Entries in SSIS (12)


PASS Summit 2013 - Examples Package

I was asked by some attendees of my PASS 2013 Session, Unseen Minions : Improving parallel processing in SSIS, if I had any sample code or examples of some of the ways my framework could be used.  I have put together 3 samples along with supporting sql code.  You will find a link to that zip file here: Example_Package.  


IF-Then-Else statement in SSIS Derived Columns

Here is another one of those rarely used and often forgotten little tidbits that hit me like a sharp tack when I sit down.

I have an input file that is fixed width and there are numerous columns that represent numbers, but the text files are coming from an old publishing program known as VISTA. This system, I believe runs on ISAM files (yeah that old). The output of these files is odd, because if you have a negative number, it is represented as such 3.75-, whereas a positive is 3.75. SQL Server 2005 doesn't appreciate 3.75- as an integer so when importing with SSIS I used a conditional statement within a derived column.

[Column 9] is defined as a 11 characters fixed width, the right most character is '-' if it is negative, and ' ' if it is positive. It represents a percentage, so it is defined within the database as decimal(5,2). Within the Derived Column Transformation I need to extract the negative number and here is how I do it:

RIGHT([Column 9],1) == "-" ? ((DT_NUMERIC,5,2)SUBSTRING([Column 9],1,10)) * -1 : (DT_NUMERIC,5,2)SUBSTRING([Column 9],1,10)

Let's break down the above statement into the standard IF-Then-Else.

IF RIGHT([Column9],1) == "-"

 THEN ((DT_NUMERIC,5,2)SUBSTRING([Column 9],1,10)) * -1

ELSE  (DT_NUMERIC,5,2)SUBSTRING([Column 9],1,10)

If you substitute the ? for the Then and the : for the ELSE, it becomes very easy to build conditional statements within Derived Column Transformations.  The other pieces of code simply state that if a find a "-" at the 11th position, multiply the number by -1 (after converting the first 10 characters to a decimal data type) and if the 11th position is not a "-", just convert the first 10 characters.



SSIS, Raw Files and Local IO

Our architecture places SSIS packages on their own servers, away from our SQL Servers.  This reduces contention on the sql server boxes themselves and makes the SSIS servers independent, allowing resources to be adjusted accordingly.  There are a few processes where we use SSIS 'Raw Files' to improve performance.  Originally we put the raw files on a remote file server and performance was acceptable.  The idea was floated that putting the raw files locally on the SSIS server should improve performance by reducing network traffic.  It turns out that performance went down by over 80%.  A little investigation determine it was the load on the SSIS box.  The SSIS box has 2 dual cores and 8 GB of memory.  The packages we run use 8 DataFlow Tasks, which is much more than the 4 CPUs installed on the machine.  The reason for this is that the sql we run some time (anywhere from a few seconds to a few minutes) so not all 8 DFTs will be active at the exact same time.  Also there are times where multiple SSIS jobs are running at the same time so the machine is almost always at full capacity.  Back to why performance for local IO is worse than networked IO.

It seems as though connecting to a remote file share enables the server to offload the IO requests to the network cards (2x1GB) vs. servicing the requests locally.  While this might seem counter intuitive, it makes some sense.  You are using the network card to process the IO that is being sent, so you simply queue it up and off it goes.  When you are asking for local IO, it requires processing power, since the OS needs to satisfy the request.  That local request needs a cpu, which takes away from the SSIS processes.


Using Variables in SSIS Script Component

I've found a technique to efficiently exchange information between a script component and the host SSIS package. I always use package level variables as I have not seen nor heard of any reason to use component level variables. I don't use the readonly/writeonly variables in the script task editor. Instead I use the following VB code:



The above code locks the sales_file variable for read and the ftp_sales variable for write. I then use the following piece of code:

Dim variablesList As Variables


The above statement takes all of the variables that we have locked for read/write and puts them into the variablesList variable. To extract a read-only variable, use the following code:

Dim SFileName As String

SFileName = variablesList("User::sales_file").Value.ToString

I can now reference SFileName throughout my script. To push a variable from within a Script Component back out to the SSIS package, I use the following code snippet:

variablesList("User::ftp_sales").Value = FTPSales

When all is said and done you need to run the following snippet:


Until you run the unlock, the variablesList prevents other packages from using these variables, even to read them. I tend to unlock them as soon as possible, so if I am simply doing reads, I unlock them as soon as the lastDts.VariableDispenser.LockForRead is issued.  If I am doing writes, then I wait until the last write is done and then unlock them.


SQL Output to an Email Attachment

This is a quick example of how to take the output of a sql query and email it to someone.

The above image shows a few of the pieces.  Create a variable, in this case filename.  You then open the properties editor for the sendmail task.  You select fileAttachments and link the variable.  This will send the text file you create as an attachment.  Now we go to the DataFlow task to see how to connect the output of your sql to that same textfile.

You create a DataReader source and connect the output to a FlatFile destination.  To create the FlatFile destination you need a FlatFileConnectionManager and that is where you connect the output to a physical file.  Click on your FlatFileConnectionManager then on the properties window on the right, click on the  + sign to the left of Expressions.  Click on the ellipses (...) on the right side, which will open the PropertiesExpressionsEditor you see on the left.  Select the ConnectionString property and select the filename variable.  

You have now connected the output of your sql to the file attachment in your email.