Achievements

« SQL-U Day Four Advanced SSIS : Enhanced file importing | Main | SQL-U Day Two Advanced SSIS : Sourcing a stored procedure »
Wednesday
Jan262011

SQL-U Day Three Advanced SSIS : Custom destination components

Today we are going to learn how to make a custom destination component.  Actually, how to make one work.  This isn't nearly as involved or complex as a custom source as there are fewer moving parts.  

Before we look at an example of one, here is a picture for review.  The color codes are to differentiate the two types of destination components that we are going to take apart.  And you thought you were only going to look at 1!!

 The magenta destinations are used to move information from the data flow to external, SSIS variables.  The green destinations not only pass information up to SSIS, they send information to the database, and get information back from the database.  You'll see what information later on.

Here is the environment of the magenta destination component.  It is design to count.  If you look closely I am counting the number of existing invoices and what division (Canadian or Trade).  One thing to note, because this is a destination, it has access to the Row Count components above.  In my package, the count will always be one and can be either Canada or Trade. 

Now that you have some background, we are going to open up the magenta destination.  You should all have one at your desk, pinned open.

Common to all destination components are 3 methods: PreExecute(), PostExecute, <name>_ProcessInputRow().  Similar to the structures found in the source component we learned about yesterday, the PreExecute() method is called when the Data Flow component is initialized, not when the component is used.  This is where you would work with any read-only variables.  The PreExecute() method is the only place that you can access read-only variables.  You will get an error otherwise.  PostExecute() method is called when the component is finished.  This could mean it has processed all of the rows sent to it or it was sent none at all.  This is the only place you can access read-write variables.  

If you look closely at the PostExecute() method, you'll see this is where the magic happens.  I have two 'global' variables, called cninvexistscount and trinvexistscount.  These are global variables in SSIS that count the overall number of Canadian and Trade invoices that are duplicates.  Each of the 8 Data Flow tasks put their counts into this variable.  What I add to each of these global variables are the local variables, Engine0cninvexists and Engine0trinvexists.  I also set another global variable Engine0flag which is used in the ETF to control the engines (We'll get to that on Friday). 

The <name> section of the ProcessInputRow() method might look confusing, but it isn't.  If you do nothing to a new destination component, it comes with a default input buffer named, Input0.  If you added a second input, it would be named Input1.  I didn't bother to change it.  Normally this is the engine of the component.  Rows are absorbed and transmogrified here!  Whereas the other two methods are called once during the life of the component, this method is called FOR EVERY RECORD.  This is where the multi-pathing I spoke about on Day One bit me and it hurt!  

I figured the component would only be called when there is a row to process.  WRONG!.  The ProcessInputRow() method is called only when there are records to process.  Remember, in my package, only 1 of the four paths are valid.  If I have an new invoice, I didn't think there would be rows to process on the 'existing invoice' path and the component wouldn't be instantiated. Wrong!!   I was getting weird counts all over the place and sporadic at best.  What I forgot to mention about multi-pathing is that the order of evaluation is random.  SSIS might evaluate the 'Duplicate Statement' path first one time and last another.  I needed a way to know when this component was the correct path for a given .pdf.  That is where the ProcessInputRow() method comes in.  To fix the issue of knowing when this component was the proper path for a given document, I set a boolean (Process) to true, when it 'processed' a row.  

That, was the easy destination component.  Drop that in the garbage and put the green destination component on your trays.

The green component needed to do two things.  It was the reason I had to design a custom component.  Front-end restrictions prevented the application from using the 20-byte hash built on each document as a way to extract a document from the database.  We opted for an identity column as a second unique identifier.  That posed a problem.  The standard OLEDB Destination component does not allow me to extract any information from the insert action.  I needed to take that identity column value and insert it into another table later on in the package.  To do so I needed to extract it and pass it up to an SSIS global variable.

Careful cutting this one open.  It's fresh and it might squirt a bit.  Take hold of the PreExecute() method.

It simply sets up the connection object for use later on.  In this workload is done in the ProcessInputRow() method.  You'll need to be careful with this, it's big.  You might want to use two hands when pulling this out or ask a classmate for help.

There are two subparts to this method.  I first insert a record into a table and then, because I have control over the transaction, I am able to get the identity column.  Now that I have set up the sql statement to insert and then retrieve the identity, I need to get that from SQL Server.

 I pass the single value back to the webunique variable, by calling the command with cmd.ExecuteScalar(). This method is called for every row, so make sure you build accordingly.  It's not like the source component where CreateOutputRows() is called once and you loop inside.  Notice the Process variable is set to let the component know it processed a row.  This is a component level variable, created inside so I can access that anywhere.  I CANNOT access any SSIS variables in here, so you need to store intermediate results in component variables to pass along.  Speaking of passing along, tear that out and poke around the PostExecute() method.

Here you can see there is a check to make sure this component processed a row.  If that check is met, we transfer the local variables out to SSIS.

There you have it.  How to engineer your own custom destination components.  Tomorrow I am going to go over either how to source variables in a custom source component or how to import multiple files using custom components and NOT ForEach loop.  Send me a tweet or email and let me know which you would prefer.

PrintView Printer Friendly Version

EmailEmail Article to Friend

References (5)

References allow you to track sources for this article, as well as articles that were written in response to this article.
  • Response
    Response: www.land-sjo.se
    Thank you so much for this wonderful content , this is actually the type of factor which keeps myself proceeding throught the day.
  • Response
    Response: sports betting
    SQL-U Day Three Advanced SSIS : Custom destination components - Journal - SSIS - SQL Server Tidbits
  • Response
    Response: hosted call center
    SQL-U Day Three Advanced SSIS : Custom destination components - Journal - SSIS - SQL Server Tidbits
  • Response
    SQL-U Day Three Advanced SSIS : Custom destination components - Journal - SSIS - SQL Server Tidbits
  • Response
    SQL-U Day Three Advanced SSIS : Custom destination components - Journal - SSIS - SQL Server Tidbits

Reader Comments (1)

Nice one, thanks so much for sharing the time and knowledge...

June 20, 2011 | Unregistered CommenterJason Yousef

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>