Navigation
Follow Along

Powered by Squarespace
Tuesday
21Jul2009

Large Page IO for SQL 2005

Very insightful article. Large Page IO

Wednesday
15Jul2009

Running DTS2000 from within SSIS 2005

Found something interesting today. I have an SSIS package that simply runs a DTS package using the 'Execute DTS 200 Package Task' (I know, I didn't like it either, but deadlines prevented a rewrite of this DTS, which happen to be 2 200 line active X tasks.) The DTS package currently exists within a SQL Server 2005 machine, so we were running this with 'Windows Authentication'. We kept the package security at 'EncryptSensitiveWithUserKey'To get this to run I needed first to make sure ran in 32-bit mode not 64-bit mode. After running I got this error:

Error: 2009-07-15 08:46:38.82

Code: 0xC0016016

Source:

Description: Failed to decrypt protected XML node "PackagePassword" with error 0x80070002 "The system cannot find the file specified.". You may not be authorized to access this information. This error occurs when there is a cryptographic error. Verify that the correct key is available.

End Error

Error: 2009-07-15 08:46:38.82

Code: 0xC0016016

Source:

Description: Failed to decrypt protected XML node "SQLPassword" with error 0x80070002 "The system cannot find the file specified.". You may not be authorized to access this information. This error occurs when there is a cryptographic error. Verify that the correct key is available.

End Error

 

Turns out, you need to set your package to 'DontSaveSensitiveData'.

Wednesday
08Jul2009

Using parameters in stored procedures in Execute SQL Task in SSIS.

Every now and again this bites me.  I can't remember how to pass parameters into and out of stored procedures.  This is for OLEDB connections:

 

Friday
03Jul2009

Division of Labor; SSIS and Sql Server

SSIS was born for parallel processing, while sql server wasn't; well not from the users perspective. Yes, sql server breaks a user's query into parallel processes, but there is no way I can 'fork' within a stored procedure, or do an asynchronous call. I originally designed a process which I outline here "Parallel Processing in SQL"which works beautifully if in fact the work to be done is broken into equal parts.

As an example, if you have 100 units of work and decide to divide it up between 5 threads, each thread would have 20 units of work. If each unit takes 1 min, then the entire process will take 20 mins. Now, lets say that somewhere in the 20 units of work that thread 4 has (around task # 15 and #17) those are different and take 10 min each. Rather than the process being done in 20 min, the process will finish in 38 min. The reason being is that when the other threads finish in 20 min, thread 4 will be stuck doing work for an extra 18 min (+9 min for task #15 and +9 min for task#17). The reason for this is because when you use the modulo strategy each thread is assigned a workload that cannot be transferred. If a thread happens to get stuck behind one or more long running pieces of work, the entire process suffers. Having seen this in some of my load processes, I came up with a new strategy that divides the work evenly, allowing the other threads to pick up work if one or more lag behind.

The new strategy creates a pool of work. Let's go back to the 100 units of work. I create a table that contains 100 rows, each with a number of columns representing the variables or work I need to do. In the example I give, I am extracting a statement_account and a rowcount. If you've ever programmed in a mutli-threaded language you would be familiar with a semaphore or latch. This is a nice mechanism to control access to a specific section of code. For us, we will be protecting that table of work, which for a lack of a better term I'll call work_queue.

There is no way within sql server that I can prevent a stored procedure from being called by two or more connections, but I can affect access to a table by way of locks. I devised a rather efficient way, within a procedure, of taking one record off the work queue for processing. In effect I have used the work_queue table as a semaphore to make sure each thread gets a single unit of work, without holding up the others for an extended period of time and no duplication of work. Here is the procedure

create procedure usp_workunit

(@rcount int output,

@statement char(8) output)

as

begin

set nocount on

declare @tblvar as table (stmt char(8))

delete top(1) work_queue

output deleted.statement_acct into @tblvar

select @rcount=@@rowcount

select @statement=stmt

from @tblvar

end

The important piece of work is using the output keyword from within the delete. As you see I can remove the top record from the work_queue table and extract the actual value without needing to lock the work_queue table (with a begin tran), select one record, delete that record, and then release the lock. I simply use the single delete statement. Also note that work_queue is simply a heap. The statement and rowcount are exported to SSIS. I use the rowcount variable to determine when there is no more work to extract and end that thread's execution loop.  Also note that since there is no assigned work queue, each thread simply retrieves the next valid record, there is no backup and only the thread that picked the short straw (longer work work unit) will be slowed.  I will show the structure I use within SSIS that couples with the work unit in my next post.

Monday
29Jun2009

Why use configfiles with SSIS?

I have had this conversation with quite a few people, including a recent conversation with http://twitter.com/joechr309, and decided to blog about the choices that I've made regarding configuration files and sql server. I happen to love the idea of de-coupling a DTS package from the sql server. I can now completely isolate sql server from external programs. If I happen to have a very non-sql intensive SSIS process (importing various files, moving them, manipulating them, ftp'ing them, etc) I no longer have to consume sql resources (granted I need to do lookups and other minor sql tasks, which is why it is an SSIS package). I can also use SSIS high availability functionality. Continuing on this path of separating SQL Server from SSIS, why not have a configuration file to make the necessary changes? Why bind the SSIS package with the need to go to a specific sql server to extract run-time variables, when I can simply provide a configfile with all of that information. I keep both the configuration file and the SSIS package on a secure fileshare (with limited access) and I have met SOX compliance and de-coupled my sql servers.

Having tables contain run-time information can get messy. Do I put the configuration for multiple SSIS packages on the same server and then if I do, where do I put them? Does the development team mandate that every sql server will have a database named 'configurations' and within that one or more tables that the SSIS packages will look for their specific information? How do I efficiently store this? Do I store it by unique SSIS package ID or by another convention? I am sure there are many answers to these many questions, but if you look at the simple solution, it would be a configuration file.  I don't need to grant developers access to databases to configure their packages, they can simply edit the configfile on their own.  They no longer need a database change request to make configfile modifications.  These are a few reasons why I love the idea of using configuration files over other methods. I have tried to see the downside to it and I am looking for those who taken the database approach to comment.

Page 1 ... 3 4 5 6 7 ... 8 Next 5 Entries ยป