Identifying SSIS execution errors in SQL Server 2012+

First of, if you are not using the project deployment model in SSIS 2012+ this blog post is not for you – sorry. The techniques described below assume you are storing your packages in SSISDB. However, you don’t necessarily need to execute your packages through SQL Agent jobs.

Okay, let’s get right into it. After getting too frustrated with waiting for SSIS execution reports out of SSMS, which are not only pretty slow but also give you a lot of information you don’t really need, I figured there has to be a better way to retrieve relevant information directly. What better place to look than tapping directly into SSISDB, the system database which is created automatically when you create an Integration Services Catalog?

Even though this is far from a deep dive into the table structure of SSISDB, I’d like to give you a little more background information on the tables we care about for this script (note that most of the documentation refers to objects with the catalog schema (Microsoft Documentation), those are views that sit on top of the internal schema tables which we use here):

  • internal.executions: lots of helpful information in this table, in particular the [package_name], which account was used to execute the package ([executed_as_name]), additional information about the project the package is part of ([project_name]) and if an environment was used to schedule the package we get information about this as well ([environment_name])
  • internal.operations: this table will tell us when exactly the package was executed ([start_time] and [end_time])
  • internal.operation_messages: holds the error message ([message]) we came here for in the first place. This table holds a LOT of information, so we will have to limit our query to those lines that hold errors, we use the [message_type] column to identify what we want to bring back. It might also help to know the exact time the error occurred ([message_time])
  • internal.event_messages: tells us which step in the SSIS package failed, given you name each component in your package ([message_source_name]), otherwise you’ll see a lot of [Data Flow Task]’s and [Execute SQL Task]’s here
  • internal.execution_parameter_values: all package parameters are listed in this table ([parameter_name] & [parameter_value])
  • internal.event_message_context: last but not least the probably second most useful piece of information is right here, execution-time variable and connection string values ([context_source_name]/[package_path] & [property_value])

To tie it all together there are 2 key identifiers we have to use. Unfortunately Microsoft decided to use [execution_id] on some tables as the primary key and [operation_id] on others, just like you find [event_message_id] here and [operation_message_id] there. Confusing at times and doesn’t make it easy to get a grasp of the structure at first but I am sure they have good reasons for doing so…?!

Now lets have a look at the script (finally!) and I’ll walk you through it:

Since you might want to limit your output to a specific time window (I usually only care about the last ~3 hours) we have the variable @Offset to allow us to define how many hours you want to go back in time. The script itself is pretty straight forward, a few INNER JOINs to link all the tables together and two OUTER APPLYs to generate XML outputs which come in handy when you try to identify if the error is caused by a wrong connection string, incorrect package parameter or variable. The WHERE clause limits the output to anything within the specified time window based on [message_time] and [message_type] = 120 ensures that we only return errors. I like to see most recent errors on top, but feel free to remove the ORDER BY in case you don’t care. Just go ahead and remove it!

If you check the code within the OUTER APPLYs you might notice that I limit the output somewhat, for example I am not returning any sensitive package parameters as those will be scrambled anyways and won’t help you figuring out anything. Once again, feel free to add or remove WHERE clauses so it works best for you!

Since I am lazy I chose to cast the error message as XML so you can quickly click on it and see it in a new window for convenience. You might ask what in the devils name made me use WITH (NOLOCK) all over the place, but in a busy environment you might find yourself waiting for quite some time as SSISDB tables are often used heavily, the logging is quite verbose but of course it mostly depends on your SSIS usage. I also threw in an OPTION(RECOMPILE) for good measure as the number of records you want to retrieve can vary quite a bit and more often than not it does more good than harm for this specific query.

Now let’s not stop here. Say we also want to gather any information we can get about the SQL Agent job which was actually executing the SSIS package (you are executing packages that way, RIGHT?!). Back in the old days™ you were able to use the SQL Agent job history to troubleshoot SSIS errors, but nowadays all you get is a generic error message:

Executed as user: [USERACCOUNT HERE]. Microsoft (R) SQL Server Execute Package Utility Version xx.x.xxxx.x for 64-bit Copyright (C) Microsoft Corporation. All rights reserved. Started: xx:xx:xx PM Package execution on IS Server failed. Execution ID: 12345, Execution Status:4. To view the details for the execution, right-click on the Integration Services Catalog, and open the [All Executions] report Started: xx:xx:xx PM Finished: xx:xx:xx PM Elapsed: xx.xxx seconds. The package execution failed. The step failed.

They tell you to open the reports for more information. But we don’t want to do that anymore, so I am eternally grateful that the error message contains the [execution_id], we will use it to tie everything back to SSISDB (hooray!). All we need to do now is join a few system tables in msdb and as if by magic we know the job name and step along with the command to execute the package (or whatever else you might want to retrieve):

And there you have it. Let me know if it helped you troubleshoot your painful SSIS issues and how you would improve it. Who knows, maybe I even got it wrong somewhere?

 

 

Okay, you are still here, ey? I got one more for you then. Here is a bonus index for you to improve query performance:

That’s all folks!