SSIS insert and update rows in a table based on the contents of a Excel file

Lets imagine you have an Excel file of First name, Surname, and Age. Also contained in this file is a unique identifier (UID), here is an example:

ID First Surname Age
1 Bob Smith 12
2 Tom Stoner 34
3 Brenda Jones 45

Now, lets continue to imagine that you have a corresponding table in SQL Server:

So, the situation is thus; We want to update records that exist in both the table and the Excel file and insert those that exist in the Excel file but not in the table. How do we do this…. Up steps SSIS.

I am going to assume you know the basics of SSIS and so will start from the perpective of a new project in BIDS.

Under Control Flow we need to add a new Control Flow Item – Data Flow. Rename this to Insert and Update records. Now double click on this item, or select the Data Flow tab.
Now things get exciting, we want to create a data flow that queries both the Excel File and the database table, both will need to be sorted on the same column, and then we perform a left outer join. At this stage we can determine which rows of the Excel sheet are new, and which exist already. To do this we use a conditional split where we ascertain if the UID is null (from the database table), if it is we know this is a new row and so should be inserted into the table, if it isn’t we know this row exists in the table and so the row should be updated (perhaps one of the column values has changed).

Without further ado, let us begin.

1. Drag onto the canvas an Excel Source
2. Configure the Excel Source to open our Excel file (you will need to create an Excel Connection manager)
3. Drag onto the canvas an OLE DB source
4. Configure the OLE DB source to reference the person table (you will need to create an OLE DB Connection manager instance)
5. Before sorting the columns, I would advise converting the Excel columns to the same type as database columns, in this case I have selected the four columns:

a. ID converted to Four Byte Signed Integer
b. Firstname to String(100)
c. Surname to String(100)
d. Age to Four Byte Signed Integer

6. Now that is in place, both streams of data need to be sorted by the ID columns (note, make sure you pass through the columns you have just converted and don’t pass through the Excel columns.

Ok, so now the exciting bit. We will perform the left outer join and the conditional to determine if each row is either a new row or existing row.
1. Drag a Merge Join onto the canvas
2. Drag a Conditional split onto the canvas
3. Drag the line from the Excel Sort to the Merge Join, this will be your Merge Left Input
4. Drag the line from the DB Sort to the Merge Join
5. Double click on the Merge Join, select the UID column from the database, and the UID, FirstName, Surname, and Age columns from the Excel Source.

6. Remeber to change the join type to Left outer Join

Now we can perform the conditional split:
1. Drag the line from the Merge Join to the conditional split
2. Double click the conditional split and configure as shown below:

This will now give us two streams coming out of the conditional split, those belonging to the the NewRow and those belonging to the ExistingRow. The data belonging to the new row will be inserted into the table, and the data belonging to the Existing Row will update said row.

So, configuring the insertion is quite easy;
1. Drag an OLE DB Destination onto the canvas
2. Configure the OLD DB Destination to insert to the person table
3. Drag the line from the conditional split to the OLE DB Destination, you will be prompted to choose the output, you want to select NewRow (as defined in the Conditional Split)
4. Choosing the mapping tab and ensure they map to the columns in the table

That is the insert sorted, how about the update;

1. Drag an OLD DB Command onto the canvas
2. Drag the green line from the conditional split to the OLD DB Command, you will be prompted to choose the output, select ExistingRow (as defined in the conditional split)
3. To configure the OLD DB Command, Double click on the item:

a. Select the relevant connection manager (it is the DB one you created earlier)
b. Under Custom Properties select SQLCOMMAND and the button at the end of the input.
c. You are now prompted with an input box, you need to enter an SQL update command (example below)

d. You then need to configure the parameter mappings top the columns (example below)

With that complete you are done!! I will leave you to configure error checking etc, and how you want to trigger this. Your data flow should look something like this:

I hope this proves useful?

Point of note, I suffered a slight problem on my machine because it is 64Bit, I received the error message “DTS_E_OLEDB_EXCEL_NOT_SUPPORTED” – Have a read of these two articles: here and here.

Tags: , , , ,

32 Responses to “SSIS insert and update rows in a table based on the contents of a Excel file”

  1. SSIS Loading multiple files containing a column value within the filename. « It's Smee Blog Says:

    […] That is the script object done, now the data flow task. Now, the data flow task will consist of a number of tasks, but I will only focus on a few. These being the derived column, and Excel Connection. From the Merge Join down that is performing the insert or update which I have previously covered here. […]

  2. Rev Says:

    Thank you so much! Easy to implement and explained perfectly.

    Thanks again!

  3. 2010 in review – A Wordpress email… « It's Smee Blog Says:

    […] SSIS insert and update rows in a table based on the contents of a Excel file October 2010 2 comments 3 […]

  4. is696 Says:

    The best explanation on the web on the issue. Five stars.

  5. Sudipt Sahu Says:

    The way of presentation is perfect. Easy to understand. Thanks

  6. Stuart Kirkup Says:

    Fantastic. Today is my first day of using SSIS, and I implemented this in about 15 minutes. I’d NEVER have worked out how to do that on my own with so little experience. Really nicely presented

  7. ashuthinks Says:

    hi help me here
    my question :

  8. Nico Says:

    Thanks for this intuitive example.
    I have the same flow setup now, with a conditional split to separate rows to insert, update and delete. When running the task the Insert succeeds, but the following Updates and Deletions fail due to a transaction deadlock. I’ve tried unchecking the TableLock on Insert to no avail. And also messed around with connection and transaction setting – still no luck. Is there a solution to this issue or is the only way out to restructure the package and create separate Data Flow Tasks for Insert, Update and Delete?

    • itssmee Says:

      My first question is why do you want to delete? Is it the case that your input contains only the data that should be in the table? If so, you should delete / truncate the table before you perform the insert / update?

      The reason you are having problem is because of the delete, you need to separate that out i.e. have an SSIS job for the deletes, and then one for the insert / update.

  9. ByronOne Says:

    The new row element works perfectly but the update row doesn’t seem to make the changes – there is no error but the changes in the Excel file are not reflected in the SQL table – any ideas why this might be?

  10. David Walker Says:

    If you only have new rows and updated rows, you can use a lookup transformation to separate the rows into two data flows.

  11. David Walker Says:

    SSIS in SQL 2008 complains that the SQL does not return any information. (It’s looking for metadata to come from the SQL command, but since this is an Update command, there is no metadata.) I can’t get past this error, so I’m going to use a SQL Merge statement in the control flow tasks and just have the data flow task populate the staging table. I would like to do it this way, but I can’t get it to work in SQL 2008.

  12. Roger Ong Says:

    Thanks. That really helped me. However, the article needs a little proof reading (E.G. ‘OLD’ DB Command) and some aspects of what needed to be done required a further look up as well (how to sort outputs). But again: Thanks.

  13. Abhinav Says:

    Thank you very much for this article. You made that stuff easier to understand what others have tried to make unnecessarily difficult through meaningless examples. It was really easy to understand. Lots of Thanks!

  14. Galina Says:

    It’s really clear explanation. But what about perfomance? I have 60000 rows in table and Merge join works very slowly.

  15. brian Says:

    its updating all the records …please help

  16. brian Says:

    sorted out.. 😀

  17. Jaykumar Says:

    The Condition column in Conditional Split should be UID and not uid! That’s how the Left Join works! Let me know if I am missimg out the trick here!

  18. Ali Says:

    THANK YOU very much Smee for this post, it was very helpful 🙂

  19. Abhay Says:

    Hey OLEDB Command updating all existing rows if we want to update only changed records then how t do ………. can you clear this

  20. Iain Says:

    I’ve implemented this method and I’m finding that it updates all my updated rows with the same input record. Any thoughts as to what would cause this behavior?

  21. ladeana Says:

    I’m getting the same results as Iain. any ideas?

  22. singh Says:

    Its probably not working because update command doesn’t restrict to to each item. you need to add “Where UID = ?”

  23. Iain Says:

    Absolutely right. Just added a WHERE clause and got better results. Anyone care to explain why? Is this an oversight in the tutorial or are certain joins going to require where while others can get away without restrictions?

  24. Sameh Abdo Salem Says:

    need to add where ID=?

  25. rohit Says:

    it is nice article but i do not understand clearly can you upload the video of same so the it can clearly understand the the article
    thanks thanks thanks for uploading the video in advance
    with regards

  26. louis Says:

    on inserting the merge join I get the ‘no input column found. Atleast one input must be mapped to an output column……Please can someone help?

  27. Mikeo Says:

    I could not figure out where to place this “where ID=?” I ended up creating a stored procedure to make this work with my SQL Server SSIS Package.

  28. janakiraman Says:


  29. John Says:

    This was exceptionally helpful. Thank you for breaking it down like this!

  30. Vince Says:

    Very helpful link

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: