Lets imagine you have an Excel file of First name, Surname, and Age. Also contained in this file is a unique identifier (UID), here is an example:
Now, lets continue to imagine that you have a corresponding table in SQL Server:
So, the situation is thus; We want to update records that exist in both the table and the Excel file and insert those that exist in the Excel file but not in the table. How do we do this…. Up steps SSIS.
I am going to assume you know the basics of SSIS and so will start from the perpective of a new project in BIDS.
Under Control Flow we need to add a new Control Flow Item – Data Flow. Rename this to Insert and Update records. Now double click on this item, or select the Data Flow tab.
Now things get exciting, we want to create a data flow that queries both the Excel File and the database table, both will need to be sorted on the same column, and then we perform a left outer join. At this stage we can determine which rows of the Excel sheet are new, and which exist already. To do this we use a conditional split where we ascertain if the UID is null (from the database table), if it is we know this is a new row and so should be inserted into the table, if it isn’t we know this row exists in the table and so the row should be updated (perhaps one of the column values has changed).
Without further ado, let us begin.
1. Drag onto the canvas an Excel Source
2. Configure the Excel Source to open our Excel file (you will need to create an Excel Connection manager)
3. Drag onto the canvas an OLE DB source
4. Configure the OLE DB source to reference the person table (you will need to create an OLE DB Connection manager instance)
5. Before sorting the columns, I would advise converting the Excel columns to the same type as database columns, in this case I have selected the four columns:
a. ID converted to Four Byte Signed Integer
b. Firstname to String(100)
c. Surname to String(100)
d. Age to Four Byte Signed Integer
6. Now that is in place, both streams of data need to be sorted by the ID columns (note, make sure you pass through the columns you have just converted and don’t pass through the Excel columns.
Ok, so now the exciting bit. We will perform the left outer join and the conditional to determine if each row is either a new row or existing row.
1. Drag a Merge Join onto the canvas
2. Drag a Conditional split onto the canvas
3. Drag the line from the Excel Sort to the Merge Join, this will be your Merge Left Input
4. Drag the line from the DB Sort to the Merge Join
5. Double click on the Merge Join, select the UID column from the database, and the UID, FirstName, Surname, and Age columns from the Excel Source.
6. Remeber to change the join type to Left outer Join
Now we can perform the conditional split:
1. Drag the line from the Merge Join to the conditional split
2. Double click the conditional split and configure as shown below:
This will now give us two streams coming out of the conditional split, those belonging to the the NewRow and those belonging to the ExistingRow. The data belonging to the new row will be inserted into the table, and the data belonging to the Existing Row will update said row.
So, configuring the insertion is quite easy;
1. Drag an OLE DB Destination onto the canvas
2. Configure the OLD DB Destination to insert to the person table
3. Drag the line from the conditional split to the OLE DB Destination, you will be prompted to choose the output, you want to select NewRow (as defined in the Conditional Split)
4. Choosing the mapping tab and ensure they map to the columns in the table
That is the insert sorted, how about the update;
1. Drag an OLD DB Command onto the canvas
2. Drag the green line from the conditional split to the OLD DB Command, you will be prompted to choose the output, select ExistingRow (as defined in the conditional split)
3. To configure the OLD DB Command, Double click on the item:
a. Select the relevant connection manager (it is the DB one you created earlier)
b. Under Custom Properties select SQLCOMMAND and the button at the end of the input.
c. You are now prompted with an input box, you need to enter an SQL update command (example below)
d. You then need to configure the parameter mappings top the columns (example below)
With that complete you are done!! I will leave you to configure error checking etc, and how you want to trigger this. Your data flow should look something like this:
I hope this proves useful?