Posted by: jasonbub | April 10, 2008

Unpivot that data, it’s easy!

I was recently asked if there was an easy way to transform tabular data into a format that was ready for a data warehouse (or at least was normalized).

Initially, I wondered if there was a simple way to do this in Excel. I tried the cut and paste special with the Transpose option checked but that didn’t work.

I then went to the place that I should have started, SQL Server Business Intelligence Development Studio and found a great Data Flow Transformation called Unpivot. This tool was exactly what I was looking for. It was so simple to do this semi-complex task, I thought that I’d outline the steps, as when I looked for information all of the others weren’t very clear.

Example Data

  1. Open SQL Server Business Intelligence Development Studio
  2. Create a New Project (File > New Project)
  3. From the Control Flow Items toolbox, drag a Data Flow Task to the Control Flow workspace on your new project and then double click it to work on the Data Flow Task details.
  4. From the Data Flow Sources toolbox, drag an Excel Source object to the Data Flow Task workspace.
  5. Double click the Excel Source, click New to create a new Excel Connection Manager by Browsing to your file with the tabular data.  Then select the worksheet with your data and click OK

    You will have an Excel Connection Manager created when you click the “New” button.  This manages the connection to the Excel file and can be used to handle the result data.

  6. From the Data Flow Transformations toolbox, drag the Unpivot transformation object to the Data Flow Task workspace and connect the Excel Data Source output (Green Arrow) to the Unpivot transformation by dragging it on top of it.
  7. Double-click the Unpivot transformation object.  Check all of the fields that have denormalized columns of data, i.e., Title A, Title B, Title C, in my example.  Then provide a name for the column that will store which title each new row is associated with, in my example Title A, Title B, Title C description will be stored in the new field Title and then click OK

  8. From the Data Flow Destinations toolbox, drag the Excel Destination object to the Data Flow Task workspace and connect the Unpivot output (Green Arrow) to the Excel Destination object by dragging it on top of it.
  9. Double-Click the Excel Destination, click New next to the Name of the Excel Sheet and create a new Excel sheet to store the results of the transformation.  Then click the Mappings items and map your transformed fields to the fields in your new Excel Worksheet.


    Click OK.
  10. All that is left to do it run the project.  Click Debug > Start Debugging (F5)
  11. When you are done you should see all Green objects and the new of items that were transformed
  12. Open the Excel file and you’ll have normalized data.

That’s it.  Very simple.

Here are other methods to do this using VBA in Excel, but this will be a much high performance solution.

VBA Example

Source Credit
Option Explicit

Sub MakeDataBaseTable()
Dim SummaryTableRange As Range
Dim PivotTableSheet As Worksheet
Set SummaryTableRange = ActiveCell.CurrentRegion
If SummaryTableRange.Count = 1 Or SummaryTableRange.Rows.Count < 3 Then
MsgBox "Select a cell in the summary table.", vbCritical
Exit Sub
End If
ActiveWorkbook.PivotCaches.Add _
(SourceType:=xlConsolidation, _
SourceData:=Array(SummaryTableRange.Address(True, True, xlR1C1, True))) _
.CreatePivotTable TableDestination:="", _
TableName:="PivotTable1"
Set PivotTableSheet = ActiveSheet
With PivotTableSheet
.PivotTableWizard TableDestination:=ActiveSheet.Cells(3, 1)
.PivotTables("PivotTable1").DataPivotField.PivotItems("Sum of Value").Position = 1
.PivotTables("PivotTable1").PivotFields("Row").Orientation = xlHidden
.PivotTables("PivotTable1").PivotFields("Column").Orientation = xlHidden
End With
Range("B4").ShowDetail = True
Application.DisplayAlerts = False
PivotTableSheet.Delete
Application.DisplayAlerts = True
End Sub

Hope this helps!

JasonBub

About these ads

Responses

  1. Thanks! But how do you account for another title being added (Title E, or Title F) without manually fixing the SSIS package every time?

  2. […] […]


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: