I was recently asked if there was an easy way to transform tabular data into a format that was ready for a data warehouse (or at least was normalized).
Initially, I wondered if there was a simple way to do this in Excel. I tried the cut and paste special with the Transpose option checked but that didn’t work.
I then went to the place that I should have started, SQL Server Business Intelligence Development Studio and found a great Data Flow Transformation called Unpivot. This tool was exactly what I was looking for. It was so simple to do this semi-complex task, I thought that I’d outline the steps, as when I looked for information all of the others weren’t very clear.
- Open SQL Server Business Intelligence Development Studio
- Create a New Project (File > New Project)
- From the Control Flow Items toolbox, drag a Data Flow Task to the Control Flow workspace on your new project and then double click it to work on the Data Flow Task details.
- From the Data Flow Sources toolbox, drag an Excel Source object to the Data Flow Task workspace.
- Double click the Excel Source, click New to create a new Excel Connection Manager by Browsing to your file with the tabular data. Then select the worksheet with your data and click OK
You will have an Excel Connection Manager created when you click the “New” button. This manages the connection to the Excel file and can be used to handle the result data.
- From the Data Flow Transformations toolbox, drag the Unpivot transformation object to the Data Flow Task workspace and connect the Excel Data Source output (Green Arrow) to the Unpivot transformation by dragging it on top of it.
- Double-click the Unpivot transformation object. Check all of the fields that have denormalized columns of data, i.e., Title A, Title B, Title C, in my example. Then provide a name for the column that will store which title each new row is associated with, in my example Title A, Title B, Title C description will be stored in the new field Title and then click OK
- From the Data Flow Destinations toolbox, drag the Excel Destination object to the Data Flow Task workspace and connect the Unpivot output (Green Arrow) to the Excel Destination object by dragging it on top of it.
- Double-Click the Excel Destination, click New next to the Name of the Excel Sheet and create a new Excel sheet to store the results of the transformation. Then click the Mappings items and map your transformed fields to the fields in your new Excel Worksheet.
- All that is left to do it run the project. Click Debug > Start Debugging (F5)
- When you are done you should see all Green objects and the new of items that were transformed
- Open the Excel file and you’ll have normalized data.
That’s it. Very simple.
Here are other methods to do this using VBA in Excel, but this will be a much high performance solution.
Source Credit Option Explicit Sub MakeDataBaseTable() Dim SummaryTableRange As Range Dim PivotTableSheet As Worksheet Set SummaryTableRange = ActiveCell.CurrentRegion If SummaryTableRange.Count = 1 Or SummaryTableRange.Rows.Count < 3 Then MsgBox "Select a cell in the summary table.", vbCritical Exit Sub End If ActiveWorkbook.PivotCaches.Add _ (SourceType:=xlConsolidation, _ SourceData:=Array(SummaryTableRange.Address(True, True, xlR1C1, True))) _ .CreatePivotTable TableDestination:="", _ TableName:="PivotTable1" Set PivotTableSheet = ActiveSheet With PivotTableSheet .PivotTableWizard TableDestination:=ActiveSheet.Cells(3, 1) .PivotTables("PivotTable1").DataPivotField.PivotItems("Sum of Value").Position = 1 .PivotTables("PivotTable1").PivotFields("Row").Orientation = xlHidden .PivotTables("PivotTable1").PivotFields("Column").Orientation = xlHidden End With Range("B4").ShowDetail = True Application.DisplayAlerts = False PivotTableSheet.Delete Application.DisplayAlerts = True End Sub
Hope this helps!