Power Query – Data Transformation: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 64: | Line 64: | ||
== Unpivoting Columns == | == Unpivoting Columns == | ||
'''Goal:''' Reshape data so that monthly sales are listed in rows rather than across columns. | |||
'''File used:''' | |||
* [[Media:Monthly_Sales_2023.xlsx|Download Monthly_Sales_2023.xlsx]] | |||
'''Steps:''' | '''Steps:''' | ||
# | # In Excel → go to the '''Data''' tab → click '''Get Data''' → '''From File''' → '''From Workbook''' | ||
# Select | # Select the file: '''Monthly_Sales_2023.xlsx''' and click '''Import''' | ||
# In Navigator | # In the Navigator window: | ||
#* Select the worksheet that appears (there is only one sheet in this file) | |||
#* Click the button '''Transform Data''' to open Power Query Editor | |||
# In Power Query Editor: | # In Power Query Editor: | ||
#* | #* Confirm that columns like `Jan`, `Feb`, ..., `Jun` are present | ||
#* Go to the '''Transform''' tab | #* Select all month columns by clicking on `Jan`, then holding Shift and clicking `Jun` | ||
# Go to the '''Transform''' tab → click '''Unpivot Columns''' | |||
# | # Power Query will convert each of the selected columns into two columns: | ||
#* One called '''Attribute''' (containing the month names) | |||
#* One called '''Value''' (containing the corresponding sales values) | |||
# You can rename the new columns: | |||
#* Right-click the column '''Attribute''' → choose '''Rename''' → type `Month` | |||
#* Right-click the column '''Value''' → choose '''Rename''' → type `Sales` | |||
'''Optional:''' | |||
* Change the data type of the new `Sales` column to a number: | |||
#* Click the small icon in the `Sales` column header (e.g. ABC123) and choose '''Whole Number''' | |||
'''Result:''' | |||
You now have a normalized table where each row represents a single product's sales in a particular month. This structure is ideal for analysis or pivoting in Excel. | |||
== Reordering Columns == | == Reordering Columns == |
Revision as of 10:19, 22 June 2025
Module 3: Data Transformation Techniques
Objective
Clean and reshape data using Power Query’s built-in tools in Excel.
Files Used
Removing Blank Rows and Unnecessary Columns
Steps:
- Go to Excel → Data tab → Get Data → From File → From Workbook
- Select and load Survey_Results.xlsx
- In Navigator → select the worksheet → click Transform Data
- In Power Query Editor → Go to Home tab → click Remove Rows → choose Remove Blank Rows
- To remove unwanted columns:
- Click the header of the column you want to remove
- Hold Ctrl and select additional columns if needed
- Right-click any selected header → choose Remove Columns
Changing Data Types
Steps:
- In Power Query Editor, locate the column you want to change
- In the column header, click the small data type icon (e.g. ABC123 or calendar icon)
- Select the appropriate data type: Text, Whole Number, Decimal Number, Date, etc.
- Alternatively, go to the Transform tab → click Data Type dropdown → choose desired type
Renaming Columns
Steps:
- In Power Query Editor, right-click on the column header you want to rename
- Click Rename
- Type the new column name and press Enter
- OR: double-click on the column name to rename directly
Filtering and Sorting Data
Steps:
- In Power Query Editor, locate the column by which you want to filter or sort
- Click the dropdown arrow in the column header
- To filter:
- Check/uncheck boxes or use number/text filters
- To sort:
- Click Sort Ascending or Sort Descending in the same dropdown menu
Grouping Data
Steps:
- Select the column that contains the values to group (e.g. Region or Category)
- Go to the Transform tab
- Click Group By
- In the dialog:
- Choose Basic or Advanced
- Select the column to group
- Choose operation (e.g. Count Rows, Sum, Average)
- Click OK
Splitting Columns
Steps:
- Select a column (e.g. Full Name or "City, State")
- Go to the Home tab → click Split Column
- Choose a method (e.g. By Delimiter)
- In the dialog:
- Choose the delimiter (e.g. comma, space, custom)
- Choose how to split (e.g. at each occurrence, left-most, right-most)
- Click OK
Unpivoting Columns
Goal: Reshape data so that monthly sales are listed in rows rather than across columns.
File used:
Steps:
- In Excel → go to the Data tab → click Get Data → From File → From Workbook
- Select the file: Monthly_Sales_2023.xlsx and click Import
- In the Navigator window:
- Select the worksheet that appears (there is only one sheet in this file)
- Click the button Transform Data to open Power Query Editor
- In Power Query Editor:
- Confirm that columns like `Jan`, `Feb`, ..., `Jun` are present
- Select all month columns by clicking on `Jan`, then holding Shift and clicking `Jun`
- Go to the Transform tab → click Unpivot Columns
- Power Query will convert each of the selected columns into two columns:
- One called Attribute (containing the month names)
- One called Value (containing the corresponding sales values)
- You can rename the new columns:
- Right-click the column Attribute → choose Rename → type `Month`
- Right-click the column Value → choose Rename → type `Sales`
Optional:
- Change the data type of the new `Sales` column to a number:
- Click the small icon in the `Sales` column header (e.g. ABC123) and choose Whole Number
Result: You now have a normalized table where each row represents a single product's sales in a particular month. This structure is ideal for analysis or pivoting in Excel.
Reordering Columns
Steps:
- In Power Query Editor, click and drag the column header to a new position
- OR: right-click the column header → choose Move → select desired direction (Left, Right, To Beginning, To End)
Duplicating and Referencing Queries
Steps:
- In the Queries pane on the left side of Power Query Editor:
- Right-click the query you want to reuse
- To create a full copy: choose Duplicate
- To create a linked version that depends on the original: choose Reference
Additional Exercise: Cleanup and Pivot Prep
Goal: Prepare survey data for pivot table analysis
Steps:
- Load Survey_Results.xlsx and remove rows with missing values in the satisfaction column
- Rename the satisfaction column to lowercase (e.g. "satisfaction")
- Group the table by satisfaction rating:
- Go to Transform → Group By → select satisfaction → operation: Count Rows
- Load the final grouped table to a new worksheet in Excel
Extension: Add a Conditional Column
Goal: Categorize satisfaction ratings into "Positive", "Neutral", or "Negative"
Steps:
- In Power Query Editor → go to Add Column tab → click Conditional Column
- Set:
- New column name: `SatisfactionLevel`
- If Satisfaction = 4 or 5 → then "Positive"
- Else if Satisfaction = 3 → then "Neutral"
- Else → "Negative"
- Click OK
→ Continue to Module 4: Automation and Parameters
Return to Main Page