Power Query – Data Transformation: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 2: | Line 2: | ||
== Objective == | == Objective == | ||
Clean and reshape data using Power Query’s built-in tools. | Clean and reshape data using Power Query’s built-in tools in Excel. | ||
== Files Used == | == Files Used == | ||
Line 8: | Line 8: | ||
* [[Media:Orders_2023.xlsx|Download Orders_2023.xlsx]] | * [[Media:Orders_2023.xlsx|Download Orders_2023.xlsx]] | ||
== Removing Rows and Columns == | == Removing Blank Rows and Unnecessary Columns == | ||
'''Steps:''' | '''Steps:''' | ||
# | # Go to Excel → Data tab → Get Data → From File → From Workbook | ||
# In Power Query Editor | # Select and load '''Survey_Results.xlsx''' | ||
# In Navigator → select the worksheet → click '''Transform Data''' | |||
#* | # In Power Query Editor → Go to '''Home''' tab → click '''Remove Rows''' → choose '''Remove Blank Rows''' | ||
# To remove unwanted columns: | |||
#* Click the header of the column you want to remove | |||
#* Hold Ctrl and select additional columns if needed | |||
#* Right-click any selected header → choose '''Remove Columns''' | |||
== Changing Data Types == | == Changing Data Types == | ||
'''Steps:''' | '''Steps:''' | ||
# | # In Power Query Editor, locate the column you want to change | ||
# | # In the column header, click the small data type icon (e.g. ABC123 or calendar icon) | ||
# | # Select the appropriate data type: '''Text''', '''Whole Number''', '''Decimal Number''', '''Date''', etc. | ||
# Alternatively, go to the '''Transform''' tab → click '''Data Type''' dropdown → choose desired type | |||
== Renaming Columns == | == Renaming Columns == | ||
'''Steps:''' | '''Steps:''' | ||
# | # In Power Query Editor, right-click on the column header you want to rename | ||
# | # Click '''Rename''' | ||
# Type the new column name and press Enter | |||
# OR: double-click on the column name to rename directly | |||
== Filtering and Sorting == | == Filtering and Sorting Data == | ||
'''Steps:''' | '''Steps:''' | ||
# Click | # In Power Query Editor, locate the column by which you want to filter or sort | ||
# | # Click the dropdown arrow in the column header | ||
# Click | # To filter: | ||
#* Check/uncheck boxes or use number/text filters | |||
# To sort: | |||
#* Click '''Sort Ascending''' or '''Sort Descending''' in the same dropdown menu | |||
== Grouping Data == | == Grouping Data == | ||
'''Steps:''' | '''Steps:''' | ||
# Select | # Select the column that contains the values to group (e.g. Region or Category) | ||
# Go to '''Transform''' tab | # Go to the '''Transform''' tab | ||
# Choose operation | # Click '''Group By''' | ||
# In the dialog: | |||
#* Choose '''Basic''' or '''Advanced''' | |||
#* Select the column to group | |||
#* Choose operation (e.g. Count Rows, Sum, Average) | |||
# Click '''OK''' | |||
== Splitting Columns == | == Splitting Columns == | ||
'''Steps:''' | '''Steps:''' | ||
# Select a column (e.g. Full Name) | # Select a column (e.g. Full Name or "City, State") | ||
# Go to '''Home''' tab → '''Split Column''' | # Go to the '''Home''' tab → click '''Split Column''' | ||
# Choose method (e.g. | # Choose a method (e.g. '''By Delimiter''') | ||
# In the dialog: | |||
#* Choose the delimiter (e.g. comma, space, custom) | |||
#* Choose how to split (e.g. at each occurrence, left-most, right-most) | |||
# Click '''OK''' | |||
== Unpivoting Columns == | == Unpivoting Columns == | ||
'''Steps:''' | '''Steps:''' | ||
# | # Go to Excel → Data tab → Get Data → From File → From Workbook | ||
# Select | # Select and load '''Orders_2023.xlsx''' | ||
# Go to '''Transform''' tab | # In Navigator → select the worksheet → click '''Transform Data''' | ||
# In Power Query Editor: | |||
#* Select all columns representing months (e.g. Jan, Feb, Mar) | |||
#* Go to the '''Transform''' tab | |||
#* Click '''Unpivot Columns''' | |||
# This will convert columns into attribute-value pairs | |||
== Reordering Columns == | == Reordering Columns == | ||
'''Steps:''' | '''Steps:''' | ||
# | # In Power Query Editor, click and drag the column header to a new position | ||
# | # OR: right-click the column header → choose '''Move''' → select desired direction (Left, Right, To Beginning, To End) | ||
== Duplicating and Referencing Queries == | == Duplicating and Referencing Queries == | ||
'''Steps:''' | '''Steps:''' | ||
# Right-click | # In the Queries pane on the left side of Power Query Editor: | ||
# | #* Right-click the query you want to reuse | ||
# To create a full copy: choose '''Duplicate''' | |||
# | # To create a linked version that depends on the original: choose '''Reference''' | ||
== Additional Exercise: Cleanup and Pivot Prep == | == Additional Exercise: Cleanup and Pivot Prep == | ||
'''Goal:''' Prepare survey data for analysis | '''Goal:''' Prepare survey data for pivot table analysis | ||
'''Steps:''' | '''Steps:''' | ||
# | # Load '''Survey_Results.xlsx''' and remove rows with missing values in the satisfaction column | ||
# | # Rename the satisfaction column to lowercase (e.g. "satisfaction") | ||
# Group by satisfaction rating | # Group the table by satisfaction rating: | ||
# Load | #* Go to '''Transform''' → '''Group By''' → select satisfaction → operation: Count Rows | ||
# Load the final grouped table to a new worksheet in Excel | |||
== Extension: Conditional | == Extension: Add a Conditional Column == | ||
'''Goal:''' | '''Goal:''' Categorize satisfaction ratings into "Positive", "Neutral", or "Negative" | ||
'''Steps:''' | '''Steps:''' | ||
# | # In Power Query Editor → go to '''Add Column''' tab → click '''Conditional Column''' | ||
# | # Set: | ||
#* New column name: `SatisfactionLevel` | |||
#* If Satisfaction = 4 or 5 → "Positive" | #* If Satisfaction = 4 or 5 → then "Positive" | ||
#* | #* Else if Satisfaction = 3 → then "Neutral" | ||
#* | #* Else → "Negative" | ||
# Click OK | # Click '''OK''' | ||
---- | ---- |
Revision as of 09:54, 22 June 2025
Module 3: Data Transformation Techniques
Objective
Clean and reshape data using Power Query’s built-in tools in Excel.
Files Used
Removing Blank Rows and Unnecessary Columns
Steps:
- Go to Excel → Data tab → Get Data → From File → From Workbook
- Select and load Survey_Results.xlsx
- In Navigator → select the worksheet → click Transform Data
- In Power Query Editor → Go to Home tab → click Remove Rows → choose Remove Blank Rows
- To remove unwanted columns:
- Click the header of the column you want to remove
- Hold Ctrl and select additional columns if needed
- Right-click any selected header → choose Remove Columns
Changing Data Types
Steps:
- In Power Query Editor, locate the column you want to change
- In the column header, click the small data type icon (e.g. ABC123 or calendar icon)
- Select the appropriate data type: Text, Whole Number, Decimal Number, Date, etc.
- Alternatively, go to the Transform tab → click Data Type dropdown → choose desired type
Renaming Columns
Steps:
- In Power Query Editor, right-click on the column header you want to rename
- Click Rename
- Type the new column name and press Enter
- OR: double-click on the column name to rename directly
Filtering and Sorting Data
Steps:
- In Power Query Editor, locate the column by which you want to filter or sort
- Click the dropdown arrow in the column header
- To filter:
- Check/uncheck boxes or use number/text filters
- To sort:
- Click Sort Ascending or Sort Descending in the same dropdown menu
Grouping Data
Steps:
- Select the column that contains the values to group (e.g. Region or Category)
- Go to the Transform tab
- Click Group By
- In the dialog:
- Choose Basic or Advanced
- Select the column to group
- Choose operation (e.g. Count Rows, Sum, Average)
- Click OK
Splitting Columns
Steps:
- Select a column (e.g. Full Name or "City, State")
- Go to the Home tab → click Split Column
- Choose a method (e.g. By Delimiter)
- In the dialog:
- Choose the delimiter (e.g. comma, space, custom)
- Choose how to split (e.g. at each occurrence, left-most, right-most)
- Click OK
Unpivoting Columns
Steps:
- Go to Excel → Data tab → Get Data → From File → From Workbook
- Select and load Orders_2023.xlsx
- In Navigator → select the worksheet → click Transform Data
- In Power Query Editor:
- Select all columns representing months (e.g. Jan, Feb, Mar)
- Go to the Transform tab
- Click Unpivot Columns
- This will convert columns into attribute-value pairs
Reordering Columns
Steps:
- In Power Query Editor, click and drag the column header to a new position
- OR: right-click the column header → choose Move → select desired direction (Left, Right, To Beginning, To End)
Duplicating and Referencing Queries
Steps:
- In the Queries pane on the left side of Power Query Editor:
- Right-click the query you want to reuse
- To create a full copy: choose Duplicate
- To create a linked version that depends on the original: choose Reference
Additional Exercise: Cleanup and Pivot Prep
Goal: Prepare survey data for pivot table analysis
Steps:
- Load Survey_Results.xlsx and remove rows with missing values in the satisfaction column
- Rename the satisfaction column to lowercase (e.g. "satisfaction")
- Group the table by satisfaction rating:
- Go to Transform → Group By → select satisfaction → operation: Count Rows
- Load the final grouped table to a new worksheet in Excel
Extension: Add a Conditional Column
Goal: Categorize satisfaction ratings into "Positive", "Neutral", or "Negative"
Steps:
- In Power Query Editor → go to Add Column tab → click Conditional Column
- Set:
- New column name: `SatisfactionLevel`
- If Satisfaction = 4 or 5 → then "Positive"
- Else if Satisfaction = 3 → then "Neutral"
- Else → "Negative"
- Click OK
→ Continue to Module 4: Automation and Parameters
Return to Main Page