Power Query for Analysts: Difference between revisions
Line 313: | Line 313: | ||
#Alternatively, use an M expression: | #Alternatively, use an M expression: | ||
Table.SelectRows(PQ_sales, each [Cost] >= Parameters[MinCost] and [Cost] <= Parameters[MaxCost]) | Table.SelectRows(PQ_sales, each [Cost] >= Parameters[MinCost] and [Cost] <= Parameters[MaxCost]) | ||
Step 4: Building a Dynamic Data Source | Step 4: Building a Dynamic Data Source | ||
Line 323: | Line 323: | ||
#Instead of using a static file path, create a parameter for the file location. | #Instead of using a static file path, create a parameter for the file location. | ||
#Where FilePath is a user-defined parameter stored in PQ_parameters.xlsx. | #Where FilePath is a user-defined parameter stored in PQ_parameters.xlsx. | ||
=== Task === | === Task === |
Revision as of 14:40, 27 February 2025
Module 1: Introduction to Power Query and Basic Transformations
Objective
In this exercise you will learn to:
Import data from a CSV file into Power Query. Examine and adjust data types, focusing on converting a text-formatted date into an actual date type. Apply basic filtering to the data. Use external help (e.g., ChatGPT) for hints on creating custom M code without directly copying solutions.
Provided Data
You are provided with a downloadable CSV file named 📂File:PQ sales.csv that contains sales order data. The file includes the following columns:
- OrderID (integer)
- OrderDate (text, in a non-standard date format)
- Customer (text)
- Product (text)
- Quantity (integer)
- Cost (number)
Instructions
- Step 1: Import the Data
- Open Power Query in Excel or Power BI.
- Import the data from the sales.csv file.
- Observe that the OrderDate column is imported as text due to its format (dd/MM/yyyy).
- Step 2: Check and Convert Data Types
- Verify that each column has the correct data type.
- Manually convert the OrderDate column from text to date type.
- Hint: If you have trouble, consider asking ChatGPT for guidance on how to write an M function for converting text to a date.
- Step 3: Apply Basic Filtering
- Filter the dataset so that only rows where Cost is greater than 200 remain.
- Suggestion: Use the graphical interface of Power Query or write a simple M script to apply the filter. If needed, consult ChatGPT for ideas on how to #implement this filter.
- Step 4: Review and Save Your Work
- Confirm that the transformations have been applied correctly by reviewing the data preview.
- Save your query and document the steps you took.
Task
Complete the steps outlined above in Power Query. Experiment with the transformation options available and try to understand how each step affects your data. Use ChatGPT for hints or troubleshooting, but avoid copying complete solutions verbatim.
Module 2: Combining and Merging Data from Multiple Sources
Objective
In this exercise you will learn to:
Import data from multiple CSV files into Power Query. Merge (join) data from different sources based on a common key. Use a Left Outer Join to add customer details to sales orders. Leverage external help (e.g., ChatGPT) for hints on writing custom M code without copying complete solutions.
Provided Data
You are provided with two downloadable CSV files:
📂 File:PQ sales2.csv – Contains sales order data:
📂 File:PQ customers.csv – Contains customer information:
Instructions
- Step 1: Import the Data
Open Power Query in Excel or Power BI. Import data from both PQ_sales.csv and PQ_customers.csv. Verify that both queries load correctly.
- Step 2: Check and Convert Data Types
Confirm that each column has the appropriate data type in both queries. For example, note that the OrderDate column in PQ_sales.csv is imported as text due to its non-standard format. Hint: Use transformation functions if any adjustments are needed.
- Step 3: Merge the Data
Merge the PQ_sales.csv query with the PQ_customers.csv query. Use the Customer column as the matching key. Select a Left Outer Join so that every sales order is retained along with its corresponding customer details. Suggestion: If you’re unsure how to write the M code for this merge, ask ChatGPT for guidance on merging queries.
- Step 4: Review the Merged Data
Confirm that the resulting query includes additional columns (e.g., Region, CustomerSince) from the PQ_customers.csv file. Inspect the merged data to ensure that customer details have been correctly joined to the appropriate sales orders.
- Step 5: Save Your Work
Save your query and document the transformation steps you applied.
Task
Complete the steps outlined above using Power Query. Experiment with both the graphical interface and custom M code to perform the merge. Use external resources (e.g., ChatGPT) for hints or troubleshooting, but avoid copying complete solutions verbatim.
Module 3: Creating Custom Columns and Functions
Objective
In this exercise, you will learn to:
Create custom calculated columns using Power Query.
Use built-in Power Query functions to manipulate text, numbers, and dates.
Write custom functions in the M language to automate transformations.
Utilize ChatGPT to assist with writing and optimizing M code.
Provided Data
For this exercise, we will use the following datasets:
📂File:PQ sales2.csv (used in previous modules)
📂File:PQ discounts.csv (new dataset) - contains discount rates based on product type.
Instructions
- Step 1: Import the Data
- Open Power Query in Excel or Power BI.
- Import both PQ_sales.csv and PQ_discounts.csv.
- Ensure that both tables are loaded correctly.
- Step 2: Creating a Custom Column for Total Cost
- In the PQ_sales table, add a new custom column:
- Navigate to Add Column → Custom Column.
- Name it TotalCost.
- Create a formula to calculate the total cost as:
Quantity * Cost
- Click OK and check the results.
- Step 3: Applying Discounts Using a Merge
- Merge PQ_sales with PQ_discounts using the Product column as the key.
- Expand the DiscountRate column into PQ_sales.
- Add another custom column called DiscountedPrice:
[TotalCost] - ([TotalCost] * [DiscountRate])
- Verify that the new column correctly applies the discounts.
- Step 4: Creating a Custom Function in M
- Create a function to categorize products into different price ranges:
- Navigate to Home → Advanced Editor.
- Write an M function that takes Cost as an input and returns a category:
Low if Cost < 500
- Medium if Cost is between 500 and 1500
High if Cost > 1500
Hint: If you're unsure how to structure the function, ask ChatGPT: "How do I write an M function that categorizes prices into Low, Medium, and High?"
- Step 5: Assigning Categories
- In the PQ_sales table, add a custom column using the function.
- Name the column PriceCategory.
- Ensure that the categories appear correctly based on the Cost value.
Task
- ✔ Complete all steps in Power Query.
- ✔ Experiment with both the graphical interface and M code.
- ✔ Use ChatGPT to troubleshoot or improve your M script.
Module 4: Advanced Data Transformations in Power Query
Objective
In this module, you will learn:
- ✔ How to pivot and unpivot data in Power Query.
- ✔ How to split and merge columns for better data structuring.
- ✔ How to use conditional transformations.
- ✔ How to leverage ChatGPT for complex M scripting.
Provided Data
For this exercise, we will introduce a new dataset: PQ_sales_pivot.csv, which contains monthly sales figures for different products.
📂 File:PQ sales pivot.csv - Pivoted data structure:
Instructions
- Step 1: Import the Data
- Open Power Query in Excel or Power BI.
- Import the file PQ_sales_pivot.csv.
- Ensure that the table is loaded correctly.
- Step 2: Unpivot the Data
- The current table has a wide format, which is not ideal for analysis.
- Unpivot the month columns so that the data is structured as:
- Product
- Category
- Month
- Sales Amount
How to do this:
- Click Transform → Use First Row as Headers to ensure the correct column names.
- Select the month columns (Jan 2025, Feb 2025, etc.).
- Click Transform → Unpivot Columns.
- Rename the resulting columns:
- "Attribute" → Month
- "Value" → Sales Amount
- Step 3: Splitting and Merging Columns
- The Month column currently has values like "Jan 2025".
- Split this column into Month Name and Year:
- Select the Month column.
- Click Transform → Split Column → By Delimiter.
- Choose the space (" ") delimiter.
- Rename the new columns as Month Name and Year.
Merge Columns Example:
If you want to merge Product and Category, select both.
- Click Transform → Merge Columns.
- Use " - " as a separator (e.g., "Monitor - Electronics").
- Step 4: Adding Conditional Transformations
Add a new custom column called "Sales Performance":
if [Sales Amount] < 300 then "Low" else if [Sales Amount] >= 300 and [Sales Amount] < 800 then "Medium" else "High"
Ensure that the column correctly categorizes sales performance.
Task
- ✔ Complete all steps in Power Query.
- ✔ Experiment with unpivoting, splitting, merging, and conditional logic.
- ✔ Use ChatGPT to troubleshoot or improve your M script.
Module 5: Parameterization and Dynamic Queries in Power Query
=== Objective ===In this module, you will learn:*✔ How to create parameters in Power Query.*✔ How to use parameters to filter and control query results dynamically.*✔ How to build dynamic data sources based on user input.*✔ How to leverage ChatGPT for writing and optimizing M code for parameterization.
=== Provided Data ===For this exercise, we will use the PQ_sales.csv file and introduce a new dataset PQ_parameters.xlsx, which contains dynamic filter values.
📂 File:PQ parameters.xlsx - Parameter table for dynamic filtering:
Instructions
Step 1: Import the Data
- Open Power Query in Excel or Power BI.
- Import both datasets:
PQ_sales.csv (Sales transactions)
PQ_parameters.xlsx (Filter parameters)
- Ensure that the tables load correctly.
Step 2: Creating and Using Parameters Dynamically
- Load the PQ_parameters.xlsx table into Power Query.
- Ensure that the table has two columns: ParameterName and Value.
- Click on Transform → Use First Row as Headers to make sure column names are correctly applied.
- Convert the table into a record for easy reference:
Select the ParameterName column and pivot it so that each parameter becomes a column.
Click Transform → Pivot Column, setting Value as the values column.
This should create a record with fields StartDate, EndDate, MinCost, MaxCost.
- Ensure that all values are in the correct data type (Date for StartDate and EndDate, Number for MinCost and MaxCost).
- Now, the parameters are dynamically retrieved from the PQ_parameters.xlsx file.
Step 3: Applying Parameters to Filter Data
Filter Sales Data Based on Date Range:
- Open the PQ_sales query in Power Query.
- Ensure that the OrderDate column is in Date format. If not, change it to Date using "Transform → Data Type → Date".
- Click on the filter dropdown for the OrderDate column.
- Select "Date Filters → Custom Filter".
- In the filter conditions:
Set the first condition to is after or equal to and reference the StartDate from the record.
Set the second condition to is before or equal to and reference the EndDate from the record.
- Click OK to apply the filter.
- Alternatively, use an M expression to filter data:
Table.SelectRows(PQ_sales, each [OrderDate] >= Parameters[StartDate] and [OrderDate] <= Parameters[EndDate])
Filter Sales Data Based on Cost Range:
- Ensure that the Cost column is in Number format.
- Click on the filter dropdown for the Cost column.
- Select "Number Filters → Between".
- Set the first condition to greater than or equal to and reference the MinCost from the record.
- Set the second condition to less than or equal to and reference the MaxCost from the record.
- Click OK to apply the filter.
- Alternatively, use an M expression:
Table.SelectRows(PQ_sales, each [Cost] >= Parameters[MinCost] and [Cost] <= Parameters[MaxCost])
Step 4: Building a Dynamic Data Source
Make the File Path Dynamic:
Source = Excel.Workbook(File.Contents(Parameters[FilePath]), null, true)
- Instead of using a static file path, create a parameter for the file location.
- Where FilePath is a user-defined parameter stored in PQ_parameters.xlsx.
Task
- ✔ Create and use parameters to filter the data dynamically.
- ✔ Implement a dynamic data source with a parameterized file path.
- ✔ Use ChatGPT to troubleshoot or improve your M script.