Sign up for our FREE Power BI Bootcap
Close Icon

Power Query Shopping List

If you don't know, Power Query is a data connection technology that allows you to discover, connect, combine, and refine data across various sources. Power Query offers a robust and high-performing data management experience when used well. I repeat when used well.

If you're new to Power BI, the loose ideal that I follow is:

If it can be done in Power Query, it probably should be done in Power Query.

But that is oversimplifying, and to that end, this blog post will guide you through some tips and best practices to optimise your Power Query operations.

Selecting the Right Connector

Power Query offers a growing selection of data connectors ranging from TXT, CSV, and Excel files, databases such as Microsoft SQL Server, and popular SaaS services such as Microsoft Dynamics 365 and Salesforce.

If your data source isn't directly listed, you can use the ODBC or OLEDB connectors.

It feels weird to say this, but choosing the right connector is essential for maximising performance.

Ensure you get this step right.

Don't forget that each data connector adheres to an everyday experience with a stage called Data Preview.

Here, you can conveniently select your data and preview it before extraction.

Filter Early

Implement filters as early as possible in your query to avoid unnecessary computation.

Power Query takes advantage of your filters through query folding, resulting in increased performance.

Also, filter out irrelevant data to focus better on the task.

Power Query offers various filtering options. You can use the auto filter menu, which displays a unique list of values in your column, or type-specific filters for date, datetime, or even date timezone columns.

Perform Expensive Operations Last

Certain operations require reading the whole data source to yield results, making them slow to preview. If possible, perform such "expensive" functions last.

This will help reduce the amount of time you spend waiting for the preview to render each time you add a new step to your query.

Some examples of expensive operations are:

  1. Merging or Appending Large Data Tables: These operations require Power Query to compare and combine large volumes of data, which can be resource-intensive, especially if the tables have many rows or columns.
  2. Sorting Large Data Sets: Sorting can be resource-heavy, particularly for large data sets. This is because sorting requires comparing values and rearranging rows, which can be computationally expensive.
  3. Nested Calculations or Recursive Functions: These can be very costly because Power Query has to perform the same calculations or operations multiple times. This increases exponentially with the amount of data or complexity of the calculations.
  4. Complex Aggregations: Aggregation functions like SUM, AVERAGE, COUNT, etc., when used on large data sets or across many different groups, can significantly slow down your query.
  5. Row-Level Calculations or Transformations: When you have to perform a calculation or transformation for each row in a large data set, this can be very time-consuming and resource-intensive.

Temporarily Use a Subset of Your Data

If adding new steps to your query is slow, consider first performing a "Keep First Rows" operation and limiting the number of rows you're working against.

This way, you can add all the necessary steps quickly and then remove the "Keep First Rows" step later.

Useful when working on slower machines or through a proxy or remote desktop, too.

Ensure Correct Data Types

Having correct data types for your columns is crucial.

Features in Power Query are contextual to the data type of the column selected.

For example, specific options and transformations will be unavailable if the column type isn't set. It's also essential when working with type-specific filters clear to particular data types.

Some more performance, data integrity, and functionality issues are at play.

Bottom line - select the correct data type.

Document Your Steps

Remember to document your queries by renaming or adding descriptions to your steps, questions, or groups. This helps to keep your work organised and understandable, especially after you move on and some other developer picks up where you left off.

Next week we will go through a handy documentation hack that will speed up this process tenfold.

Embrace Modularity

If a query contains many steps, consider splitting it into multiple questions where one query references the next.

This makes your transformation phases easier to understand by breaking them into smaller, digestible parts.

It can get a little weird, so ensure comments/documentation inside and outside of Power BI.

Organise with Groups

Groups in Power Query are like folders for your queries. They help keep your work organised. You can even create groups within groups if necessary.

Again, just getting ourselves organised.

Leverage Parameters and Custom Functions

Parameters can make your queries more dynamic and flexible. They allow you to store and manage a value that easily.

Things like starting your calendar from the first date of a data set.

Parameters can make life easier.

Anyway, Power Query is a gem. There are a lot of resources out there, and it's wise to refine your skills regularly.

Actionable tips to take you from developer to Power BI business owner

Written by humans. We'll never share your data
Thank you! The Reports are yours!
Be sure to check your junk if you don't see our confirmation email.
Oops! Something went wrong while submitting the form.