The following strategies can be implemented in order to manage and improve the Power Query M code that transforms large datasets in Power BI.
Minimize Steps and Avoid Excess Transformations: Processing time increases with each additional step in Power Query. Therefore, it is best to limit the number of transformation steps, particularly those involving operations such as merges and groupings, which tend to make performance worse. Do not include unnecessary steps that will not help achieve the desired output.
Use Query Folding Whenever Feasible: Whenever the provided source system supports it, instead of processing operations in Power BI, data modifications are pushed down to the database source. This is even better, especially with SQL Server or any other database that has query folding capability. This also means that taking data transformations to the source will help cut down the data load time tremendously. To confirm query folding, right-click on each step and click on “View Native Query.”
Filter Early, Aggregate Early: Do not wait until the transformed data is almost ready to start filtering or aggregating it to manageable amounts. Most of the time, there is no necessity to carry all columns and rows with the data, which can easily be aggregated. Carrying out such processing early in load processes tends to speed up processing in Power BI and also save on memory resources.
Make Use of Variables When Writing M Code: Using variables when compiling any M code, especially an original one, facilitates more transformations since it allows the redeployment of the values worked out without having to carry out the transformations again. It also enhances the visual composition of the code.
Disable Those Intermediate Queries from Loading Data: Where data shaping is through intermediate queries, ensure that they are marked as ‘Enable Load’ unchecked so that Power BI can avoid loading data that is tight and slim unnecessarily.
Apply Buffer Function Efficiently: Table. The buffer () function is known to enhance performance by caching data in memory. However, it should be used judiciously. It proves useful when some transformation is stepped, and the data needs to be maintained throughout the steps, but if overused, it can lead to increased memory consumption.
Avoid Creating Unnecessary Custom Columns: If this limit can be set, it is best to use inbuilt transforms or calculated columns after data load in Power BI rather than custom columns, as they are, in most cases, inefficient in Power Query.
Employing these user strategies helps in handling and improving the Power Query M code writing, and implementing data transformations facilitates speed and effectiveness, even with large data sets.