As we learned in Part 1 and Part 2 in our blog series, data modeling in Power BI depends on a foundation of clean tables and the relationships we establish. When it comes to extending our data model with DAX formulas, the quality of our insights depend on our understanding of key concepts and how functions work together.
DAX, or Data Analysis eXpressions, is a functional query language designed by Microsoft with a familiar Excel look and feel. While DAX may appear simple on the surface, it is not easy to master. The statement “Dax is simple but not easy” was made popular by Alberto Ferrari from SQLBI. There are few statements that ring more true when learning DAX.
No DAX function or concept is difficult on its own, however combining elements together can lead to complexity. Regardless, the strong Excel user should enjoy some confidence early on, followed by some careful learning as more advanced DAX is required.
Below is the Dunnage Kruger effect. I saw this during a presentation and thought it was a great illustration of my own journey learning DAX as an Excel user.
The good news is, is that you are not alone; many of us have experienced a similar learning curve in DAX. The Excel users that push through the curve gain an appreciation for DAX, and an ability to provide insights on a much larger scale.
It is impossible to cover all that DAX has to offer in a blog post, but I do try to provide enough concept and hands on examples to get the new user rolling with Power BI.
This also turned into a long post, so I have split the DAX posts into two separate posts. Post one will cover core concepts and two simple DAX examples. Next week I will post two more DAX examples where we will modify the evaluation context in our expression.
Before we start with DAX, let’s get familiar with some foundational DAX concepts:
“Evaluation Context identifies the current environment for each value, then works with our tables to return the correct result,”
Whenever DAX executes a formula, it uses one of two calculation approaches or Evaluation Contexts. The two options are Filter and Row Context.
Evaluation context defaults to either Filter Context or Row Context depending on where the value sits in Power BI. If the value sits in a report visual, then filter context works to filter the tables in your data model before the expression calculates. However, if the value sits in a calculated column in your data table, then by default row context calculates each row result, one row at a time.
Filter Context can’t do what Row Context can do and vice versa. There are times you will use one over the other, and times when you’ll use both in the same formula.
“Filter Context evaluates your formula expression only after filtering tables in the model that correspond with the filters surrounding your report value.”
Filter Context is used when our values sit in a report visual. DAX evaluates the relevant information surrounding each cell in the report visual, then uses that information to filter our tables before calculating the new cell result.
We can use an Excel Pivot Table to visualize the steps necessary to return the correct results. Imagine we have the below table on the right, as well as the corresponding Pivot Table summary on the left.
As fields are pulled into the Pivot row (Red, Blue), column (Mo1, Mo2), and value box (Amount). Each cell value in the Pivot Table instantly returns a specific result based on the Row and Column coordinates in the Pivot Table.
What happens under the hood between our data table and our Pivot Table to make this work?:
1. The cell that represents Red and Mo 1 in the Pivot Table sends Excel back to the data table with instructions to get the total for Red and Mo 1.
2. Excel goes back to the data table and filters the table to view only the rows that apply to Color = Red and Month = Mo 1.
3. After filtering, it totals the remaining rows under the Amount column.
4. Excel then brings this total amount back to the Pivot Table, dropping it off in the original cell. For Red Mo 1, our result is 3.
Excel moves on to the next cell to do 1-4 all over again for all the other cells in the pivot table.
In Power BI the same concept holds true under the Filter Context. When DAX evaluates a formula under a filter context:
1. The Filter Context is first defined by surrounding filters, including row and column intersects in the visual, as well as any external filters (such as report slicers) that further narrow down the result in the cell.
2. DAX takes the Filter Context from above and applies filters to all tables that relate to the value. In doing so, the remaining rows in the entire model pertain only to that specific report cell.
3. After the model is filtered, the expression evaluates over the remaining rows in the model.
4. The expression results in an aggregation (sum, max, average, etc.) over the filtered rows. This result is returned to the visual and becomes the updated report value under its original filter context.
“Row Context evaluates your formula expression by iterating (repeating) over a table row-by-row.”
Row context is used by default when we add a calculated column to our table in the data model. Row Context serves a completely different but equally valuable purpose. Instead of filtering tables and calculating over remaining rows, Row Context isolates each row one at a time to produce the correct result. In Power BI terms, we say that Row Context iterates (repeats row by row) over tables.
To tie this back to our pivot table example, it would be similar to adding a calculated column to our data sheet that serves as the source to our pivot table. We can add as many calculated columns as we need on our data table, then later drag them into our pivot table as summarized (filter context) fields in our pivot table.
When writing DAX, it is critical to know which context you are in. As we’ll see in our examples, writing DAX formulas that don’t line up with our evaluation context will likely produce unexpected results.
There are only two ways to create formulas in DAX, Calculated Columns and Measures. The Excel person will feel right at home with calculated columns so this will be our first example.
Let’s open our file from where we left off in Post 2 and get ready to write our first DAX example.
If you are just joining us, you can start from here and download the Power BI report right where we left off in Post 2.
With our report open, lets add a table visual from the Data view in Power BI.
Next drag the following columns form the Part table and our Stepping Stone Sales table:
Since our Doc# field is numeric, Power BI assumes we want to sum the values of the doc numbers in our table. Lets remove the auto sum by clicking on the arrow to the right of the Doc# field and choosing “Do not Summarize”.
Our table should now looks like this:
Example 1: Calculated Columns
“Calculated columns materialize values in a column, and thus defaults to using row Context and iterates row by row”
Calculated Columns are very similar to adding an additional column on to our spreadsheet. When we calculate a new column in Excel, we write the formula based on the first row, then click and drag that formula all the way down the bottom of our sheet. Somehow Excel knows that we expect that formula to repeat for each new row.
The same holds true in DAX. When we choose to add a calculated column, we by default call Row Context for our expression. We will override this Evaluation Context in a later example, but for this example, our calculated column expression automatically activates row context, and thus row by row evaluation.
Let’s try this out in our model! Click on the Table icon from the left side of the report page, and choose the Stepping Stone table.
Either right click over a column in the table and choose ‘New Column’, or click on the ‘New Column’ icon in the ribbon.
Note: Our cursor may not automatically be active in the formula bar. Just click in the formula bar to edit the placeholder text ‘Column =‘.
Type the following formula just as written below:
Tip: To save time, take advantage of intellisense and start out by just typing “Qty” then the tab key.
Note: With calculated columns, that we need to use the full column reference including both the table (SteppingStone) and column name (Unit Price). We’ll see shortly that the same does not apply to Measures.
Now lets drag this new calculated column into our table.
As expected, our calculated column displays the appropriate total sales amount both at the sales order row level as well as at the grand total level.
This ensures that no matter how we group our sales in our report, we will always step row by row, and arrive at the correct result on our visual.
Let’s make sure that our math holds true if we remove the Document Number, Quantity, and Unit Price fields.
Even after removing our Doc# field, we still show the proper amounts calculated on each row, including the Total line of $533.
Example 2: Measures
“Measures materialize in report visuals and call on Filter Context. We filter the model first, then return the result over surviving rows.”
Contrast this to the default behavior now in a Measure, which has a different purpose than our calculated column did under row context. Measures ultimately depend on Filter Context to filter the model, then evaluate the expression over surviving table rows to ensure the correct result.
Measures may include multiple steps and arguments, but in the end, every measure must provide just one value as the answer for the cell (Referred to as a Scalar Value). Thus we can not reference a column name without using an aggregation function such as =SUM, =MAX, =AVERAGE, to ensure a single value result.
Nothing highlights the difference between Row and Filter Context better than trying to replicate our calculated column from above, but in a Measure.
Let’s make an attempt at this in a measure, and look at the result.
Right click over a column in the table columns and choose “New Measure”, or choose the New Measure button from the ribbon above.
Lets write out our “Measure Sales” then drag the measure into our table next to “Row Sales” and compare the result:
What caused our incorrect result? To understand this better, we need to walk through the process of filter context.
By comparing our Calculated Column/Row Context on the left, to our Measure/Filter Context on the right, we see the effect of filtering our table FIRST, THEN aggregating our expression using a SUM() for each column Qty and Unit Price.
It is impossible to write a measure like this without calling upon Row Context (Which we will learn to call successfully in a measure in our next post) and expect the correct result. For this expression we must use the row context to evaluate the Qty * Unit Price, row-by-row.
Note: In the Grand total line, we don’t see a sum of the numbers listed in the column. Rather than calculating a sum of the amounts visible in the column, DAX treats the grand total row just like any other row in the visual… except it clears all filters over the Product Class, and then performs the expression. This means that our quantity column grand total is multiplied against our Unit Price grand total. This is why we get $8,160 instead of the sum of visible values above the Grand Total line, which would otherwise total 2,902.
In this post we’ve touched on key DAX concepts and walked through a couple of basic DAX calculations. While DAX is simple, mastering DAX takes a commitment to learn fundamentals, hands on practice, and gain experience troubleshooting unexpected results.
For the Excel person that pushes through the learning curve, there is light at the end of the tunnel. My recommendation is to find a good DAX book in addition to other learning resources early on, wrap your head around the foundational concepts as you start to practice.
As always, thanks for reading! – Mark