As we learned in Part 1 and Part 2 in our blog series, data modeling in Power BI depends on a foundation of clean tables and the relationships we establish. When it comes to extending our data model with DAX formulas, the quality of our insights depend on our understanding of key concepts and how functions work together.
DAX, or Data Analysis eXpressions, is a functional query language designed by Microsoft with a familiar Excel look and feel a basic level. While DAX may use many of the functions we know and love in Excel, DAX quickly progresses into new functions and concepts to learn.
No DAX function or concept by itself is difficult to learn, however combining or nesting functions can lead to complexity. Another source of complexity reveals itself when we consider that DAX is a functional and query language. We use DAX functions to produce calculated results, but we also have the ability to write table queries that calculate our own virtual (or physical) table available to use in more complex DAX calculations. The dedicated DAX learner should enjoy some confidence early on, followed by some careful learning as more advanced DAX is required.
Below is the Dunnage Kruger effect. I think this is a great illustration of my own journey learning DAX as an Excel user.
It is impossible to cover all that DAX has to offer in a blog post, but I do try to provide core concepts and hands on examples to get the new user rolling.
This is a long post, so I have split the DAX posts into two separate posts. Post one will cover core concepts and two simple DAX examples. Next week I will post two more DAX examples where we will modify the evaluation context in our expression.
Before we start with DAX, let’s get familiar with some foundational DAX concepts:
“Evaluation Context identifies the current environment for our written function, then works with tables to return the result,”
When DAX executes a formula, it starts by identifying one of two Evaluation Contexts, Filter or Row Context.
Filter context can be physically seen by carefully observing each cell in a Power BI matrix visual on our report visual. Each cell result is influenced by the row and column labels seen in our report. Add the additional user selections filtered by slicers, and we realize that one function can bring back different results based on a combination of all of the applied labels and selected slicer filters. Appropriately, we call this “Filter Context”. As a matter of fact, everything we drag onto the reporting canvas is ultimately based on a filter context result.
Row Context on the other hand is used by default if the calculated function sits in a calculated column in a physical table in ur model. In this DAX does not drive the result from an initial awareness of various labels and filters, but rather just identifies all the rows in a table, and calculates our expression one row at a time. This is not unlike adding a column in Excel, and dragging that one formula down the side of our sheet. Same formula, applied row by row, to return the result. Appropriately, we call this concept “Row Context”. Note that row context happens naturally in a calculated column, but we can also nest a row context calculation within our written measures. We’ll learn about measures later in this series.
Filter Context can’t do what Row Context can do and vice versa. There are times you will use one over the other, and times when you’ll use both in the same formula.
“Filter Context evaluates your calculation only after filtering tables in the model that correspond with the row and column labels and other filters that surround your report value.”
Filter Context happens when our values sit in a report visual. DAX evaluates the relevant information surrounding each cell in the report visual, then uses that information to filter physical tables in our model before calculating the new cell result.
It might be a bit overwhelming to think that every visit cell takes all of the current filters for a specific cell, goes back to the data model, and filters and reduces our data model tables in a way that matches the visual cell’s row column and slicer filters, then runs the math expression to return the correct result just for that one cell in our matrix, but that is exactly what happens.
A measure is a formula that we write as an extension of our model. We don’t see the results of a measure in a physical table in our model. Not having to materialize the result and store a measure in our model helps us reduce the physical weight of the tables in our model. In order to see results from a measure, we need to include it in our report visual on our Power BI canvas.
We can use an Excel Pivot Table to visualize the steps necessary to return the correct results. Imagine we have the below table on the right, as well as the corresponding Pivot Table summary on the left.
As fields are pulled into the Pivot row (Red, Blue), column (Mo1, Mo2), and value box (Amount). Each cell value in the Pivot Table instantly returns a specific result based on the Row and Column coordinates in the Pivot Table.
What happens under the hood between our data table and our Pivot Table to make this work?:
1. The cell that represents Red and Mo 1 in the Pivot Table sends Excel back to the data table with instructions to get the total for Red and Mo 1.
2. Excel goes back to the data table and filters the table to view only the rows that apply to Color = Red and Month = Mo 1.
3. After filtering, it totals the remaining rows under the Amount column.
4. Excel then brings this total amount back to the Pivot Table, dropping it off in the original cell. For Red Mo 1, our result is 3.
Excel moves on to the next cell to do 1-4 all over again for all the other cells in the pivot table.
In Power BI the same concept holds true under the Filter Context. When DAX evaluates a formula under a filter context:
1. The Filter Context is first defined by surrounding filters, including row and column intersects in the visual, as well as any external filters (such as report slicers) that further narrow down the result in the cell.
2. DAX takes the Filter Context from above and applies filters to all tables that relate to the value. In doing so, the remaining rows in the entire model pertain only to that specific report cell.
3. After the model is filtered, the expression evaluates over the remaining rows in the model.
4. The expression results in an aggregation (sum, max, average, etc.) over the filtered rows. This result is returned to the visual and becomes the updated report value under its original filter context.
“Row Context evaluates your formula expression by iterating (repeating) over a table row-by-row.”
Row context is used by default when we add a calculated column to our table in the data model. Row Context serves a completely different but equally valuable purpose. Instead of filtering tables and calculating over remaining rows, Row Context isolates each row one at a time to produce the correct result. In Power BI terms, we say that Row Context iterates (repeats row by row) over tables.
To tie this back to our pivot table example, it would be similar to adding a calculated column to our data sheet that serves as the source to our pivot table. We can add as many calculated columns as we need on our data table, then later drag them into our pivot table as summarized (filter context) fields in our pivot table.
When writing DAX, it is critical to know which context you are in. As we’ll see in our examples, writing DAX formulas that don’t line up with our evaluation context will likely produce unexpected results.
There are only two ways to create formulas in DAX, Calculated Columns and Measures. The Excel person will feel right at home with calculated columns so this will be our first example.
Let’s open our file from where we left off in Post 2 and get ready to write our first DAX example.
If you are just joining us, you can start from here and download the Power BI report right where we left off in Post 2.
With our report open, lets add a table visual from the Data view in Power BI.
Next drag the following columns form the Part table and our Stepping Stone Sales table:
Since our Doc# field is numeric, Power BI assumes we want to sum the values of the doc numbers in our table. Lets remove the auto sum by clicking on the arrow to the right of the Doc# field and choosing “Do not Summarize”.
Our table should now looks like this:
Example 1: Calculated Columns
“Calculated columns materialize values in a column, and thus defaults to using row Context and iterates row by row”
Calculated Columns are very similar to adding an additional column on to our spreadsheet. When we calculate a new column in Excel, we write the formula based on the first row, then click and drag that formula all the way down the bottom of our sheet. Somehow Excel knows that we expect that formula to repeat for each new row.
The same holds true in DAX. When we choose to add a calculated column, we by default call Row Context for our expression. We will override this Evaluation Context in a later example, but for this example, our calculated column expression automatically activates row context, and thus row by row evaluation.
Let’s try this out in our model! Click on the Table icon from the left side of the report page, and choose the Stepping Stone table.
Either right click over a column in the table and choose ‘New Column’, or click on the ‘New Column’ icon in the ribbon.
Note: Our cursor may not automatically be active in the formula bar. Just click in the formula bar to edit the placeholder text ‘Column =‘.
Type the following formula just as written below:
Tip: To save time, take advantage of intellisense and start out by just typing “Qty” then the tab key.
Note: With calculated columns, that we need to use the full column reference including both the table (SteppingStone) and column name (Unit Price). We’ll see shortly that the same does not apply to Measures.
Now lets drag this new calculated column into our table.
As expected, our calculated column displays the appropriate total sales amount both at the sales order row level as well as at the grand total level.
This ensures that no matter how we group our sales in our report, we will always step row by row, and arrive at the correct result on our visual.
Let’s make sure that our math holds true if we remove the Document Number, Quantity, and Unit Price fields.
Even after removing our Doc# field, we still show the proper amounts calculated on each row, including the Total line of $533.
Example 2: Measures
“Measures materialize in report visuals and call on Filter Context. We filter the model first, then return the result over surviving rows.”
Contrast this to the default behavior now in a Measure, which has a different purpose than our calculated column did under row context. Measures ultimately depend on Filter Context to filter the model, then evaluate the expression over surviving table rows to ensure the correct result.
Measures may include multiple steps and arguments, but in the end, every measure must provide just one value as the answer for the cell (Referred to as a Scalar Value). Thus we can not reference a column name without using an aggregation function such as =SUM, =MAX, =AVERAGE, to ensure a single value result.
Nothing highlights the difference between Row and Filter Context better than trying to replicate our calculated column from above, but in a Measure.
Let’s make an attempt at this in a measure, and look at the result.
Right click over a column in the table columns and choose “New Measure”, or choose the New Measure button from the ribbon above.
Lets write out our “Measure Sales” then drag the measure into our table next to “Row Sales” and compare the result:
What caused our incorrect result? To understand this better, we need to walk through the process of filter context.
By comparing our Calculated Column/Row Context on the left, to our Measure/Filter Context on the right, we see the effect of filtering our table FIRST, THEN aggregating our expression using a SUM() for each column Qty and Unit Price.
It is impossible to write a measure like this without calling upon Row Context (Which we will learn to call successfully in a measure in our next post) and expect the correct result. For this expression we must use the row context to evaluate the Qty * Unit Price, row-by-row.
Note: In the Grand total line, we don’t see a sum of the numbers listed in the column. Rather than calculating a sum of the amounts visible in the column, DAX treats the grand total row just like any other row in the visual… except it clears all filters over the Product Class, and then performs the expression. This means that our quantity column grand total is multiplied against our Unit Price grand total. This is why we get $8,160 instead of the sum of visible values above the Grand Total line, which would otherwise total 2,902.
In this post we’ve touched on key DAX concepts and walked through a couple of basic DAX calculations. While DAX is simple, mastering DAX takes a commitment to learn fundamentals, hands on practice, and gain experience troubleshooting unexpected results.
For the Excel person that pushes through the learning curve, there is light at the end of the tunnel. My recommendation is to find a good DAX book in addition to other learning resources early on, wrap your head around the foundational concepts as you start to practice.
As always, thanks for reading! – Mark
All things Power BI