Minimize
02

A common calculation required in business is moving averages. How can we create it in DAX?  Lets walk through the calculation.

Imagine we have the following PowerPivot table:

image

In order to calculate a three month moving average, we first need to be able to grasp a reference to the last two months and the current month – for each of the months listed on the table.  In other words, if I am in September of 2001, I should be able to grab July, August and September of 2001.    How do we do this in DAX?

Using DAX filter context you have the ability to make a reference to the in-memory date table, and - for each month in the pivot table – filter the table only for the current month and the last 2 prior months.  Lets see this in action:

Lets start by trying to grab a reference to the current month. To test it, we can use DAX’s VALUES function which returns a table with distinct values depending on the current context.  As we want to “grab” the current month, we try creating a measure using this expression:

=VALUES(DimDate[EnglishMonthName])

DimDate is the in-memory table in the PowerPivot window that contains all dates, and “EnglishMonthName” is a column on that table.  This is the column that has been placed on the pivot table.

The result we get is an error.  It says:

ERROR - CALCULATION ABORTED: MdxScript(Sandbox) (5, 57) A table of multiple values was supplied where a single value was expected.

Hmm.   The error says there is a table with multiple values.  Clearly, each cell’s output should represent only one value.  But.. why do we have multiple values when there is only one month in context?   The reason is that there is a grand total row (not shown above).   In the case of the grand total, there ARE multiple month values.  The way to avoid that error is by ensuring only one distinct month value is in context.   Try this now:

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1, VALUES(DimDate[EnglishMonthName]))

image

Great. so now we have a reference to the current month.   Now lets try getting a reference to two months ago.   We can do this using the function PARALLELPERIOD. 

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1, PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH) )

We get an error again. As you notice, the PARALLELPERIOD function takes a date for its first parameter, not a month.  In the context of the month, there are multiple dates.  When using PARALLELPERIOD, we get an output of multiple dates and that cannot be represented in once cell unless we tell the function which date to use.  So lets try this now:

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1, FIRSTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH)))

The result is what we expect:

image

So if we get the first date 2 months ago, shouldn’t we capture the three months by getting the last date of the current month? Yes! and one way to accomplish that is by using the following expression:

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1, LASTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], 0, MONTH)))

image

Ok, so now we have the boundaries of dates we need.  In DAX, a filter context can add or restrict the current context based on the expression used.  Using CALCULATE we can test we are properly aggregating the months we need:

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1,
    CALCULATE(
        SUM( FactResellerSales[SalesAmount] ),
            DATESBETWEEN(
                DimDate[FullDateAlternateKey],
                FIRSTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH)),
                LASTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], 0, MONTH))
            )

    )
)

We use the DATESBETWEEN function to return a one column table starting two months ago and ending on the last date of the current month. DAX will then add all the sales amount data over that ‘calculated’ table. The result?

image

What?  The formula gives the exact same result as the regular “Sales Amount” measure? 

The reason is that the expression is evaluated on the current’s month context.  In order to force the evaluation to happen across months, we must force the filter context to evaluate all dates (not just the current month).  We do this by adding one more filter to the expression:

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1,
    CALCULATE(
        SUM( FactResellerSales[SalesAmount] ),
            DATESBETWEEN(
                DimDate[FullDateAlternateKey],
                FIRSTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH)),
                LASTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], 0, MONTH))
            ), ALL(DimDate)
    )
)

image

Great! so now we have a 3 month moving sum.  If you are still following me, all else we need to do is change the aggregation from sum to average.   Simple change, right?

=IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1,
    CALCULATE(
        AVERAGE( FactResellerSales[SalesAmount] ),
            DATESBETWEEN(
                DimDate[FullDateAlternateKey],
                FIRSTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH)),
                LASTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], 0, MONTH))
            ), ALL(DimDate)
    )
)

The result, however , is not what we expect.

image

Why?  Each DAX calculation “activates” certain rows on the tables loaded into the PowerPivot window.  For example, for July of 2001 the reported sales amount is 489,329.  However, this is an aggregate number: the underlying data is at the date level.   When we execute a DAX average function, we ask the PowerPivot engine to compute the average on the active rows, that is, all relevant rows at the level of granularity of the fact table. The output of the average calculation is, then, the correct daily average: not what we want.

Instead, we can add the sales amount for the period and then divide by the distinct count of months in the current context. Here is the complete expression:

IF(COUNTROWS(VALUES(DimDate[EnglishMonthName])) = 1,
    CALCULATE(
    SUM( FactResellerSales[SalesAmount] ) /  COUNTROWS(  VALUES ( DimDate[EnglishMonthName] ) )  ,
            DATESBETWEEN(
                DimDate[FullDateAlternateKey],
                FIRSTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], -2, MONTH)),
                LASTDATE(PARALLELPERIOD(DimDate[FullDateAlternateKey], 0, MONTH))
            ), ALL(DimDate)
    )
)

That’s it!  The result of the final formula is displayed below.

image

powered by metaPost
Posted in: Blog

Comments

There are currently no comments, be the first to post one.

Post Comment

Only registered users may post comments.

MSBiCentral Bloggers Minimize

Latest Entries Minimize
Using VBA to enhance interactivity of PowerPivot reports by JavierGuillen
Presenting for PowerPivot Analytics Expo (hosted by SSWUG on July 15) by JavierGuillen
PowerPivot Analytics Expo
Quartile, Percentile and Median in PowerPivot by JavierGuillen
Using MDX in a PowerPivot model
Median calculation in PowerPivot / DAX by JavierGuillen
How to calculate median over a PowerPivot measure
Calculating moving averages in DAX by JavierGuillen
After SSRS 2008 R2 Upgrade: Some Reports Hang by wsnyder
You just upgraded to SSRS 2008 R2 and some reports are hanging up, while others seem to work fine. How do you debug for this, and what is the fix?
SSRS SystemOverflowexception: Value was either too large or too small for an Int32 by wsnyder
You install or upgrade to SSRS 2008R2 Gold, and get the following error "SSRS SystemOverflowexception: Value was either too large or too small for an Int32". Here is what is going on....
PASS Virtual Chapter Presentation Slides on PowerPivot / DAX by JavierGuillen
Presentation slides for PASS VC webinar on PowerPivot and DAX
Business Intelligence Prototyping with PowerPivot by JavierGuillen
PowerPivot can be a really powerful tool for prototyping Business Intelligence projects.
Speaking for PASS Business Intelligence Virtual Chapter by JavierGuillen
Page 1 of 3First   Previous   [1]  2  3  Next   Last   

Copyright 2004-2012 MSBICentral.com Terms Of Use Privacy Statement