Entries for 'Javier Guillen'

10

PowerPivot currently has no programmatic interface, so there is really no automated way to leverage its capabilities (other than those provided by PowerPivot for SharePoint).  However, we must remember that one way to consume PowerPivot data is through an Excel pivot table (there are others, like Excel Cube functions).  Pivot tables do have an API, which allows for a certain degree of programmatic interactivity with PowerPivot data.

A client recently asked me if a PowerPivot report could behave in this way:  upon expansion of a row lable, a graph should display only the ‘child’ members of the selected row.  A screenshot will probably help in explaining the desired effect:

image

One may be inclined to think this functionality is now possible but keeping in mind that pivot tables do have an API, it was worth giving it a try.  As you may have seen on my prior blog entries, PowerPivot automatically generates an interface of measures and dimensions.  We can use this interface to dynamically generate the desired effect.

The first thing we must do is generate a second pivot table with countries in the row labels along with a slicer for the ‘Sales Territory’ (the one the user will be clicking on). I call this second pivot table a utility pivot table as it is not meant to be shown to the user, but will allow us to get the job done. It will not only be sliced by the same ‘Calendar Year’ as the original one, but will serve as the source for chart data.

image

In order to have both tables react to the slicer, we must configure the ‘Calendar Year’ slicer to connect to both.  This can be easily done in the slicer ribbon tab:

image

Now all we have to do is generate a VBA script to take the user selection of ‘SalesTerritoryGroup’ (in the slicer) and filter the utility table as well. First we must pick the right event handler to capture a user click. For this we can use the Worksheet_PivotTableUpdate event handler.  Once the event is triggered, we can capture the text on the active cell, and pass that to the slicer on the utility pivot table. 

 

Private Sub Worksheet_PivotTableUpdate(ByVal Target As PivotTable)
         
    ActiveWorkbook.SlicerCaches("Slicer_SalesTerritoryGroup").VisibleSlicerItemsList _
        = Array( _
        "[DimSalesTerritory].[SalesTerritoryGroup].&[" & ActiveCell.Text & "]")
 
End Sub

 

“Slicer_SalesTerritoryGroup” is the name of the slicer of the utility pivot table.   Upon pivot table update, we pass to that slicer a reference to the fully qualified dimension member which the user has just clicked on.  The result is a chart that dynamically shows the child countries of the territory the user just double clicked on.  This happens at the same time the row expands to show those children in the pivot table itself:

image

Though the solution may not be perfect (the effect happens not only on expansion but also when collapsing a row), it is still an interesting way to add functionality to a PowerPivot report. 

powered by metaPost
Posted in: Blog
12
This Friday July 15 I will be presenting on DAX as part of the PowerPivot Analytics Expo.   There is a great line-up of speakers, covering a number of topics on PowerPivot technology - from Data Mining to SharePoint integration.  

I will be online during my presentation to answer questions. Don't miss it !!

For more information, visit
http://www.vconferenceonline.com/event/sessions.aspx?id=281&offset=7


Posted in: Blog
08

Lets examine what the options are for calculating Quartile / Percentiles in PowerPivot.  On my last post, I explained one way to calculate Median using pure DAX.   Though the calculation works, don’t expect it to be speedy when dealing with large datasets;  The reason for this is that it relies on row context iterations to rank values prior to selecting the median.  

As similar technique can be used to calculate Quartile or Percentile, following this blog entry.  However, the result is not always optimal:  not only the calculation can be slow, be the results do not match Excel’s native functions.    The reason for this is that Excel uses interpolation to calculate both Quartile and Percentile, whereas the DAX expression above picks one value out of the set (in other words, its just an approximation).

Can we use a technique to match Excel’s results within a PowerPivot model?   The answer is yes, though the technique discussed here does not use DAX.   What I want to share is another way of using the inherent capabilities of PowerPivot: being itself a version of SQL Server Analysis Services, it is actually able to understand MDX.  It is quite interesting to notice that PowerPivot can behave in ways that reveal its SSAS nature:  from the ability to respond to Excel cube functions to generating a trace file which can be read in SQL Profiler.

So what does this have to do with calculating Percentile?   Well, even though DAX does not include any of those functions you can leverage PowerPivot’s multidimensional interface to create an MDX calculated member that will encapsulate each of those functions.  Lets see how.

In order to see PowerPivot queries to the Vertipaq engine in MDX format, we can use a free codeplex add-in called ‘OLAP PivotTable Extensions’ (download it here, and be sure to select the correct bitness version).  This add-in was created with SSAS in mind, but – without any further effort from the add-in itself – PowerPivot responds to it. Once installed, create a PowerPivot table and then use the add-in to view the MDX query (select the ‘MDX’ tab on the window that appears when you click on the OLAP Pivot Table Extensions menu item)

image

Sample PowerPivot table using AdventureWorks data

The MDX query we get is:

SELECT
    NON EMPTY Hierarchize(
        {DrilldownLevel({[DimDate].[EnglishMonthName].[All]},,,INCLUDE_CALC_MEMBERS)})
        DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS 
FROM [Sandbox]
    WHERE (
        [DimDate].[CalendarYear].&[2004],
        [Measures].[Sum of SalesAmount]
        )

So lets create a calculated member to compute Median.

Select the ‘Calculations’ tab on the add-in menu.  This is used to submit calculated members along with the rest of the pivot table query.  Type the following query:

Median(                                  
        (
         [DimDate].[CalendarYear].CurrentMember *
         [DimDate].[EnglishMonthName].[All].Children
        ) 
        , CoalesceEmpty ( [Measures].[Sum of SalesAmount] , 0 )                          
      )

This should look in the add-in window as follows:

image

The result matches Excel regular ‘Median’ function:

image

Lets examine the actual query.   MDX has a built-in ‘Median’ function ( see definition here ).   As you can see, the expression uses parameters that look like regular SSAS dimensions.  This is the PowerPivot data model though, and we didn’t have to use Visual Studio to create it: the PowerPivot engine automatically creates dimensions based on the tabular model loaded into Vertipaq.

The measure [Sum of SalesAmount] is the aggregate PowerPivot created based on the Sales Amount column of the in-memory fact table.  Once it became an actual measure in the PowerPivot model, we can use it as the measure expression of the MDX function.

Lets try now creating Quartile and Percentiles.  It gets more interesting here:  Unlike the ‘Median’ function, MDX has no built-in Percentile or Quartile function.   So here we have it:  neither DAX nor MDX will help us here.  However, we can leverage a another feature of SSAS: the ability to perform COM interop calls through the ExcelMDX assembly.   You can see this library on a regular SSAS installation by browsing the assemblies folder:

image

It is quite interesting when you think about it.   We are asking an Excel pivot table to communicate with the PowerPivot engine through a calculated member which in turn invokes Excel functionality not present in the MDX language. Pretty cool indeed. So lets try it:  Using the following expression we can compute Quartiles:

Excel!Quartile(
                                SetToArray ( 
                                                
                                                                ( [DimDate].[CalendarYear].CurrentMember * [DimDate].[EnglishMonthName].[All].Children 
                                                ) 
                                , CoalesceEmpty ( [Measures].[Sum of SalesAmount]  , 0 )
                                ) 
                ,1)

The output is the following:

image

As you can see, the output matches Excel’s native quartile function. The calculation is now built into the PowerPivot model, and hence is dynamic and can respond to slicers and filters, as well as cell context by using MDX properties like ‘CurrentMember’.  And though the PowerPivot is performing a COM interop call to Excel (marshalling calls from one environment to the other), it should still perform better than a DAX calculation based heavily on row context.

If you examine the MDX that is sent to the Vertipaq engine, you will notice the calculated member is declared as a query scoped expression:

WITH
MEMBER [Measures].[Quartile] as Excel!Quartile(
                                SetToArray ( 
                                                
                                                                ( [DimDate].[CalendarYear].CurrentMember * [DimDate].[EnglishMonthName].[All].Children 
                                                ) 
                                , CoalesceEmpty ( [Measures].[Sum of SalesAmount]  , 0 )
                                ) 
                ,1)

SELECT
    {
    [Measures].[Sum of SalesAmount],
    [Measures].[Quartile]
    }
    DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON COLUMNS ,
NON EMPTY
    Hierarchize(
        DrilldownMember(
            CrossJoin(
                {
                [DimDate].[CalendarYear].[All],
                [DimDate].[CalendarYear].[CalendarYear].AllMembers
                },
                {([DimDate].[EnglishMonthName].[All])}
                ),
                [DimDate].[CalendarYear].[CalendarYear].AllMembers,
                [DimDate].[EnglishMonthName]))
    DIMENSION PROPERTIES PARENT_UNIQUE_NAME,HIERARCHY_UNIQUE_NAME ON ROWS 
FROM [Sandbox] CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

The dimension used (DimDate) as well as the measure (Sum of SalesAmount) were automatically created by the PowerPivot engine.  This is what provides compatibility with OLAP interfaces:  for each table in the PowerPivot model, a default ‘count’ measure has been created as well as a dimension, in which each column is an attribute.  In other words, dimensions and measures are somewhat interchangeable in PowerPivot, following a more flexible model than with the traditional SSAS engine.  Also, notice the query is executed against a cube called ‘Sandbox’.  This is the default cube-interface name for Vertipaq PowerPivot databases.  It is the same name you will see in the cubes generated in the dedicated SSAS instance used on PowerPivot for SharePoint (This is an instance of SQL Server Analysis Services 2008 R2 in Vertipaq mode).

One downside of this method is that the PowerPivot field list is not aware of the calculation.  However, if you open the standard pivot table field list you will see it there. In other words, and as I said before, this calculation is now part of the data model.  Which means you can access it outside of the constraints of the pivot table: through Excel cube functions.   As an example, using the following formula expression on a worksheet cell we can dynamically invoke the MDX calculated member against PowerPivot, with the ability to specify a query context to it:

=CUBEVALUE("PowerPivot Data",CUBESET("PowerPivot Data","Head([DimDate].[CalendarYear].[All].Children)"),"[Measures].[Quartile]")

The result will be 925,015 again.  In other words, we were able to dynamically invoke the first member of the CalendarYear attribute on the DimDate dimension and leverage the MDX calculated member, all in one cell – no need of pivot tables.  Pretty powerful right?

You can verify that our Quartile calculation is now part of the intrinstic model by browsing through measures members when constructing the cube formula:

image

What I find pretty interesting is that the list of available measures on the PowerPivot model includes now regular DAX measures ([Measures].[Sum of Sales]) as well as MDX measures ([Measures].[Quartile]).

How about Percentile?  Using the same method, it is pretty simple to calculate it.  Just replace Quartile in the MDX expression… that’s it!

Excel!Percentile(
                                SetToArray ( 
                                                
                                                                ( [DimDate].[CalendarYear].CurrentMember * [DimDate].[EnglishMonthName].[All].Children 
                                                ) 
                                , CoalesceEmpty ( [Measures].[Sum of SalesAmount]  , 0 )
                                ) 
                ,0.30)

image

Be careful to not define your underlying measure as ‘currency’ type.  Though I always thought data types in the PowerPivot window were purely decorative (as they don’t carry over the pivot table), the MDX calculated member will not work on currency types.  Instead, try decimal type and you will be fine.

And one last thing… as you may have noticed, this technique opens the possiblity to a huge range of calculated expressions.  An though it is not the standard PowerPivot/DAX implementation, it still leverages PowerPivot technology by making use of its inherent data model. In other words, PowerPivot applications can not only benefit from the amazing power of DAX , but also that of MDX (which in turn opens the door to further functionality through interfaces with Excel and VBA).  Things like Internal Rate of Return (IRR) are also possible using this method Smile

Excel!IRR( SetToArray (  [Table3].[b].[All].Children, [Measures].[Sum of a]  ) )

image

 

powered by metaPost
03

Although PowerPivot – in its currently implementation – lacks a dedicated ‘median’ function, it is possible to recreate it using existing functionality.   There is a great blog entry here by Marco Russo where he describes how to calculate median over the values of a PowerPivot column. 

His calculation is as follows:

MINX( FILTER( VALUES( People[Age] ),
              CALCULATE( COUNTROWS( People ),
                         People[Age] <= EARLIER( People[Age] ) ) 
              > COUNTROWS( People ) / 2 ),
      People[Age] )

Recently I tried to adapt this calculation to a PowerPivot application I was working on.  I ran into two issues:

1) The calculation can be applied over a regular column on a table in the PowerPivot window or a DAX calculated column. However, what if you must calculated the median over the output of a DAX measure instead of a regular or calculated column?  In this case, you won’t be able to use the DAX functions EARLIER and VALUES, as these only take columns as parameters.  Due to this, the calculation above won’t work

2) I realized there should be a separate logic depending on whether the set of values is even or odd.   For even sets, there is no discrete middle value and one must calculate an average of both values in the middle.

Fortunately, it is fairly easy to deal with both issues.  With Alberto Ferrari’s help, I implemented the calculation below.  There are three steps for it, the first one which implies the creation of an index, which we will use to select the values in the middle.  I then created an alternate execution path for even sets, which simply averages the two values in the middle.

To see how it works, copy and past the data below and import it into PowerPivot (as a linked table):

Month

Amount

January

1,317,542

February

2,384,847

March

1,563,955

April

1,865,278

May

2,880,753

June

1,987,873

July

2,665,651

August

4,212,972

September

4,047,574

October

2,282,116

November

3,483,161

December

3,510,949

Step 1

Create a measure.  This is to test that we can calculate a median over a column or measure using the same method.

[Test Measure] =SUM([Amount]) / 2

image

Step 2

Create another measure which will serve to rank the months by the value in [Test Measure]. We will use this ranking value to determine which value is the median (in case of odd sets), or to average the two values in the middle (in case of even sets)

[Index] =IF (
    COUNTROWS ( VALUES ( Table1[Month] ) ) = 1,
    COUNTROWS (
        FILTER (
            ALL ( Table1[Month] ),
            [Test Measure] <= CALCULATE (
               [Test Measure], Table1[Month] = VALUES ( Table1[Month] )
            )
        )
    )
)
image

Step 3

Create median expression, reflecting the logic discussed above:

[Median] =IF( COUNTROWS( VALUES( Table1[Month] ) ) > 1,
     IF( MOD( COUNTROWS( ALL( Table1[Month] ) ) ,2 ) = 0,
          (
            CALCULATE( MAXX( Table1, [Test Measure] ),
                FILTER(
                 ALL(Table1[Month]),
                 [Index] <= ROUND( COUNTROWS(ALL(Table1[Month])) / 2, 0)
                )
            ) +
            CALCULATE( MINX( Table1, [Test Measure] ),
                FILTER(
                 ALL(Table1[Month]),
                 [Index] > ROUND( COUNTROWS(ALL(Table1[Month])) / 2, 0)
                )
            )
          ) / 2
  ,
          CALCULATE(
           [Test Measure],
            FILTER(
             ALL(Table1[Month]),
             [Index] = ROUND( COUNTROWS(ALL(Table1[Month])) / 2, 0)
            )
          )
        )
    )

The calculation first uses the MOD function to determine if the set is even (which is the case in our example).  It then gets the two values in the middle by:

1) Filtering the months for all that have an index of less or equal to the count of months divided by two.  Take the maximum value of [Test Measure] for this set.

2) Filter the months for all that have an index greater than the count of months divided by two. Take the minimum value of [Test Measure] for this set

3) add these values and divide by two (simple average)

image

The calculation is correct and matches Excel’s regular median function output. One word of caution: this expression relies heavily on row context which can greatly impact performance over big datasets.    In the following blog post, I will talk about a potential way around this issue in which we can arrive to correct the Median output by leveraging PowerPivot’s intrinsic compatibility with MDX.

powered by metaPost
Posted in: Blog
01
These are the slides for the presentation I gave on Friday April 29, 2011 For the Professional Association for SQL Server (PASS) Business Intelligence Virtual Chapter.

http://cid-3f1bb9738e005da8.office.live.com/self.aspx/.Public/Dynamic%20Formulas%20in%20PowerPivot.pdf

If you have any questions or feedback about them, please send me a message through my email: javiguillen@hotmail.com

Thank you!

28
I will be speaking tomorrow for the Professional Association for SQL Server (PASS) Business Intelligence Virtual Chapter;   my topic will be again one of my favorite: DAX calculations!

Check out the abstract @
http://bi.sqlpass.org/ (Friday 28, 2011 webcast)
03

When developing DAX calculations in order to create meaningful business metrics you normally want to remove or refine an existing filter in the current filter context.  For example, when calculating the sales contribution of a country to the sales across all countries in the Adventure Works sample database one could use:

SUM(FactInternetSales[SalesAmount])/

CALCULATE(SUM(FactInternetSales[SalesAmount]),ALL(DimDate[CalendarYear]))

This would yield the following result




So far so good.  However, what if we want to pick only certain years and have the expression dynamically recalculate only for the selected years?  We could add a slicer with the year values. This won’t work, though: the ALL function will force the filter context to evaluate across all years – regardless of slicer selection:




As you can see, DAX computes the contribution across the total for all years in the in-memory DimDate table.  The result incorrectly adds up to only 54.82% as the denominator is still taking into account all years, not just the selected ones.  How can we make this dynamic, to use only the selected values on the slicer? 

The solution is to take advantage of the inherent flexibility offered by calculated columns. We can create another ‘Year’ column in our data model, and then add another context to the calculation that will use ALL years but only within the boundaries defined by the user defined filters.  The nice thing about this solution is that the output only takes into account the years in the slicer (in other words, the query context), instead of either taking ALL years or just the CURRENT year.

So how to do this?

1)      Add a calculated column called "QueryContextYear" to the in-memory FactResellerSales table, using this DAX expression :

    RELATED(DimDate[CalendarYear])

2)      Replace the slicers in the spreadsheet with this column:  Instead of using the CalendarYear from DimDate in the slicer, use the new QueryContextYear from FactResellerSales

That’s it!  Now the result is dynamic to your slicer selection: the denominator will be calculated only using the selected years.




The reason why works is that the ALL function in the DAX expression removes the filters only on the DimDate[CalendarYear] column, but not in FactResellerSales[QueryContextYear]; The output of a calculation in DAX
is always the result of stacking all the filtered tables.  So while FactResellerSales[QueryContextYear] is being filtered, DimDate[CalendarYear] is not; And the intersection of these filters determines the calculation output.

Posted in: Blog
30

If PowerPivot for Excel was working fine and all of the sudden it disappeared, try checking to see if the add-in got disabled (it has happened to me a few times and I’m still not quite sure why Excel disabled it)

Go to File -> Options -> Add-ins.  Once there, Select COM Add-Ins and click ‘GO’



Once there, a window with open with the option to re-enable PowerPivot again

04

Recently, I encountered a situation that required the use of SUMX as part of a DAX measure.  Interestingly, SUMX seemed to only be taking in the integer part of the decimal used as part of the expression.

DAX expression # 1:

SUMX(FactResellerSales, FactResellerSales[SalesAmount] * 0.9 )

The result is 0 for all the cells on the pivot table.





DAX expression # 2:

SUMX(FactResellerSales, FactResellerSales[SalesAmount] * 2.9 )

The result reflects the fact that DAX only used 2 in the multiplication (instead of 2.9)




Hopefully this will be fixed in the next release.

As a workaround, you can create a DAX calculated column (instead of a measure), as the decimal is used properly there:


 

 

 

 

 

 

 

03

On March 19, I will be speaking about DAX Evaluation Context at SQL Saturday # 70 in Columbia, SC.  As part of the presentation, I will compare the different ways in which DAX and MDX derive context when interpreting an expression. 

Here is a link to the event: http://www.sqlsaturday.com/schedule.aspx

If you can make it to the event, please pass by and say hello!



Posted in: Blog
24

One of the interesting DAX Time Intelligence functions is the TOTALYTD, which can define a measure calculation over a year in a cumulative way.

The same output can be achieved in alternate ways using the CALCULATE function.  Nevertheless, as it is a very common computation in Excel reports, the PowerPivot team has created a dedicated function in order to help developing YTD calculations much faster.

A TOTALYTD behavior that is worth nothing however, is the fact that if a filter context defines multiple years (perhaps accidentally), this function will then use the last year available in order to perform the aggregation.

The screenshots below show a simple pivot table using Adventure Works data.  The DAX measure defined for YTD aggregate is:

TOTALYTD(SUM([SalesAmount]), DimTime[FullDateAlternateKey], ALL(DimTime))

When the slicer selection specifies only one year, the regular aggregate of [Sales Amount] over each month as well as the DAX [YTD] measure act on that sole year.

 However when two years are selected, the regular aggregate of [Sales Amount] acts over the correct filter context (defined over two years), whereas the YTD measure uses only the last year (2002). 

 

At first it can be confusing, but if you select different sets of multiple years in the slicers it becomes evident pretty quick that the YTD is only showing the latest year of the set. 

 

Although it can be argued that it is a good assumption given that the TOTALYTD function is not meant to work over multiple years, it is still worth documenting this behavior and keeping it in mind in case a complex report produces unexpected results due to a YTD formula that is picking for you what year should go into the calculation.

12




This is the final installment on the series of blog entires regarding how to leverage PowerPivot data using Excel cube functions. The main purpose of the series has been to explain how to use Excel in a way that allows the report author to create appealing reports and dashboards using PowerPivot in-memory data but beyond the constraints and limitations imposed by Pivot tables.

As you can see,   the final report (screenshot above) uses a combination Pivot table references, MDX filtering functionality, DAX measures and Excel Cube functions.  There is a reason why each of these was used, which is what we discuss next.

To recap, is worth mentioning a few things that you can do when working with PowerPivot data using Excel cube functions:

-          Cube functions access the in-memory tables using the workbook connection named “PowerPivot Data”, which was automatically created when importing data to PowerPivot.

-          Even though the storage is SSAS in-memory columnar mode (Vertipaq), Cube functions can access this data as if it was SSAS in OLAP mode.

-          Cube functions allow using a limited amount of MDX functionality which is very useful when constructing custom sets to be used across the report

-          DAX calculated columns and measures can be called via Excel Cube functions. Even though they are essentially relational, when using these DAX calculations over Excel Cube functions you can think about them in terms or traditional UDM/OLAP measures and dimensions.

Cube functions: not a vehicle for free form MDX

Even though MDX is allowed inside Cube functions, do not think they will just simply pass the MDX expression to the storage engine (SSAS in OLAP or Vertipaq mode), as if you were in a free form query window like SSMS.  There are limitations that are important to keep in mind:

-          CUBESET function only allows one expression when building the set.  What this means is that you cannot combine two filters in the functions to create a set across 2 or more dimensions (in othere words, a single-tuple set).  You can construct this kind of set however, if you hardcode the set definition manually.



Only one set expression is allowed. Two filters cannot create a tuple set, unless hardcoded manually, as in the expression below:

=CUBESET("PowerPivot Data","({[DimDate].[CalendarYear].[All].[2005],[DimDate].[CalendarYear].[All].[2006]},[DimGeography].[ContinentName].[All].[Asia])","MySet")

Remember: [DimGeography] and [DimDate] are not real OLAP dimensions when using PowerPivot data in this report.  They are in-memory tables.  PowerPivot converts them to ‘dimension equivalents’ after loading the data.  But, you may ask, how does PowerPivot know what is supposed to be a dimension and what a measure?  It doesn’t.  In Vertipaq, any field can be either a dimension or a measure. Try dragging [DimGeography].[RegionCountryName] to the values section of the Pivot table. Vertipaq automatically turns it into a measure, which is by default a count of values.

-          Even though you can define tuples that include more than one dimension following the technique above, the result will be equivalent to a resultset over a column or row axis with crossjoined dimensions. In other words, there is no way to generate the equivalent to a ‘slicer’ axis using the CUBESET function.




There is no equivalent slicer axis in the CUBESET function.  Dis
playing members of a tuple set will yield duplicate values. In this case the first Asia value is associated with Year 2005, and the second one with Year 2006.

-          CUBEVALUE function only allows calling a measure which is then filtered using member expressions. What this means is that you cannot create MDX calculated members on the fly using the CUBEVALUE function.

In order to compensate for the first two limitations above, if there is a need to apply a slicer to a set it is best to reference existing pivot tables, as the default behavior on them is to filter the dimension members on the column or rows axis in a way that is equivalent to the ‘WHERE’ clause in T-SQL or MDX (slicer axis).

 

 

 

 

 

 

 

 

In a pivot table, members are filtered following the behavior of a WHERE slicer clause. When building custom reports, there may be a need to reference pivot tables in other spreadsheets if slicing the data in this particular way is necessary.

 

Regarding the CUBEVALUE limitation (not being able to construct on the fly calculated members), the way to circumvent it is by way of DAX calculations.  These can be called as if they were MDX members, even to the point of being able to change the context of a calculation in the exact same way once you would when invoking a regular SSAS measure.  This is due to the fact that both languages assign context as a function of the coordinate pointers over stored data’s memory space.  MDX does it over a cube space, wheras DAX does it over a relational system of tables and relationships. However, they are in essence very similar, and to Cube functions they look exactly the same.  An example will explain this better:

In order to calculate the Accumulated Sales Growth Since Inception of the product, we use the following DAX measures:

       Measure [Sales (RunningTotal)]:

     

IF([Sales] <> "",

CALCULATE([Sales],DATESBETWEEN(DimDate[Datekey],CALCULATE(FIRSTDATE(DimDate[Datekey]), ALL(DimDate)),LASTDATE(DimDate[Datekey]))),

BLANK())


      Measure [Product Sales Growth (Since Inception)]:


            IF([Sales (Running Total)] <> "",

 

            (

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                                                    LASTNONBLANK(

                                                                                                            ALL(DimDate[CalendarMonth]), [Sales (Running Total)])) -

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                                                    FIRSTNONBLANK(

                                                                                                            ALL(DimDate[CalendarMonth]), [Sales (Running Total)]))

            )                       /

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                            FIRSTNONBLANK(

                                                                                    ALL(DimDate[CalendarMonth]), [Sales (Running Total)]))

            ,BLANK())

This function removes any current filter context on [CalendarMonth] when calculating growth.  It won’t affect, however, the [CalendarYear] attribute.  When applying this DAX measure to a specific calendar Year, it will reflect the accumulated sales growth over the year by using a simple growth formula:

(Final Accumulated Sales – Initial Sales) / (Initial Sales)

If what we want is to display the growth for ALL years, we must remove any existing filter context for the [CalendarYear] attribute when evaluating the DAX measure.  In the sample report, we also want to display this total rate in one cell, outside of a pivot table.  We can accomplish this using the following Cube formula:



We can call the DAX measure within the Cube formula, and by specifying the [All] member of the [CalendarYear] attribute we change the filter context of evaluation, which gives us the total accumulated sales growth across all years.

Finally, despite the limitations exposed above regarding MDX functionality in Cube formulas, it can be used when filtering set values or constructing custom sets.  In the sample report the X axis values applied to the chart where filtered to only the years that have associated fact data, which was accomplished using the MDX ‘Exists’ function. In that way, the chart won't show empty labels for those years that have no data.



The MDX ‘Exists’ function keeps only those ones that have [Sales] data. MDX can be used even though Vertipaq is fundamentally relational, not multidimensional.

In summary, Excel cube functions using PowerPivot data is a great combination that gives I.T. or data analysts a way to build custom Excel reports and dashboards that break out the constraints of pivot tables.  DAX calculation and MDX set manipulation functions can be used in a single Excel report in order to generate highly customized views of data that a user can still interact with. 

 

19

Even though you can directly reference the slicer name range in the cube formula, it won’t work if you select multiple slicer values –

 


But there is a trick I learned thanks to the great PowerPivotPro blog.

Going back to the sheet that holds the PowerPivot table, we had previously selected ‘Calendar Year’ as a vertical slicer.  Interestingly, you can also drag and drop the same field into the report filter field, which in effect enables this field to be both a slicer and a regular filter!



By using this trick, the slicer selection will always be in sync with the pivot table filter.  And in turn, we can use the regular filter cell reference in the cube formula, as they graciously understand multiple selected members in a regular filter.

 


After you change the cube formula, you may have to refresh the PowerPivot table, in order to override the cached values.

16

 

PowerPivot Slicers

Slicers are a new addition to Excel 2010.  They allow filtering in pivot tables, but with a lot of added advantages compared to traditional filters.  First, they look a lot better – they are much more inviting!  Second, when multiple filters are selected, you can actually SEE the selected options as opposed to the [multiple values] that you get with traditional filters.  And last, when using multiple slicers against a pivot table, they can actually autoselect the members with available data based on the selections on the other slicers (in effect, generating a cascading filter)

 It is possible to add a slicer to an Excel worksheet without any pivot table at all.  However, in order to benefit from the cascading mechanism, we must tie it to Pivot table (in that way, members of the slicer will know if they have or not numeric data available for them).

Load the PowerPivot window with the prepopulated Contoso BI data, and select PivotTable under the PowerPivot home tab.  When asked where it should be placed, select “new worksheet”.

Add the RegionCountryName to the rows field, the Sales Amount to the Values field, and the Calendar year and ContinentName as vertical slicers.

 

 

The reason we add the rows and values, even though all we want is the slicers, is to generate the autoexists functionality in the slicers as described above (they will act as cascading filters). In the picture below, you can see this feature in action – members without any data are colored in grey, indicating they have no measure values associated with them.

 

If we move the pivot table to another page (PivotTable Tools tab->Move PivotTable), the slicers will stay on the current spreadsheet.  In that way, we can use them when constructing our cube formula based report while still having the PowerPivot field list available to us in case we want to review available dimension/measure attributes or create new DAX calculations.

There are 7 cube functions in Excel.  They can be used to simulate MDX query functionality, from set creation to retrieval of measure values based on filter expressions.  In our current report, we will use 3 of them -

CUBESET:  used to define a set of members.  You can leverage this function to create tuples that span multiple dimensions as well; its usage is not restricted to members of the same dimension as in regular named sets.

CUBERANKEDMEMBER: used to access each of the members of a CUBESET.

CUBEVALUE: used to retrieve measure values in the context of specific slicers (expressions).

If we wanted to know - for example - the total sales for Greece in our sample Contoso BI data, we could use the following formula:

=CUBEVALUE("PowerPivot Data","[DimGeography].[RegionCountryName].[All].[Greece]","[Measures].[Sales]")

The formula queries Vertipaq and retrieves the measure from the equivalent MDX expression of

Aggregate([DimGeography].[RegionCountryName].[All].[Greece], [Measures].[Sales])


Adding Slicers to the Mix

In order to tie our current slicers with the cube functions, we can leverage the fact that Excel makes available a name range as a pointer to the slicer member set.

Query against Vertipaq

 

 

 

When building cube formulas against the Excel vertipaq database, we leverage the automatically generated “PowerPivot Data” connection.  There is intellisense in the formulas, which helps guide us see the options we have available.

Note how Vertipaq generated a dimension for each of the queries loaded in-memory.  There is also an automatically generated ‘Measures’ dimension.

 

 

 

This dimension holds a not very intuitive ‘count’ aggregate for each of the queries in the vertipaq data store as well. This was done in order to force Analysis Service to create a measure group placeholder, in which to build on when creating DAX calculations.  In our example using Contoso BI, some DAX calculated members were already created for us, and Vertipaq correctly placed them as Measure members ([Measures].[BaseQty] below)

 

 

 

You can also see and edit these DAX measures in the PowerPivot field list. They have a calculator icon next to them.

 

 

 

In our specific report example, we will be using the [Measures].[Sales] DAX calculation, which is simple a SUM of all SalesAmount values in the FactSales query.  This DAX measure will be evaluated in the context of the slicers used as expressions within the cube function. The picture below shows the DAX measure editor for this calculation.

 

 

Add both slicers to the formula -

 

 

 

 

 

 

 

Tying everything up, we end up with the formula below.

=CUBEVALUE("PowerPivot Data","[Measures].[Sales]",Slicer_ContinentName, Slicer_CalendarYear)

This allows us to uses the slicers to dynamically evaluate the [Measures].[Sales] member in the context of the slicers’ selection.

 

In the next entry, I will continue building the report in the attempt to answer the questions:

which are the top 5 selling countries per continent and year?

Which were the best selling brands for one or more of the countries above?

Which was the top product out of the best selling brand group above, and what were it sales performance for the last year?

Posted in: SSAS, MDX, Blog
28

  

PowerPivot Data + Cube Formulas = a powerful combination
 

 

One of the interesting things about PowerPivot is that it really is, under the covers, an instance of SQL Server Analysis Service.  And because of this we get some features for ‘free’, features that already work nicely and reliably in the traditional SSAS engine.  One of those features is the interactive functionality between Excel and the cube, and specifically - Excel Cube functions.  These are functions that allow us to build highly customized SSAS connected Excel based reports, well beyond any inherent limitations of pivot tables.  You may have seen these formulas automatically generated by Excel, when selecting the option to convert an SSAS connected Pivot Table to formulas (via the OLAP tools menu) - 

 

 

 

 

 

 

 

 

Two great advantages of using these functions are that we can build highly customized, asymmetric reports and that we can use MDX functionality to query either a traditional SSAS cube or the in-memory Vertipaq engine.   

The term "Excel functions" and "Excel formulas" is sometimes used interchangeably.  Excel Cube functions are the 7 built-in functions available in the formula bar, under the 'Cube' category;  Excel cube formulas are cell formulas that use cube functions in order to achieve dynamic query behavior to the Analysis Service instance.

 


But how?  If PowerPivot is loading data in-memory in what looks like a relational schema (one tab for each table), then how is it that we connect to a ‘cube’?

 

 

Well, PowerPivot (or Vertipaq, shall we say) builds an in-memory OLAP cube.  You can see this when running DAX queries against the PowerPivot data, Excel displays a ‘Running OLAP query’ message on the status bar. (It really is, when you think about it, an amazing technology… cubes are embedded in the workbook!)

Anyhow, going back to topic – cube formulas.  PowerPivot creates an embedded connection to the in-memory data. You can see the connection string under the data tab in the Excel ribbon.  Select connection and you will see one called ‘PowerPivot Data’.  This was automatically created when importing data via the PowerPivot window.

 

When building customized Excel reports, you can retrieve data from this source with any of the built-in Excel Cube functions. 

PowerPivot applications... without Pivot tables!

Cube formulas are good to build custom reports that are difficult to create with standard pivot tables, yet still leveraging the functionality of being connected to the database (in the case of PowerPivot, the in-memory Vertipaq database).  We can also leverage DAX measures, and execute them through Cube functions outside the pivot table.

I the next following posts I will go over a sample report, highly customized to create the type of business reporting that decision makers would want to see. The example uses the new Microsoft business intelligence database, Contoso Retail DW (available for download here), and leverages some of the interesting features of PowerPivot and Cube functions to build Excel reports that require functionality beyond what is normally exposed through pivot tables.

Posted in: Blog
28

The new SSAS engine (only currently available to PowerPivot, either as part of Excel or Sharepoint 2010, and shipped along with SQL Server 2008 R2) has a completely different architecture than the traditional disk based storage engine.

As a columnar database,  its driving principle is that the variation of data across 1 column is usually low (for example, if a column stores “gender” it only needs to store two values for the entire column). This in contrast to the traditional storage in which values are stored for each row, no matter how much data variance there is (as you can see - when storing columns with high degree of variance, the columnar approach may lose its advantage).   

Due to the reduced need to store discrete values (along with other data pattern algorithms), high levels of compression are possible which allows the Excel version of Vertipaq to run in-memory, all based on RAM storage. The huge advantage here over traditional disk based SSAS storage is data retrieval and calculations happen at a much faster rate, as disk I/O processing is omitted. 

Given that current personal computers have much higher RAM specifications, it is finally possible to hold in-memory an entire database table representing millions of rows, thanks to the Vertipaq's architecture. 

Another characteristic of  PowerPivot's Vertipaq engine is that it is an in-process memory storage.  That means it is bound to Excel, as they are both the same file.  The advantages are it allows portability, as the file IS the database; however, the disadvantage is that any process failure compromises the whole system.


Vertipaq, in its PowerPivot for Sharepoint version, requires a dedicated SQL Server Analysis Server instance which only responds to the PowerPivot system, and as its Excel counterpart, does not store the data in pre-aggregated mode but instead computes aggregations on the fly. Other server functions like data unloading (removing data from cache at specified intervals, if users do not make requests) are also featured in this type of configuration.


Some of the features that were currently left out for version 1.0 of PowerPivot (like many-to-many relationships and dimension hierarchies) are available to the underlying engine, as it really is in fact, a repackaged SSAS instance.  Those features will eventually be exposed, in order to make the Vertipaq/PowerPivot system even more robust.


At this point, it is not possible to access PowerPivot OLAP cubes via a programmatic interface (though you can access OLAP cube members via Excel cube functions).  Let’s hope the next version will include a way to interact with it (and with Vertipaq) via managed code.

Posted in: Blog
08

 

One of the things I am fascinated with is how familiar data manipulation tools like Excel can be powerful BI enablers.   While traditionally spreadsheet software hasn’t been part of the corporate BI initiative (in fact, it was sometimes seen as an enemy as it generated multiple isolated spreadmarts), newer BI technologies are now embracing it, in part due to the recognition that information workers love working with it and it is the BI program that needs to adapt to the way those workers crunch and analyze data and not the other way around.   Cube functions/PowerPivot/DAX and the Vertipaq engine were born under that philosophy -- the possibility of having Excel interfaces connected to the data warehouse to enable data analysts to create fast ad hoc analysis needed for strategic decision making is what is now being touted as ‘BI for the masses’.
Pivoting in Excel is as easy as drag & drop. However, I have seen many power users needing to go beyond the implicit limitations of pivot tables and create more customized reports. PowerPivot is an excellent tool that can help in most of those cases, but in its current incarnation there is no concept of dimension hierarchies. A power user or BI developer can still leverage the corporate cube dimension hierarchies to build very customized reports in Excel, and even use some basic MDX directly in the Excel functions to build reports not normally available to standard Pivot tables (unless the BI team builds that capability in the cube of course)
In my experience, building some Excel templates using this technology can enable non technical power users to leverage and extend them and from there create very custom reports in a fairly fast way and without having to delve too much in the code. For example, changing the level retrieved from an Descendants function can be a simple thing from a code perspective but can quickly empower a data analyst to satisfy a quick request, provided he has some basic conceptual understanding of what a hierarchy is  –

I wrote a small article that describes the process here: http://www.msbicentral.com/Resources/Articles/tabid/88/articleType/ArticleView/articleId/103/Beyond-Excel-pivot-tables-Leveraging-cube-formulas-with-MDX.aspx

Posted in: Blog
Copyright 2004-2013 MSBICentral.com Terms Of Use Privacy Statement