01
These are the slides for the presentation I gave on Friday April 29, 2011 For the Professional Association for SQL Server (PASS) Business Intelligence Virtual Chapter.

http://cid-3f1bb9738e005da8.office.live.com/self.aspx/.Public/Dynamic%20Formulas%20in%20PowerPivot.pdf

If you have any questions or feedback about them, please send me a message through my email: javiguillen@hotmail.com

Thank you!

28

PowerPivot is generally marketed as a ‘Self-Service BI’ tool.  However, it can alternatively be used as a way to enhance the design phases of traditional Business Intelligence projects, even if the final implementation does not use PowerPivot technology at all.

Why is that?  Because it more closely aligns business experts with BI developers during architectural discussions. In order to explain this better, I think it’s relevant to first describe the process by which data analysts can provide relevant input to BI designers:

A lot of intermediate Excel users will probably never use VBA macros and will barely feel comfortable with Pivot Tables.  For advanced Excel users, however, there is a thrill associated with getting a cool formula to work, or knowing the most interesting formatting shortcuts or even knowing how to add dropdowns to a spreadsheet to achieve dynamic interactivity.

These folks do not normally belong to the technology team; however, in addition to being savvy Excel users they are in the front lines of the decision making process which many times makes them subject matter experts (SMEs). It is likely they will eventually be motivated to start playing with the data import facility in PowerPivot and start developing DAX calculations.

Fast forward a little into the future, and imagine these users developing meaningful and insightful PowerPivot reports.  These contain connections to a multitude of data sources and aggregate data in a way that provides enlightening information to decision makers.

Though not always aware of it, these users will have accomplished three things of tremendous value for Business Intelligence professionals:

·         Validated one or more sources of quality data

·         Validated a useful data model

·         Validated one or more business calculations

In other words, as they benefit and make use of ‘Self-Service BI’ technology they will unknowingly serve as ‘data explorers’ to the Business Intelligence and Data Warehousing team that benefits from their discoveries.

In the past, these ‘spreadsheet-based discoveries’ went unnoticed by members of the technology team (and became spreadmarts).  However, the Business Intelligence team members can now gain insight into popular spreadsheet reports across the enterprise (PowerPivot for SharePoint). In addition to that, they will be able to take these PowerPivot applications and open them in Visual Studio as a SQL Server Analysis Services Business Intelligence Semantic Models (BISM) in SQL Server 2011.  This is the server equivalent of a PowerPivot for Excel workbook: it is built using the same technology (the columnar based storage engine "Vertipaq") but with a number of robust features proper of an enterprise solution.

One of the great things of this new approach to BI that Microsoft is proposing is that it will most likely help close the gap between Excel power users and Business Intelligence professionals by creating an analytical environment built on the same technical foundation on both ends: DAX.  This will most certainly help avoid communication challenges that normally occur when both sides speak in totally different “languages”:  Excel formulas \ business calculations on one side, and T-SQL/MDX and OLAP on the other.

So what about BI prototyping? And how can we benefit from it even before SQL Server 2011 is released? Well, imagine having the ability to engage subject matter experts in their most familiar environment (Excel) and with their help be able to rapidly test assumptions about BI project deliverables, before starting development.  Rather than purely describing a calculation in words or spreadsheets disconnected from a database, we can instead quickly and iteratively build a sample of it tied to a data source and observe the calculation in action as its being sliced by multiple dimensions.  This can generate the necessary early insights that can help correct potential misunderstandings quickly and generate a more complete and accurate data model that can be used to guide actual development. And these insights have been already validated by the SMEs (in collaboration with the BI team).

As an example, imagine a dimension called “Orders”.  During the initial design phase, it becomes clear it must be added to the cube in order to generate the expected reports. Fast forward a few months (and many hours of development).  While building the final reports, the BI team becomes aware something that is missing:  Even though DimOrders is used in many reports (as business users started at the beginning of the project) it was built at a different grain that what reports require;  conditional logic is discovered – yes, at the last stages of the project – that treats some Order Details differently from others.  What to do? Different approaches are possible, from redesigning parts of the ETL to creating new measure groups based on named queries in order include the data that is missing. Architectural re-arrangements are going to have to be made in order to complete it on time; and any ‘quick’ solution will risk compromising the ad hoc nature of the final product (which will in turn limit the ability to use it for analysis).

How would this change if PowerPivot is used as a prototyping tool?  Using an Agile approach, during  early scrum meetings it will be possible to ‘sample’ the Order dimension against a copy of all relevant reports.  By loading a relatively small set of data into PowerPivot and designing ad hoc reports to mirror actual views across any functional existing reports, one is able to quickly uncover the fact that different grain is required for the dimension in question.  This work would probably be done in very small amount of time compared to what it will require to re-desing and re-develop parts of the ETL and OLAP solution.  Time – and money – can be saved while increase accuracy of the design.

How would it be possible to ‘mimic’ the dimension ‘Orders’ without building into a real SSAS cube? This is the most interesting part:  PowerPivot was built on SQL Server Analysis Services code. In other words, it is a version of SSAS.  All rules expected by SSAS when building cubes apply to PowerPivot as well; for example, dimensional tables in a star schema translate to ‘lookup’ tables in the PowerPivot window.  PowerPivot data is consumed by Excel using ‘cube’ interfaces  (MS OLAP provider, Excel CUBE functions, etc); However, PowerPivot was built for fast development which makes it an ideal tool for ‘mimicking’ an SSAS cube without the complexity of advanced server features.

I want to emphasize, we are not trying to build the entire BI project or corporate data warehouse in PowerPivot.  We are just ‘sampling’ the data model and ‘testing’ the different calculations in order to allow subject matter experts to help us gain a much clarity regarding the quality of the data, the relationships among the different data entities and the validity and relevance of the business calculations.   It is often the case that business users will start ‘refining’ their idea of what they want once they see a working model. 

This final result of a BI project that used PowerPivot for prototyping can very well be a traditional UDM/OLAP environment using MDX calculations. The discoveries gained during the prototyping phase have no direct relationship to the specific technology that will be implemented; they only offer pointers to a better architecture that will increase the chances of a successful implementation, one that will have a profound impact on the business.

A lot of Business Intelligence projects fail for two major reasons: lack of Business Sponsorship and poor Data Quality.  There is a good chance that using PowerPivot as a prototyping tool can help tackle these issues as subject matter experts use their familiar Excel environment while collaborating with BI professionals under the iterative Agile framework. Rapid dimension/measure sampling can quickly uncover data quality issues while establishing a solid data model that analysts can understand which gradually builds support for the BI vision (in contrast to solutions they perceive as overly complex and many times inaccurate, which is the mean reason why ‘spreadmarts’ exist in the first place).

28
I will be speaking tomorrow for the Professional Association for SQL Server (PASS) Business Intelligence Virtual Chapter;   my topic will be again one of my favorite: DAX calculations!

Check out the abstract @
http://bi.sqlpass.org/ (Friday 28, 2011 webcast)
30

If PowerPivot for Excel was working fine and all of the sudden it disappeared, try checking to see if the add-in got disabled (it has happened to me a few times and I’m still not quite sure why Excel disabled it)

Go to File -> Options -> Add-ins.  Once there, Select COM Add-Ins and click ‘GO’



Once there, a window with open with the option to re-enable PowerPivot again

24

One of the interesting DAX Time Intelligence functions is the TOTALYTD, which can define a measure calculation over a year in a cumulative way.

The same output can be achieved in alternate ways using the CALCULATE function.  Nevertheless, as it is a very common computation in Excel reports, the PowerPivot team has created a dedicated function in order to help developing YTD calculations much faster.

A TOTALYTD behavior that is worth nothing however, is the fact that if a filter context defines multiple years (perhaps accidentally), this function will then use the last year available in order to perform the aggregation.

The screenshots below show a simple pivot table using Adventure Works data.  The DAX measure defined for YTD aggregate is:

TOTALYTD(SUM([SalesAmount]), DimTime[FullDateAlternateKey], ALL(DimTime))

When the slicer selection specifies only one year, the regular aggregate of [Sales Amount] over each month as well as the DAX [YTD] measure act on that sole year.

 However when two years are selected, the regular aggregate of [Sales Amount] acts over the correct filter context (defined over two years), whereas the YTD measure uses only the last year (2002). 

 

At first it can be confusing, but if you select different sets of multiple years in the slicers it becomes evident pretty quick that the YTD is only showing the latest year of the set. 

 

Although it can be argued that it is a good assumption given that the TOTALYTD function is not meant to work over multiple years, it is still worth documenting this behavior and keeping it in mind in case a complex report produces unexpected results due to a YTD formula that is picking for you what year should go into the calculation.

12




This is the final installment on the series of blog entires regarding how to leverage PowerPivot data using Excel cube functions. The main purpose of the series has been to explain how to use Excel in a way that allows the report author to create appealing reports and dashboards using PowerPivot in-memory data but beyond the constraints and limitations imposed by Pivot tables.

As you can see,   the final report (screenshot above) uses a combination Pivot table references, MDX filtering functionality, DAX measures and Excel Cube functions.  There is a reason why each of these was used, which is what we discuss next.

To recap, is worth mentioning a few things that you can do when working with PowerPivot data using Excel cube functions:

-          Cube functions access the in-memory tables using the workbook connection named “PowerPivot Data”, which was automatically created when importing data to PowerPivot.

-          Even though the storage is SSAS in-memory columnar mode (Vertipaq), Cube functions can access this data as if it was SSAS in OLAP mode.

-          Cube functions allow using a limited amount of MDX functionality which is very useful when constructing custom sets to be used across the report

-          DAX calculated columns and measures can be called via Excel Cube functions. Even though they are essentially relational, when using these DAX calculations over Excel Cube functions you can think about them in terms or traditional UDM/OLAP measures and dimensions.

Cube functions: not a vehicle for free form MDX

Even though MDX is allowed inside Cube functions, do not think they will just simply pass the MDX expression to the storage engine (SSAS in OLAP or Vertipaq mode), as if you were in a free form query window like SSMS.  There are limitations that are important to keep in mind:

-          CUBESET function only allows one expression when building the set.  What this means is that you cannot combine two filters in the functions to create a set across 2 or more dimensions (in othere words, a single-tuple set).  You can construct this kind of set however, if you hardcode the set definition manually.



Only one set expression is allowed. Two filters cannot create a tuple set, unless hardcoded manually, as in the expression below:

=CUBESET("PowerPivot Data","({[DimDate].[CalendarYear].[All].[2005],[DimDate].[CalendarYear].[All].[2006]},[DimGeography].[ContinentName].[All].[Asia])","MySet")

Remember: [DimGeography] and [DimDate] are not real OLAP dimensions when using PowerPivot data in this report.  They are in-memory tables.  PowerPivot converts them to ‘dimension equivalents’ after loading the data.  But, you may ask, how does PowerPivot know what is supposed to be a dimension and what a measure?  It doesn’t.  In Vertipaq, any field can be either a dimension or a measure. Try dragging [DimGeography].[RegionCountryName] to the values section of the Pivot table. Vertipaq automatically turns it into a measure, which is by default a count of values.

-          Even though you can define tuples that include more than one dimension following the technique above, the result will be equivalent to a resultset over a column or row axis with crossjoined dimensions. In other words, there is no way to generate the equivalent to a ‘slicer’ axis using the CUBESET function.




There is no equivalent slicer axis in the CUBESET function.  Dis
playing members of a tuple set will yield duplicate values. In this case the first Asia value is associated with Year 2005, and the second one with Year 2006.

-          CUBEVALUE function only allows calling a measure which is then filtered using member expressions. What this means is that you cannot create MDX calculated members on the fly using the CUBEVALUE function.

In order to compensate for the first two limitations above, if there is a need to apply a slicer to a set it is best to reference existing pivot tables, as the default behavior on them is to filter the dimension members on the column or rows axis in a way that is equivalent to the ‘WHERE’ clause in T-SQL or MDX (slicer axis).

 

 

 

 

 

 

 

 

In a pivot table, members are filtered following the behavior of a WHERE slicer clause. When building custom reports, there may be a need to reference pivot tables in other spreadsheets if slicing the data in this particular way is necessary.

 

Regarding the CUBEVALUE limitation (not being able to construct on the fly calculated members), the way to circumvent it is by way of DAX calculations.  These can be called as if they were MDX members, even to the point of being able to change the context of a calculation in the exact same way once you would when invoking a regular SSAS measure.  This is due to the fact that both languages assign context as a function of the coordinate pointers over stored data’s memory space.  MDX does it over a cube space, wheras DAX does it over a relational system of tables and relationships. However, they are in essence very similar, and to Cube functions they look exactly the same.  An example will explain this better:

In order to calculate the Accumulated Sales Growth Since Inception of the product, we use the following DAX measures:

       Measure [Sales (RunningTotal)]:

     

IF([Sales] <> "",

CALCULATE([Sales],DATESBETWEEN(DimDate[Datekey],CALCULATE(FIRSTDATE(DimDate[Datekey]), ALL(DimDate)),LASTDATE(DimDate[Datekey]))),

BLANK())


      Measure [Product Sales Growth (Since Inception)]:


            IF([Sales (Running Total)] <> "",

 

            (

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                                                    LASTNONBLANK(

                                                                                                            ALL(DimDate[CalendarMonth]), [Sales (Running Total)])) -

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                                                    FIRSTNONBLANK(

                                                                                                            ALL(DimDate[CalendarMonth]), [Sales (Running Total)]))

            )                       /

                                    CALCULATE(

                                                            [Sales (Running Total)],

                                                            FIRSTNONBLANK(

                                                                                    ALL(DimDate[CalendarMonth]), [Sales (Running Total)]))

            ,BLANK())

This function removes any current filter context on [CalendarMonth] when calculating growth.  It won’t affect, however, the [CalendarYear] attribute.  When applying this DAX measure to a specific calendar Year, it will reflect the accumulated sales growth over the year by using a simple growth formula:

(Final Accumulated Sales – Initial Sales) / (Initial Sales)

If what we want is to display the growth for ALL years, we must remove any existing filter context for the [CalendarYear] attribute when evaluating the DAX measure.  In the sample report, we also want to display this total rate in one cell, outside of a pivot table.  We can accomplish this using the following Cube formula:



We can call the DAX measure within the Cube formula, and by specifying the [All] member of the [CalendarYear] attribute we change the filter context of evaluation, which gives us the total accumulated sales growth across all years.

Finally, despite the limitations exposed above regarding MDX functionality in Cube formulas, it can be used when filtering set values or constructing custom sets.  In the sample report the X axis values applied to the chart where filtered to only the years that have associated fact data, which was accomplished using the MDX ‘Exists’ function. In that way, the chart won't show empty labels for those years that have no data.



The MDX ‘Exists’ function keeps only those ones that have [Sales] data. MDX can be used even though Vertipaq is fundamentally relational, not multidimensional.

In summary, Excel cube functions using PowerPivot data is a great combination that gives I.T. or data analysts a way to build custom Excel reports and dashboards that break out the constraints of pivot tables.  DAX calculation and MDX set manipulation functions can be used in a single Excel report in order to generate highly customized views of data that a user can still interact with. 

 

19

Even though you can directly reference the slicer name range in the cube formula, it won’t work if you select multiple slicer values –

 


But there is a trick I learned thanks to the great PowerPivotPro blog.

Going back to the sheet that holds the PowerPivot table, we had previously selected ‘Calendar Year’ as a vertical slicer.  Interestingly, you can also drag and drop the same field into the report filter field, which in effect enables this field to be both a slicer and a regular filter!



By using this trick, the slicer selection will always be in sync with the pivot table filter.  And in turn, we can use the regular filter cell reference in the cube formula, as they graciously understand multiple selected members in a regular filter.

 


After you change the cube formula, you may have to refresh the PowerPivot table, in order to override the cached values.

07

 

PowerPivot and DAX are a powerful technologies, but there is still a good amount of work that can be done with traditional cube functions. In its current version, PowerPivot lacks the concept of dimension hierarchies; however very customized Excel reports that leverage cube hierarchies can still be built using MDX enabled cube functions. This approach goes beyond the limiting nature of traditional pivot tables by enabling power users to transverse dimensions using custom formulas that are a mix of MDX with traditional Excel functionality.
This approach can be used, for example, to determine the top city for each of the displayed countries in a traditional SSAS connected pivot table, as shown below (using Adventure Works 2008 SSAS cube)
 
By converting the pivot table to formulas, we free ourselves from any of the inherent pivot table limitations -
 
Notice that filters are still kept as dropdowns, which is a powerful feature as it allows the construction of very dynamic formula based reports.
 
Selecting the member Germany (cell A8 in screenshot above) shows us how Excel navigates the cube in order to retrieve the member name -
 
 
We can use a similar approach with the CUBESET Excel function in order to create an MDX set of the top cities by reseller sales for each of the countries displayed in the spreadsheet;  In this example, I leave the set name empty which will make it invisible to people viewing the spreadsheet cell values. However, even though there no value is displayed, the cell contains a reference to the set object now, and we can use reference it via Excel formulas in order to complete our report:
 
 
Using the CUBERANKEDMEMBER function, we retrieve the first member of the set and show the city name (In this example the set only contained one member)
 
Just as easily, a power user with understanding of hierarchies could slightly modify the MDX descendant formula to retrieve a different level of the hierarchy, like the top postal code (using the value of 3 for the descendants formula, in effect changing the level of granularity retrieved from the dimension)
 
Even though is a fairly simple technique, I hope this small article shows the potential of using MDX enabled cube formulas. More complicated calculations like weighted averages or trends can be also derived using this technique, enabling power users to quickly satisfy questions from decision makers and build highly customized reports. In turn, the BI team can leverage the successful and popular calculations and ‘promote’ them to the cube (which will most likely happen with DAX measures as well).

 

 My Blog Friends Minimize


    
Copyright 2004-2012 MSBICentral.com Terms Of Use Privacy Statement