Posted on May 13, 2012 03:28
This is called a cartesian product.
Have you tried to get a simple list of two related tables in PowerPivot? Did you get frustrated? I did. Imagine two tables ProductCategory and ProductSubcategory, and I simply want a list of each ProductCategory with all of the ProductSubcategories associated with it.
Add the tables in PowerPivot and relate them on the proper keys. Everything looks good so far. Add a pivot table in Excel and drag ProductCategory name, and then ProductSubcategory name into it.
Here is the simple list, all closed down.
We expand Accessories and see the list of subcategories.
Close Accessories and open Bikes.
It’s the same list. Every subcategory is associated with every category. This is not what we wanted.
This is called a cartesian product.
There are two ways to fix this.
1. Import these two tables as a single table. This requires you to join them in SQL. There is a query editor for this. As you import the tables, choose “Write a Query” as in the diagram below. Then Choose Next.
2. Choose the Design button in the Table Import Wizard.
3. Choose ProductCategory Name and ProductSubcategory name. When the tables are related within the database, they will be automatically joined. If they are not related, you must de-select the “Auto Detect” button and connect the joining columns in the next screen. Then Choose OK.
Since all of the data is in a single table, when you drag the category and subcategory, you will see what you expect.
The other way is to drag any column from the child table (ProductSubcategory) or a measure from any table related to both Category and subcategory to the Values landing zone. I dragged the name column from ProductSubcategory.
Now we get the list we want.
Posted on April 16, 2012 06:51
In a previous blog post, we saw there can be a performance penalty when using text parameters in SSIS, promised to let you know if I found a work around. Here is the story.
In order to test this on AdventureWorks database, I copied AdventureWorks as another database name. To test, I decided to use Sales.SalesOrderDetail table. I added a column which converts the CarrierTrackingNumber from Nvarchar(25) to varchar(25). We will use this column to filter from Unicode parameters in SSRS. I called the new column SO. The column definition is below:
1: [S0] as (Convert([varchar(250,]carriertrackingNumber[,00 PERSISTED
SSRS treats queries with multi-select parameters differently when 1 value is chosen vs when multiple values are chosen. When one value is chosen, sp_exectuesql is used, as in the code below.
exec sp_executesql N'SELECT Sales.SalesOrderDetail.*
WHERE SO in (@SOParm)',N'@SOParm nvarchar(12)',@SOParm=N'000A-434D-BC'
However when multiple values are selected, the query SSRS submits looks like the one below:
WHERE SO in (N'4911-403C-98',N'6431-4D57-83')
So any solution with a multi-select parameter must be able to work in both of these cases and it had to pass values that are not Unicode. For the Unicode part, if we could just generate a string without the N character, that would tell SQL that this is not Unicode. So we need to put some calculation around the query, or at least the parameters. I did not want to generate the entire query in a Report Code script because the sql was about 800 lines long with lots of quoted string values in it. Including quoted strings in another string is so painful. That means a single parameter string must be passed to SQL, and then SQL must parse it in a usable way. How can we pass a single string value down to a SQL query, but have it behave as if multiple values are passed, so sql can do something like the query above – Select yada from table where SO in (‘The string we pass’).
First we need to pass 1 or more parameter values as a single string. The parameter for this report was called @SOParm. In the data set, I used the JOIN function to create a single, comma delimited string with all of the parameter values. So now we have a single comma-delimited string which is not Unicode.
The resulting string looks like the last row on the right in the report below.
Now the question becomes – how can we turn a query like the top on above ( the sp_executesql one), into something that can parse the string @SOParm back into multiple values that can be used in a where clause. You can see below that the parameter is now a single comma-delimited string.
exec sp_executesql N'SELECT Sales.SalesOrderDetail.*
WHERE SO in (@SOParm)',N'@SOParm nvarchar(25)',@SOParm=N'4911-403C-98,6431-4D57-83'
How can we transform this into something using sp_executesql, but the parameter values are treated like two parameters instead of a single string? Remember that sql syntax allows
WHERE SO in (Select col from Anothertable)
This helps us because we can create a table-valued function in SQL that will take our string, parse it into individual values and present it back as a table. The code for this function follows:
CREATE FUNCTION ]dbo[.]ufn_CSVToTable[ ( @StringInput NVARCHAR(4000),@delimiter NCHAR(1) )
RETURNS @OutputTable TABLE ( ]col[ VARCHAR(4000) )
DECLARE @String VARCHAR(4000)
WHILE LEN(@StringInput) > 0
SET @String = LEFT(@StringInput,
ISNULL(NULLIF(CHARINDEX(@delimiter, @StringInput) - 1, -1),
SET @StringInput = SUBSTRING(@StringInput,
ISNULL(NULLIF(CHARINDEX(@delimiter, @StringInput), 0),
LEN(@StringInput)) + 1, LEN(@StringInput))
INSERT INTO @OutputTable ( ]col[ )
VALUES ( @String )
This works fine, but notice it has a limit of 4000 characters for the string. You can test this function in SSMS with the following query.
WHERE SO in (Select col from ]ufn_CSVToTable[( '4911-403C-98,6431-4D57-83') )
This works and still gives us an index scan with no Nvarchar to varchar conversions… Performance is much better, especially for large tables, where the cost of the table scan is greatest..
Now we change the query in our report to:
WHERE SO in (Select col from ]ufn_CSVToTable[( @SOParm) )
And the query that actually gets sent to sql server:
exec sp_executesql N'SELECT Sales.SalesOrderDetail.*
WHERE SO in (Select col from ]ufn_CSVToTable[( @SOParm) )',N'@SOParm nvarchar(25)',@SOParm=N'4911-403C-98,6431-4D57-83'
That’s the workaround we used. I wish the SSRS developers at MS would see fit to add a single byte character set as a data type for SSRS, but maybe there is no .Net equivalent to that. I don’t know. Even though this is a pain, it certainly allowed our reports to run much, much faster than before..
Posted on April 15, 2012 06:51
Too many PowerPivot Connections?
Have you ever created a PowerPivot in Excel, then changed the data source, and things didn’t seem to work right? Maybe it acted like the data was coming from different servers, some from the old server, and some from the new server. It is very easy to get yourself into this situation, without even realizing it. Here is what you might do.
Go into PowerPivot, Select From Database, From SQL Server, add your servername and database name. Then click next.
Moving to the next screen, you select the main table you are interested in, in this case Product Category.
You click Finish and the table is imported. You mess with it some, then decide you want another table, this time ProductSubCategory. You import it the same way. Later still you decide you want to see all of the Products, and import the Product table. As you keep working on this powerpivot, you end up with 15 or 20 tables, and have created yourself a really nice report. At some point in the future you wish to point to powerpivot to another server, maybe from a dev server to a production server.
You click on the Design tab, and choose Existing Connections. You see something like the picture below, except you have 15 Connections. Every time you added a table, you created a new connection – even though the new connection points to the same place as the old one. To point this powerpivot to another server, you must edit each one of the connections.
I am not sure that Powerpivot Add-in to excel will allow you to change the data source connection for a table. So this might not be fixable, without starting all over again.
How should this have been done?
If you select all of the tables you need, right at the first, they will share the connection. It is unlikely you will always know in advance what tables you need. Moreover, you will likely add to your powerpivot over time.
Assuming you wish to add additional tables, using the same connection, follow these steps.
Instead of using From Database on the Home tab, go to the Design tab and choose Existing Connections. Don’t feel bad, the gui seems to lead us in the wrong direction.
You should see a single connection, the one you created when you imported your first table – Select that connection.
Then Choose OPEN. The name on the button didn’t make sense to me, but that is what you do. select your connection and Click Open.
Now you will see the Table Import Wizard – this is the same place you were the first time around. Click next, choose the additional tables and move on.
Now, when you wish to add new tables from the same source database, you know how to share a single connection. Should you ever need to change your data source – it just got easier.
Posted on March 25, 2012 18:26
Recently, I got a call complaining about poor query performance in an SSRS Report. The performance problem was caused by a data type conversion issue. This is what I figured out.
SSRS text parameters have a data type of Unicode – two bytes per character. Many of the databases I deal with use the single byte character set for text fields. SQL data types for Unicode are nvarchar, nchar, and single byte character set data types are varchar and char. Although SQL Server automatically converts Unicode to single-byte, your query plan and performance may suffer.
The report had a parameter called Business Unit which was a text parameter. The parameter values were populated from a SQL column which is varchar data type. I multi-selected 5 values and ran the query, capturing the actual query using SQL Profiler. I made two versions of the query, one with the parameter values not Unicode (Query 1), and the other was the original query sent from the report using Unicode(Query 2).
2: SELECT *
3: FROM [Fin_Internal].[Ledger_AvB]
4: Where Business_Unit in ( '10192','10195','10191','10137','10656')
2: SELECT *
3: FROM [Fin_Internal].[Ledger_AvB]
4: Where Business_Unit in ( N'10192',N'10195',N'10191',N'10137',N'10656')
The Ledger_AvB table used in example has14 million rows and the query is VERY selective - returning only about 20 rows.
After running both queries in SSMS, let’s look at the results. You can see below that Query 1(Non-Unicode) had 7% cost and Query 2(Unicode) was 97%. Remember the data is stored using single byte character set, so the Unicode passed in requires an extra conversion.
The non-unicode version does an index seek, while Unicode executes as a table scan. In some cases, the performance of the report can vary widely. Notice the estimated cost below for Query 1 is 80 and Query 2 is 1019. The inability of SSRS to provide single byte character set parameters causes this problem.
If SSRS simply allowed a check box for text parameters so one could select Unicode or not – the query generated could be the appropriate version of above, yielding much better performance for a large portion of SSRS users, while not penalizing any international Unicode users.
Query 1 ( Non-Unicode)
I created a Connect item for this on 3/22/2012 ( #732626). It was closed as a duplicate of 543243. 543243 was closed as "’By Design”. The connect response is below.
Thanks for your feedback. We're resolving this item as a duplicate of an existing one: http://connect.microsoft.com/SQLServer/feedback/details/543243/major-performance-problem-with-parameterized-queries-on-non-unicode-databases. Note that it was resolved as By Design, so you may want to submit a Suggestion for the feature.
I will work on this some more and the post what we did to improve performance.
Posted on May 26, 2011 10:25
After a recent SSRS upgrade from SQL 2008 to SQL 2008R2 Gold ( no patches) , we began to get complaints from users that reports were running slow and Internet Explorer was hanging up when certain reports were executed.
We ran some of our standard SSRS reports, and there were no unusual report failures. We looked at the SSRS logs and there was nothing which could explain the problem either.
We could reproduce the problem by running one of our SSRS reports – it seemed to run forever, when normally it would run in less than 10 seconds.
It seemed that every time we touched the report – things would hang up. While the report was running, sometimes your laptop would go unresponsive as well. If you managed to click on any other tab in IE, the title bar for IE would include (Internet Explorer Unresponsive), and the entire IE window would grey out.
Running the same report, with the same data, in Report Builder 3 was fine.
We ran Fiddler to see what we might learn by following the network traffic. The times between send and receives were pretty fast. Summing up the times accounted for only a small portion of the 1:40 that it took to run the report. However we did notice a pattern – there was a large wait on the client side. Fiddler looked like:
Send -> receive
Send -> receive
Send -> receive
Send -> receive
The sends come from the IE client, and the receive is the response from SSRS Reportserver.The client would send a request, get a quick response, then sometimes just wait. Since the unexplained time was after a receive, and before a send, we know that the client is the problem. If the client would simply do the next send – all would be good.
Running Task Manager on the client during the report run , we would see one or two processors using nearly 100% during the wait, then returning to normal. Below is task manager during the unresponsive period.
After a while we realized that the problem occurs when the parameters refresh. It took 1:40 to refresh the parameter list. There were several parameters, and simply choosing a different item from the parameter list would send us into this 1:40 wait.
I pulled out the query which generates the parameter lists, and ran it in Management Studio – and it is fast, fast, fast.
Running the report in the development environment also did not exhibit the same behavior, even though it was on the same version of SSRS as the problem prod environment. There were lots of reports in prod that did not have this problem. We typically run 25,000 to 35,000 reports per day in production. Most were fine, but several reports, only in production, exhibited this behavior.
Continuing to work, we discovered that the number of values in the parameter lists in the trouble prod reports were between 8K and 11k. The same report which did not exhibit this problem in dev, had a much shorter list of values in its parameter lists. Changing the data source on the report in dev, to point to the production database – the dev report took forever to refresh, just like prod. Now we know the problem is related to having a large number of values in the parameter list. Checking other reports with the same problem, they all had a large parameter lists.
Microsoft has a KB article about this problem (http://support.microsoft.com/kb/2522708) and http://support.microsoft.com/kb/2506799/LN. it is fixed in CU7 for SQL Server 2008 R2. This CU is not included in SP1, which is in beta right now. However, I assume it will be included in SP2 and later, when they come out.
The KB article describes the problem to apply to “The report contains a large multi-select drop-down parameter list.” Our reports with single select drop-downs also showed this problem, so it does not apply ONLY to multi-select.
Another thing to note is that the initial load of the parameters when the report first comes up in IE is fine. However when you change one of the parameter values, and the parameters refresh – you wait, wait, wait.
How do I know if I have this problem?
· Report was created in an earlier version of SSRS than SQL 2008 R2
· Initial report load, and execution time same as pre-upgrade.
· Parameter refresh takes a long time, during which CPU utilization goes near 100%
· You are running SQL Server 2008 R2 pre-CU7
We installed CU 7 and the 1:40 refresh time dropped to about 10 seconds, and processor utilization dropped – problem fixed. CU7 can be found at http://support.microsoft.com/kb/2489376.
Warning: I believe that installing SQL 2008 R2 CU7 breaks intellisense in Visual Studio 10. If so, there is a patch which fixes this also.
Posted on May 26, 2011 08:55
We recently upgraded SQL Server Reporting Services for a large SSRS user from SQL 2008 to SQL 2008R2 Gold (no SQL 2008R2 patches).
Immediately some reports began to fail with a System.OverflowException. Sometimes the report would run correctly. Other times the report would fail. The implementation was 4 load balanced servers on the front end. Failures were happening on all of the servers, so the problem wasn’t specific to a particular server.
The error that customers would see in SSRS is in the screen shot below:
Looking at the SSRS Error Logs, we found errors like these:
processing!WindowsService_0!165c!05/24/2011-06:00:18:: e ERROR: An exception has occurred in data set 'OutageDataSet'. Details: System.OverflowException: Value was either too large or too small for an Int32.
processing!WindowsService_0!165c!05/24/2011-06:00:18:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ProcessingAbortedException: , Microsoft.ReportingServices.ReportProcessing.ProcessingAbortedException: An error has occurred during report processing. ---> System.OverflowException: Value was either too large or too small for an Int32.
It was always System.OverflowException: Value was either too large or too small for an int32.
As it turns out, this is a known bug, documented in the KB article (http://support.microsoft.com/kb/2359606).
There is a work around in the article which tells us to convert a grouping field from int to Cdbl or CLong. However, if you do this – you’ll have to figure out which item needs to be changed. The error message will give you the dataset, but not the column, nor the item within the report that exhibits the problem.
This is fixed in Cumulative Update 4(http://support.microsoft.com/kb/2345451) , which you can download and install immediately. This fix is also included in SQL 2008 R2 SP1, which at the time of this writing, is still in beta.
Posted on January 27, 2011 13:58
One of my friends and co-workers has been working on an SSRS in Sharepoint Integrated mode problem. Melissa Coates has an excellent blog post
which details the issues. Derek Sanderson also has a blog post
about this issue.The short version is that images in reports in Sharepoint Integrated Mode can cause the report to run much more slowly than we though. Think about a scorecard with the small images related to good, bad, growing, slowing etc. Each one of those images is a separate http Get - another round trip. In her example, Melissa has a dashboard type report which runs .5 seconds in native mode, and about 11 seconds in integrated mode.
Do some testing to ensure there are no surprises in your SSRS Sharepoint Integrated Mode installation.
To Melissa and others on her team who worked on this - great job!
We are going to follow up with more testing around this.... and we'll post the results.
Posted on September 29, 2010 07:35
This morning I got an email from Craig Ellis with a picture of us at PASS ( probably 2009 or 2008). It was titled "Me and the Prez". Craig is one of the many folks whom I have had the pleasure to know at PASS. It reminds me that PASS has been an anchor for me! PASS has been the source of many good friends. My career has been made better because of the things I learn at PASS, in fact - I truly believe that my life has been improved as a result. Exciting things are happening at PASS - stay tuned.
PASS is an organization which is composed of SQL people who simply wish for us all to be better, to do better, and to live better. While there have always been folks who would disparage PASS, I try not to let it bother me. Of course - I could be better. Of course - PASS could be better - duh! However, there is no vast conspiracy, not even a small conspiracy, no evil empire, no secret society - it just ain't true - none of it! Never has been!
PASS is just you and me, trying to do as good as we can in a busy world!
I'm just a SQL guy - a country boy from Charlotte, NC. Most of the folks I have come in contact with are just the same - folks simply trying to do their job, trying to do good for themselves, their company and their family. They volunteer for PASS because it is good for all of us - helping someone else is the best way to help yourself. To all of you who have volunteered for PASS, led a chapter, helped with a committee - ( even the program committee which did not accept my session this year) - I say - Thanks!. Thanks for all of us. And Thanks from me.
The PASS conference is often the best week of the year for me!. I get to learn stuff, and visit with all of my friends. It is always a great time! See you the first week of November!
"Me and the Prez" - just makes me feel a little bit sentimental!
Posted on September 28, 2010 08:07
I recently read a research report from the ITIC which which studied the number of security flaws reported against different databases since 2002. The ITIC is the Information Technology Intelligence Consulting. It is located in Boston and its bio says it is an independent research and consulting firm that covers high technology.
The research states that, since 2002, Microsoft SQL Server has had the fewest reported number of vulnerabilities of any major database platform - only 49. These statistics were reported from the NIST ( National Institute of Standards and Technology) which is a government monitoring agency.
Oracle has reported a whopping 321 flaws, more than 6 time that of SQL Server. So when someone talks about how great Oracle is compared with SQL Server, or complains about security patches from Microsoft - this research may come in handy. I guess this the result of the Trustworthy Computing initiative which begain in 2002. You may recall the SQL Slammer worm which really messed us all up in about May of 2002. Afterwards, the SQL Server group stopped all new development and spent months going through existing code with the purpose of making it safer and more secure. Perhaps this is the payoff - good job SQL TEAM!
Security Vulnerabilities since 2002
The entire report can be seen at http://itic-corp.com/blog/2010/09/sql-server-most-secure-database-oracle-least-secure-database-since-2002/
Posted on September 27, 2010 14:24
Right now I am looking at rolling out SQL Server 2008R2 SSRS in a large enterprise and thinking about the developer's desktop software and how this will impact BIDS and Visual Studio. Since SQL Server Tools are shared across instances of the same version, AND since SQL 2008 and SQL 2008 R2 are the same version ( sort of) - they share the same tools - like BIDS. That means you can only have SQL 2008 BIDS OR SQL 2008 R2 BIDs on your machine - not both. So how will that affect us in the enterprise?
BIDs can deploy SQL 2008 and R2 reports
BIDs can preview SQL 2008 and R2 reports
BIDs will attempt to upgrade SQL 2005 reports when imported.
The functionality of SSRS has changed between 2008 and R2. For instance maps, sparklines, and bar graphs have been added. Also the ability to deploy and share data sets, and report parts are support. BIDs for R2 needs to be able to deploy reports to both 2008 and R2. This is done by adding information to the report project configuration. TargetServerVersion is added to tell BIDs if your SSRS version is SQL 2008 or R2. Location information for shared data sets and report parts are also added to the project file(.rptproj).
Here is an example of a part of the project file for SQL 2008 R2 SSRS project. I have bolded some of the new stuff in the project file.
<?xml version="1.0" encoding="utf-8"?>
<Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" ToolsVersion="2.0">
Technet says (http://technet.microsoft.com/en-us/library/ee635898.aspx) and below:
If you save a SQL Server 2008 Report Server project in the SQL Server 2008 R2 version Business Intelligence Development Studio you can no longer open it in the SQL Server 2008 version of Business Intelligence Development Studio.
I did some testing on this. It turns out that SQL 2008 allows you to open the R2 project, and simply ignores the tags it does not understand. You can save an existing report and be fine. But if you do anything which causes a change in the project file, like adding a new report or changing the configuration parameters, a save will remove the R2 specific project items. You can still open the package in R2, and you will be asked if you wish to upgrade. Then you will have to re-setup the configuration information.
Reports can be deployed to SQL 2008 or R2, but unsupported features will be stripped away.
BIDs for R2 can deploy reports for SQL 2008 and R2, but not SQL 2005. You can set up mulitple configurations, as before and for each configuration specify which environment. If you put R2 only features into a report and attempt to deploy it to SQL 2008, those features will be stripped out. So, for instance, a map will be removed from a report when you try to deploy to SQL 2008. This happens during a build phase of deployment. A result is returned - ErrorLevel. Valid values are below. Note they go from most to least severe.
Most severe and unavoidable build issues that prevent preview and deployment of reports.
Severe build issues that change the report layout drastically.
Less severe build issues that change report layout insignificantly.
Minor build issues that change the report layout in minor ways that might not be noticeable.
Used only for publishing warnings
You can set the ErrorLevel in a project configuration, as below:
Any error with a value <= ErrorLevel in the configuration are considered an error and will kill the deployment. Otherwise, it is considered a warning and the build continues. 2 seems to be the default.
This project indicates the target to be R2. We could change this to be SQL 2008. Changing it does NOT change the RDL files for the reports. This conversion will occur for the deployment, and not the source.
I created a simple report with a map and an image, using R2 as the target. After the build, both the source and the OutputPath were the same. Then I changed the target to SQL 2008, and did another build. The error below was returned.
Error 1 The map, Map1, was detected in the report. SQL Server 2008 Reporting Services does not support map report items. Either reduce the error level to a value less than 2 or change the ReportServerTargetVersion to a version that supports map report items. C:\Documents and Settings\wsnyder\My Documents\Visual Studio 2008\Projects\TiffTest\TiffTest\Report1.rdl 0 0
After changing the ErrorLevel in the project to 0, I did another build. The source RDL was unchanged, but the copy in OutputPath had the Map stripped from it.
Posted on September 24, 2010 10:16
It feels strange that my first blog post will not be technical. Technical will come soon. I work for a great company Mariner, based in Charlotte, NC. Mariner is an ALL BI - ALL THE TIME company where you can be challenged with good work and encouraged and expected to keep up to date with all the latest MS Stack based Business Inteliigence practices and software.
Mariner is looking for excellent skills SSIS, SSAS, SSRS, SQL to join the team that I work in. We are getting new projects and wish to add some folks. The team you will work with are motivated, smart, hard-working people, who enjoy their job and their lives. If you are really good, we'd love you to interview to join us.
Mariner is a small company (30-ish), but well respected and specialized.Our home is in Charlotte, NC. We do what we say - both for our customers and co-workers. If you want to be the best, then come work with the best. We are looking for people right now!! http://www.mariner-usa.com/jobs/