09

Have you ever got the following warning when creating or modifying a table in SQL?

Warning: The table "SomeTable" has been created, but its maximum row size exceeds the allowed maximum of 8060 bytes. INSERT or UPDATE to this table will fail if the resulting row exceeds the size limit.

Are you wondering how close to the limit you really are for the set of data the table contains?  If so this information will help determine that answer.

The records on disk are actually stored in 8K byte pages.  The layout of the record in the page is split into 8 parts as shown in the following table.

Name

Size

Description

Status Bits

2 bytes

Two status bits each stored in one byte.

Fixed Length

2 Byte

The length of the fixed portion of the record.

Fixed Length Data

N bytes

The actual data for the fixed length columns that are defined in the table.

Column Count

2 bytes

The number of columns in the table.

Null Bitmap

N bytes, 1 byte for each set of 8 columns, NULL bit for each column.

 

Variable Length Column Count

2 bytes

Number of variable length columns.

Variable Length End Position

2 bytes for each variable length column.

The ending position of the variable length data associated with the variable length column.  When the variable being stored is off page the upper most bit will be set and is not really part of the end position.  It means the data that is on page is a pointer to the actual data off the page.

Variable Length Data

N Bytes

The actual data stored in the variable length column.

 

So what are the data types that have fixed data lengths?

Data type

Size

Bit

1 byte (optimized by SQL 8 columns of bit data type will be put in 1 byte)

Bigint

8

Int

4

Smallint

2

Tinyint

1

Numeric (p,1-9)

5

Numeric(p,10-19

9

Numeric(p,20-28)

13

Numeric(p,29-38)

17

Decimal (p,1-9)

5

Decimal(p,10-19

9

Decimal(p,20-28)

13

Decimal(p,29-38)

17

Smallmoney

4

Money

8

Real

4

Float(1-24)

4

Float(25-53)

8

Date

3

DateTime2(0-2)

6

DateTime2(3-4)

7

DateTime2(5-7)

8

DateTime

8

SmallDateTime

4

Time

5

Datetimeoffset

10

Char(n)

N

Nchar(n)

2*n

Binary(n)

N

Timestamp

8

Uniqueidentifier

16

 

The following data types that have variable lengths may be stored on or off page depending on the available variable space left on the page.

Data type

Size

Varchar(n)

N (null and empty string take no space except for end position and bit in null bitmap)

Nvarchar(n)

2*n (null and empty strings take no space except for end position and bit in null bitmap)

Image

Image fields may or may not be stored in the page depending on the ‘text in row’ option of the table.  If the text is off page, a 16 bytes pointer to the text field is stored in variable part of the record.  Null fields take no space on page except for end   position and bit in null bitmap.

Text (Deprecated so should not use)

Text fields may or may not be stored in the page depending on the ‘text in row’ option of the table.  If the text is off page, a 16 bytes pointer to the text field is stored in variable part of the record.  Null fields take no space on page.    1 byte per character.

NText (Deprecated so should not use)

Text fields may or may not be stored in the page depending on the ‘text in row’ option of the table.  If the text is off page, a 16 bytes pointer to the text field is stored in variable part of the record.  Null fields take no space on page.  2 byte per character.

VarChar(max)

Fields that fit on the page will take up 1 byte per character plus 2 bytes for end position and 1 bit of the null bitmap.  If the field will not fit on the page then a 36 bytes pointer is stored on page. (null and empty string values take no space except end position and 1 bit of null bitmap)

Nvarchar(max)

Fields that fit on the page will take up 2 byte per character plus 2 bytes for end position and 1 bit of the null bitmap.  If the field will not fit on the page then a 36 bytes pointer is stored on page. (null and empty string values take no space except end position and 1 bit of null bitmap)

VarBinary(n)

N bytes plus 2 bytes for end position and 1 bit in null bitmap. (null and empty string values take no space except end position and 1 bit of null bitmap)

VarBinary(max)

Fields that fit on the page will take up 1 byte per character plus 2 bytes for end position and 1 bit of the null bitmap.  If the field will not fit on the page then a 36 bytes pointer is stored on page. (null and empty string values take no space except end position and 1 bit of null bitmap)

XML

Fields that fit on the page will be place on the page.  If the field will not fit on the page then a 36 byte pointer is stored on page.  The xml itself is compressed into a parsed data tree so the exact length is hard to determine.  (null and empty string values take no space except end position and 1 bit of null bitmap)

 

Let’s see this in action by using DBCC to dump the pages of a table.  First create a table.

USE [tempdb]

GO

 

 

CREATE TABLE [dbo].[Table1](

      [PK] [bigint] NOT NULL,

      [T2] [varchar](50) NULL,

      [T3] [varchar](max) NULL,

      [T4] [varchar](max) NULL,

      [X1] [xml] NULL,

 CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED

(

      [PK] ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

) ON [PRIMARY]

 

GO

Now we will insert a record.

INSERT INTO dbo.Table1

      VALUES(

            1

            ,'Field2'

            ,null

            ,CONVERT(varchar(max),Replicate('X',8000)) + CONVERT(varchar(max),Replicate('X',60))

            ,'<root>xml value</root>')

Let’s analyze the data that is actually on disk to do that we need to use DBCC.  First we need to get the file number and page number for the table.


DBCC IND('tempdb','dbo.Table1',-1)

PageFID

PagePID

IAMFID

IAMPID

ObjectID

IndexID

PartitionNumber

PartitionID

iam_chain_type

PageType

IndexLevel

NextPageFID

NextPagePID

PrevPageFID

PrevPagePID

1

171

NULL

NULL

549576996

1

1

2449958198642081792

In-row data

10

NULL

0

0

0

0

1

166

1

171

549576996

1

1

2449958198642081792

In-row data

1

0

0

0

0

0

1

164

NULL

NULL

549576996

1

1

2449958198642081792

LOB data

10

NULL

0

0

0

0

1

163

1

164

549576996

1

1

2449958198642081792

LOB data

3

0

0

0

0

0

1

165

1

164

549576996

1

1

2449958198642081792

LOB data

3

0

0

0

0

0


Now we know the FID and PID for the page that has the row data in it which is the page with a page type of 1.  Use these values in a DBCC PAG command.

DBCC TRACEON(3604) -- DBCC PAGE to client

DBCC PAGE('tempdb',1/* file num*/,166 /*page num*/,1)

 

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

 

PAGE: (1:166)

 

 

BUFFER:

 

 

BUF @0x0397FE20

 

bpage = 0x05000000                   bhash = 0x00000000                   bpageno = (1:166)

bdbid = 2                            breferences = 0                      bUse1 = 51000

bstat = 0x1c0000b                    blog = 0x1bbbbbbb                    bnext = 0x00000000

 

PAGE HEADER:

 

 

Page @0x05000000

 

m_pageId = (1:166)                   m_headerVersion = 1                  m_type = 1

m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0xc000

m_objId (AllocUnitId.idObj) = 32527  m_indexId (AllocUnitId.idInd) = 13568

Metadata: AllocUnitId = 3819052486141870080                              

Metadata: PartitionId = 2449958198642081792                               Metadata: IndexId = 1

Metadata: ObjectId = 549576996       m_prevPage = (0:0)                   m_nextPage = (0:0)

pminlen = 12                         m_slotCnt = 1                        m_freeCnt = 7985

m_freeData = 205                     m_reservedCnt = 0                    m_lsn = (41:274:130)

m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0

m_tornBits = 0                      

 

Allocation Status

 

GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED              

PFS (1:1) = 0x60 MIXED_EXT ALLOCATED   0_PCT_FULL                         DIFF (1:6) = CHANGED

ML (1:7) = NOT MIN_LOGGED           

 

DATA:

 

 

Slot 0, Offset 0x60, Length 109, DumpStyle BYTE

 

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS

Record Size = 109                   

Memory Dump @0x6388C060

 

00000000:   30000c00 01000000 00000000 05000404 †0...............        

00000010:   001f001f 0043806d 00466965 6c643204 †.....C.m.Field2.        

00000020:   00000201 00000086 65000068 1f0000a5 †.......†e..h...¥        

00000030:   00000001 0000007c 1f0000a3 00000001 †.......|...£....        

00000040:   000000df ff01b004 f0047200 6f006f00 †...ßÿ.°.ð.r.o.o.        

00000050:   7400ef00 0001f801 11097800 6d006c00 †t.ï...ø..    x.m.l.        

00000060:   20007600 61006c00 75006500 f7†††††††† .v.a.l.u.e.÷           

 

OFFSET TABLE:

 

Row - Offset                        

0 (0x0) - 96 (0x60)                 

 

 

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

Now we can analyze the Memory Dump section:

 

00000000:   30000c00 01000000 00000000 05000404 †0...............        

00000010:   001f001f 0043806d 00466965 6c643204 †.....C.m.Field2.        

00000020:   00000201 00000086 65000068 1f0000a5 †.......†e..h...¥        

00000030:   00000001 0000007c 1f0000a3 00000001 †.......|...£....        

00000040:   000000df ff01b004 f0047200 6f006f00 †...ßÿ.°.ð.r.o.o.        

00000050:   7400ef00 0001f801 11097800 6d006c00 †t.ï...ø..    x.m.l.        

00000060:   20007600 61006c00 75006500 f7†††††††† .v.a.l.u.e.÷           

 

Name

Data

Notes

Status Bits

3000

Two status bits each stored in one byte.

Fixed Length

0c00

The length of the fixed portion of the record.

Fixed Length Data

01000000 00000000

The actual data for the fixed length columns that are defined in the table.

Column Count

0500

The number of columns in the table.

Null Bitmap

04

 

Variable Length Column Count

0400

Number of variable length columns.

Variable Length End Position

1f001f 0043806d 00

The ending position of the variable length data associated with the variable length column.  When the variable being stored is off page the upper most bit will be set and is not really part of the end position.  It means the data that is on page is a pointer to the actual data off the page.

Variable Length Data

N Bytes

The actual data stored in the variable length column.

T2 Data (Field2)

466965 6c6432

The varchar data can be stored in the data record so there is the actual value.

T3

Notice that there is no data in the variable part of the record.  If you look at the variable length end positions you see that the first two end positions are the same so either the string is null or empty to determine if it is null you have to look at the null bitmap field and see that 3 bit is one for T3 so it is null.

T4

0400000201 00000086 65000068 1f0000a500000001 0000007c 1f0000a3 00000001 000000

Contains more data that cannot be stored in the record so variable part is a pointer to another page.  The first bit of the end positions is on for this field.

X1

df ff01b004 f0047200 6f006f00 7400ef00 0001f801 11097800 6d006c00 20007600 61006c00 75006500 f7

The XML tree can fit on the page so it is stored in the variable length data.

 

Armed with this information you can get a better feel for how close you are to exceeding the SQL limit for row size.  For instance if your record has a large number of variable length fields (nvarchar, varchar, varbinary…)  that are null or empty then you probably are good because they do not take any space other than the bit in the null bitmap and 2 bytes in the end positions but if you have a large number of varchar(max) fields with data that does not fit on the page then each of those fields take 36 bytes of data for pointer and 2 more bytes for the end position so you can run out of space not because of the actual value but not enough room  on the page for the pointers.  There are many other permutation that could cause you to fill up the record so that is why the warning is just a warning until SQL has a chance to analyze the actual data that need to be stored it cannot determine the true size of the record.

Are there any tools to find out what is actually being used?  The data management view sys.dm_db_index_physical_stats will tell you the maximum record size of the records on different types of pages.  If your table is clustered then just look at the level 0 IN_ROW_DATA record in the view to determine the maximum row size in the table.

SELECT * FROM sys.dm_db_index_physical_stats

    (DB_ID(N'tempdb'), OBJECT_ID(N'dbo.table1'), NULL, NULL , 'DETAILED');

database_id

object_id

index_id

partition_number

index_type_desc

alloc_unit_type_desc

index_depth

index_level

avg_fragmentation_in_percent

fragment_count

avg_fragment_size_in_pages

page_count

avg_page_space_used_in_percent

record_count

ghost_record_count

version_ghost_record_count

min_record_size_in_bytes

max_record_size_in_bytes

avg_record_size_in_bytes

forwarded_record_count

compressed_page_count

2

709577566

1

1

CLUSTERED INDEX

IN_ROW_DATA

1

0

0

1

1

1

1.34667655053126

1

0

0

109

109

109

NULL

0

2

709577566

1

1

CLUSTERED INDEX

LOB_DATA

1

0

0

NULL

NULL

2

49.9629355077835

2

0

0

34

8054

4044

NULL

NULL

 

Hopefully now armed with this information you can feel better about how much space you have in your table.

powered by metaPost
13

Applies to SSAS 2008 and Excel 2007.

Also see this blog entry: OLAP Reporting with Excel 2007: Use Member Properties!! at http://blog.davyknuysen.be/2009/08/03/olap-reporting-with-excel-2007-use-member-properties/

As you add columns from a SSAS cube to pivot table in Excel it will progressively get slower to the point that it is virtually unresponsive. In order to understand exactly why this occurs we need to look at how Excel builds its MDX. Let’s take a look at a simple pivot table against AdventureWorks.

clip_image002

So what is the underlying MDX that is executed to get this result?

SELECT NON EMPTY

       CrossJoin(

              CrossJoin(

                     Hierarchize({DrilldownLevel({[Product].[Product].[All Products]})})

                     , Hierarchize({DrilldownLevel({[Product].[Color].[All Products]})}))

                     , Hierarchize({DrilldownLevel({[Product].[Size].[All Products]})}))

                     DIMENSION PROPERTIES PARENT_UNIQUE_NAME

                     ,[Product].[Product].[Product].[Class]

                     ,[Product].[Product].[Product].[Color]

                     ,[Product].[Product].[Product].[Days to Manufacture]

                     ,[Product].[Product].[Product].[Dealer Price]

                     ,[Product].[Product].[Product].[End Date]

                     ,[Product].[Product].[Product].[Large Photo]

                     ,[Product].[Product].[Product].[List Price]

                     ,[Product].[Product].[Product].[Model Name]

                     ,[Product].[Product].[Product].[Reorder Point]

                     ,[Product].[Product].[Product].[Safety Stock Level]

                     ,[Product].[Product].[Product].[Size]

                     ,[Product].[Product].[Product].[Size Range]

                     ,[Product].[Product].[Product].[Standard Cost]

                     ,[Product].[Product].[Product].[Start Date]

                     ,[Product].[Product].[Product].[Status]

                     ,[Product].[Product].[Product].[Style]

                     ,[Product].[Product].[Product].[Subcategory]

                     ,[Product].[Product].[Product].[Weight] ON COLUMNS 

FROM [Adventure Works] WHERE ([Measures].[Order Count])

CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

Notice that there are two CrossJoin function calls in the MDX which build a set of 3 Hierarchize function call for each of the selected dimension attributes. The CrossJoin function performs a cross product of two set and the sets are the results of Hierarchize function which organizes the members of the set in hierarchical order. As you can see the number of member of the set returned can grow exponentially as more and more columns are added to the pivot table. As you add a column to the excel spreadsheet Excel will add another CrossJoin and Hierarchize function the MDX. For example the adding the Size Range attribute leads to the following query:

SELECT NON EMPTY

       CrossJoin(

              CrossJoin(

                     CrossJoin(

                           Hierarchize({DrilldownLevel({[Product].[Product].[All Products]})})

                           , Hierarchize({DrilldownLevel({[Product].[Color].[All Products]})}))

                     , Hierarchize({DrilldownLevel({[Product].[Size].[All Products]})}))

              , Hierarchize({DrilldownLevel({[Product].[Size Range].[All Products]})}))

              DIMENSION PROPERTIES PARENT_UNIQUE_NAME

       ,[Product].[Product].[Product].[Class]

       ,[Product].[Product].[Product].[Color]

       ,[Product].[Product].[Product].[Days to Manufacture]

       ,[Product].[Product].[Product].[Dealer Price]

       ,[Product].[Product].[Product].[End Date]

       ,[Product].[Product].[Product].[Large Photo]

       ,[Product].[Product].[Product].[List Price]

       ,[Product].[Product].[Product].[Model Name]

       ,[Product].[Product].[Product].[Reorder Point]

       ,[Product].[Product].[Product].[Safety Stock Level]

       ,[Product].[Product].[Product].[Size],[Product].[Product].[Product].[Size Range]

       ,[Product].[Product].[Product].[Standard Cost]

       ,[Product].[Product].[Product].[Start Date]

       ,[Product].[Product].[Product].[Status]

       ,[Product].[Product].[Product].[Style]

       ,[Product].[Product].[Product].[Subcategory]

       ,[Product].[Product].[Product].[Weight] ON COLUMNS 

FROM [Adventure Works] WHERE ([Measures].[Order Count])

CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

So you can see all these cross products quickly get out of hand and performance suffers. Since we cannot control how Excel builds the MDX how can we help this.

There are two options that I am going to bring up here. First we could just add a unnatural hierarchy to the dimension which consist of Product, Color, Size and Size Range and this will materialize the cross join in the cube so that performance will be much better but this is really impractical since there is virtually unlimited number of combinations that would have to be handled. Another approach is to use another less well know feature of Excel and that is Show Properties in Report.

If you notice in our example [Product].[Product] is the primary key of the dimension so the measure is unaffected by the additional columns in pivot table. A product can only have 1 color, size and size range. We are at the grain of the dimension once we have put Product in the pivot table.

In order to have Excel not CrossJoin the additional attributes we do the following:

1. Remove the Color and Size attribute from the pivot table.

2. Select them to show in the report using the Show Properties in Report field menu item.

clip_image004

Now we have the same data in Excel and at virtual no extra cost we can add any number of additional attributes to the pivot table.

clip_image006

The MDX generated looks like this:

SELECT NON EMPTY

       Hierarchize({DrilldownLevel({[Product].[Product].[All Products]})})

              DIMENSION PROPERTIES PARENT_UNIQUE_NAME

       ,[Product].[Product].[Product].[Class]

       ,[Product].[Product].[Product].[Color]

       ,[Product].[Product].[Product].[Days to Manufacture]

       ,[Product].[Product].[Product].[Dealer Price]

       ,[Product].[Product].[Product].[End Date]

       ,[Product].[Product].[Product].[Large Photo]

       ,[Product].[Product].[Product].[List Price]

       ,[Product].[Product].[Product].[Model Name]

       ,[Product].[Product].[Product].[Reorder Point]

       ,[Product].[Product].[Product].[Safety Stock Level]

       ,[Product].[Product].[Product].[Size],[Product].[Product].[Product].[Size Range]

       ,[Product].[Product].[Product].[Standard Cost]

       ,[Product].[Product].[Product].[Start Date]

       ,[Product].[Product].[Product].[Status]

       ,[Product].[Product].[Product].[Style]

       ,[Product].[Product].[Product].[Subcategory]

       ,[Product].[Product].[Product].[Weight] ON COLUMNS 

FROM [Adventure Works] WHERE ([Measures].[Order Count])

CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, BACK_COLOR, FORE_COLOR, FONT_FLAGS

Notice there is no longer any CrossJoin at all and we have virtual the same data in Excel.

If you want to control the order of the properties that appear in Excel you can use the Field Properties tool in the pivot table ribbon.

clip_image008

You can really see the difference in performance if you just put all the dimension attributes as columns in the pivot table and compare its performance with the performance when you Show all the Properties in the Report.

Conclusion

Using Show Properties in Report can provide much better performance but it is not always possible because the attributes you are looking for are not properties of a column you are showing but it is just one more way to get better performance out of Excel.

In general once you pull in the key attribute of the dimension you should use Show Properties in Report for all other attributes from that dimension.

powered by metaPost
29

Annotation tables and the fact that SSAS formula engine allows non numeric measures can be a very powerful tool to show just about anything as a measure in a cube.  I just published an article that show how to put non-numeric measures in a cube using annotation tables.  The sample show you how to  display a good approximation of the last processed date of the cube or any other text you would like to display.  Using Annotation Tables in SSAS to Show Last Processed Time and Latest Data Updates as a Measure in a Cube (  http://www.msbicentral.com/Blogs/JohnHall/tabid/132/articleType/ArticleView/articleId/127/Using-Annotation-Tables-in-SSAS-to-Show-Last-Processed-Time-and-Latest-Data-Updates-as-a-Measure-in-a-Cube.aspx).

powered by metaPost
Posted in: Blog
02
I recently was getting unexplained #Error when trying to replace a null with 0 on a SSRS report when the data source was a Analysis Service database.  Check out this article for a solution:  Checking a Field For Null Exception in SSRS.
Posted in: Blog
  Minimize

picture of John Hall

John Hall


Our Blog Friends Minimize

Advertisement Minimize

Copyright 2004-2010 MSBICentral.com Terms Of Use Privacy Statement