Categories
Uncategorized

Why You Need to Routinely ProcessFull

Earlier this week, one of the client’s routine SQL agent jobs failed during an Analysis Services processing task with the following error message:

Error: 2013-07-18 10:27:36.18
Code: 0xC1020041
Source: Process Dimensions Analysis Services Execute DDL Task
Description: File system error: A FileStore error from WriteFile occurred. Physical file: \\?\E:\OLAP\Data\Sales.18.db\Product Review.0.dim\342.Comments.bsstore. Logical file: . .
End Error

This is a recurring job – running every 2 hours – that simply kicks off an SSIS package.  The SSIS package loads new data into several dimensions and fact tables before refreshing the associated cube.  The cube is refreshed using the following pattern:

  1. Dimensions: ProcessUpdate
  2. Cube: ProcessFull

Having never seen this exact error message before, my first thought was to check the free space on the SSAS server to make sure there was room to write new data.  But a quick check showed other SSAS processing tasks completing successfully…if it the disk was full, everything would be failing – so drive space couldn’t be the issue.

Next stop – google bing.  Initial search results indicated that we may have run into the infamous 4GB string store limit. All the signs were there…this was a fact dimension and the attribute was called “Comments” so it could be a wide column with lots of rows…lending credit to the 4GB string store limit hypothesis.

Note: SSAS 2012, provides for an increased range which you can read about here and here.

Since none of the developers have access to the production server (a good idea), we alerted the DBA and and asked her to check the size of the *.bsstore file on the production SSAS server.  In the meantime, mostly out of curiosity, I ran the following query against the production DW to get an idea of the potential size for this attribute:

[sql]

SELECT SUM(DATALENGTH(ISNULL(f.column,1))) / 1024.0 / 1024.0 AS Size_In_MB
FROM dbo.fact f

[/sql]

This may seem odd that an attribute with a size of ~10 MB in the database could have ballooned to ~4GB on the SSAS server.  But, if we recall, this dimension has been getting reprocessed every 2 hours (via ProcessUpdate) for the past few months without an occasional ProcessFull.  So if the underlying data is changing somewhat frequently (which it was, as I later discovered) then it is definitely possible for the file size to balloon.

This is because of how the ProcessUpdate works…essentially it tries to do as little work as possible to accommodate underlying changes in the source table while avoiding the need to reprocess the related partitions:

  • Inserts: data is added as new members in the dimension
  • Deletes: corresponding members are logically marked as inactive in the SSAS-files, but not removed
  • Updates: the existing member is logically marked as inactive and a new member (with the changed values) is added to the SSAS-files

So as you can see, the files are growing for every Insert/Update that occurs in the source table.  And when a record is deleted from the source file, the corresponding member(s) remain in the SSAS physical file.  And so the SSAS files grow and grow and grow.

This is why it is good practice to occasionally run a ProcessFull on the dimensions. ProcessFull wipes out everything and rebuilds it from scratch…so all those “logically marked as inactive” attribute members get removed from the files resulting in a smaller files (and smaller files = less I/O). The downside is that you then need to go back and reprocess all the related partitions which can take considerable time.

Once the DBA confirmed the *.bsstore file on the production SSAS server was ~4GB on disk, we executed a ProcessFull on the associated dimension, followed by a ProcessDefault on the related partitions.  This shrunk the file size back down to ~14MB which is much more reasonable given the data size in the source.

In the future, we plan to implement a regularly scheduled ProcessFull to prevent a recurrence of this issue.

6 replies on “Why You Need to Routinely ProcessFull”

Hi Bill,

This is a very timely and useful post for me. I remember reading that multiple “Process Updates” should be broken up by regular “Process Full”, but I’m not sure I ever knew the “why” behind it. Thanks for the clear explanation.

BTW, in your code snippet, I believe “DATALEN” should be “DATALENGTH” and you meant to query a dimension table instead of the fact table?

Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s