This is the third part in a series of posts on the “OLAP Farm” – which is an Analysis Services architecture that makes use of remote partitions on the backend and linked dimensions and linked measure groups on the front end in order to support solutions where datasets and latency requirements simply can’t be handled by a single processing server. It was discussed in more detail in the previous blog post – so if you’re not familiar, stop now and check it out before continuing.
- Building an OLAP Farm in the Cloud with Azure Virtual Machines – Part 1
- Building an OLAP Farm in the Cloud with Azure Virtual Machines – Part 2
Quick Recap: in the last post we focused on establishing a baseline for processing a SSAS database (based on an enlarged version of the AdventureWorksDW sample dataset). After applying a several common techniques for improving processing performance (e.g. partitioning, source indexing, etc) we were able to reduce processing duration from 18min to 13min which will now serve as our baseline from which to try and further improve upon by distributing processing tasks across multiple backend servers.
Setting Up Remote Partitions
In order to distribute the processing workload across several servers, we need to setup remote partitions. This process is fairly well documented (here) so I’m not going to go into great depths about the initial configuration.
- Create SSAS database projects for each backend server
- Update SSAS server-level configuration on backend and master servers
- Set MasterDataSourceID database property on backend servers
What we end up with is a SSAS database solution with 3 projects:
- AdventureWorksDW – master SSAS database deployed to AZVM-SSAS-MSTR\MASTER
- AdventureWorksDW-Remote01 – backend database deployed to AZVM-SSAS-BE01\BE01
- AdventureWorksDW-Remote02 – backend database deployed to AZVM-SSAS-BE02\BE02
Note: the AdventureWorksDW database on the MASTER instance in the screenshot above is from the previous post where we focused on tuning processing for a single server. In this scenario, where we’re distributing partitions across backend servers, AdventureWorksDW-Remote is the “MASTER” database.
Now at this point, we’re ready to re-configure the partitions in master SSAS database (i.e. AdventureWorksDW) to reside on the remote instances. Again, instructions for this piece have already been covered in the link above so I don’t want to spend too much time here. All you need to know is that I’ve configured the Internet Customers measure group to reside on AZVM-SSAS-BE01 and the Internet Orders measure group to reside on AZVM-SSAS-BE02.
Finally, we can re-deploy the master SSAS database (i.e. AdventureWorksDW-Remote) to the master SSAS instance (i.e. AZVM-SSAS-MSTR\MASTER) and process the database. Afterward, we notice that the rest of the objects (e.g. data source, data source views, cubes, dimensions, etc) have been created in the remote SSAS databases on the backend instances – where as before these backend databases simply consisted of an empty database shell containing only a single data source pointing back to the master SSAS database. The one interesting thing to point out is the partitions – only the partitions that have been configured for the remote instance will show up. As you can see in the screenshot below, only the partitions for the Internet Customers measure group was created in the Remote01 database on BE01. And if we were to drill into the Remote02 database on BE02, we would only see the partition object for the Internet Orders measure group.
Baseline Distributed Processing Performance
Now that the SSAS database has been deployed across a single master and 2 remote SSAS servers it’s time to run through our processing test scenarios to see how much time we can shave off.
Test01: Remote Partitions (FAILED after 15 minutes)
After the initial setup in the previous section, I decided to effectively pickup where we left off in the last post – each measure group partitioned by month – the obvious difference being that the partitions for [Internet Customers] and [Internet Orders] have now been moved off to remote instances (i.e. AZVM-SSAS-BE01, AZVM-SSAS-BE02).
Unfortunately, processing failed after ~15 minutes – the remote servers simply ran out of memory during partition index processing. This can be seen in the following 2 screenshots showing memory utilization for BE01 and BE02 respectively:
Notice how, in both cases, the [Memory Usage KB] touches the [Memory Limit Hard KB] line just before 2:31 PM – that’s bad – but more importantly it will cause processing to fail.
One quick option is to bump the memory on these VMs, but the only way to do that is to move up to the next tier – which is A3 (4 cores, 7.5 GB memory). Unfortunately, that’s going to put me over my allotted core count – MSDN subscriptions only provide you with $150 of monthly credit and access to 20 CPU cores at a time.
Note: Other options include lazy processing, adjustments to the dimension attributes to reduce size/number of indexes.
Below is what I’m currently using…
|DC01||Domain Controller||1 cores; 1.75 MB memory|
|SQLDW||SQL Server database||8 cores; 56 GB memory|
|SSAS-MSTR||SSAS (master)||4 cores; 7 GB memory|
|SSAS-BE01||SSAS (remote01)||2 cores; 3.5 GB memory|
|SSAS-BE02||SSAS (remote02)||2 cores; 3.5 GB memory|
That’s 17 cores…if I bump BE01/BE02 from A2 to A3, I’ll be at 21 cores – and over the limit. One option is to move the domain controller onto another box (e.g. SQLDW) which will get me there.
Note – One thing I discovered with this architecture is that the messaging when processing remote partitions becomes VERY chatty. Up until now, I’d been simply executing an XMLA command (which controlled the processing order: dimensions, partition data, partition indexes) via SSMS without issue. However, this time there was a message in the SSMS results window indicating the the memory consumption was approaching the limit for SSMS and output would no longer be logged. Fearing this might have somehow contributed to the errors that ultimately resulted in processing failure, I switched over to using PowerShell to execute the XMLA processing command.
Below is the script I used:
Test02: Remote Partitions (~12 minutes)
Ok – so after running into some resource constraints with the previous scenario, I’ve adjusted the spread of partitions across the backend servers. I ended up bringing the [Internet Customers] and [Internet Orders] measure groups back on to the master instance and instead spreading the monthly partitions for the [Internet Sales] measure group across the backend server in an alternating pattern: 201101 (remote01), 201102 (remote02), 201103 (remote01), etc.
In this test we fair much better – processing completed in just ~12 minutes – which is already 1 minute faster than our fastest time with a single SSAS server (though arguably within normal variance – so don’t get too happy just yet). Let’s take a look at the resource utilization across the servers…
Memory: Nothing to write home about other than we didn’t hit the wall
CPU Utilization: For the master instance, we’re not maxed on CPU yet…though we bounced around the 80% mark during the majority of processing before hitting the ceiling during the Process Indexes phase. The backend servers (BE01/BE02) are very clearly hitting the limit for the work they’re doing – the 2 big 100% humps are during the Process Data and Process Indexes phase…which is all these servers are really responsible for (where as the master server also needs to handle dimension processing).
BE01 – the break between the peaks is because we have to wait for the Process Data phase (of Internet Customers and Internet Orders) on the master instance (AZVM-SSAS-MASTER) to complete before we can start the Process Index phase. If you look closely across the charts for threads and rows converted (next sections) you’ll see that Process Data phase completes around 4:08pm on both BE01/BE02 but not until about 4:09:50pm on MASTER.
BE02 – same observation as BE01
MASTER – the plateau is during the process data phase and the barrier is about 42-43 processing threads. Unlike the final run on the single processing server (from last post) we do not see any queuing – but we also don’t see as high of a barrier (last time was closer to 50 processing threads) – this can most likely be attributed to spreading out the work.
BE01 – during process data phase, busy threads tick up to 25, then queuing occurs and things are backed down a bit. This is most likely due to a CPU bottleneck which is completely maxed during processing thread utilization.
BE02 – basically the same story as BE01.
MASTER – average during process data phase of 680k rows converted/sec
BE01 – average during process data phase of 350k rows converted/sec
BE02 – average during process data phase of 340k rows converted/sec
Test03: Remote Partitions (~12 minutes)
Same as the previous test…but after migrating the domian controller from AZVM-DC01 to AZVM-MASTER and bumping BE01 and BE02 from A2 to A3.
When this didn’t produce any improvement, I took a closer look and realized the bottleneck was still at the source system (i.e. AZVM-SQLDW). I should have noticed this sooner – clearly increasing resources on BE01/BE02 was a step in the wrong direction and instead I should have decreased resources on AZVM-SSAS-MSTR instead. That would move the bottleneck from the source system back over to the SSAS processing servers – and thus allowed the scaled out solution to provide relief to the processing bottleneck. Bottom-Line: if the processing bottleneck is on the source system, scaling out the SSAS processing environment is NOT going to improve speed!
At this point I decided to throw in the towel. Mainly because I was frustrated with the goof I made (and time wasted by moving the DC to MSTR and increasing resources on BE01/BE02 while bottleneck was still at the source system)…but even more frustrated because I failed to notice that the features needed to make this architecture possible (i.e. Remote partitions, Remote linked measure groups, Linked dimensions) are all going to be removed in a future version (see here). Why i didn’t notice this sooner – i have no idea! I have enjoyed thinking through this architecture and absolutely have enjoyed working with Azure VMs but at this point with so much more interesting stuff to play with, it seems like a waste to double-down on something that has been deprecated.
The truth is that an OLAP Farm architecture made much more sense back in the days of 32-bit systems. This really became obvious as I needed to bring down the resources on the SSAS Servers (AZVM-SSAS-MSTR, BE01, and BE02) in order to push the bottleneck from the source (AZVM-SQLDW) to the SSAS processing server (AZVM-SSAS-MSTR)…in order to highlight the benefit of scaling out the SSAS backend processing. And now that 64-bit systems are ubiquitous, there is even less reason for the added complexity of scaling out the backend in an OLAP Farm architecture.
But what about those solutions that, even with compression, wouldn’t come close to fitting in memory on a server with 2 TB? This is, after all, the target use-case for a scaled-out OLAP Farm (when data won’t fit in memory). Well, for those situations we can point to the amazing improvements in storage technology over the last few years. Now the thought of cycling data in and out of memory is much less of a concern…especially for those fortunate enough to be working with flash storage technologies such as Fusion IO and it’s many competitors.