Welcome back! This is the second post in a series of posts focusing on performance monitoring for Analysis Services.
Extended Events Tracing
Extended Events (xEvents) is an event-handling system providing insight to the behavioral and performance characteristics of an Analysis Services instance. Even though it was first introduced for the database engine in SQL Server 2008, it didn’t make its way into Analysis Services until SQL Server 2012. It is the successor to SQL Profiler (which is has been deprecated) so unless you’re still working with SSAS 2008R2 (or below), it’s time to get comfortable w/ xEvents.
For processing workloads, not only will an xEvent trace provide information such as the overall processing duration for the cube or tabular model, but we can also see how long it takes to process each major object and phase (e.g. dimension, measure group, partition, aggregation, etc). From an xEvent trace, we can also determine as which major components are being processed in parallel. This is helpful information to have because the solution to most performance-related problems for processing workloads is very different depending on whether the primary contributor is a dimension (as opposed to a measure group). Furthermore, and something we’ll discuss and explore in more detail later in this series, the problem may not be due to a “single” major object. Instead, we may have a situation where it’s actually a group of objects being processed in parallel. For example, perhaps several major objects have slowly increased in size to the point where we’re now running into a resource (e.g. CPU, disk IO) bottleneck that’s causing the high processing duration.
For query workloads, we can see important information about every single query that hits the system including details such as the total duration of the query, query text (MDX/DAX), start and end times, as well as the associated user account. We can also determine details as to how the query was executed such as the number of partitions scanned, aggregation hits/misses, cache hits/misses, other queries running at the same time, etc…all of which have an effect on the performance of any one particular query. A secondary benefit is that we’ll be able to identify the usage pattern(s) of folks using the cube. For example, is usage low/moderate throughout the week with a heavy spike on Friday mornings?
The major drawback with xEvents for Analysis Services is the lack of a GUI – that’s not coming until SSAS 2016. Until then, you’ll have to use XMLA (XML for Analysis Services) commands to start and stop the traces.
Below is an example of an XMLA command that starts an Analysis Service xEvents trace.
<Create xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:ddl200_200="http://schemas.microsoft.com/analysisservices/2010/engine/200/200" xmlns:ddl300_300="http://schemas.microsoft.com/analysisservices/2011/engine/300/300"> <ObjectDefinition> <Trace> <ID>Trace01</ID> <Name>TraceName01</Name> <ddl300_300:XEvent> <event_session name="xeas" dispatchLatency="1" maxEventSize="4" maxMemory="4" memoryPartitionMode="none" eventRetentionMode="allowSingleEventLoss" trackCausality="true"> <!-- ### COMMAND EVENTS ### --> <event package="AS" name="CommandEnd" /> <!-- ### ERRORS AND WARNING ### --> <event package="AS" name="Error" /> <!-- ### PROGRESS REPORTS ### --> <event package="AS" name="ProgressReportEnd" /> <event package="AS" name="ProgressReportError" /> <!-- ### QUERY EVENTS ### --> <event package="AS" name="QueryEnd" /> <!-- ### QUERY PROCESSING ### --> <event package="AS" name="QuerySubcube" /> <event package="AS" name="VertiPaqSEQueryCacheMatch" /> <event package="AS" name="VertiPaqSEQueryEnd" /> <event package="AS" name="ResourceUsage" /> <!-- ### TARGET ### --> <target package="Package0" name="event_file"> <parameter name="filename" value="Trace01.xel" /> <parameter name="max_file_size" value="32" /> </target> </event_session> </ddl300_300:XEvent> </Trace> </ObjectDefinition> </Create>
There are many other events-types you can include in an extended events trace for analysis services. However, you’ll want to be careful about which (and how many) events-types you include, especially if you’re running the trace in a production environment, because of the observer overhead effect. This particular trace definition (shown above) includes everything you’ll need to derive the information described in the Processing/Query workload sections and it’s what I use on a regular basis when working with clients to troubleshoot and resolve their SSAS performance problems. Only on rare occasions will I add additional events and its usually only after we’ve narrowed down the problem to a specific handful of problematic queries.
Performance Monitor Tracing
Performance Monitor (perfmon) is a utility included in Windows for collecting OS and application-specific performance information. It works by reading measurements (referred to as “counters”) on a regular interval. From the perspective of monitoring the performance of an Analysis Services server, this tool can be used to track the utilization and availability of server resources (e.g. CPU, memory, disk IO, etc) and allow you to determine if (and where) any bottlenecks exist.
During processing workloads, you can use perfmon to see things like how much memory is being consumed (and how much is still available) throughout the various phases of processing. For example, over time you might notice more and more memory being consumed while processing a group of dimensions in parallel. At some point the server may run out of available memory and processing will fail. Instead of simply waiting for that failure to happen, you can (and should) track memory usage during processing workloads and review regularly. When you notice memory usage increasing and approaching the limit, you can take corrective action (e.g. add memory to the server, change the way the cube/tabular model is processed so that fewer objects are processed in parallel, etc) and avoid the “unexpected” failure – which isn’t really unexpected, we just weren’t paying attention.
Perfmon traces can also be used to benchmark and monitor processing performance in the context of data-throughput (i.e. rows per second). This can be very helpful for detecting changes to related components in the environment such as the source SQL Server database or the SAN. For example, if your cube or tabular model typically averages 400k rows read per second when processing a large measure group, and then at one point that average drops down to ~120k rows read per second…we know there’s something is wrong and it’s probably related to components involved in the movement of data from the source database to the Analysis Services server. Perhaps there’s new activity on the source SQL Server causing blocking/contention with the queries pulling data for SSAS processing. Or maybe the network was changed and a new switch was added and isn’t configured properly.
From the perspective of a query workload, we can use a perfmon trace in a similar manner. Some of the drivers for bottlenecks may be different (e.g. an increasing number of users or increasing number of queries, etc) but the method and use case is the same as we saw for processing workloads.
Perhaps my 2 favorite characteristics of performance monitor (when compared to extended events traces) are…
- there’s a GUI for starting, stopping as well as reviewing traces built right into Windows (shown below)
- it isn’t SQL Server specific which means there’s already quite a bit of info out there on the internet on how to best use this utility (here, here, here, etc)
Both of these reasons greatly reduce the complexity for those just getting started. However, from the perspective of the on-going performance monitoring of an analysis services solution, you might want to script it out using PowerShell or .NET.
At this point, you should have a firm grasp on why performance monitoring for Analysis Services is important and a general idea as to the types of problems it can help you avoid (or at least significantly reduce the time spent troubleshooting). And now that you’ve been introduced to the 2 primary tools used to collect the performance-related data from your analysis services server it’s time to start closing the gap.
In the next post, we’re going to discuss in detail which xEvents and perfmon counters you’ll want to include and, most importantly, what each can tell you (alone or in conjunction w/ other events/counters) about existing or potential performance problems and/or bottlenecks. It will essentially be a cheat sheet on what to look out for and why. Stay tuned!
Quick Note (based on some reader feedback): the purpose of this series is only to teach others how to start monitoring and identifying Analysis Services performance issues with a few tips and scripts to help jump start your learning path. If you are following this series or simply landed upon this single post, I can only assume you have dealt with (or are currently dealing with) Analysis Services performance issues – or maybe you’re just a true SSAS-geek!
That said, Analysis Services and associated performance issues is a very complex and nuanced topic that could quickly fill hundreds of pages to thoroughly cover. My goal here is not to provide a blueprint for solving your specific SSAS issue – it is merely an attempt at teaching you how to start fishing. If you are currently dealing with a major Analysis Services performance issue, please feel free to reach out directly and I’ll be happy to discuss your issues and even give you a point in the right direction for free. However, if it becomes evident that you need more than just a point in the right direction, I will likely suggest a brief consulting engagement.