Earlier this summer, I started working with a new client to develop the next version of their reporting platform. This client already had the reporting requirements and good jump start on the underlying data model but had run into some serious performance issues when trying to report directly from the relational data model using stored procedures. After some discussion about the requirements (e.g. types of reports, calculations, granularity, data size, security) it was quite clear that Analysis Services in tabular mode was a very good fit. In just over a week, we were able to build out a prototype that accomplished all of the requirements (including a dramatic increase in performance). Following the engagement the client has decided to move forward with SSAS tabular as the engine behind their new reporting platform.
Below are a few observations picked up along the way.
Telerik is Sexy
In this case, the client had already chosen Telerik and the reports they’d mocked up (which I really wish I could share) were absolutely stunning.
Note: If you’ve never heard of Telerik Reporting, just think younger and less mature, SQL Server Reporting Services (SSRS). Its got a great look and feel, plenty of flexibility when integrating with a custom application, and it’s cheap! On the other hand, if you’re already using SQL Server and have the resource capacity to install SSRS on the same server (meaning no additional cost) you’ll probably get more bang for your buck w/ SSRS. Telerik is working on their own version of a Report Server, but it is still in the infancy-stage and way behind in terms features. Plus SSRS can be just as sexy as Telerik when you know what you’re doing.
The only question then was whether Telerik Reports supported DAX queries against a SSAS tabular model.
Short answer: yes.
Longer answer: After wasting an hour trying to find documentation for this capability and coming up completely empty, I turned to Twitter. A day or so later I received a response from @Telerik indicating that this functionality was not available but that I could “log this as a feature request“. At this point, I started to get a little nervous. Before bringing this to the attention of the client I decided to download the trial version and confirm for myself. Even Microsoft’s very own SSRS “barely” supports DAX-based datasets against SSAS Tabular-models (details) – perhaps there’s a similar workaround for Telerik. Fortunately, the trial version allowed me to confirm that both DAX and MDX queries can be executed via the CubeDataSource component. Crisis Averted!
Using Multi-Value Parameters w/ DAX
One common requirement with operational reporting is the ability to apply a filter on multiple values for a single attribute. For example, in a typical sales report, you may need to be able to filter the results so that they only include sales orders from 3 out of 10 territories. With MDX, I’d typically use a combination of the Join (SSRS) function and the StrToSet (MDX) function like this. However, the StrToSet function doesn’t exist in DAX. Thankfully, Sayed Saeed (b | t) documented a very clever workaround in this post.
Sayed’s workaround is leveraging the power of the PATHCONTAINS function to create a filter context based on a string of values for a particular column. For example, the query below shows how the PATHCONTAINS function is used in the second argument of the FILTER function to limit the years returned to just 2012 and 2014.
Note: Line 7 shows what this would looking like once parameterized. The application would need to pass in a pipe-delimited string of values for the @pYearList parameter.
This is not exactly the intended use of the PATHCONTAINS function, but it certainly gets the job done! See Sayed’s blog post for a full example.
Tabular is Wicked Fast
Performance is definitely one of the primary selling points of this technology…but its always nice to see it validated in practice.
The first big performance win was with the queries used to populate the parameter lists. This client had some interesting (in a good way) requirements with regards to the user experience when working selecting parameters. The intention was to prevent the user from selecting a combination of parameter values that resulted in zero data showing up in the body of the report.
Note: I wish more folks would adhere to this type of requirement. Nothing makes me want to flip a table more than wasting 10 minutes picking a super specific combination of parameter values, executing the report, and seeing no data. Grrrrrrrrrrrrrrrrh!
This meant that the query for each parameter-list needed to cross the fact table with a filter-context consisting of the selected values from each of the other parameters. Given that many of the reports had 6+ multi-value parameters I was a bit worried about the performance. The previous solution was taking over 10 seconds to prepare the parameter-lists…but it was designed by (outside) app-developers so not too surprising (/sarcasm). After a bit of testing w/ tabular, we were hitting an average response time under 100ms (cold-cache) which was more than fast enough.
Note: This client is on SQL Server 2014, so it’s certainly possible (perhaps likely) that similar performance could have been achieved with the use of a clustered columnstore index on the fact table. However, I’m doubtful that a columnstore index would help w/ the next issue.
The second big performance win with this prototype involved leaf-level calculations – which can be major performance hurdles when working with multidimensional cubes. Due to the complexity in the data (common in Healthcare) many of the measures were not simple aggregations where you group by some columns in the dimension and aggregate (e.g. SUM) a column in the fact table. First we had to group and rank records in the fact table at a very low level (not the grain, but close). Then we had to filter that result set for the lowest rank in each group. Then we had to aggregate the remaining result set up to a higher level (i.e. the reporting level) and calculate the measures.
Note: a common maneuver in this situation is to try and push this logic down into the ETL and reduce the report query to a basic aggregation. However, due to the complexity of the existing logic in place to populate the ranked column, it was near impossible to remodel without completely exploding the fact table and adding a ton of work for the ETL developer and forcing the client to wave bye-bye to the target deadline.
Initially, this report query was taking too long (can’t recall the actual timing) when run against the relational database using a SQL stored procedure. In multidimensional, it’s possible (even likely) this would have taken even longer. However, with tabular, we were able to get this report query to perform in under 2.5 seconds across 2 years of data. It took several attempts at writing this query to get it to perform that well and the first few attempts came up short. At one point, there was some serious concern about whether this was going to work since this particular type of calculation was used in at least half of the reports.
The tabular query engine may be wicked fast, but it’s also terribly unforgiving and you may have to refactor many times to get the performance you expect.
DAX Studio Rocks!
DAX Studio is the best tool out there for writing DAX queries…and its FREE!
Performance was a high-priority for this client and we spent a lot of time tuning the DAX queries to eek out as much performance as possible. DAX Studio allowed us to iterate and compare performance at a super detailed level very quickly. At this point, I simply can’t imagine not having it and think Microsoft should incorporate this functionality into SSMS.