How big is your Big Data data set? If it is not measured in petabytes then perhaps it doesn’t qualify. In any case, as the amount of data grows the challenges grow disproportionately and in areas that perhaps you don’t expect. In this post, I am raising a number of challenges and concerns that you should be thinking about, especially if you are interested in business analytics on large data sets.
I recommend that if you are serious about large data sets and numerical precision, you have no choice but to adopt 128-bit arithmetic in most scenarios, especially financial.
The title says it all. See this Microsoft Connect suggestion and vote to make it happen: SSAS needs larger datatypes to store currency (128-bit currency and/or decimal).
When adding additional SELECT clauses to an aggregating query over a large fact table and using ColumnStore, the performance degrades in a step-wise linear fashion with large steps. It may be quicker to execute several less-complex queries rather than a single complex query.
I’ve submitted a Microsoft Connect bug report here: https://connect.microsoft.com/SQLServer/feedback/details/761895/.
Continue reading for full details, or download this attachment: https://www.box.com/s/5hp9f0ditg0fspghu506 [PDF file also available via Microsoft Connect]. Actual test results, steps to reproduce, etc., as well as more pretty graphs :), are included.
Essentially, the performance of a non-grouping SQL SELECT query degrades when applied to a ColumnStore index. This has been tested with SQL Server 2012 RTM CU1. Performing partial aggregation can result in a 15x performance improvement in some of the cases I observed.
See the full details in my Microsoft Connect submission here: https://connect.microsoft.com/SQLServer/feedback/details/761469/. Or download the details here: https://www.box.com/s/frf7imhyclb2efz2tfvb.