It can often be bugs and errors in fetching data from the server. This can leave system administrators baffled as all work comes to a standstill. Incomplete or incorrect transaction cause confusion among employees working on the different sections of a database. Read the rest of this entry »
Category Archives: SQL Server 2014
Query performance is a very important area of SQL server. We always have badly performance queries around.
Query store is the newest tool for tracking and resolving performance problems in SQL server.
In this article, we are going to have a look at some practical uses of SQL server Query store.
What is Query Store?
The query store has been described by Microsoft as a ‘flight data recorder’ for SQL server queries. It tracks the queries run against the database, recording the details of the queries and their plans and their runtime characteristics. Query store is per database feature and runs automatically. Once turned on, nothing further needs to be done to get it to track data. It runs in the background collecting data and storing it for later analysis.
Query store is available in SQL Server 2016 and later, and Azure SQLDB v12 and later. It is available in all editions of SQL server, even in Express edition.
How is Query store different from other tracking options?
We have had query performance tracking for some time though in the form of dynamic management views. Mostly, sys.dm_exec_query_stats and sys.dm_exec_query_plan and tracing tools like SQL server profiler and extended events.
So, what makes Query Store different? Let me start answering that by describing a scenario that I encountered a couple of years ago.
A particular critical system was suddenly performing badly. It had been fine the previous week and there have been no extended events sessions or profiler traces running historically. The admin had restarted the server when the performance problem started, just to make sure it was not something related to a pending reboot.
As such, there was no historical performance data at all and solving the problem of what happened, why the query performance is different this week was extremely difficult.
Query store would have solved the problem of the lack of historical data. Once turned on, query store tracks query performance automatically. The data collected is persisted into the user database and hence -unlike with the DMVs- it is not lost on a server restart.
Since the query store data is persisted into the user database, it is included in the backups of that database as well. This makes it much easier to do a performance analysis in somewhere other than the production server now.
What exactly is a query performance regression?
A dictionary does not help much here. The Oxford dictionary defines regression as returning to an earlier state which is definitely not relevant here.
A performance regression occurs when a query has degraded in performance over time. This degradation may be sudden or it may be gradual. It is probably more common for regression to be sudden. The degradation may be permanent or it may at a later point returned to the previously accepted behavior.
What causes a regression?
A common cause of a query performance regression is a plan change. Let’s briefly talk about the query optimization on the plan caching process to see why.
TSQL is a declarative language, the query written expresses the desired results, not the process of getting to those results. It is up to SQL server engine to figure out how to get those results and the portion of the engine that figures that out is the query optimizer which takes the query and outputs a query plan. A theoretical operation that the query process that can then execute to obtain the desired results. For anything other than a trivial query there are multiple different plan shapes that can produce the same results but differ in the internal details and differ in how long it will take to execute that plan. In theory, the optimizer will always take a query and produce a good plan –if not the fastest possible plan- is fast enough. However, that is not always the case and it is also perfectly possible for a query to have a plan that is fast for some parameter values and really slow for other parameter values. Then there is plan caching which is another layer of complexity. Optimization is an expensive process so SQL server caches the execution plans. When the query executes again it can fetch the plan from the cache and execute it without the need for the cost of the optimization process.
There are many minor causes the plan changes like:
- Bad parameter sniffing.
- Out of date statistics.
- Bad query patterns.
- Overly-complicated queries.
These all tend to cause temporary performance regression.
Sometimes the regression can be caused by data growth which will be persistent.
Also, code changes or schema changes can cause a performance regression.
Tracking and diagnosing query performance regressions with the query store:
First of all, we should enable query store option using the following statement:
ALTER DATABASE [SQLSHACK_Demo] SET QUERY_STORE = ON;
Then whenever you encounter any query performance problems with your application, simply you can open your database in SQL server management studio and expand out the query store folder and then open the regressed query report.
You will see the report with the default configurations which indicates total for the duration but this is not ideal. What we want is to check for regression in CPU time because that eliminates cases of blocking and then indicates the appropriate intervals that you want to investigate in:
You can change these configurations of the data viewed if you clicked on configure button on the top right of the report to get the following page:
Now, we can see how it behaved over time and we can see that this query has two plans associated with it.
And if we hovered over the bars of the paragraph in the top left corner we can see the query behavior in the recent and historic intervals and we can see how they differ.
We can also see how many plans the query has on the top right paragraph if we clicked on the query bar and if we clicked on any of the plans shown we can see its graphical display down on the window.
Fortunately, you can select both plans and click on compare plans button to compare them.
In our case here, we can conclude that we definitely do have a case of a bad parameter sniffing. We have got two plans. One is appropriate for all executions and one that was generated by a ‘’NULL’’ parameter value and is not suitable for the majority of executions for this query. I know that we can fix this later.
But, what if we need to fix this now. Here comes query store to show the easiest way to do this by choosing the fastest plan and click force plan button.
From that point onwards, the query will be executed with the forced plan no matter what parameter values the query is compiled with. So the application now is performing well.
Now we want to do further analysis to identify why this happened and how to prevent it in a long-term without resulting to plan forcing. This is kind of investigation we do not really want to do on a production server. And here also query store introduces the easiest way to do that by just backing up and restoring the production database to our test environment and then have a look at our query store regressed queries report again.
Query performance regression can face any DBA every day so you had to know what caused it and how to track that regression over time. This article was to show how much easier query store makes it. I hope this article has been informative for you.
- Query Store: How it works? How to use it?
- Best Practice with the Query Store
- Monitoring performance by using the Query Store
SQL Server data compression is now available in all editions of SQL Server, starting with 2016 SP1.
In this Article, you will not only know how data compression will save space, you’ll also find out how compression can sometimes improve performance as well.
Space Savings vs. Performance
When I first heard about compression back in 2008, my first thought is that it would have a big performance penalty. Compression would save disk space, but it would probably decrease performance as the data was compressed and decompressed. It turns out that compression can improve performance instead. Because compressed data fits in a smaller number of data pages, there are decreased I/O requirements. Since I/O is generally the bottleneck in SQL Server, this can improve performance. Compressed data also has a decreased memory requirement. When querying compressed data, a smaller number of pages will be copied to the buffer pool. The one area that is impacted is CPU. You do need to have some CPU headroom because compression will require some additional CPU resources.
The good thing is that if the workload is reasonably tuned, many SQL Server instances have more CPU resources than they need. One note of caution here. Don’t look to compression as the solution to solving major performance issues. You need to look at physical resources, configuration, indexing, and query tuning. The point I’m trying to make is that decreased I/O and better memory utilization will benefit the workload in many cases.
Compression has been available in SQL Server for all versions from 2008 to 2016 SQL server, but only for Enterprise Edition. Beginning with SQL Server 2016 SP1, it’s now available in Enterprise, Standard, and Express. I’m excited about this because now compression is within the reach of any organization.
There are two types of compression that you can use to compress a table, index, or even a partition:
- Row level compression works by storing fixed-width data types as variable length data types. Nulls and zeroes do not take any space. For example, in an uncompressed table, an integer column takes 4 bytes per row, even for those rows that have a small number such as 1 or 10, or even null. Once row compression is implemented on the table, each value will be stored with the smallest possible number of bytes. So outside of the metadata, storing a 1 in an integer column will take 1 byte, storing a null or 0 will take 0 bytes.
- Starting with SQL Server 2012, Unicode compression is implemented when you use row compression. This applies to Unicode columns in varchar and in char. In an uncompressed table, each character takes up 2 bytes in a Unicode column, even if a small character set is used. In those cases, compressing the table will store the Unicode characters in 1 byte instead.
- In the technical article, Data Compression: Strategy, Capacity Planning, and Best Practices, Microsoft recommends using row compression on all data, as long as you have 10% extra CPU capacity, and, of course, as long as you achieve some space savings.
- It costs Low CPU penalty.
- When you implement page level compression, the rows are automatically row-compressed first.
- Page level compression also removes repeated data within a page by two mechanisms, prefix and dictionary compression.
- Page level compression can compress a table to a smaller size than row compression, but it is recommended for tables that are mostly inserted, but not updated that often.
- It does have a higher CPU penalty.
Neither of these compression types will work on row-overflow data. This is data from a row that exceeds 8K. For example, you can create a table that has two varchar 8000 columns. If a row exceeds 8060 bytes, then SQL Server will move one or more of those columns to another page so that the row fits. One way to get around this issue is the new COMPRESS function.
- You can use this function to compress individual values, which includes row-overflow data.
- It uses a GZIP algorithm to compress the values. There is a downside to this, however. The COMPRESS function must be applied each time a value is inserted or updated, and a DECOMPRESS function must then be used to read the value. This means that there would be changes to the application or stored procedure.
In this demo, we’ll do some performance comparisons. We’ll take a close look at I/O and memory impact. We’ll see how compression affects both reads and writes to the data.
I created two tables; one with page compression called “bigTransactionHistoryPAGE” and one with row compression called “bigTransactionHistoryROW”. I also created a table with no compression called “bigTransactionHistory” that I’ll use during this demonstration as well. And finally, I populated the two compressed tables with around 31 million records.
Let’s take a look at the number of pages in each table or index:
SELECT OBJECT_NAME(i.[object_id]) AS TableName, i.name AS IndexName, SUM(s.used_page_count) IndexPages, FORMAT(1 - SUM(s.used_page_count) * 1.0/CASE WHEN i.name LIKE 'IX%' THEN 131819 ELSE 143645 END, 'P') AS PercentSaved FROM sys.dm_db_partition_stats AS s JOIN sys.indexes AS i ON s.[object_id] = i.[object_id] AND s.index_id = i.index_id WHERE OBJECT_NAME(i.[object_id]) LIKE 'bigTransactionHistory%' AND OBJECT_NAME(i.[object_id]) <> 'bigTransactionHistoryTEST' GROUP BY i.[object_id],i.name ORDER BY IndexPages Desc;
The uncompressed table from is the largest. The non-clustered index compressed with row compression saved about 30% of the space, and the clustered index saved about 37%. The page compressed table is really interesting. The clustered index is much smaller than the original table, while the non-clustered index is about 39% smaller.
I’m going to turn on STATISTICS IO to compare the number of pages touched when I query each table:
SET STATISTICS IO ON; GO SELECT SUM(Quantity) AS ItemsPurchased FROM bigTransactionHistory; SELECT SUM(Quantity) AS ItemsPurchased FROM bigTransactionHistoryROW; SELECT SUM(Quantity) AS ItemsPurchased FROM bigTransactionHistoryPAGE;
You see here that less I/O is needed for the compressed table. That’s actually obvious since the clustered and non-clustered indexes are smaller when compressed.
The next query shows how many pages are in the buffer for each index:
SELECT COUNT(*)AS cached_pages_count ,name ,index_id FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT object_name(object_id) AS name ,index_id ,allocation_unit_id FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT object_name(object_id) AS name ,index_id, allocation_unit_id FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() AND name LIKE 'bigTransactionHistory%' GROUP BY name, index_id ORDER BY cached_pages_count DESC;
Again, when the index resides on fewer pages, the data takes less space and memory. Notice that the clustered index was used for the page-compressed table. We saw that the page-compressed clustered index is much smaller than the non-clustered index.
On this Azure VM, I have seen inconsistent results in the time to run the queries. For example, if I use DBCC DROPCLEANBUFFERS, or even restart SQL Server to ensure the data must be loaded from disk, one of these queries could run anywhere from 1 second to even a minute. I suspect that the VM is sharing resources with other VMs, and that is causing my discrepancies. I decided to run the queries each in a loop to better see the difference. Inside the loop, I use DBCC DROPCLEANBUFFERS to make sure that the data was loaded from disk, and not cached data.
DECLARE @Count INT = 0; WHILE @Count < 100 BEGIN DBCC DROPCLEANBUFFERS; SELECT SUM(Quantity) AS ItemsPurchased FROM bigTransactionHistory; SET @Count += 1; END;
The loop against the uncompressed table took 3 minutes and 31 seconds. The row-compressed table took 3 minutes and 4 seconds and the page compressed table took 3 minutes and 18 seconds. So in this case, the row-compressed query performed the best, with the page-compressed query next.
Let’s take a look at inserting data. In this script, I populated an uncompressed table and our two compressed tables with a loop of 1000 inserts of 1000 rows each. To make things fair, I ran DROPCLEANBUFFERS before each loop.
DBCC DROPCLEANBUFFERS; GO --2:40 DECLARE @Count INT = 0; WHILE @Count < 1000 BEGIN INSERT INTO dbo.bigTransactionHistoryTEST SELECT * FROM bigTransactionHistory WHERE TransactionID BETWEEN @Count * 1000 + 1 AND (@Count + 1) * 1000; SET @Count = @Count + 1; END;
The uncompressed inserts took 2 minutes and 40 seconds. The row-compressed inserts completed a bit faster at 2 minutes and 23 seconds, and the page-compressed inserts completed in 2 minutes and 14 seconds. The page-compressed inserts were actually fastest, and I suspect that it might be due to less I/O.
Let’s take a look at updates.
DECLARE @Count INT = 0;
WHILE @Count < 1000 BEGIN
SET Quantity = Quantity + 1
WHERE TransactionID BETWEEN @Count * 1000 + 1 AND (@Count + 1) * 1000;
SET @Count = @Count + 1;
Again, I ran loops. In this case, it’s 1000 updates of 1000 rows Updating the uncompressed table took 1 minute and 48 seconds. Updating the row-compressed table took 1 minute and 36 seconds. The page-compressed table was much slower at 2 minutes and 21 seconds. During the slides, I mentioned that Microsoft recommends page compression for workloads that have few updates, and you can see that the updates are slower for page compression.
To summarize, I saw decent space savings, especially with page compression. The compressed tables performed better for both selects and inserts. Updates, however, performed noticeably worse with page compression.
This Article is an overview of data compression. Compression is now available in all editions of SQL Server, starting with 2016 SP1. You can implement row and page level compression, and also use the new COMPRESS function. Compression can improve performance because of decreased I/O and memory pressure. I hope this article has been informative for you.
- Enable Compression on a Table or Index
- Row Compression Implementation
- Page Compression Implementation
As DBAs, we all get to the point where we are asked to setup a new server for a particular environment. Setting up a new server is not a big thing but giving the answer to the question that “how well it will work” might be tricky.
There are tons of items which we can set up to measure how well the newly installed server will receive a response, but here I will discuss one of the most valuable resources of the server “Disk.” Most often the disk is not measured correctly, or I have seen environments where the disk response time has never been measured. I will discuss here a tool from Microsoft which is very handy and can solve your problem very quickly. The diskspd.exe!
It’s the superseding version of SQLIO which was previously used to measure IO response time for the disk. The source code of diskspd.exe is hosted on GitHub. You can download this free utility from Microsoft’s website using this link.
After you download the utility, you will get a zip file. Just unzip the file, and it will give you the folders and files as shown in the below screenshot. You will-will need the eye of diskspd inside the folder “amd64fre” if you have a SQL Server 64-bit version (most of us will be having this).
To complete this article please check it here
One of customers changed the value returned from @@SERVERNAME. SQL Server works no problem, however an unexpected behavior appeared. Changing the value for @@SERVERNAME, caused the backups to fail.
Looking at the maintenance jobs, found all jobs completed successfully and without issues. However, upon looking at the database’s statics it states no backups completed.
Because the database in question is part of AlwaysOn Availability Group (AG); SQL Server executes sys.fn_hadr_backup_is_preferred_replica to determine if the backup should take place on the current node. However, it returns value of 0 for all databases, if the preferred replica is set. Because, the script makes a check that is running on the server that is preferred. It does this by comparing the value to @@SERVERNAME to value of replica_server_name in sys.availability_replicas. Because value will never match, it skips the database on both primary and secondary replica.
I have created a Microsoft Connect article (link); asking this little bit of information to be added to Books Online article (link). There was a request submitted by Ola Hallengren (Blog | Twitter), which was closed as Won’t Fix (link). Please vote!
After 10 great years, extended support for all versions of SQL Server 2005 is coming to an end on April 12, 2016 it is the time for Upgrade to SQL Server 2014
Upgrading is not just a maintenance task, but an opportunity to provide new value to your business. Moving to SQL Server 2014 or Azure SQL Database enables you to achieve breakthrough performance:
- SQL Server 2014 has been benchmarked to be 13 times faster than SQL Server 2005, before the additional performance gains available from in-memory OLTP.
- AlwaysOn availability groups provide more reliable High Availability.
- Together with Windows Server 2012 R2 you can scale up across compute, networking and storage.
- New tools in SQL Server Management Studio provide an easy on-ramp to cloud to help you get more from your data platform investment.
Six Benefits to Planning for SQL Server 2005 and Windows Server 2003 End of Support Now
Final post in the series, I hope to summarize my findings for everyone. Thanks for all the comments, feedback and comments I have received to-date on these posts.
Before providing the summary of all the testing, research and reading, I want to provide some internals for folks.
NUAM Memory Allocation
If a server has NUMA configuration, when we define the max memory setting. All memory gets allocated evenly from all NUMA nodes. For example, if I allocate 8GB from a server that has 4 NUMA nodes, it will allocate 2GB from each NUMA node for the instance in question. So be careful, if you are using CPU affinity. As even using CPU affinity, lets say you only want to use NUMA 1 for an instance; this can lead to foreign memory access. NUMA allocation still happens across all NUMA nodes (at least in SQL Server 2014). So even though you might only have subset of schedulers online, memory will still be accessed across all NUMA Nodes. Even though with new hardware foreign memory access does not have major impact on performance. Still recommend avoiding it. If you must use CPU affinity, then allocate sub-set of the processor from each NUMA node.
Lazy Writer Responsibility
Lazy Writer is back ground process and a thread is created per physical NUMA node (for Soft-NUMA, we only get one thread, you can read details here by Jonathan Kehayias (Blog|Twitter)). Lazy Writer responsibility is to make sure sufficient number of free pages, about 640 pages, are always available for new data to be loaded. If Lazy Writer thread is spinning, that is usually a clear signal of internal memory pressure.
Note: Internal Memory pressure does not mean server doesn’t have enough memory. There are multiple factors that can contribute to this, one fact is low Max Server Memory, which is what has been explored in these articles.
Original Scenario: SQL Server 2012 Latest Build + Windows 2008 R2 Latest Build + Locked Pages In Memory + Low Max Server Memory + AMD Processors + NUMA Configuration
In this build, I can consistently reproduce the issue. If I set the Max Memory low enough, it caused the Lazy Writer on NUMA node 0 to spin non-stop, as I documented in SQL Server 2012 NUMA Node Imbalance – Cont’d. To further to this I understand some additional internals around this thanks to colleague who helped here. As I was not true sure what is happening.
If we looked at the DBCC MEMORYSTATUS, in particular we notice following for Node 0:
Memory node Id = 0 KB
VM Reserved 127086340
VM Committed 642452
Locked Pages Allocated 55996
Pages Allocated 54008
Pages Free 0
Target Committed 512000
Current Committed 698448
Foreign Committed 0
Away Committed 0
Taken Away Committed 0
Review the two numbers high-lighted above, Target memory is what this NUMA node is allocated and Current is what is NUMA consuming, therefore the Page Free count is 0. Thus Lazy Writer spinning to free up memory. However it is unable, thus spinning indefinitely causing CPU to pin. Question to answer, why can’t Lazy Writer free up memory from NUMA Node 0?
Looking further into MEMORYSTATUS, we also notice, that most of the memory allocated into NUMA Node 0 is for buffer cache:
MEMORYCLERK_SQLBUFFERPOOL (node 0) KB
VM Reserved 103971820
VM Committed 524288
Locked Pages Allocated 2524
SM Reserved 0
SM Committed 0
Pages Allocated 8
Test Scenario #1: Windows 10 + SQL Server 2012 RTM + Low Max Server Memory + Intel Processors + No-NUMA (Physical)
In this, I was just trying to reproduce the issue on my desktop, with no luck. Memory allocation or such were nominal.
Test Scenario #2: Windows 2012 + SQL Server 2014 RTM + Low Max Server Memory + Intel Processors + No-NUMA (Hyper-V)
Again no issue, as per the title of the series, NUMA Imbalance. It was obvious and expected this two scenarios don’t show any issues. But I wanted to confirm it.
Test Scenario #3: SQL Server 2008 R2 + Windows 2008 R2 + Low Max Server Memory + AMD Processors
These test was on same servers as original scenario, however we did not notice any issue.
*** During testing, we realized an interesting fact about NUMA configuration in SQL Server 2008 R2. SQL Server swaps NUMA 0 and NUMA 1 CPU mapping when creating a logical mapping for schedulers. For example, if you have 2 NUMA Nodes each with 4 cores; SQL Server binds NUMA 0 – Processor 0 – 3 to Scheduler 4 – 7 and NUMA 1 – Processor 4 – 7 to Scheduler 0 – 3. This is considered NUMA Swap, documented in this MSDN blog article here by CSS Team. However this behavior does not seem to persist in SQL Server 2012.
Test Scenario #4: SQL Server 2014 + Windows 2012 R2 + Low Max Server Memory + Intel Processors + NUMA (Azure IaaS Virtual Machine)
This was interesting case, when I set the Max Server Memory to low value, the NUMA node 0’s processor time started to increase. However it was not as swear as SQL Server 2012, however these are different processors. I had same behavior as noticed above, Current Committed was higher then Target Committed, therefore NUMA thread on Node 0 was spinning. However interesting thing note, it was not as aggressive as SQL Server 2012. It was not continuously spinning, it allowed CPU to breath. However would wake up frequently as low as ever 20ms.
!!! Eliminated Locked Pages in Memory as a factor for NUMA Imbalance.
Test Scenario #5: SQL Server 2014 + Windows 2012 R2 + Low Max Server Memory + Intel Processors + Locked Pages In Memory + NUMA (Azure IaaS Virtual Machine)
Exact same behavior as above was noticed, however, after restart, SQL Server no longer had Current Committed higher then Target Committed. Even after lowering the max memory even further then Scenario #4. It did not cause as dramatic shift.
*** Another interesting thing to note in SQL Server 2014. Up-to SQL Server 2012, Lazy Writer Thread bound to CPU 0 of each NUMA Node. In SQL Server 2014, Lazy Writer Thread is binding to last CPU of each NUMA Node.
Conclusion & Things to Consider
- Configuring SQL Server Max Server Memory low memory will cause Lazy Writer to spin due to internal memory pressure issue.
- There is no value at which point this behavior starts, it depends on the load of the server. For example, in Original Scenario, due to server load under 4000MB it started spinning the Lazy Writer thread; however in Test Scenario #4, I had to go down to as much as 1000MB. Therefore on your servers look at Lazy Writer thread in conjunction with CPU and DBCC MEMORYSTATUS to understand what is truly happening.
- This behavior is consistent in whether I have Locked Pages in Memory or not.
- In Windows 2012 R2 NUMA allocation seem to be much better then Windows 2008 R2.
- SQL Server 2014 Lazy Writer thread does not seem to be as aggressive as SQL Server 2012.
- Few KB articles came up during my research in SQL Server 2012 that fix NUMA related issues, KB2819662 and KB2926223.
- There as been some fixes released for NUMA for Windows 2008 R2 operating system also, please review KB2155311 and KB2510206.
- In addition, if you are running a server with more then 64-logical cores, please review K-Group Configuration, for details please reference this article.
- Last but not least verify that memory is distributed evenly on physical server. That is the memory banks have been evenly allocated, i.e. you do not have more memory allocated in one bank versus another, as you might also see one NUMA node working extra harder if there is significant amount of foreign memory access.
Question to answer, why can’t Lazy Writer free up memory from NUMA Node 0?
SQL Server Lazy Writer is only responsible for clearing memory from Buffer Pool, it cannot remove or clear memory from any other memory clerk. Therefore, it is possible, that other internal components have memory allocated on NUMA Node 0 only, therefore NUMA Node 0 is unable to free up enough memory for buffer pool only. Looking at DBCC MEMORYSTATUS we can see which memory clerks allocate from NODE Node 0 only, there are few, for example, MEMORYCLERK_SQLCLR, MEMORYCLERK_SOSMEMMANAGER, OBJECTSTORE_LOCK_MANAGER, and MEMORYCLERK_XE_BUFFER all have allocation from NODE Node 0 only.