Last week, I wrote about what MaxDOP is and why you may not want to leave it at the default value of zero on your multi-processor systems. This week I'm going to talk about how to change it and give some results from my server where I made a change.
MaxDOP is an advanced configuration option, so if your server is not already configured to do so, you'll need to set it to display advanced configuration options:
EXEC dbo.sp_configure 'show advanced options', 1; GO RECONFIGURE; GO
Now if you just issue the sp_configure command without any parameters, you can see what your current MaxDOP setting is:
You can see I've already changed my MaxDOP value from the default of zero to a value of 8. (The maximum value of 64 is the maximum value SQL Server will accept, not the maximum possible value for this system.)
If you are more of a GUI person, you can see this info in SSMS by right clicking the server in Object Explorer and choosing Properties, then the Advanced tab:
Changing the setting is realtively easy and does not require a server restart to take effect. If you are using the GUI, you can enter a new value directly in the screen shown above. For T-SQL, use the following:
EXEC dbo.sp_configure 'max degree of parallelism', 8; GO RECONFIGURE; GO
Substitute whatever value you want for 8 in the above command, of course.
Now that you know how to change the value, how do you know if you should change the value?
Introducing the CXPACKET wait type
SQL Server can spend time waiting on all sorts of things - data to be read from disk, data to be written to disk, waiting for locks to be acquired or released, or, of particular interest to us, waiting for the results of parallelized queries to get recombined. This later wait is tracked via a wait type counter in SQL Server called CXPACKET. This counter is available in SQL Server 2000 through the DBCC sqlperf(waitstats) command, but in SQL 2005 and later, you can also see it with the sys.dm_os_wait_stats dynamic management view.
The dynamic management view will return a value, in milliseconds, of how long SQL has spent waiting to combine the results for parallelized queries since the last time the server was restarted or the last time the wait stats were cleared. Glenn Berry has, as part of his great diagnostic queries, a query that will list the top wait states on your servers. His query is:
-- Isolate top waits for server instance since last restart or statistics clear WITH Waits AS (SELECT wait_type ,wait_time_ms / 1000. AS wait_time_s ,100. * wait_time_ms / SUM(wait_time_ms) OVER () AS pct ,ROW_NUMBER() OVER (ORDER BY wait_time_ms DESC) AS rn FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ( 'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK', 'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE', 'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BROKER_TO_FLUSH', 'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT', 'DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', 'ONDEMAND_TASK_QUEUE', 'BROKER_EVENTHANDLER', 'SLEEP_BPOOL_FLUSH') ) SELECT W1.wait_type ,CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s ,CAST(W1.pct AS DECIMAL(12, 2)) AS pct ,CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.rn <= W1.rn GROUP BY W1.rn ,W1.wait_type ,W1.wait_time_s ,W1.pct HAVING SUM(W2.pct) - W1.pct < 99 OPTION (RECOMPILE); -- percentage threshold
There are a couple of nice things this query does. First, it converts the milliseconds value the DMV returns to seconds. Second, it orders the waits from largest to smallest and also provides a running total percentage.
So that's how you see what your CXPACKET wait is. But how do you know if you've got a problem. Microsoft suggests that if your CXPACKET wait consists of more than 5% of your total wait time, then you may have a parallelism bottleneck. I will again point out the It Depends SQL caveat applies here, as it does with all general SQL configuration advice.
Tracking Your Wait Times
If you decide you need to change your MaxDOP setting, you're going to want a baseline of your pre-change system operation to compare against to determine if you've actually improved performance. One way to do that is to manually reset the wait stats, let the system run for a set period of time, make the MaxDOP setting change, reset the wait stats again, then let the system run for the same length of time again. Ideally, your two test periods should include times of similar usage.
At this point, I feel the need to point out that you should always test system configuration changes on a test system before making them on production system.
You can reset your wait stats using the following command:
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR)
For my test, I created two jobs. The first job cleared the wait stats and ran on at 8 AM on Monday morning. The second job executed Glenn Berry's query above on the following Friday at 9 PM and wrote the results to a text file. I set up the schedules for the jobs to run twice. (I could have written the data to a table, but I was in a hurry setting this up, so I just wrote to a text file. If you do this, make sure you choose the Append option so you don't overwrite existing data with your second execution.) After the first week, I had my baseline data. When I came in on the second Monday, I made sure I changed the MaxDOP setting prior to 8 AM. (No restart is required when you change this setting.) When the following Friday rolled around, I had my second set of performance data gathered after my changes.
I should point out, the server I am testing is a 16 core system in a production environment. It is NUMA-enabled and has 8 processor cores per NUMA node. Based on Microsoft's recommendation, I determined I should set the MaxDOP value to 8. My baseline numbers:
After I made the change, these were my results:
So you can see, the time SQL Server spent waiting for parallelism to complete went down by 160,981,67 seconds - 44.7 hours! I just gave this server one man-week of additional computing time! It spent an additional 6 hours of that time waiting to acquire intent shared locks and an additional 3.7 hours waiting for OLE DB connections, but I'm completely OK with this because those wait times are indicative of a working system. An increase in OLEDB means the server is spending more time waiting for data to return from either a call to a linked server or for data to load from the bulk insert task or some other process that uses the SQL Server native client OLE DB provider. Which, in turn means, SQL is processing more work than it used to.
Now, I'm not going to say that OLEDB wait stat value is a good one. It now represents 35% of the time SQL Server spends waiting and if I can trim that down, I can squeeze some more performance out of this server. But that will require some deeper digging to find out what the OLE DB calls being made are and what, if anything, I can do about them.
If I take a step back to look at the overall picture and add up the total time spent waiting on the top 15 waits (which are the waits Glenn's query returns), I can see my total time spent waiting dropped from 321 hours to 294 hours!
I'm also not going to claim this is a scientifically accurate test. For one, I had no control over the load the server experienced during this week, so the two workloads may have been significantly different. However, by looking at an entire week, I can more or less average out any one-time anomalies a large one-off query might introduce. I also know the nominal workload of my servers and I can make a reasonable assumption that the workload was more or less the same. Further, as a working DBA, I'm more concerned about my actual performance improvement than a theoretical improvement. So all in all, for a first step in performance improvement, I'm very happy with this result!
Bonus: When analyzing performance data, I like to find alternative ways to validate my conclusions. My conclusion above was that this server was now spending more time processing requests. So if I use my PerfMon monitoring routine to look at the number of batch requests per second this server is handling, I should see an increase from the first week to the second week.
Sure enough, if you look at the average low point over the two weeks, you'll see the lows were under 200 for the first week and over 200 for the second. Likewise, the highs are higher, going from typically under 600 to peaking over 600 multiple times.