Technical: Microsoft – SQL Server – Wait Stats – CXPacket

Technical: Microsoft – SQL Server – Wait Stats – CXPacket

Introduction

Reviewing SQL Server Wait Stats is a beaten down path towards understanding and optimizing your database’s bottlenecks.

What is CXPacket wait?

When the SQL Server Engine chooses to break down a query into smaller pieces and process the individual pieces in parallel there is a lag between when the first results is received and when the results are compiled.

Craig Freedman has a very good and short design lead in:

Craig Freedman – Introduction to Parallel Query Execution
http://blogs.msdn.com/b/craigfr/archive/2006/10/11/introduction-to-parallel-query-execution.aspx

Caveat

One should keep in mind that CXPacket wait stats are not wholly distinct from other wait stats. That is, because waits are marked as CXPacket, one should not over-read and think that for a specific request waits related to I/O, CPU, and Memory, Locks and Latches are not contributing to the query’s CXPacket wait stat.

In fact, it seems that once a query is parallelized, a timer is started and only stopped once all parallelized threads complete and their results are assembled.

Trouble spot

Based on Microsoft’s published documentation if the percentile of Wait Stats attributed to CXPacket exceeds 5% one should take a good and clear eye view of the settings that are relevant to Parallelism.

Extrapolate Wait Stats

Extrapolate Wait Stats – All Wait Stats

Let us quickly look at all Wait Stats.

Publicly stealing from Paul Randal:

Tell me where it hurts:

http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/



;WITH [Waits] AS
    (SELECT 
		*
		, ROW_NUMBER() OVER(ORDER BY [ResourceS] DESC) 
                    AS [RowNum]
     from
	   (

	      select
		[wait_type],
		[wait_time_ms] / 1000.0 AS [WaitS],
		([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 
                    AS [ResourceS],
		[signal_wait_time_ms] / 1000.0 AS [SignalS],
		[waiting_tasks_count] AS [WaitCount],
		(100.0 * [wait_time_ms]) / SUM ([wait_time_ms]) OVER() 
                    AS [Percentage]
	     FROM sys.dm_os_wait_stats										
	     where waiting_tasks_count != 0

	  ) tblIn		

	)				
SELECT 
    [W1].[wait_type] AS [WaitType],
    CAST ([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S],
    CAST ([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S],
    CAST ([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S],
    [W1].[WaitCount] AS [WaitCount],
    CAST ([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage],
    CAST (([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
        AS [AvgWait_S],
    CAST (([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
        AS [AvgRes_S],
    CAST (([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
        AS [AvgSig_S]
FROM [Waits] AS [W1]
INNER JOIN [Waits] AS [W2]
    ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY [W1].[RowNum], [W1].[wait_type], [W1].[WaitS],
    [W1].[ResourceS], [W1].[SignalS], [W1].[WaitCount], [W1].[Percentage]
HAVING SUM ([W2].[Percentage]) - [W1].[Percentage] < 95
order by CAST ([W1].[Percentage] AS DECIMAL(4, 2)) desc 
; -- percentage threshold
GO

Output:

SQLServerWaitStats-QA

From the above we are able to see our most costly wait categories:

 Wait Type What does it mean?
CLR_AUTO_EVENT  CLR
SQLTRACE_INCREMENTAL_FLUSH_SLEEP  SQLTRACE
LOGMGR_QUEUE  LOGMGR
REQUEST_FOR_DEADLOCK_SEARCH System looking for Deadlock (preventive maintenance)
……
CXPACKET Parallelism

Extrapolate Wait Stats – Some Wait Stats

Filter out some wait stats categories, and only look at some.

Again resting on Paul Randal:



WITH [Waits] AS
    (SELECT
        [wait_type],
        [wait_time_ms] / 1000.0 AS [WaitS],
        ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 
            AS [ResourceS],
        [signal_wait_time_ms] / 1000.0 AS [SignalS],
        [waiting_tasks_count] AS [WaitCount],
        100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER()
            AS [Percentage],
        ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC)
            AS [RowNum]
    FROM sys.dm_os_wait_stats
    WHERE [wait_type] NOT IN (
        N'CLR_SEMAPHORE',    N'LAZYWRITER_SLEEP',
        N'RESOURCE_QUEUE',   N'SQLTRACE_BUFFER_FLUSH',
        N'SLEEP_TASK',       N'SLEEP_SYSTEMTASK',
        N'WAITFOR',          N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
        N'CHECKPOINT_QUEUE', N'REQUEST_FOR_DEADLOCK_SEARCH',
        N'XE_TIMER_EVENT',   N'XE_DISPATCHER_JOIN',
        N'LOGMGR_QUEUE',     N'FT_IFTS_SCHEDULER_IDLE_WAIT',
        N'BROKER_TASK_STOP', N'CLR_MANUAL_EVENT',
        N'CLR_AUTO_EVENT',   N'DISPATCHER_QUEUE_SEMAPHORE',
        N'TRACEWRITE',       N'XE_DISPATCHER_WAIT',
        N'BROKER_TO_FLUSH',  N'BROKER_EVENTHANDLER',
        N'FT_IFTSHC_MUTEX',  N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        N'DIRTY_PAGE_POLL',  N'SP_SERVER_DIAGNOSTICS_SLEEP')
    )
SELECT
    [W1].[wait_type] AS [WaitType],
    CAST ([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S],
    CAST ([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S],
    CAST ([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S],
    [W1].[WaitCount] AS [WaitCount],
    CAST ([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage],
    CAST (([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
         AS [AvgWait_S],
    CAST (([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
         AS [AvgRes_S],
    CAST (([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) 
         AS [AvgSig_S]
FROM [Waits] AS [W1]
INNER JOIN [Waits] AS [W2]
    ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY [W1].[RowNum], [W1].[wait_type], [W1].[WaitS],
    [W1].[ResourceS], [W1].[SignalS], [W1].[WaitCount], [W1].[Percentage]
HAVING SUM ([W2].[Percentage]) - [W1].[Percentage] < 95; 
GO

Output (Testing):

 SQLServerWaitStats-Filtered-QA

Output (Production):

SQLServerWaitStats-Filtered-Prod

sys.dm_os_wait_stats (Transact SQL)

http://technet.microsoft.com/en-us/library/ms179984.aspx

 Wait Type What does it mean?
CXPACKET Occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if contention on this wait type becomes a problem.
OLEDB Occurs when SQL Server calls the SQL Server Native Client OLE DB Provider. This wait type is not used for synchronization. Instead, it indicates the duration of calls to the OLE DB provider.
LATCH_EX Occurs when waiting for an EX (exclusive) latch.
ASYNC_NETWORK_IO Occurs on network writes when the task is blocked behind the network. Verify that the client is processing data from the server.
BACKUPBUFFER Database Backup
ASYNC_IO_COMPLETION Occurs when a task is waiting for I/Os to finish
BACKUPIO Database Backup
CXROWSET_SYNC Occurs during a parallel range scan.

Wait Stats Decision?

Which Wait Stats shall we look at ?  As our title suggests, CXPacket.

Review CXPacket Wait Stats Related Settings

  • Machine → Number of CPUs
  • Machine → Hyper-Threading (HT) setting
  • SQL Server Instance Setting → Max degree of parallelism
  • SQL Server Instance Setting → Parallelism Cost Threshold
  • SQL Server Instance Setting → CPU & I/O Affinity
  • SQL Server – Exposes – NUMA Characteristics

Machine / Number of CPU

The decision as to what to set SQL Server’s Instance setting for “Max degree of Parallelism” to is based on Number of Processor Cores on the box.

There are various tools for determining the number of cores on your box; our choices includes:

  • CPUID (http://www.cpuid.com/)
  • System Information ( built into Windows \ Accessories \ System Tools \ System Information)

My box happens to be a virtualized host and I really need to dig in and determine what my true # of processors is.

Machine / Hyper-Threading setting

Like any groundbreaking Technology, it seems that it took a few iterations for Chip manufacturers (Intel and AMD), OS Vendor (Microsoft), and Database Vendor to fully “bake” HT Technology.

Discussions about HT and how it relates to SQL Server warrant several postings, and I will suggest that you Google for more popular postings

Machine / Hyper-Threading setting – SQL Command

To get an idea on whether Hyper-Threading is being used, how many logical processors is being exposed, and how many physical processors you have, please issue a query similar to the one below.

SQL  Command:

Courtesy of Dattatray Nale
http://dattatraynale.blogspot.com/2012/11/how-to-find-number-of-physical.html



SELECT 

  cpu_count AS NumberOfLogicalCPUs

, hyperthread_ratio

, ( cpu_count / hyperthread_ratio ) AS NumberOfPhysicalCPUs

, CASE
      WHEN hyperthread_ratio = cpu_count THEN cpu_count
      ELSE ( ( cpu_count - hyperthread_ratio ) / 
             ( cpu_count / hyperthread_ratio ) )
 END AS NumberOfCoresInEachCPU

, CASE
	WHEN hyperthread_ratio = cpu_count THEN cpu_count
	ELSE ( cpu_count / hyperthread_ratio ) 
	* ( ( cpu_count - hyperthread_ratio ) / 
            ( cpu_count / hyperthread_ratio ) )
  END AS TotalNumberOfCores

FROM sys.dm_os_sys_info

OS Command:


wmic cpu get NumberofCores, NumberofLogicalProcessors

To rest, determine your HT Setting and ensure that it is properly considered as you make a decision as to # of CPU Cores and what to set your “Max degree of Parallelism” to.

SQL Server Instance / Max degree of Parallelism

Issue SQL Server Query:

exec sp_configure 'max degree of parallelism'

Output:

maxdegreeofParallelism

SQL Server Instance / Cost threshold for Parallelism

Issue SQL Server Query:

exec sp_configure 'cost threshold for parallelism'

Output:

CostThresholdforParallelism

 

SQL Server Instance – CPU & I/O Affinity

Each SQL Server Instance can be configured to use specific processor and I/O bindings.

To determine a SQL Server Instance current setting:

GUI

  • In “Management Studio”, right click on the SQL Server Instance node
  • From the drop-down menu, select the “Properties” menu entry
  • From the “Server Properties”window, click the “Processor” page
  • Review the settings for Processor and I/O Affinity

Here is ours:

ProcessorAffinity

We can see that:

  • Processor and I/O affinity are not set
  • We have a single NUMA node

Query


exec sp_configure 'show advanced options', 1;

reconfigure with override;

--Processor Mask
exec sp_configure 'affinity mask';

--IO mask
exec sp_configure 'affinity I/O mask';

SQL Server Instance – Exposes – NUMA Characteristics

Non-Uniform Memory Access

http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access

Non-Uniform Memory Access is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non-local memory.

Limiting the number of memory accesses provided the key to extracting high performance from a modern computer. For commodity processors, this meant installing an ever-increasing amount of high-speed cache memory and using increasingly sophisticated algorithms to avoid cache misses. But the dramatic increase in size of the operating systems and of the applications run on them has generally overwhelmed these cache-processing improvements. Multi-processor systems without NUMA make the problem considerably worse. Now a system can starve several processors at the same time, notably because only one processor can access the computer’s memory at a time.

NUMA attempts to address this problem by providing separate memory for each processor, avoiding the performance hit when several processors attempt to address the same memory. For problems involving spread data (common for servers and similar applications), NUMA can improve the performance over a single shared memory by a factor of roughly the number of processors (or separate memory banks).

Thus, Multi-processor machines utilize NUMA to more gracefully partition and protect memory.

It is better to minimize cross NUMA Node memory access.  And, so keep that in mind as you set “Max degree of Parallelism” .

Basically, you want to limit the maximum set of processors accessed to the number of processors per NUMA bank.

Queries

Once you have documented the relevant settings that might affect whether a query is ran in parallel or not, it really makes sense to gain awareness as to which queries are being ran in parallel.

Review Cached Plan:

Looked on the Net for sample code.  And, quite a few of the usual public committers  posted good queries:

Settled on the one by StrateSQL; here it is:



/*

StrateSQL
Can You Dig It? – Parallelism in the Plan Cache

http://www.sqlservercentral.com/blogs/stratesql/2010/12/9/can-you-dig-it_3F00_-_1320_-parallelism-in-the-plan-cache/

*/

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

WITH XMLNAMESPACES
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')

SELECT
	  COALESCE(DB_NAME(p.dbid)
  , p.query_plan.value('(//RelOp/OutputList/ColumnReference/@Database)[1]'
                               ,'nvarchar(128)')) 
            AS DatabaseName 
	, DB_NAME(p.dbid) + '.' 
               + OBJECT_SCHEMA_NAME(p.objectid, p.dbid) 
               + '.' + OBJECT_NAME(p.objectid, p.dbid) 
              AS ObjectName

	, cp.objtype

	, p.query_plan

	, cp.UseCounts

	, cp.plan_handle

	, CAST('' AS xml) AS SQLText

FROM sys.dm_exec_cached_plans cp

	CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) p

	CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) as q

WHERE cp.cacheobjtype = 'Compiled Plan'

AND p.query_plan.exist('//RelOp[@Parallel = "1"]') = 1

ORDER BY 

   COALESCE(DB_NAME(p.dbid)
                , p.query_plan.value(
                   '(//RelOp/OutputList/ColumnReference/@Database)[1]'
                    , 'nvarchar(128)'
               )
            )

  , UseCounts DESC

Review Active Sessions:

If on a whim, you will like to see which active, ongoing queries are running in parallel try the untested SQL Script I pasted below:



;with cteParallelProcesses (spid, numberofMultipleECIDs)
as
	(
		select   
			  spid
			, count(ecid) as numberofMultipleECIDs
		from     sys.sysprocesses
		group by 
			spid
		having 
			COUNT(ecid) > 1	
	)		

select 
	  tblSysProcess.spid
	, tblCTE.numberofMultipleECIDs  
	, suser_sname(tblSessions.security_id) as [userName]
	, tblSysProcess.lastwaittype
	, tblSysProcess.waitresource
	, tblSysProcess.waittime
	, db_name(tblSysProcess.dbid) as [databaseName]
	, tblSysProcess.cpu
	, tblSysProcess.physical_io
	, tblSysProcess.memusage
	, tblSysProcess.status
	, tblSysProcess.hostname
	, tblSysProcess.program_name
--	, tblSysProcess.cmd
	, tblSQLText.text as sqlText
	, object_name(tblSQLText.objectid, tblSQLText.dbid) 
              as objectName

from   sys.sysprocesses tblSysProcess

  inner join sys.dm_exec_connections tblConnections

		on tblSysProcess.spid = tblConnections.session_id

  inner join sys.dm_exec_sessions tblSessions

		on tblSysProcess.spid = tblSessions.session_id

  cross apply sys.dm_exec_sql_text(most_recent_sql_handle) tblSQLText

	 inner join cteParallelProcesses tblCTE

		on tblSysProcess.spid = tblCTE.spid

where tblSysProcess.spid > 50

order by 
         tblCTE.numberofMultipleECIDs desc
; 

Understand Parallelized Queries

I think we will like to review the batch of Queries that are flagged as Parallelized and get an idea on why?

Factors

Here are some of the recurring reasons:

  • Missing Indexes (which leads to more IO)
  • Missing column statistics
  • Complex SQL including ones included in exec (‘–sql—’)
  • Resource pressure that could end up serializing what would otherwise have benefited from being actually ran in parallel ( crediting sqldoctor @ http://sqldoctor.idera.com/wait-stats/parallelism-hurry-up-and-wait/)

Blind Sides

There are so many blind sides to this parallelism story.

Variable Argument vs Concrete Argument

Though estimated Query Plan might indicate that parallelism will occur, when your run the query and supply actual values for the parameters the optimized might see that the data set is narrow enough and the arguments for Parallelism get decimated.

Here is a sampler:

  • In the top pane, variables were submitted (@retirementAge) in the where clause and in the bottom actual values were submitted (’7/1/2013′)

Estimated vs Actual

Hardening Parallelism for specific user through Resource Group

To constrain this posting I will only briefly touch on this point.  My thoughts were excited  via a blog posting from Cindy Gross.

Cindy Gross – The ins and outs of MAXDOP

http://blogs.msdn.com/b/cindygross/archive/2011/01/28/the-ins-and-outs-of-maxdop.aspx

“… and your login is not limited by MAX_DOP settings in a Resource Governor resource pool.”

Here are some contributory postings:

How much of Parallelism is based on your Hardware?

I have a lot more reading and LAB work to do to determine to what extent parallelism decisions are anchored to Hardware performance…

But, I think you want to to get the sets of queries that are being ran in parallel.   This should be accessed across your entire deployment systems (Development, Testing, and Production).

You also want to give a bit more attention to data from your Production box; for the following reasons:

  • It will have more data
  • And, the hardware (Storage and CPU) is more representative to as what your users are experiencing
  • Your concurrent workload which affects whether SQL Server Engine parallelizes is also more helpful when seen through your Production lenses
  • Against Production DB, consider running server-side SQL Trace and using the replay functionality.  And, in turn run that against systems in the lower tier to simulate Production DB  load

Adjust your Settings

Max degree of Parallelism

What should you set “Max degree of Parallelism” to:

Here is a formula that I found on the Web for calculating Max degree of Parallelism:

Sakthivel Chidambaram - Wow… We have MAXDOP calculator for SQL Server… It makes my job easier:
http://blogs.msdn.com/b/sqlsakthi/archive/2012/05/24/wow-we-have-maxdop-calculator-for-sql-server-it-makes-my-job-easier.aspx

http://blogs.msdn.com/b/sqlsakthi/p/maxdop-calculator-sqlserver.aspx

Get # of NUMA Nodes

http://blogs.msdn.com/b/sqlsakthi/p/maxdop-calculator-sqlserver.aspx

From SQL Server issue:


select COUNT(distinct memory_node_id) as memory_node_id
from    sys.dm_os_memory_clerks
where   memory_node_id != 64

 

Our result:

distinct_memory_node_id

Get # of Processor Cores

http://blogs.msdn.com/b/sqlsakthi/p/maxdop-calculator-sqlserver.aspx

From command line issue:


powershell -Command "& Get-WmiObject -namespace "root\CIMV2"
-class Win32_Processor -Property NumberofCores | select NumberofCores"

Output:

NumberofCores

Submit Data:

Submit data to http://blogs.msdn.com/b/sqlsakthi/p/maxdop-calculator-sqlserver.aspx

Here is what we came up with:

SQLServerMaxDegreeofParallelismCaclculator

Cost threshold for Parallelism

Based on Joe Chang’s published hard-work, consider adjusting your “Cost threshold for Parallelism” to match your hardware.

That is, on bigger boxes, if you are think too may queries are being parallelized determine their average cost and raise your “Cost threshold for Parallelism” a bit higher.

I found Chris Mcgowan’s blog posting to be most useful.

Cost Threshold for Parallelism

http://sqlserver365.blogspot.com/2013/02/cost-threshold-for-parallelism.html



declare @PlanCacheForMaxDop TABLE
(
      StatementSubTreeCost FLOAT 
    , UseCounts INT 
    , PlanSizeInBytes INT
);

-- Collect parallel plan information
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
WITH XMLNAMESPACES  
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') 
INSERT INTO @PlanCacheForMaxDop
(
      StatementSubTreeCost
    , UseCounts
    , PlanSizeInBytes
)
SELECT 
     n.value('(@StatementSubTreeCost)[1]', 'VARCHAR(128)') 
        AS StatementSubTreeCost,
     ecp.usecounts,
     ecp.size_in_bytes 
FROM sys.dm_exec_cached_plans AS ecp
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS eqp
CROSS APPLY 
query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple')  AS qn(n)

WHERE  n.query('.').exist('//RelOp[@PhysicalOp="Parallelism"]') = 1;

-- Return grouped parallel plan information
SELECT  MAX(CASE WHEN StatementSubTreeCost BETWEEN 1 AND 5 THEN '1-5'
                 WHEN StatementSubTreeCost BETWEEN 5 AND 6 THEN '5-6'
                 WHEN StatementSubTreeCost BETWEEN 6 AND 7 THEN '6-7'
                 WHEN StatementSubTreeCost BETWEEN 7 AND 8 THEN '7-8'
                 WHEN StatementSubTreeCost BETWEEN 8 AND 9 THEN '8-9'
                 WHEN StatementSubTreeCost BETWEEN 9 AND 10 THEN '9-10'
                 WHEN StatementSubTreeCost BETWEEN 10 AND 11 THEN '10-11'
                 WHEN StatementSubTreeCost BETWEEN 11 AND 12 THEN '11-12'
                 WHEN StatementSubTreeCost BETWEEN 12 AND 13 THEN '12-13'
                 WHEN StatementSubTreeCost BETWEEN 13 AND 14 THEN '13-14'
                 WHEN StatementSubTreeCost BETWEEN 14 AND 15 THEN '14-15'
                 WHEN StatementSubTreeCost BETWEEN 15 AND 16 THEN '15-16'
                 WHEN StatementSubTreeCost BETWEEN 16 AND 17 THEN '16-17'
                 WHEN StatementSubTreeCost BETWEEN 17 AND 18 THEN '17-18'
                 WHEN StatementSubTreeCost BETWEEN 18 AND 19 THEN '18-19'
                 WHEN StatementSubTreeCost BETWEEN 19 AND 20 THEN '19-20'
                 WHEN StatementSubTreeCost BETWEEN 20 AND 25 THEN '20-25'
                 WHEN StatementSubTreeCost BETWEEN 25 AND 30 THEN '25-30'
                 WHEN StatementSubTreeCost BETWEEN 30 AND 35 THEN '30-35'
                 WHEN StatementSubTreeCost BETWEEN 35 AND 40 THEN '35-40'
                 WHEN StatementSubTreeCost BETWEEN 40 AND 45 THEN '40-45'
                 WHEN StatementSubTreeCost BETWEEN 45 AND 50 THEN '45-50'
                 WHEN StatementSubTreeCost > 50 THEN '>50'
                 ELSE CAST(StatementSubTreeCost AS VARCHAR(100))
            END) AS StatementSubTreeCost 
        , COUNT(*) AS countInstance

        , avg(PlanSizeInBytes) /1000 avgPlanSizeInKB

FROM    @PlanCacheForMaxDop
GROUP BY CASE WHEN StatementSubTreeCost BETWEEN 1 AND 5 THEN 2.5
              WHEN StatementSubTreeCost BETWEEN 5 AND 6 THEN 5.5
              WHEN StatementSubTreeCost BETWEEN 6 AND 7 THEN 6.5
              WHEN StatementSubTreeCost BETWEEN 7 AND 8 THEN 7.5
              WHEN StatementSubTreeCost BETWEEN 8 AND 9 THEN 8.5
              WHEN StatementSubTreeCost BETWEEN 9 AND 10 THEN 9.5
              WHEN StatementSubTreeCost BETWEEN 10 AND 11 THEN 10.5
              WHEN StatementSubTreeCost BETWEEN 11 AND 12 THEN 11.5
              WHEN StatementSubTreeCost BETWEEN 12 AND 13 THEN 12.5
              WHEN StatementSubTreeCost BETWEEN 13 AND 14 THEN 13.5
              WHEN StatementSubTreeCost BETWEEN 14 AND 15 THEN 14.5
              WHEN StatementSubTreeCost BETWEEN 15 AND 16 THEN 15.5
              WHEN StatementSubTreeCost BETWEEN 16 AND 17 THEN 16.5
              WHEN StatementSubTreeCost BETWEEN 17 AND 18 THEN 17.5
              WHEN StatementSubTreeCost BETWEEN 18 AND 19 THEN 18.5
              WHEN StatementSubTreeCost BETWEEN 19 AND 20 THEN 19.5
              WHEN StatementSubTreeCost BETWEEN 10 AND 15 THEN 12.5
              WHEN StatementSubTreeCost BETWEEN 15 AND 20 THEN 17.5
              WHEN StatementSubTreeCost BETWEEN 20 AND 25 THEN 22.5
              WHEN StatementSubTreeCost BETWEEN 25 AND 30 THEN 27.5
              WHEN StatementSubTreeCost BETWEEN 30 AND 35 THEN 32.5
              WHEN StatementSubTreeCost BETWEEN 35 AND 40 THEN 37.5
              WHEN StatementSubTreeCost BETWEEN 40 AND 45 THEN 42.5
              WHEN StatementSubTreeCost BETWEEN 45 AND 50 THEN 47.5
              WHEN StatementSubTreeCost > 50 THEN 100
              ELSE StatementSubTreeCost
         END;
GO

Interestingly enough it shows that as a group Statements costed below our default threshold of 5 are actually the highest on our list:

StatementSubTreeCost

And, like they say that is not for now; but for another trek up the learning Tree…

Thanks

My interest in this area was piqued by a couple of articles publicly published by Jimmy May.  I have listed the postings in the References section.

One of the commenters, Mike B, posted:

There is a misunderstanding here. You can reduce CPU utilization even more by turning off the computer. It should be understood that parallel processing is inherently “inefficient”, but you use it because it executes faster.

With passion like this, who needs Court TV.  Just like CSI you almost always learn something, once you have the requisite interest.

 

References:

References – Wait Stats:

References – Wait Stats / SQLCLR:

References – Parallelism – Architecture & Design:

References – General:

References – Parallelism in your Plan Cache:

References – Max degree of Parallelism Settings

References – CPU (Processor):

References – Jimmy May

References – Joe Chang:

References – Maxdop Calculator:

References – Parallelism – Cost Threshold:

This entry was posted in CXPACKET, Dynamic Management View (DMV), Microsoft, MS SQL Server, Parallelism - Query, Wait Stats and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s