Transact SQL – Warnings – “Operator used tempdb to spill data during execution with spill level”

Background

Reviewing some queries and noticed a warning that stated “Operator used tempdb to spill data during execution with spill level“.

 

Query

Here is the query truncated:


select  count(tblB.pkBookingID)

from   [dbo].[tblBooking] tblB WITH (NOLOCK)

    inner join [dbo].[tblBookingItem] tblBI WITH (NOLOCK) 

        ON  tblB.pkBookingID = tblBI.fkBookingID

        AND tblBI.pkBookingItemID =
                (

                    select max(tblBI_Inner.pkBookingItemID)

                    from   [dbo].[tblBookingItem] tblBI_Inner WITH (NOLOCK) 

                    where  tblBI.fkBookingID = tblBI_Inner.fkBookingID

                    and    tblBI_Inner.active = 1

                    and tblBI_Inner.primaryRoom = 1 

                )

    inner JOIN [dbo].[tblBookingDateTime] tblBDT WITH (NOLOCK)

	    ON tblBI.fkBookingID = tblBDT.fkBookingID
	    AND tblBI.pkBookingItemID = tblBDT.fkBookingItemID 

    INNER JOIN dbo.tblGeneralFieldInformation tblGFI

        ON  tblB.pkBookingID = tblGFI.fkBookingID

where tblB.deleteBooking = 0

and tblBI.primaryRoom = 1 

and  tblBI.active = 1

AND tblGFI.active =1

 

Query Plan

Estimated Query Plan

Here is our estimated query plan:

estimatedQueryPlan

 

Explanation:

  • Nothing big
    • A couple of Merge Joins and a Hash Match
    • A couple of Stream Aggregates
      • Stream Aggregate Showplan Operator
        https://technet.microsoft.com/en-us/library/ms189907%28v=sql.105%29.aspx

        The Stream Aggregate operator groups rows by one or more columns and then calculates one or more aggregate expressions returned by the query. The output of this operator can be referenced by later operators in the query, returned to the client, or both. The Stream Aggregate operator requires input ordered by the columns within its groups. The optimizer will use a Sort Operator prior to this operator if the data is not already sorted due to a prior Sort operator or due to an ordered index seek or scan. In the SHOWPLAN_ALL statement or the graphical execution plan in SQL Server Management Studio, the columns in the GROUP BY predicate are listed in the Argument column, and the aggregate expressions are listed in the Defined Values column.

Actual Query Plan

Here is what happens when we actually run the query:

actualQueryPlan

Our eyes quickly turn to the Yellow Warning Signal:

HashMatchAggregateWarning

Quick Explanation:

  • We see a warning on the Hash Match Join
  • Keep in mind that the Hash Match precedes the Stream Aggregate
  • And, so we should review the Hash Match Operator and see what is being sorted

 

Hash Match Aggregate Operator:

HashMatchCleanedup

 

 

Quick Explanation:

  • There is a big divide between the “Estimated Number of Rows” and the “Actual Number of Rows”
    • The “Estimated Number of Rows” is 1
    • The “Actual Number of Rows” is 397722
  • The “Defined Values”
    • Scalar Operator(ANY([DB].[dbo].[tblGeneralFieldInformation].[fkBookingID] as [tblGFI].[fkBookingID]))
  • Estimated Execution Mode
    • Row
  • Parallel
    • False –> Which means no parallelism
  • Warnings
    • There are 3 warnings
    • Operator used tempdb to spill data during execution with …
      • spill level 1
      • spill level 2
      • spill level 3

Warnings

 

Query Rewritten

Windowing Functions

I suspect the original query is a bit slow due to the correlated sub-query on [dbo].[tblBookingItem].

Let us re-write using Windowing Functions.

Query uses Windowing Functions:

 

;with cteBI
as
(

    select
              pkBookingItemID
            , fkBookingID
            , primaryRoom
            , active

            , DENSE_RANK()
                over (PARTITION by tblBI.fkBookingID order
                        by tblBI.[pkBookingItemID] desc
                    )
                        as rankID

        from  [dbo].[tblBookingItem] tblBI WITH (NOLOCK) 

        where    tblBI.active = 1

        and tblBI.primaryRoom = 1 

)
select  count(tblB.pkBookingID)

from   [dbo].[tblBooking] tblB WITH (NOLOCK)

    /*
            inner join [dbo].[tblBookingItem] tblBI WITH (NOLOCK) 

                ON  tblB.pkBookingID = tblBI.fkBookingID

                AND tblBI.pkBookingItemID =
                        (

                            select max(tblBI_Inner.pkBookingItemID)

                            from   [dbo].[tblBookingItem] tblBI_Inner WITH (NOLOCK) 

                            where  tblBI.fkBookingID = tblBI_Inner.fkBookingID

                            and    tblBI_Inner.active = 1

                                Make sure that it is the primary room
                            and tblBI_Inner.primaryRoom = 1 

                        )

    */

        inner join cteBI

                    on tblB.pkBookingID = cteBI.fkBookingID
                    and cteBI.rankID = 1    

    inner JOIN [dbo].[tblBookingDateTime] tblBDT WITH (NOLOCK)

	    ON cteBI.fkBookingID = tblBDT.fkBookingID
	    AND cteBI.pkBookingItemID = tblBDT.fkBookingItemID 

    INNER JOIN dbo.tblGeneralFieldInformation tblGFI

        ON  tblB.pkBookingID = tblGFI.fkBookingID

where tblB.deleteBooking = 0

and cteBI.primaryRoom = 1 

and  cteBI.active = 1

AND tblGFI.active =1

 

Execution Plan

 

actualQueryPlan_WF

 

Performance Analysis

Let us do a quick comparison

Query Cost

QueryCostCompare

Statistics I/O

Image:

statisticsIO

 

 

Statistics Time

Image:

 

statisticsTime

 

Category Factor Correlated Join  Window Function
Query Cost  
 34%  66%
 Time
 CPU time  2059 ms 1248 ms
 elapsed time  2503 ms  1278 ms
 I/O
 Worktable Scan count = 0
Logical reads = 0
Physical Reads = 0
Scan count = 1711
logical reads = 18037
physical reads = 0
 Workfile Scan count = 68
Logical reads = 3624
Physical Reads = 380
Read aheads =3748
Scan count =0
logical reads =0
physical reads = 0
Read aheads = 0
 tblBookingDateTime Scan count = 397722
Logical reads =  1196123
Scan count =1
logical reads =1850
tblGeneralFieldInformation Scan count = 1
Logical reads = 2111
Scan count =1
logical reads =1102
tblBookingItem Scan count = 2
Logical reads = 5114
Scan count = 1
logical reads = 2557
 tblBooking Scan count = 1
Logical reads = 694
Scan count = 1
logical reads = 695

 

 

 

 

SQL Server Profiler

Configure:

We are filtering:

EventsSelection

 

Output:

Image:

SQLServerProfiler

 

Tabulated:

Column Correlated Join  Window Function
 CPU 2184  1404
 Reads 1204774 25955
 Writes  0  0
 Duration  2684  1497

 

 

Quick Explanation:

  • We can see the Hash Warnings between the beginning and completion of the correlated join; as expected there are 3 of them
  • There is also a big schism is the amount of I/O between both queries

 

Diagnostics via Dynamic Management View

Let us compare the queries by utilizing Dynamic Management Views.

Requirements:

  • Need one of the following permissions
    • Need sysadmin
    • VIEW SERVER STATE

Query

Clear Cache

As we are using a development box, we are able to clear memory and procedure cache.


   dbcc dropcleanbuffers
   dbcc freeproccache

 Actual Dynamic Management View ( DMV) Query


SELECT
          plan_handle = ecp.plan_handle
        , [schema] = object_schema_name(sqlText.objectid)
        , [object] = object_name(sqlText.objectid)
        , sqltext = sqlText.text 

        , ecp.objtype
        , [memoryObjectType] = omo.[type]

        , planSize = sum(ecp.size_in_bytes)
        , [pageSize] = sum(page_size_in_bytes )

        , workerTime = cast(cast(avg(qs.total_worker_time / 1E6) as decimal(10,2)) as varchar(40) ) + ' secs'
        , readsLogical  = avg(qs.total_logical_reads)
        , executionCount = sum(qs.execution_count)
        , totalRows = avg(qs.total_rows)

  FROM sys.dm_exec_cached_plans AS ecp 

        JOIN sys.dm_os_memory_objects AS omo 

                ON
                    (
                           ( ecp.memory_object_address = omo.memory_object_address )
                        OR ( ecp.memory_object_address = omo.parent_address)
                    )

        CROSS APPLY sys.dm_exec_sql_text(plan_handle) sqlText

        LEFT OUTER JOIN sys.dm_exec_query_stats AS qs 

                on ecp.plan_handle = qs.plan_handle

  where sqlText.text not like '%sys%'

  and    sqlText.dbid = db_id()

  group by
          ecp.plan_handle
        , object_schema_name(sqlText.objectid)
        , object_name(sqlText.objectid)
        , sqlText.text
        , ecp.objtype
        , omo.[type]

  order by
          sum(qs.total_logical_reads) desc
        , sum(page_size_in_bytes ) desc

  GO

 

Explanation:

  • Filter out queries that reference the sys schema
  • Only look in our current user database [ db_id() ]

Output:

DMV_1

Tabulated:

Column Correlated Join  Window Function
Plan Size 114688 98304
Page Size 8192 8192
Worker Time 3.25 secs  2.02 secs
Logical Reads 1317079 29418

 

 

Summary

With tending a bit of attention and time, we can tackle sorts and consequent temp-db workload.

 

 

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s