Transact SQL – XML Column Hidden In SUBSTRING

Background

Found a query that was causing SQL Server timeouts, let us see if we can unpack it a bit.

 

Code

Original Code

Here is a quick snippet


select app.*

from CsLogins.dbo.users usr

join CsLogins.dbo.UserInRole ures on ures.userID = usr.userID

join CsLogins.dbo.resources res on res.resourceID = ures.resourceID

left join Admin.CourseManagerUserLog UserLog1 

	on UserLog1.username=usr.userName 
	and 
		(
			   ( UserLog1.[message] like '%set as cancelled%' )
			OR ( UserLog1.[message] like '%has been added%' )
		) 
	AND (UserLog1.record_created between @periodStart and @periodEnd)

join [dbo].Appointments App 
		on cast(App.appointmentID as nvarchar(36)) 
			= SUBSTRING
				(
					cast(UserLog1.details as nvarchar(max))
				  , CHARINDEX
						(
							 'apptID='
							, cast(UserLog1.details as nvarchar(max))
							, 1
						) + 8
					, 36
				)

 

Quick Explanation

  1. Not bought into the idea that we need a left outer join into Admin.CourseManagerUserLog
  2. Let us change it into an inner join

 

Simplify Code

Let us simplify the code and see what is going in with the SubString function

Code



select
		UserLog1.*

from    Admin.CourseManagerUserLog UserLog1 with (nolock)

join [dbo].BTWTAppointments App 

	on App.appointmentID 
		= SUBSTRING
			(
				  cast(UserLog1.details as nvarchar(max))
				, CHARINDEX('apptID=', cast(UserLog1.details as nvarchar(max)), 1) + 8
				, 36
			)

where 
	(
	     ( UserLog1.[message] like '%set as cancelled%' )
	  OR ( UserLog1.[message] like '%has been added%' )

	) 

Statistics

Query Plan

SubString-QueryPlan

Statistics I/O

SubString-StatisticsIO

 

Discovered Substring column is XML

Discovery

Reviewed the column that substring is being applied to and discovered that it is an XML Column.

Code



select top 10

	  [id]
	, [username]
	, UserLog1.[message]
	, userLog1.details

from    Admin.CourseManagerUserLog UserLog1 with (nolock)

where 
			(
				   ( UserLog1.[message] like '%set as cancelled%' )
				OR ( UserLog1.[message] like '%has been added%' )
			) 

Code

XMLColumn

 

XPATH

Let us use XPATH to break the details column into XML Elements.

 

Explanation

  1. XPath
    • We are using XML Path to splice the nodes of the details column
    • The exact XML path is /in-car-lsn-status
  2. And, joined against the Appointment table
    • AppointmentID = UserLogInCarLSN.node.value
    • attribute (@apptID as varchar(60))
    • cast to uniqueIdentifier

Code



select 
	UserLog1.*

from    Admin.CourseManagerUserLog UserLog1 with (nolock)

cross apply UserLog1.details.nodes('/in-car-lsn-status') UserLogInCarLSN(node)

join [dbo].BTWTAppointments App 

	on App.appointmentID 
		= cast(UserLogInCarLSN.node.value('@apptID', 'varchar(60)') AS UniqueIdentifier)

where 
	(
		   ( UserLog1.[message] like '%set as cancelled%' )
		OR ( UserLog1.[message] like '%has been added%' )
	) 


Statistics

Query Plan

NoXMLIndexes-QueryPlan

 

Statistics I/O

NoXMLIndexes-StatisticsIO-20160225

 

Explanation

Once we started using XPath, we noticed the following:

  1. In the Query Plan
    • Operators that includes XMLReader with XPath filter

 

XML – Index

As we in the database world, we likely need Indexes.

XML – Index – Main

Let us add one on the XML code.

Code


--create primary xml index
CREATE PRIMARY XML INDEX [INDX_DBA_XML_MAIN] 
ON  Admin.CourseManagerUserLog
	(
		[details]
	)
GO

 

Query Plan

XMLIndexMain-QueryPlan

 

Statistics IO

XMLIndexMain-StatisticsIO

 

Explanation

  1. Query Plan
    • We see two XML Index seeks
      • They are both Clustered Index Seeks
  2. Statistics IO
    • Reference to xml_index_nodes_x1_x2

 

XML – Index – Attribute

As we are looking to match on specific attributes, let us create a property XML Index, as well.

Here is our Query Snippet

Query Snippet – Match on Attribute

Clause-XMLProperty

Code

Sample

--create property xml index
CREATE XML INDEX [INDX_DBA_XML_Propertyx] 
ON  Admin.CourseManagerUserLog
	(
		[details]
	)
USING XML INDEX [INDX_DBA_XML_MAIN]
FOR PROPERTY    
GO

Code

Query Plan

XMLIndex-Property-QueryPlan

 

Statistics I/O

XMLIndex-Property-StatisticsIO

 

Indexes Used

The statistics I/O does not list the specific XML Indexes used.  It lists the index as xml_index_nodes_[nnnnnnnn].

To get the actual XML Index, access the Operator and review Object entry.

In our case,we had two XML Operators.

One accesses the Main XML Index, the other accesses the XML Property Index.

MAIN

QueryPlanXMLMain

 

Property

QueryPlanXMLProperty

Metrics

Statistics I/O

 

Data Source CHAR_INDEX XQUERY (NO INDEX) XQUERY (INDEX – PRIMARY) XQUERY (INDEX – SECONDARY)
Appointments Scan count 57, logical reads 167808, physical reads 3 Scan count 0, logical reads 328, physical reads 1, read-ahead reads 98  Scan count 0, logical reads 177, physical reads 0, read-ahead reads 0 Scan count 0, logical reads 177, physical reads 94, read-ahead reads 0
xml_index_nodes  Scan count 116, logical reads 355, physical reads 0 Scan count 116, logical reads 408, physical reads 41, read-ahead reads 0
Resources Scan count 0, logical reads 1041, physical reads 3, read-ahead reads 0 Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0  Scan count 0, logical reads 1041, physical reads 0, read-ahead reads 0 Scan count 0, logical reads 1041, physical reads 3, read-ahead reads 0
UserInRole Scan count 57, logical reads 114, physical reads 8, read-ahead reads 0 Scan count 57, logical reads 236, physical reads 1, read-ahead reads 7  Scan count 57, logical reads 114, physical reads 0, read-ahead reads 0 Scan count 57, logical reads 114, physical reads 8, read-ahead reads 0
Student Scan count 0, logical reads 228, physical reads 86, read-ahead reads 0 Scan count 0, logical reads 228, physical reads 86, read-ahead reads 0  Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0 Scan count 0, logical reads 228, physical reads 86, read-ahead reads 0
Users Scan count 0, logical reads 114, physical reads 1, read-ahead reads 0 ‘Users’. Scan count 0, logical reads 114, physical reads 1, read-ahead reads 0  Scan count 0, logical reads 114, physical reads 0, read-ahead reads 0 Scan count 0, logical reads 114, physical reads 1, read-ahead reads 0,
CourseManagerUserLog Scan count 1, logical reads 52946, physical reads 4, read-ahead reads 52940 Scan count 1, logical reads 52946, physical reads 4, read-ahead reads 52940  Scan count 1, logical reads 52946, physical reads 0, read-ahead reads 0 Scan count 1, logical reads 52946, physical reads 4, read-ahead reads 52940
Worktable Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0

 

 

Statistics Time

 

Time CHAR_INDEX XQUERY (NO INDEX) XQUERY (INDEX – PRIMARY) XQUERY (INDEX – SECONDARY)
CPU Time 477129 ms  936 ms  530 ms 780 ms
Elapsed Time 479800 ms  2637 ms  531 ms 3593 ms
Duration 7.59 minutes  2 seconds  0 seconds 3 sec

 

 

 

Technical Review

Here is some key takeaways

Filtering Using CHAR-INDEX Versus XPATH

  1. Statistics
    • I/O
      • Appointments Table, Logical Reads
        • CHAR-INDEX = 167808
        • XPATH/NO-INDEX = 328
        • XPATH/INDEX = 177
    • Time taken
      • CHAR-INDEX = 8 minutes
      • XPATH = 2 to 3 seconds
  2. Extraneous resources referenced in Statistics I/O
    • CHAR-INDEX
      • None
    • XPATH (  NO-INDEX )
      • Work Table
        • Unfortunately, metrics are not actually recorded
    • XPATH ( INDEX )
      • xml_index_nodes
        • XPATH ( MAIN INDEX Only )
          • Scan count 116, logical reads 355, physical reads 0
        • XPATH ( MAIN & Secondary Index )
          • Scan count 116, logical reads 408, physical reads 41, read-ahead reads 0

 

Original Query

Once we had a good sense that our XPATH revision will yield benefit over the earlier CHAR_INDEX syntax, we went back and revised our original query.

Here is a quick summary of how the two queries compare.

Statistics IO

Graphical

StatisticsIO

 

Tabulated

Data Source CHAR_INDEX XPATH
Student Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
BTWTAppointments Scan count 57, logical reads 167808, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Scan count 0, logical reads 171, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Users Scan count 0, logical reads 228, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Scan count 0, logical reads 114, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CourseManagerUserLog Scan count 1, logical reads 6239, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Scan count 1, logical reads 6239, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
xml_index_nodes_1598628738_32000 Scan count 114, logical reads 402, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

 

Statistics Time

Graphical

StatisticsTime

 

Tabulated

Time CHAR_INDEX XPATH
CPU Time 9360 ms 78 ms
Elapsed Time 9383 ms 107 ms

 

 

Query Plan – Comparison

Compare

QueryPlanCompare

 

 

CHAR-INDEX

SubtreeCost-OriginalPlan

 

XPATH

SubtreeCost-RevisedPlan

 

 

Tabulated

Identifier CHAR_INDEX XPATH
Query Cost 95% 5%
Estimated Sub Tree Cost 4.16539 0.209969

 

 

Dedicated

Like Kim, dedicated to a few good fellas

  1. XML
  2. Miguel de Icaza, CTO and co-founder of Xamarin
  3. Scott Gu
  4. Seth Delconte

All of my very quick study on XML Indexes is based on Seth Delconte’s work referenced above.

But, really can not let an opportunity to speak well about Miguel de Icaza’s brilliant wilderness work on Xamarin go by.

One thought on “Transact SQL – XML Column Hidden In SUBSTRING

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s