Microsoft – SQL Server – DML – Merge Statement

Microsoft – SQL Server – DML – Merge Statement

Reviewing a Stored Procedure that uses the SQL Merge Statement.  I remember a few years back a trusted Oracle DBA was discussing its use in an ETL Application that he and I jointly worked on.

He seems to like it OK.

But, nevertheless, wanted to find potential performance implications around it.  And, also see what areas will lead to the most optimal processing.

And, so found the web posts listed below:

  1. Optimizing Merge Statement
    http://technet.microsoft.com/en-us/library/cc879317.aspx

Upon reviewing the docs, here are some considerations per optimization:

  1. In the source table(s), consider having a primary key or unique, covering index on the join columns
  2. In the destination table, consider having a unique clustered index
  3. On the joining/On Clause, only specify the columns that are involved in the join; that is, do not specify additional filtering items
  4. Additional restriction(s) should be placed in the “When Matched” clause
  5. Review the Query Plan to determine which join operator is being used.  When the tables are similar in size a merge join will probably be used.  Whereas, when the source table is much smaller, a nested loop will probably be used.  If you deem it necessary to force a particular join type, consider using a Option <Query Hint>
  6. Review to what extent the Query \ Query Plan is parameterized.  You can tilt towards parameters by using Stored Procedures, variables rather than hard-coded query \ values.  Also consider using Plan Guides
  7. The usage of the TOP Clause to ensure that only a stated number of records are inserted\updated seems to be a bit problematic for the Merge Statement.  The reasons includes the fact that the entire tables are “read through” thus potentially bypassing the benefits of  a well chosen TOP clause.  Care should be taken to ensure that data already processed and are not re-introduced during subsequent merge statement iterations (especially when the merge statement is inside a loop offering)
  8.  Note that the Merge Operator can be used in a bulk-load scenario to bring in new data.  The Source table can be offered via the OpenRowset syntax.  When used in such a case consider the use of Order and Unique hints – This ensures that the Query Optimizer can generate better Query Plan

 

 

References

  1. Optimizing Merge Statement
    http://technet.microsoft.com/en-us/library/cc879317.aspx

  2. Christian Etter (CE’s Blog – Stuff 1 found noteworthy)
    Merging Inserted Data Using OUTPUT in SQL Server 2005
    http://www.christian-etter.de/?p=593

     

     

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s