Microsoft – SQL Server – unique Index – Pros and Cons (mostly cons in this particular case)

Background

Tried addressing a “Duplicate Key Violation” error, by adding a unique index. This allows us to jump ahead of the DB Engine and avoid returning back to the Client “Duplicate Key” errors.

A few of the available options are documented in:

A bit of our thoughts is documented in https://danieladeniji.wordpress.com/2013/01/25/microsoft-sql-server-fast-db-inserts-and-error-suppression/

Weeks Later

But, here we are a couple of weeks later and we are seeing a lot of DB Deadlocks traceable back to the Index.

DuplicateKeyError

Googled for help and I like the following write-ups:

Somnath Muluk
http://dba.stackexchange.com/questions/10782/sql-server-indexing-performance-in-unique-duplicate-column

A unique index guarantees that the index key contains no duplicate values and therefore every row in the table is in some way unique. Specifying a unique index makes sense only when uniqueness is a characteristic of the data itself. For example, if you want to make sure that the values in the NationalIDNumber column in the HumanResources.Employee table are unique, when the primary key is EmployeeID, create a UNIQUE constraint on the NationalIDNumbercolumn.

The benefits of unique indexes include the following:

  • Data integrity of the defined columns is ensured.
  • Additional information helpful to the query optimizer is provided.

Creating a PRIMARY KEY or UNIQUE constraint automatically creates a unique index on the specified columns. There are no significant differences between creating a UNIQUE constraint and creating a unique index independent of a constraint. Data validation occurs in the same manner and the query optimizer does not differentiate between a unique index created by a constraint or manually created. However, you should create a UNIQUE or PRIMARY KEY constraint on the column when data integrity is the objective. By doing this the objective of the index will be clear.

SQL Kiwi (Paul White)
http://dba.stackexchange.com/questions/10782/sql-server-indexing-performance-in-unique-duplicate-column

Performance does not necessarily increase if an index is marked unique, though there are many good reasons to make an index unique if it is guaranteed to be so. One consideration is integrity: IDENTITY columns should always have a unique index or constraint, for example, since this column property does not enforce uniqueness in itself, but most database designs would expect that guarantee to exist. UNIQUEIDENTIFIER is another type that does not self-enforce uniqueness, but if used as a key, then that uniqueness must be explicitly enforced. The same argument applies to any candidate key of a relation, of course.

Uniqueness has the biggest potential for performance gains in the query optimizer. Uniqueness guarantees allow many simplifications to be applied, and these usually result in ‘better’ execution plans. In the best case, a uniqueness guarantee might allow entire operations to be ‘optimized away’, which will usually benefit performance markedly.

The storage engine can also benefit from uniqueness, even though physical storage size may be unaffected either way. Take the common example of an equality seek on an index. If the index is constrained to be unique, the storage engine can perform a singleton seek: knowing that at most only a single value can be returned allows certain physical optimizations to be applied.

Where an index is not defined as unique, the storage engine must scan (forward or backward) from the starting point to ensure it returns all duplicated values. Performance testing shows that singleton seeks on a unique index can be 30-40% faster than the seek + range scan that occurs on the same data and non-unique-index. The situation is not entirely clear-cut however, if SQL Server uses linear interpolation search on a unique index with an unfortunate data distribution, performance can be 70% worse (on 64-bit systems).

Perhaps the biggest hidden cost to uniqueness is the cost of enforcing it. The query processor always needs to perform extra work to check for uniqueness violations when modifying a unique index. Overall, my advice is still to enforce uniqueness wherever it logically exists, but to also be aware of the downsides of uniqueness too.

Test Code

Here is a quick test code:

Btw, code is also available @

http://sqlfiddle.com/#!3/75f12/1


set nocount on
go

use [tempdb]
go

if object_id('dbo.customer') is null
begin
    create table dbo.customer
    (

            [customerID] bigint  not null identity(1,1) primary key
          , [website] sysname not null

    )

    create index idx_WebSite
        on dbo.customer
        (
            [website]
        )

end

if object_id('dbo.customer_UniqueKey') is null
begin

    create table dbo.customer_UniqueKey
    (

            [customerID] bigint  not null identity(1,1) primary key
          , [website] sysname not null

    )

    create unique index idx_Unique_WebSite
        on dbo.customer_UniqueKey
        (
            [website]
        )
        with ignore_dup_key    

end

insert into dbo.customer
([website])
values ('http://www.microsoft.com')

insert into dbo.customer_UniqueKey
([website])
values ('http://www.microsoft.com')

And, MS SQL Server “Query Cost”

UniqueIndexCost

From the chart pasted above, the unique key insert is a bit more expensive as SQL Server needs to check for existing entries beforehand.

(Please let me know your thoughts)

References

One thought on “Microsoft – SQL Server – unique Index – Pros and Cons (mostly cons in this particular case)

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s