SQL Server – GUID Generation Options

 

Background

As a quick followup to our last post on generating unique identifiers for Transact SQL, I searched the Net for straightforward C# code that I could co-opt and benchmark.

 

How does SQL Server Sort GUIDs?

Before, we dig into the options, let us be upfront and acknowledge that it is likely that you will likely be surprised that SQL Server does not sort GUIDs in the fashion that will think it does.

The sort is akin to the Eastern Way of lettering as compared to the Western Way.

LAST 6 BYTES

Alberto Ferrari is one of his blog Postings spoke to this somewhat esoteric area.  The specific post is:

How are GUIDs sorted by SQL Server?
http://sqlblog.com/blogs/alberto_ferrari/archive/2007/08/31/how-are-guids-sorted-by-sql-server.aspx

 

Hypothesis

Here is the sample data:

Sample

 

And, here is his hypothesis

  1. A..F are evaluated in right to left  left to right order and are the most important
  2. 8..9 are evaluated in left to left order
  3. 6..7 are evaluated in left to right order
  4. 4..5 are evaluated in left to right order
  5. 0..3 are evaluated in left to right order and are the less important

 

Jeremy Todd

jeremy Todd has this to say:

It gets even more complicated, because one eccentricity of Microsoft SQL Server is that it orders GUID values according to the least significant six bytes (i.e. the last six bytes of the Data4 block).  So, if we want to create a sequential GUID for use with SQL Server, we have to put the sequential portion at the end.  Most other database systems will want it at the beginning. 

Sample

It is always good to test this things out, and so we did so.

Bottom

Issued the following SQL against one of our tables..

Code:

select top 5 
		  [sourced] = 'Bottom'
		, [id]
from   [dbo].[BannerLinkLog]
order by [id] desc

 

Output:

Bottom

Explanation:
  1. When ordered from the highest number, here is what we see
    • F6D8CD28-9001-4D78-84EA-FFFFF778F28D
    • F2B20B4F-0C67-4312-8D81-FFFFF150F422
    • B9B011AB-A7ED-4C4C-9538-FFFFEC9AB13C
    • 06D2C885-0676-46C1-A807-FFFFE3B623A5
    • D6FC23D5-B2B0-42D4-AF89-FFFFDCAAF476
  2. I think it is best to focus on the last 6 bytes ( 12 digits ) and go from left to right

Top

Code:

select top 5 
		  [sourced] = 'Top'
		, [id]
from   [dbo].[BannerLinkLog]
order by [id] asc

 

Output:

Top

 

 

 

Sample Code

Found nice ones at:

Transact SQL

  1. Jimmy Nilsson – Comb

CLR

  1. Jeremy Todd – GUID as fast primary keys under multiple database
    http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database
  2. Dan Guzman – Improving Uniqueidentifier Performance
    http://www.dbdelta.com/improving-uniqueidentifier-performance/
  3. David Browne – How to generated Sequential GUIDs for SQL Server in .Net
    http://blogs.msdn.com/b/dbrowne/archive/2012/07/03/how-to-generate-sequential-guids-for-sql-server-in-net.aspx

 

Extended Stored Procedure

  1. Gert Drapers
    XPGUID.DLL

    • No longer available

COMB

Jimmy Nilsson

Jimmy Nilsson in an article, The Cost of GUIDs as Primary Keys, detailed how he was able to create progressive increasing GUIDs by combining NewID and getDate(). The getdate is the suffix; as we detailed earlier it is the last 6 bytes that matters for sorts.

Sample Code

Here is his sample code:


DECLARE @aGuid UNIQUEIDENTIFIER

SET @aGuid = CAST(CAST(NEWID() AS BINARY(10)) 
+ CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER)

Encapsulated

And, here is our implementation;  basically wrapper code so that we can re-use in various places:

[Constant].[vw_GUID]


if schema_id('Constant') is null
begin
	
	exec('create schema [Constant] authorization [dbo]')

end
go


if object_id('[Constant].[vw_GUID]') is null
begin
	
	exec('create view [Constant].[vw_GUID] as select 1/0 as [shell]')

end
go

alter view [Constant].[vw_GUID]
with schemabinding
as

	select
	 
		 [EMPTY]
			=
				CAST(
						'00000000-0000-0000-0000-000000000000' 
						AS UNIQUEIDENTIFIER
					)				

		, [COMB]
			=
				CAST(
						  CAST(NEWID() AS BINARY(10)) 
						+ CAST(GETDATE() AS BINARY(6)) 
						AS UNIQUEIDENTIFIER
					)
go

grant select on [Constant].[vw_GUID] to [public]
go


[dbo].[fnc_getNextGUID]


SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO


/*
	drop function [dbo].[fnc_getNextGUID]
*/
if object_id('[dbo].[fnc_getNextGUID]') is null
begin

	exec('CREATE FUNCTION [dbo].[fnc_getNextGUID]
			()
			RETURNS uniqueidentifier
			AS
			BEGIN	

				return
					(
						select 
							cast (''00000000-0000-0000-0000-000000000000'' as uniqueIdentifier)
					)

			END
		')
END
go



ALTER FUNCTION [dbo].[fnc_getNextGUID]()
RETURNS uniqueidentifier
AS
BEGIN	

	return
		(
			select
					[newID] = [COMB]
			from  [Constant].[vw_GUID]
		)
END
go


grant execute on [dbo].[fnc_getNextGUID] to [public]
go


 

CLR Code

Here, I made very little changes to Jeremy Todd’s code.

The change was to convert from a regular C# DLL that is callable from a .Net Application into a CLR Scalar function that is accessible from SQL Server.

 

Source Code

SQL Server CLR




/*

GUIDs as fast primary keys under multiple databases

Jeremy Todd, 15 Mar 2013	http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database

*/
using System;
using System.Security.Cryptography;

using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

using System.Runtime.InteropServices;

public enum SequentialGuidType
{
  SequentialAsString,
  SequentialAsBinary,
  SequentialAtEnd
} 


public partial class GuidGenerator
{

	[DllImport("rpcrt4.dll", CharSet = CharSet.Auto)]
    public static extern int UuidCreateSequential(ref Guid guid);

		
  private static readonly RNGCryptoServiceProvider _rng = new RNGCryptoServiceProvider();

  [SqlFunction()]
  public static Guid NewSequentialGuidSequentialAsString()
  {
	  return NewSequentialGuidJeremyTodd(SequentialGuidType.SequentialAsString);
  }  

  
  [SqlFunction()]
  public static Guid NewSequentialGuidSequentialAsBinary()
  {
	  return NewSequentialGuidJeremyTodd(SequentialGuidType.SequentialAsBinary);
  }  

  [SqlFunction()]
  public static Guid NewSequentialGuidSequentialAtEnd()
  {
	  return NewSequentialGuidJeremyTodd(SequentialGuidType.SequentialAtEnd);
  }  
  
  [SqlFunction()]
  public static Guid NewSequentialGuidMSSQLJeremyTodd()
  {
	  return NewSequentialGuidJeremyTodd(SequentialGuidType.SequentialAtEnd);
  }  

  [SqlFunction()]
  public static Guid NewSequentialGuidMSSQLDanGuzman()
  {
	  return NewSequentialGuidDanGuzman();
  }  

  [SqlFunction()]
  public static Guid NewSequentialGuidMSSQLDavidBrowne()
  {
	  return NewSequentialGuidDavidBrowne();
  }  
  
  [SqlFunction()]
   public static Guid NewGuid()
  {
	  
    return (Guid.NewGuid());
	
  }
  
  private static Guid NewSequentialGuidJeremyTodd(SequentialGuidType guidType)
  
  {
    byte[] randomBytes = new byte[10];
    _rng.GetBytes(randomBytes);

    long timestamp = DateTime.UtcNow.Ticks / 10000L;
    byte[] timestampBytes = BitConverter.GetBytes(timestamp);

    if (BitConverter.IsLittleEndian)
    {
      Array.Reverse(timestampBytes);
    }

    byte[] guidBytes = new byte[16];

    switch (guidType)
    {
      case SequentialGuidType.SequentialAsString:
      case SequentialGuidType.SequentialAsBinary:
        Buffer.BlockCopy(timestampBytes, 2, guidBytes, 0, 6);
        Buffer.BlockCopy(randomBytes, 0, guidBytes, 6, 10);

        // If formatting as a string, we have to reverse the order
        // of the Data1 and Data2 blocks on little-endian systems.
        if (guidType == SequentialGuidType.SequentialAsString && BitConverter.IsLittleEndian)
        {
          Array.Reverse(guidBytes, 0, 4);
          Array.Reverse(guidBytes, 4, 2);
        }
        break;

      case SequentialGuidType.SequentialAtEnd:
        Buffer.BlockCopy(randomBytes, 0, guidBytes, 0, 10);
        Buffer.BlockCopy(timestampBytes, 2, guidBytes, 10, 6);
        break;
    }

    return new Guid(guidBytes);
  }
  
  
  private static Guid NewSequentialGuidDanGuzman()
  {
		const int S_OK = 0;
		const int RPC_S_UUID_LOCAL_ONLY = 1824;

		Guid oldGuid = Guid.Empty;

		int result = UuidCreateSequential(ref oldGuid);
		if (result != S_OK && result != RPC_S_UUID_LOCAL_ONLY)
		{
			throw new ExternalException("UuidCreateSequential call failed", result);
		}

		byte[] oldGuidBytes = oldGuid.ToByteArray();
		byte[] newGuidBytes = new byte[16];
		oldGuidBytes.CopyTo(newGuidBytes, 0);

		// swap low timestamp bytes (0-3)
		newGuidBytes[0] = oldGuidBytes[3];
		newGuidBytes[1] = oldGuidBytes[2];
		newGuidBytes[2] = oldGuidBytes[1];
		newGuidBytes[3] = oldGuidBytes[0];

		// swap middle timestamp bytes (4-5)
		newGuidBytes[4] = oldGuidBytes[5];
		newGuidBytes[5] = oldGuidBytes[4];

		// swap high timestamp bytes (6-7)
		newGuidBytes[6] = oldGuidBytes[7];
		newGuidBytes[7] = oldGuidBytes[6];

		//remaining 8 bytes are unchanged (8-15) 

		return new Guid(newGuidBytes);

    }  
	
  /*
  How to Generate Sequential GUIDs for SQL Server in .NET
http://blogs.msdn.com/b/dbrowne/archive/2012/07/03/how-to-generate-sequential-guids-for-sql-server-in-net.aspx
  */  
  private static Guid NewSequentialGuidDavidBrowne()
  {
		Guid guid = Guid.Empty;
		
		
        UuidCreateSequential(ref guid);		
        var s = guid.ToByteArray();
        var t = new byte[16];
        t[3] = s[0];
        t[2] = s[1];
        t[1] = s[2];
        t[0] = s[3];
        t[5] = s[4];
        t[4] = s[5];
        t[7] = s[6];
        t[6] = s[7];
        t[8] = s[8];
        t[9] = s[9];
        t[10] = s[10];
        t[11] = s[11];
        t[12] = s[12];
        t[13] = s[13];
        t[14] = s[14];
        t[15] = s[15];
        return new Guid(t);

    }  	
} 

Compile


set DIR_COMPILER=C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319

%DIR_COMPILER%\csc /target:library /define:DEBUG /debug /pdb:GuidGenerator.pdb /optimize /out:GuidGenerator.dll GuidGenerator.cs

Deploy Code


if object_id('[dbo].[GuidCLR]') is not null
begin
	DROP FUNCTION [dbo].[GuidCLR]
end
go

if object_id('[dbo].[SequentialGuidCLR]') is not null
begin
	DROP FUNCTION [dbo].[SequentialGuidCLR]
end
go


if object_id('[dbo].[SequentialGuidCLRJT]') is not null
begin
	DROP FUNCTION [dbo].[SequentialGuidCLRJT]
end
go

if object_id('[dbo].[SequentialGuidCLRDG]') is not null
begin
	DROP FUNCTION [dbo].[SequentialGuidCLRDG]
end
go

if exists
	(
		select *
		from   sys.assemblies
		where  name = 'GuidGenerator'
	)
begin
	DROP ASSEMBLY [GuidGenerator]
end
go

CREATE ASSEMBLY [GuidGenerator]
from 'D:\Personal\DanielAdeniji\Blog\Microsoft\SQLServer\TransactSQL\Datatype\UniqueIdentifier\CLR\SQLCLR\GuidGenerator.dll' 
WITH PERMISSION_SET = UNSAFE;
go



CREATE FUNCTION [dbo].[GuidCLR]()
RETURNS uniqueIdentifier
AS EXTERNAL NAME [GuidGenerator].[GuidGenerator].[NewGuid];
GO


CREATE FUNCTION [dbo].[SequentialGuidCLRJT]()
RETURNS uniqueIdentifier
AS EXTERNAL NAME [GuidGenerator].[GuidGenerator].[NewSequentialGuidMSSQLJeremyTodd];
GO

CREATE FUNCTION [dbo].[SequentialGuidCLRDG]()
RETURNS uniqueIdentifier
AS EXTERNAL NAME [GuidGenerator].[GuidGenerator].[NewSequentialGuidMSSQLDanGuzman];
GO

Validate Code

Let us quickly validate the code:

Code


set nocount on;

select [guidCLR] = [dbo].[GuidCLR]()

declare @guidSequencialCLR1 uniqueIdentifier
declare @guidSequencialCLR2 uniqueIdentifier
declare @guidSequencialCLR3 uniqueIdentifier
declare @guidSequencialCLR4 uniqueIdentifier
declare @guidSequencialCLR5 uniqueIdentifier
declare @guidSequencialCLR6 uniqueIdentifier

select @guidSequencialCLR1 = [dbo].[SequentialGuidCLR]()
select @guidSequencialCLR2 = [dbo].[SequentialGuidCLR]()
select @guidSequencialCLR3 = [dbo].[SequentialGuidCLR]()
select @guidSequencialCLR4 = [dbo].[SequentialGuidCLR]()
select @guidSequencialCLR5 = [dbo].[SequentialGuidCLR]()
select @guidSequencialCLR6 = [dbo].[SequentialGuidCLR]()

select 
		  [1] = @guidSequencialCLR1
		, [2] = @guidSequencialCLR2
		, [3] = @guidSequencialCLR3

select
		  [4] = @guidSequencialCLR4
		, [5] = @guidSequencialCLR5
		, [6] = @guidSequencialCLR6

Output

reviewGeneratedNumbers201511180821AM

Explanation

  1. For the Sequencing GUID our numbers are obviously not progressive
    • 39D42D92-D2C6-ACF5-8AF3-85F87B413342
    • 39D42D92-D2C6-0D7D-EB87-690CC6AA93B0
    • 39D42D92-D2C6-D03B-1915-0BCE3FE77594
    • 39D42D92-D2C6-0030-B22E-C499B543307E
    • 39D42D92-D2C6-115B-EF25-6757A1256DF5
    • 39D42D92-D2C6-106A-9712-9AB31F70BE87

Instrumentation

Workload Processing Time

The numbers, in terms of time taken to generate 1 million records were similar for the various use-cases:

  1. SQL Server Built in NewID
  2. SQL Server Built in NewSequentialID
  3. CLR – NewGUID
  4. CLR – Sequential GUID
    • Using Jeremy Todd’s code

Fragmentation

Let us access sys.dm_db_index_physical_stats and inquire about fragmentation level:

Code


/*

	Check SQL Server a specified database index fragmentation percentage (SQL)
		https://gallery.technet.microsoft.com/scriptcenter/Check-SQL-Server-a-a5758043

*/
SELECT 
		  TableName
			= OBJECT_NAME(ind.OBJECT_ID)
		, IndexName
			= ind.name
		, IndexType
			= indexstats.index_type_desc
		, [NumberofRows]	
			= p.[rows]
		, [%Fragmentation]
			= indexstats.avg_fragmentation_in_percent 

FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) indexstats 

INNER JOIN sys.indexes ind  
		ON ind.object_id = indexstats.object_id 
		AND ind.index_id = indexstats.index_id 

INNER JOIN sys.partitions p
		ON p.object_id = indexstats.object_id 
		AND p.index_id = indexstats.index_id 

WHERE OBJECT_SCHEMA_NAME(ind.OBJECT_ID) = 'datatypeUniqueIdentifier'

ORDER BY indexstats.avg_fragmentation_in_percent DESC

Output

Indexfragmentation.20151208.0253PM

Explanation

  1. The least index fragmentation was for the in-built NewSequentialID (1.2%)
  2. Next came Jimmy Nilsson Comb’s function at 30%
  3. And, then for Jeremy Todd’s Sequential ID (38%)
  4. Both the CLR Unique Identifier and the regular NEWID came in at 99%

 

Storage Utilization

Let us access sys.dm_db_partition_stats and inquire into Storage utilized by each table:

Code



; with cteTotal
as
	(
		SELECT 
			  sum(PART.rows) AS PartitionRows 
			, sum(STAT.row_count) AS StatRowCount 
			, sum(STAT.used_page_count * 8) AS UsedSizeKB 
			, sum(STAT.reserved_page_count * 8) AS ReservedSizeKB 
	FROM sys.partitions AS PART 
		 INNER JOIN sys.dm_db_partition_stats AS STAT 
			 ON PART.partition_id = STAT.partition_id 
				AND PART.partition_number = STAT.partition_number 
		 INNER JOIN sys.objects AS OBJ 
			 ON STAT.object_id = OBJ.object_id 
		 INNER JOIN sys.schemas AS SCH 
			 ON OBJ.schema_id = SCH.schema_id 
		 INNER JOIN sys.indexes AS INDX 
			 ON STAT.object_id = INDX.object_id 
				AND STAT.index_id = INDX.index_id 

	WHERE OBJECT_SCHEMA_NAME(OBJ.OBJECT_ID) = 'datatypeUniqueIdentifier'



	)
SELECT 
		  SCH.name AS SchemaName 
		, OBJ.name AS ObjName 
        , PART.rows AS PartitionRows 
        , STAT.row_count AS StatRowCount 
        , UsedSizeKB 
			= (STAT.used_page_count * 8)
        , ReservedSizeKB 
			= (STAT.reserved_page_count * 8)
	, [%ReservedSizeKB]
		= cast
			(
				  (STAT.reserved_page_count * 8 * 100.00)
				/ NULLIF(cteTotal.reservedSizeKB, 0)
				as decimal(10,2)
			)
FROM sys.partitions AS PART 

     INNER JOIN sys.dm_db_partition_stats AS STAT 
         ON PART.partition_id = STAT.partition_id 
            AND PART.partition_number = STAT.partition_number 
     INNER JOIN sys.objects AS OBJ 
         ON STAT.object_id = OBJ.object_id 
     INNER JOIN sys.schemas AS SCH 
         ON OBJ.schema_id = SCH.schema_id 
     INNER JOIN sys.indexes AS INDX 
         ON STAT.object_id = INDX.object_id 
            AND STAT.index_id = INDX.index_id 
	CROSS APPLY cteTotal

WHERE OBJECT_SCHEMA_NAME(OBJ.OBJECT_ID) = 'datatypeUniqueIdentifier'

ORDER BY 
   	     STAT.reserved_page_count
	   , SCH.name 
           , OBJ.name 

 

Output:

20151208.1236

Explanation

  1. Microsoft’s own NewSequentialID uses about half as much storage as some of the other CLR functions that we tried out
    • The reason is because of Page Splits

 

Storage Utilization & Index Defragmentation

Code



DECLARE @db_id SMALLINT;
DECLARE @object_id INT;

SET @db_id = db_id()


SELECT 

		  [schema] = object_schema_name(objIPS.object_id)

		, [object] = object_name(objIPS.object_id)

		, [pageCount]
			=objIPS.page_count

		, [fragmentCount]
			=objIPS.fragment_count

		, [%fragmention]
			=objIPS.avg_fragmentation_in_percent


FROM sys.dm_db_index_physical_stats
		(
				  @db_id
				, @object_id
				, NULL
				, NULL 
				, 'LIMITED'
		) objIPS

WHERE OBJECT_SCHEMA_NAME(objIPS.OBJECT_ID) = 'datatypeUniqueIdentifier'

order by 
			objIPS.avg_fragmentation_in_percent asc

GO



 

Output:

physicalStats.20151210

 

Explanation:

  1. We can quickly see that for newSequentialID table, we use 20 thousand pages while for our home grown CLR sequential ID we used about 38 thousand pages
    • So for page count we use 50% less storage for newSequentialID
  2. The fragmentCount is also far less
    • Fragment Count is the number of consecutive fragments

 

Connect Items

Here are related Microsoft Connect Items:

  1. ID :- 732399
    • Title
      • NEWSEQUENTIALID() should be callable as select NEWSEQUENTIALID(), similar to NEWID()
    • Description
      • The current design choice of only allowing NEWSEQUENTIALID() as a default on a column makes it much more difficult to use in applications, which want an id up-front, before inserting into a table.

        Currently getting an error: The newsequentialid() built-in function can only be used in a DEFAULT expression for a column of type ‘uniqueidentifier’ in a CREATE TABLE or ALTER TABLE statement. It cannot be combined with other operators to form a complex scalar expression.

    •  Feedback
      • Hello Andrew,
        Thanks for your feedback. While I agree that having per-row semantics for NEWSEQUENTIALID is good from a consistency point of view, we do not have plans to support it yet. Currently, NEWID & NEXT VALUE FOR are the only functions that have per-row semantics and each has its own restrictions. Additionally, NEWSEQUENTIALID has blocking semantics unlike NEWID to guarantee that the guids are sequential. This has performance implications when used from queries.
        Anyway, I am closing this request as won’t fix for now given the priority of this request with others already in our pipeline. We will look at it in the future if we get more feedback on this.–
        Umachandar, SQL Programmability Team

Performant Measurement

With a limited battery of test, it seems that we will likely achieve the most progressively increasing numbers with the Built-In SQL Server Function, NewSequentialID.

All of the CLR functions we tested do not quite keep state and thus the GUIDs generated can be a bit over the map.

 

Summary

 

Thankfully, with a bit of Internet prowling it is nice to see and know that SQL Server is easily extensible.

 

Listening

Here is to someone who puts in a lot more work into each day….

Adele – When we were young

References

GUID and SQL Server

  1. Stephen Cleary – A few words on GUID
    http://blog.stephencleary.com/2010/11/few-words-on-guids.html
  2. Telerik – Creating Client Side Sequential GUIDs
    http://docs.telerik.com/data-access/feature-reference/api/context-api/id-generators/feature-ref-api-context-api-client-seq-guids

Multiple Databases

  1. GUID as fast primary keys under multiple databases
    http://www.codeproject.com/Articles/388157/GUIDs-as-fast-primary-keys-under-multiple-database

 

Index Fragmentation

  1. Example: Index fragmentation with insert/updates, measuring it and fixing it by Sunil Agarwal
    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2008/03/01/example-data-fragmentation-with-insert-updates-measuring-it-and-fixing-it.aspx

 

C# & SQL Unique Identifier

  1. C# GUID and SQL Unique Identifier
    http://stackoverflow.com/questions/1435908/c-sharp-guid-and-sql-uniqueidentifier 

Output Parameters in C#

  1. using Stored Procedure Output Parameters in C#
    http://stackoverflow.com/questions/10905782/using-stored-procedure-output-parameters-in-c-sharp

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s