Transact SQL and the C Level Programmer ( String Casing and Indexes )

Background

Again, back reviewing code and saw yet another interesting one.

The code looks like this:


select username
from   [dbo].[contactFNTest]
where  LOWER(lastname) = LOWER(@lastname)

 

Thinking

Left me thinking why all the string conversion/proper casing when your Database is case-insensitive. And, I was wondering will an index help.

 

Lab

Let us create a small little lab environment.

 

Create Table

set noexec off
go

use tempdb
go

if object_id('dbo.contactFNTest') is not null
begin
   set noexec on
end
go

create table [dbo].[contactFNTest]
(

    [username]    varchar(30) not null
  , [firstname]   varchar(60) not null
  , [lastname]    varchar(60) not null

  , [phoneNumber] varchar(30) null

  , constraint PK_CONTACTFNTEST PRIMARY KEY
    (
      [username]
    )

)

go

set noexec off
go

if not exists
(
   select 1
   from   sys.indexes tblSI
   where  tblSI.object_id = object_id('dbo.contactFNTest')
   and    tblSI.name = 'idx_lastname'
)
begin

    create index idx_lastname
    on [dbo].[contactFNTest]
    (
      [lastname]
    )

end
go

 

Populate Table


set nocount on
go

use tempdb
go

truncate table [dbo].[contactFNTest]
go

insert into [dbo].[contactFNTest]
([username], [firstname], [lastname], [phoneNumber] )
values
  ( 'dadeniji', 'Daniel', 'Adeniji', '803-111-2222')
, ( 'troberts', 'Tanya', 'Roberts', '415-121-2222')
, ( 'Trinidad', 'Felix', 'Trinidad', '310-121-2222')
, ( 'SHAGER', 'Sammy', 'Hager', '707-121-2222')
, ( 'smosely', 'Shane', 'Mosely', '213-121-2222')

 

Query Table


set nocount on
go

declare @lastname varchar(60)

print 'Product Version    :  ' + cast(serverproperty('productversion') as varchar)
print 'Server Collation   :  ' + cast(serverproperty('collation') as varchar)
print 'Database           :  ' + db_name()
print 'Database Collation :  ' + cast(databasepropertyex(db_name(), 'collation') as varchar)

select username
from   [dbo].[contactFNTest]
where  LOWER(lastname) = LOWER(@lastname)

select username
from   [dbo].[contactFNTest]
where  lastname = @lastname

 

DB Info

Here is Server and DB Info:

DBInfo

Explanation:

  • Product Version :- 12.0.2000.8 – 12 means SQL Server 2014
  • Database Collation :- Latin1_General_CI_AS – CI means Case Insensitive

 

Review Query Plan

QueryPlan

A quick explanation:

  • The query that uses the Lower on the column name does an index scan
  • And, the one that does not use the lower does an index seek

They both were able to put the index on the lastname to work.

 

C Level Programmer

I titled this post C-Level programmer for a couple of reasons.  In C language we had strcmp, strcmpi.

And, so when you bring a C Developer to the SQL World he assumes that the DB Engine is by default configured for case-sensitive searches and comparison; which is in-fact quicker and faster.

But, if truth be told, I personally have seen more databases setup with case-insensitive collation \ sort.

So you are thinking Daniel, you are making an assumption that just because someone cares about string case, that does not make them C-Developers.

But, then the dead giveaway was this code line


where  LOWER(substring(countryCode, 0, 3)) = LOWER(substring(@CountryCode, 0, 3))

I am thinking to myself that in SQL we do not start counting at 0, but at 1.

 

Substring – Start

Let us compare substring start argument at 0 with start argument 1


    use [tempdb]
    go

    select
         username
       , phonenumber
       , substring(phoneNumber,0,3) [returnsFirst2Chars]
       , substring(phoneNumber,1,3) [returnsFirst3Chars]
    from [dbo].[contactFNTest]

 

Output:

phoneNumber

 

 

Microsoft Take

Here is Microsoft’s documentation on SubString ( https://msdn.microsoft.com/en-us/library/ms187748.aspx )

SUBSTRING ( expression ,start , length )

Is an integer or bigint expression that specifies where the returned characters start. If start is less than 1, the returned expression will begin at the first character that is specified in expression. In this case, the number of characters that are returned is the largest value of either the sum of start + length– 1 or 0. If start is greater than the number of characters in the value expression, a zero-length expression is returned.

I will let you think more why you will want to pass a value less than 1 as the start argument.

 

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s