SQL Server – Image and Text Data – Day 1

Background

Experienced a SQL Server Instance crash on Tuesday night.

Yes, there is very little reason for SQL to crash.

 

Sql Error & Crash Logs

We have collected the SQL Server Instance crash logs and will post about them at a later time.

 

What Precipitated the Crash

Glad you asked.  As an aside there was really nothing in the logs that contains nuggets as to why the Instance crashed.

And, so happy that the lady that was running a Job reached out to us within the first hour of the next day.

I asked her for a quick write-up and she shared that she was loading data into one of the tables.  The data that was being loaded included PDF Images.

 

Table Definition

Here is the table’s structure


CREATE TABLE [dbo].[ten95C](
	[ten95C_id] [int] IDENTITY(1,1) NOT NULL,
	[ten95C_ssn] [char](9) NOT NULL,
	[ten95C_emp_id] [char](9) NOT NULL,
	[ten95C_emp_loc] [char](2) NOT NULL,
	[ten95C_tax_year] [smallint] NOT NULL,
	[ten95C_showable] [char](1) NULL,
	[ten95C_view_cnt] [int] NULL,
	[ten95C_view_last] [datetime2](6) NULL,
	[ten95C_notes] [varchar](50) NULL,
	[ten95C_create_ts] [datetime2](0) NOT NULL,
	[ten95C_pdf] [varbinary](max) NOT NULL,
        CONSTRAINT [pk_ten95C_id] PRIMARY KEY CLUSTERED 
        (
	   [ten95C_id] ASC
        ) 
        ON [PRIMARY]
) 
ON [PRIMARY] 
TEXTIMAGE_ON [PRIMARY]

GO

ALTER TABLE [dbo].[ten95C] ADD  DEFAULT ('') FOR [ten95C_ssn]
GO

ALTER TABLE [dbo].[ten95C] ADD  DEFAULT ((0)) FOR [ten95C_view_cnt]
GO

Explanation

Here is a quick structural overview:

  1. The Image Column is defined as [ten95C_pdf] [varbinary](max) 
    • Because FileStream is not explicitly noted we are storing images using the classic method
  2. The table itself is saved on the Primary File Group
  3. The Text Image is also saved on the Primary File Group

 

Table Metrics

Using SSMS and choosing digging deep into the Table, here is what the “Table Properties” look like

tableproperties-dbo-ten95c

 

Explanation

Quick Notes

  1. Again both Table and Text Filegroup is Primary
  2. Data Space is 81, 499 MB or 82 GB
  3. Index is 7.875 MB
  4. Row count is 147112 or 150 thousand

 

Optimization Choices

There are a couple of alternate pathways that we can evaluate to potentially achieve better performance.

  1. Storage
    • Currently, data, indexes, and LOB data are all saved in the same file groups
    • We should consider placing them on different filegroups .  And, targeting those filegroups in on their own physical drives
    • Also, will using File Streams offer better performance
  2. Client End Tooling
    1. The front-end tool is Java.
    2. And, the DB interface is Hibernate
    3. Will revisit optimization choices for Hibernate in a later post

 

Approach

In this post, we will go with creating a new file group dedicated to LOB data.

 

FileGroup

FileGroup – Add file groups


USE [master]
GO

ALTER DATABASE [rbpivr1_20161111]
	ADD FILEGROUP [FileGroupIndexes]
GO

ALTER DATABASE [rbpivr1_20161111] 
	ADD FILEGROUP [FileGroupTextAndImage]
GO

 

 

FileGroup – Add files to new file groups

FileGroup – Add files to new file groups – Indexes


USE [master]
GO

exec xp_create_subdir N'Z:\Microsoft\SQLServer\Indexes\rbpivr1\'
go

ALTER DATABASE [rbpivr1_20161111] 
ADD FILE 
	( 
		  NAME = N'rbpivr1_indexes_01'
		, FILENAME = N'Z:\Microsoft\SQLServer\Indexes\rbpivr1\rbpivr1_indexes_01.ndf' 
		, SIZE = 4096KB 
		, FILEGROWTH = 200MB 
	) 
	TO FILEGROUP [FileGroupIndexes]

GO

 

FileGroup – Add files to new file groups – TextAndImage


USE [master]
GO

exec xp_create_subdir N'Z:\Microsoft\SQLServer\TextAndImage\rbpivr1\'
go

ALTER DATABASE [rbpivr1_20161111]
ADD FILE
(
NAME = N'rbpivr1_text_image_01'
, FILENAME = N'Z:\Microsoft\SQLServer\TextAndImage\rbpivr1\rbpivr1_TextAndImage_01.ndf'
, SIZE = 4096KB
, FILEGROWTH = 200MB
)
TO FILEGROUP [FileGroupTextAndImage]

GO

Table

Let us create new table and be sure to target the indexes and text column on the Index FileGroup and Image & Text file group accordingly.

SET ANSI_NULLS OFF
GO

SET QUOTED_IDENTIFIER ON
GO

if schema_id('fgTI') is null
begin

	exec('create schema [fgTI] authorization [dbo]')

end
;

if object_id('[fgTI].[ten95C]') is null
begin

	CREATE TABLE [fgTI].[ten95C]
	(
		[ten95C_ssn] [char](9) NOT NULL,
		[ten95C_emp_id] [char](9) NOT NULL,
		[ten95C_emp_loc] [char](2) NOT NULL,
		[ten95C_tax_year] [smallint] NOT NULL,
		[ten95C_showable] [char](1) NULL,
		[ten95C_view_cnt] [int] NULL,
		[ten95C_view_last] [datetime2](6) NULL,
		[ten95C_notes] [varchar](50) NULL,
		[ten95C_create_ts] [datetime2](6) NOT NULL,
		[ten95C_pdf] [varbinary](max) NOT NULL

		, CONSTRAINT [pk_ten95C] PRIMARY KEY CLUSTERED 
		(
			[ten95C_ssn] ASC,
			[ten95C_emp_id] ASC,
			[ten95C_emp_loc] ASC,
			[ten95C_tax_year] ASC,
			[ten95C_create_ts] ASC
		)
		WITH 
		(
			  PAD_INDEX = OFF
			, STATISTICS_NORECOMPUTE = OFF
			, IGNORE_DUP_KEY = OFF
			, ALLOW_ROW_LOCKS = ON
			, ALLOW_PAGE_LOCKS = ON
		) 
		ON [PRIMARY]
	
		, CONSTRAINT [ten95C_tax] UNIQUE NONCLUSTERED 
		(
			[ten95C_emp_id] ASC,
			[ten95C_emp_loc] ASC,
			[ten95C_tax_year] ASC,
			[ten95C_create_ts] ASC
		)
		WITH 
		(
			  PAD_INDEX = OFF
			, STATISTICS_NORECOMPUTE = OFF
			, IGNORE_DUP_KEY = OFF
			, ALLOW_ROW_LOCKS = ON
			, ALLOW_PAGE_LOCKS = ON
		) 
		--ON [PRIMARY]
		ON [FileGroupIndexes]

	) 
	ON [PRIMARY] 

	--By default Text and Image and saved on FileGroup- Primary
	--TEXTIMAGE_ON [PRIMARY]
	TEXTIMAGE_ON [FileGroupTextAndImage]

	ALTER TABLE [fgTI].[ten95C] 
		ADD  DEFAULT ('') FOR [ten95C_ssn]

	ALTER TABLE [fgTI].[ten95C] 
		ADD  DEFAULT ((0)) FOR [ten95C_view_cnt]

end
go

Explanation

  1. We kept the Primary Key and accordingly the table in the primary file group
  2. Our lone index is targeting the newly created FileGroupIndexes filegroup
  3. And, the TextImage_On is targeting FileGroupTextAndImage

 

 

Transition Data

To transition data we have a lot of options, such as :

  1. ETL Tools
    • SSIS anyone
  2. BulkCopy
    • Bulk Insert
    • BCP
  3. Transact SQL
    • Merge Statement

 

Transact SQL – Merge Statement


use [rbpivr1]
go

set XACT_ABORT on
set nocount on

declare @id int
declare @idLen int
declare @idAsString varchar(60)
declare @spaceLength varchar(10)
declare @BLOCK_LENGTH_OF_SPACE tinyint
declare @idLow bigint
declare @idHigh bigint


declare @idLowBlock bigint
declare @idHighBlock bigint
declare @iBlockSize bigint
declare @commit bit

declare @log varchar(600)

set @idLow = 1
set @idHigh = 1E4

set @BLOCK_LENGTH_OF_SPACE = 4

set @idLowBlock = @idLow
set @iBlockSize = 1E3
set @commit =1

truncate table  [rbpivr1_20161111].[fgTI].[ten95C];

begin tran

	delete 
	from [rbpivr1_20161111].[fgTI].[ten95C]

	set identity_insert [rbpivr1_20161111].[fgTI].[ten95C] on
	
	set @id =1
	
	while (
			(@idLowBlock <= @idHigh) ) begin set @idHighBlock = @idLowBlock + @iBlockSize set @idLen = len(@id) set @spaceLength = @BLOCK_LENGTH_OF_SPACE - @idLen set @idAsString = replicate(' ', @spaceLength) + cast(@id as varchar(10)) set @log = 'Retrieving data from ' + @idAsString + ') ' + cast(@idLowBlock as varchar(10)) + ' ' + cast(@idHighBlock as varchar(10)) print @log merge [rbpivr1_20161111].[fgTI].[ten95C] tblTarget using ( select [ten95C_id] ,[ten95C_ssn] ,[ten95C_emp_id] ,[ten95C_emp_loc] ,[ten95C_tax_year] ,[ten95C_showable] ,[ten95C_view_cnt] ,[ten95C_view_last] ,[ten95C_notes] ,[ten95C_create_ts] ,[ten95C_pdf] from [rbpivr1].[dbo].[ten95C] tbl95C with (nolock) where ( ( tbl95C.[ten95C_id] between @idLowBlock and @idHighBlock) ) ) tblSource ON ( ( tblTarget.[ten95C_id] = tblSource.[ten95C_id]) ) WHEN NOT MATCHED BY TARGET THEN INSERT ( [ten95C_id] ,[ten95C_ssn] ,[ten95C_emp_id] ,[ten95C_emp_loc] ,[ten95C_tax_year] ,[ten95C_showable] ,[ten95C_view_cnt] ,[ten95C_view_last] ,[ten95C_notes] ,[ten95C_create_ts] ,[ten95C_pdf] ) VALUES ( tblSource.[ten95C_id] ,tblSource.[ten95C_ssn] ,tblSource.[ten95C_emp_id] ,tblSource.[ten95C_emp_loc] ,tblSource.[ten95C_tax_year] ,tblSource.[ten95C_showable] ,tblSource.[ten95C_view_cnt] ,tblSource.[ten95C_view_last] ,tblSource.[ten95C_notes] ,tblSource.[ten95C_create_ts] ,tblSource.[ten95C_pdf] ) ; set @log = 'Posted data' + @idAsString + ') ' + cast(@idLowBlock as varchar(10)) + ' ' + cast(@idHighBlock as varchar(10)) print @log set @idLowBlock = @idHighBlock set @id =@id +1 end ---block work while (@@TRANCOUNT > 0)
begin

	if (@commit = 1)
	begin
		print 'commit'
		commit tran
	end

	if (@commit = 0)
	begin
		print 'rollback'			
		rollback tran
	end

end

set identity_insert [rbpivr1_20161111].[fgTI].[ten95C] off;


 

Introspection

Let us compare our original design with the slight modification we made

 

Table Allocation Sizes


declare @tableName sysname

set @tableName = 'ten95C'

; with cte
(
	  	  [objectName]
		, [index_id]
		, [index_name]
		, [filegroup]
		, [fileGroupType]
		, [allocationUnitType]
		, [data_space_id]
		, [numberOfPages]
		, [totalPagesInMB]
		, [usedPagesInMB]
		, [dataPagesInMB]

)
as
(
	SELECT 

		[objectName]
		=   quoteName(tblSS.[name])
		  + '.'
		  + quoteName(tblSO.[name])

		, [index_id]
			= PA.index_id

		, [index_name]
			= tblSI.[name]

		, [filegroup]
			= FG.[name] 

		, [fileGroupType]
			= FG.[type_desc]

		, [allocationUnitType]
			= AU.[type_desc]

		, FG.[data_space_id]

		, [numberOfPages]
			= (sum(AU.[total_pages]))

		, [totalPagesInMB]
			= (sum(AU.[total_pages]) * 8 ) / 1024

		, [usedPagesInMB]
			= (sum(AU.[used_pages]) * 8) / 1024

		, [dataPagesInMB]
			= (sum(AU.[data_pages] * 8 ) / 1024)

	FROM sys.filegroups FG 

	INNER JOIN sys.allocation_units AU 
		ON AU.data_space_id = FG.data_space_id 

	INNER JOIN sys.partitions PA 
		ON PA.partition_id = AU.container_id 

	INNER JOIN sys.objects tblSO
		ON PA.object_id = tblSO.object_id


	INNER JOIN sys.indexes tblSI
		ON  PA.object_id = tblSI.object_id
		AND PA.index_id = tblSI.index_id

	INNER JOIN sys.schemas tblSS
		ON tblSO.schema_id = tblSS.schema_id

	WHERE  tblSO.[type] = 'U'

	and    tblSO.[name] = isNull(@tableName, tblSO.[name])
	
	group by

		quoteName(tblSS.[name])
		  + '.'
		  + quoteName(tblSO.[name])

		, PA.index_id

		, tblSI.[name]

		, FG.[name] 

		, FG.[type_desc]

		, AU.[type_desc]

		, FG.[data_space_id]

)
select 

		  [objectName]
		--, [index_id]
		, [index_name]
		, [filegroup]
		--, [fileGroupType]
		, [allocationUnitType]
		--, [data_space_id]
		, [numberOfPages]
		, [totalPagesInMB]
		, [usedPagesInMB]
		, [dataPagesInMB]

from   cte

order by
		[index_id]

Output

Original

tableproperties-dbo-ten95c

Revised

tableproperties-fgti-ten95c

Explanation
  1. In our original design
    • All data is targeted at the primary FileGroup
  2. In the revised design
    • LOB Data
      • Is the biggest storage hoarder, but now it is in its own FileGroup [fileGroupTextAndImage]
    • PRIMARY
      • Regular table data and the Primary Key is all that remains on the PRIMARY file group
    • Indexes
      • Our two indexes are placed next to each other in the Index FileGroup

Summary

This is Day 1 and so we will only deep into our LOB (Image) datatype.

A lot of these posts are just to disarm areas we do not deal with everyday.

One thought on “SQL Server – Image and Text Data – Day 1

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s