Transact SQL – Warnings – No Join Predicate / Query Optimization – Rules and Costs Based

Background

The first thing one learns about Database Performance is the importance of “up to date” Statistics.  As soon as things start running slow, one hears the harried echoes – Is there blocking, are the stats up to date, are the Indexes Defragged, are those bad queries running again?

I am always amazed as to how many things get thrown at the lowly and lonely DBA.

 

No Join Predicate

And, so that I am really jazzed that Microsoft Research along with the Database Core group continues to think ahead and evolve the product.  Let us take for example this “No Join predicate” thing.

Optimization Path

Rules Based

We has DBAs have to quickly understand what is new, how it helps us, and what do we have to do to ensure we are reaping unearned credits.

How does it work?  Well, it looks at the query and tries to decipher whether it’s join clause is fully defined.

In the Join clause it says have we included all the columns needed to ensure that a unique row is returned.  Uniqueness is satisfied via:

  • Primary Key
  • Unique Index
  • Unique Constraints

In previous post, we defined our table as

 


SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

if schema_id('njp') is null
begin

    exec('create schema [njp] authorization [dbo]')

end
go

if object_id('[njp].[AttributeList]') is not null
begin
    drop table [njp].[AttributeList]
end
go

if object_id('[njp].[listofAttributes]') is not null
begin
    drop table [njp].[listofAttributes]
end
go

if object_id('[njp].[person]') is not null
begin
    drop table [njp].[person]
end
go

create table [njp].[person]
(
    [personID] bigint not null identity(1,1)

     CONSTRAINT [PK_PERSON_ID] PRIMARY KEY CLUSTERED
    (
	    [personID] ASC
    )

)

create table [njp].[listofAttributes]
(
    [attributeName] nvarchar(50) not null primary key
)
go

CREATE TABLE [njp].[AttributeList]
(
	  [pkAttributeListID] [bigint] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL

    , [personID] bigint not null

	, [attributeName] [nvarchar](50) not null

	, [attributeValue] [nvarchar](380)  null

	, [addedBy] [sysname] NOT NULL
        CONSTRAINT constrantDefaultNJPAttributeList
            default (SUSER_SNAME())

	, [addedOn] [datetime] NOT NULL
         CONSTRAINT constrantDefaultNJPAttributeListAddedOn
            default getdate() 

     CONSTRAINT [PK_AttributeList] PRIMARY KEY NONCLUSTERED
    (
	    [pkAttributeListID] ASC
    )

    , CONSTRAINT [FK_AttributeList_PersonID]
        FOREIGN KEY
            (
	            [personID]
            )
        REFERENCES [njp].[person]
            (
	            [personID]
            )

     , CONSTRAINT [FK_AttributeList_AttributeName]
        FOREIGN KEY
            (
	            [attributeName]
            )
        REFERENCES [njp].[listofAttributes]
            (
	            [attributeName]
            )

)

go

create clustered index idx_personID_attributeName
on [njp].[AttributeList]
(
      [personID]
	, [attributeName]
)
GO

create nonclustered index idx_pkAttributeListID
on [njp].[AttributeList]
(
      [pkAttributeListID]
)
GO

/*
    drop index [njp].[AttributeList].idx_unique_personID_attributeName
    create unique index idx_unique_personID_attributeName
    on [njp].[AttributeList]
    (
          [personID]
	    , [attributeName]
    );

    update statistics [njp].[AttributeList] idx_unique_personID_attributeName with fullscan;

*/
go

/*

    -- drop index [njp].[AttributeList].idx_unique_pkAttributeListID
    create unique nonclustered index idx_unique_pkAttributeListID
    on [njp].[AttributeList]
    (
          [pkAttributeListID]
    )

*/
go

/*

    alter table [njp].[AttributeList]
        drop constraint [PK_AttributeList] 

    alter table [njp].[AttributeList]
        add constraint PK_AttributeList
            primary key
    (
          [personID]
	    , [attributeName]
    );

*/

/*

    -- Check for unique constraints

    select
              object_name(id) as objectName
            , object_name(constid) as constraintName
            , tblC.name as columnName
            , tblSC.colid as columnID

    from  sys.sysconstraints tblSC

            inner join sys.columns tblC

                on tblSC.id = tblC.object_id
                and tblSC.colid = tblC.column_id

    where tblSC.id = object_id('[njp].[AttributeList]')
    and   tblSC.constid = object_id('constraint_AttributeList_Unique_personID_attributeName')

*/

/*

    select * from sys.check_constraints
    select * from sys.default_constraints

*/

/*

    --Drop Unique Constraint, if is already exists
    if exists
        (
            select *

            from  sys.indexes tblSI

            where  tblSI.is_unique_constraint = 1

            and    tblSI.object_id = object_id('[njp].[AttributeList]')

            and    tblSI.name = 'constraint_AttributeList_Unique_personID_attributeName'

        )
    begin

        alter table [njp].[AttributeList]
            drop constraint constraint_AttributeList_Unique_personID_attributeName

    end

    alter table [njp].[AttributeList]
        add constraint constraint_AttributeList_Unique_personID_attributeName
          unique
            (
                  [personID]
	            , [attributeName]
            );

*/
go

 

 

And, did a self join


  declare @personID bigint
     declare @attributeName nvarchar(50)
    declare @attributeNameAlt nvarchar(50)
    declare @attributeNameAlt2 nvarchar(50)

    set @attributeName = 'BirthMonth'
    set @attributeNameAlt = 'favoriteColor'
    set @attributeNameAlt2 = 'BirthCity'

   set @personID = 1 

   select
          tblAL.personID
        , tblAL.attributeName
        , tblAL.attributeValue

        , tblALAlt.attributeName
        , tblALAlt.attributeValue

from [njp].[AttributeList] tblAL

    left outer JOIN [njp].[AttributeList] tblALAlt
            ON tblAL.personID = tblALAlt.personID
            AND tblALAlt.attributeName = @attributeNameAlt

where tblAL.personID = @personID 

and tblAL.attributeName = @attributeName

;

 

Upon returning to our criteria, we can see that our join columns ( personID and attributeName) does not satisfy our uniqueness criteria:

  • The Primary Key is our identity column ( pkAttributeListID).   This ensures that we have an ever increasing index.  The good is that we will not incur split pages; but we will have hotspots at the shared insertion page
  • We have an index on personID and  attributeName
  • We do not have unique constraints

 

And, so we are hoping the system will hint us with “No Join Predicate” warning.

Query Plan -NoWarning

But, no it is giving us a pass.

 

Cost Based

Let us see whether we can trip the warning by introducing a bit of cost.  There are a couple of ways to do so:

  • Add data
  • Play game with the statistics

Statistics Tweaking

Let us play games with our statistics

 

Set hard-coded Statistics

 

Syntax:

     update statistics [njp].[AttributeList] with rowcount = <rowcount>, pagecount = <page-count>

 

Sample:

In the example below, we wll set the Number of rows to 1000000

     update statistics [njp].[AttributeList] with rowcount = 1000000;

 

Query Plan

Once we have statistics, we have a far more helpful query plan.

QueryPlan - Stats - Artificial

 

Hints

Here are our hints:

  • No Join Predicate
  • No Stats

 

No Join Predicates

Warnings - No Join Predicate

 

No Stats

 

Warnings - Columns With No Statistics

 

 

  • TableCardinality :- 1000000

    • The table cardinality of 1 million is obviously due to us faking a rowcount of 1 million records
  • Warnings :- Columns With No Statistics: [DBLAB].[njp].[AttributeList].personID, [DBLAB].[njp].[AttributeList].attributeName
    • The columns with no statistics is indicative of the fact that we have table level statistics, but we have no way of creating column level data stats

Check Statistics
 

DBCC SHOW_STATISTICS ('[njp].[AttributeList]', 'idx_personID_attributeName')

 

Output:

 

 

Statistics-Index-idx_PersonID_attributeName-RawData

 

Stats:

Here are the Stats through GUI:

We can also use the GUI by  transversing Database \ Tables \ <Table name> \ Statistics \ <Index-name>

Steps

StepsToAccessStatsThroughGUI

 

Output

StatsThroughGUI

Either path we take, issuing SQL statement or via GUI, we have no stats.

 

Take Care of Hints

Let us go address our helpful hints.

No Join Predicate

As we discussed on a previous post, we need to ensure that our joins result in a single row being returned.

We also offered up primary key or unique indexes. A unique constraint is a first cousin of unique index and we will use it.

 


    --Drop Unique Constraint, if is already exists
    if exists
        (
            select *

            from  sys.indexes tblSI

            where  tblSI.is_unique_constraint = 1

            and      tblSI.object_id = object_id('[njp].[AttributeList]')

            and     tblSI.name = 'constraint_AttributeList_Unique_personID_attributeName'

        )
    begin

        alter table [njp].[AttributeList]
            drop constraint constraint_AttributeList_Unique_personID_attributeName

    end

    alter table [njp].[AttributeList]
        add constraint constraint_AttributeList_Unique_personID_attributeName
          unique
            (
                  [personID]
                , [attributeName]
            );

 

Btw, I prefer to check if an object exists, before attempting to create it.  That is the basis for the if exists code-line above.

And, as always say our blind spots are not obvious outside of getting out a pen and writing it out.

To check for unique constraint, my original thoughts were geared towards checking the constraints table

 


    -- Check for unique constraints

    select
              object_name(id) as objectName
            , object_name(constid) as constraintName
            , tblC.name as columnName
            , tblSC.colid as columnID

    from  sys.sysconstraints tblSC

            inner join sys.columns tblC

                on tblSC.id = tblC.object_id
                and tblSC.colid = tblC.column_id

    where tblSC.id = object_id('[njp].[AttributeList]')
    and   tblSC.constid = object_id('constraint_AttributeList_Unique_personID_attributeName')

But, sys.sysconstrainsts does not expose unique constraints, just default and check constraints.

Also, keep in mind that we have sys.check_constraints and sys.default_constraints.  But, no sys.unique_constraints.

 

No Stats

Once we create the unique constraint ( index ), we were able to rid ourselves of the corresponding missing stats warning, as well.

But, keep in mind, we will still without detailed column stats, until we add data rows.

Once we add data rows, we can now see stats:

Statistics-Constraint

 

 

Source Control

GitHub

The Transact SQL is available GitHub @ https://github.com/DanielAdeniji/TransactSQLWarningNoJoinPredicate .

 

Additional Reading

As I prepared this post, Goggled on this and that.

And, was lucky enough to find this gem…

More Common Mistakes Java Developers Make when Writing SQL
http://blog.jooq.org/2013/08/12/10-more-common-mistakes-java-developers-make-when-writing-sql/

Item 8 – Not using row value expressions where they are supported

SELECT c.first_name, c.last_name, a.street
FROM   customer c
JOIN   address a
ON  (c.id, c.tenant_id) = (a.id, a.tenant_id)

Unfortunately, not all databases support row value expressions in the same way. But the SQL standard had defined them already in 1992, and if you use them, sophisticated databases like Oracle or Postgres can use them for calculating better execution plans.

 

Dedicated

I will be remiss if I do not dedicate this post to Microsoft – Tech Support, Research, and SQL Server.  The “No Join predicates” is very useful tooling.

Yes, I know that often I use this forum to throw things at MSFT.  And, anyone sympathetic to their cause will and should borrow from Chris Brown (CB) and say

Miss me with that BS”.

But, this is house business.  Though we take it to the streets ever so succinctly; it doesn’t stay that way.

 

Summary:

We need declarative tooling to establish row level uniqueness.  Primary Keys, Unique Indexes or Constraints should suffice.

But, it seems we also might need actual data or faked number of row statistics to trigger “No Join Predicate” warnings.

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s