Tuesday, March 27, 2012

Difference between Index & Statistics

2000 & 2005 (The DDL was pulled from the 2005 box, but should be the same, or
very close, on the 2000 box)
I know what statistics are: distribution of values used by the Query
Optimizer.
I know what indexes are.
I do not understand why I see both CREATE INDEX and CREATE STATISTICS on a
table, nor do I understand why Visio seem to be marking a column with
Statistics (ObjectID) as having a Unique index.
The column in question is "ObjectID" and is used in a JOIN to other tables.
REATE TABLE [dbo].[Documents](
[DocumentID] [int] IDENTITY(1,1) NOT NULL,
[ObjectID] [int] NULL,
[ObjectTypeID] [int] NULL,
[StatusID] [int] NOT NULL CONSTRAINT [DF_Documents_StatusID] DEFAULT (1),
[StatusComment] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[Searchable] [bit] NOT NULL CONSTRAINT [DF_Documents_Searchable] DEFAULT
(1),
[DateEntered] [datetime] NOT NULL CONSTRAINT [DF_Documents_DateEntered]
DEFAULT (getdate()),
[DateModified] [datetime] NOT NULL CONSTRAINT [DF_Documents_DateModified]
DEFAULT (getdate()),
[ModifiedBy] [int] NULL,
[ReleaseDate] [smalldatetime] NULL,
[ExpireDate] [smalldatetime] NULL,
[ViewCount] [int] NOT NULL CONSTRAINT [DF_Documents_ViewCount] DEFAULT
((0)),
[AddedBy] [int] NULL,
[DateAuthorCreated] [datetime] NULL,
[DateAuthorRevised] [datetime] NULL,
[Title] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ShortTitle] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
CONSTRAINT [PK_Document] PRIMARY KEY CLUSTERED (
[DocumentID] ASC
)
WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY])
ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE STATISTICS [_dta_stat_29243159_1_2] ON
[dbo].[Documents]([DocumentID], [ObjectID])
GO
ALTER TABLE [dbo].[Documents] WITH CHECK ADD CONSTRAINT
[FK_Documents_ltblObjectType] FOREIGN KEY([ObjectTypeID])
REFERENCES [dbo].[ltblObjectType] ([ObjectTypeID])
GO
ALTER TABLE [dbo].[Documents] CHECK CONSTRAINT [FK_Documents_ltblObjectType]
> I do not understand why I see both CREATE INDEX and CREATE STATISTICS on a
> table
That you should ask the one show created the statistics. In fact, the name implies it was done by
Database Engine Tuning Advisor. Can that be correct? Anyhow, the statistics is on the column
*combination* (DocumentID, ObjectID). The only index I see is the one created for the PK which is on
only the column DocumentID. Even though distributiution information is for only the first column,
SQL Server *does* maintain density for the two columns (see output from DBCC SHOW_STATISTICS). So my
guess is that someone did a DTA for a workload and DTA suggested to create this statistics.

> nor do I understand why Visio seem to be marking a column with
> Statistics (ObjectID) as having a Unique index.
A bug in Visio? I suggest you ask in a Visio group, since you are more likely to find Visio experts
there.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"JayKon" <JayKon@.discussions.microsoft.com> wrote in message
news:1BE2F35F-7A55-4B69-8FF7-B48CECD48E91@.microsoft.com...
> 2000 & 2005 (The DDL was pulled from the 2005 box, but should be the same, or
> very close, on the 2000 box)
> I know what statistics are: distribution of values used by the Query
> Optimizer.
> I know what indexes are.
> I do not understand why I see both CREATE INDEX and CREATE STATISTICS on a
> table, nor do I understand why Visio seem to be marking a column with
> Statistics (ObjectID) as having a Unique index.
> The column in question is "ObjectID" and is used in a JOIN to other tables.
> REATE TABLE [dbo].[Documents](
> [DocumentID] [int] IDENTITY(1,1) NOT NULL,
> [ObjectID] [int] NULL,
> [ObjectTypeID] [int] NULL,
> [StatusID] [int] NOT NULL CONSTRAINT [DF_Documents_StatusID] DEFAULT (1),
> [StatusComment] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> [Searchable] [bit] NOT NULL CONSTRAINT [DF_Documents_Searchable] DEFAULT
> (1),
> [DateEntered] [datetime] NOT NULL CONSTRAINT [DF_Documents_DateEntered]
> DEFAULT (getdate()),
> [DateModified] [datetime] NOT NULL CONSTRAINT [DF_Documents_DateModified]
> DEFAULT (getdate()),
> [ModifiedBy] [int] NULL,
> [ReleaseDate] [smalldatetime] NULL,
> [ExpireDate] [smalldatetime] NULL,
> [ViewCount] [int] NOT NULL CONSTRAINT [DF_Documents_ViewCount] DEFAULT
> ((0)),
> [AddedBy] [int] NULL,
> [DateAuthorCreated] [datetime] NULL,
> [DateAuthorRevised] [datetime] NULL,
> [Title] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> [ShortTitle] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
> CONSTRAINT [PK_Document] PRIMARY KEY CLUSTERED (
> [DocumentID] ASC
> )
> WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY])
> ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
>
> GO
> CREATE STATISTICS [_dta_stat_29243159_1_2] ON
> [dbo].[Documents]([DocumentID], [ObjectID])
> GO
> ALTER TABLE [dbo].[Documents] WITH CHECK ADD CONSTRAINT
> [FK_Documents_ltblObjectType] FOREIGN KEY([ObjectTypeID])
> REFERENCES [dbo].[ltblObjectType] ([ObjectTypeID])
> GO
> ALTER TABLE [dbo].[Documents] CHECK CONSTRAINT [FK_Documents_ltblObjectType]
>
|||Thanks Tibor,
After reading your reply, I decided to look at the table again. For some
reason I was expecting MSSMS to give me all the DDL to the table in a single
option. My bad.
There is a non-unique index on ObjectID of the Documents table, so Visio's
"U" means non-unique and "I" means unique Oh well, Visio vs. ERwin.
As to the DTA, it kinda of sounds like you don't think much of it. Yes? No?
As to the Statistics, I'm still unclear why I would want them and not an
index. I am, of course, only refering to the statistics that show up in the
DDL, not the engine stats. Am I correct in the distinction I just made, or
should it be phrased differently?
"Tibor Karaszi" wrote:

> That you should ask the one show created the statistics. In fact, the name implies it was done by
> Database Engine Tuning Advisor. Can that be correct? Anyhow, the statistics is on the column
> *combination* (DocumentID, ObjectID). The only index I see is the one created for the PK which is on
> only the column DocumentID. Even though distributiution information is for only the first column,
> SQL Server *does* maintain density for the two columns (see output from DBCC SHOW_STATISTICS). So my
> guess is that someone did a DTA for a workload and DTA suggested to create this statistics.
>
> A bug in Visio? I suggest you ask in a Visio group, since you are more likely to find Visio experts
> there.
> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://sqlblog.com/blogs/tibor_karaszi
>
> "JayKon" <JayKon@.discussions.microsoft.com> wrote in message
> news:1BE2F35F-7A55-4B69-8FF7-B48CECD48E91@.microsoft.com...
>
|||> There is a non-unique index on ObjectID of the Documents table, so Visio's
> "U" means non-unique and "I" means unique Oh well, Visio vs. ERwin.
Now, that's weird. I guess different tool makers has different preferences...

> As to the DTA, it kinda of sounds like you don't think much of it.
No, that was not what I was trying to say. DTA is been much improved since Index Tuning Wizard (2000
and 7.0). IMO, a tool like this will never replace the human brain, but it is a good complement to
the work we do.

> As to the Statistics, I'm still unclear why I would want them and not an
> index. I am, of course, only refering to the statistics that show up in the
> DDL, not the engine stats. Am I correct in the distinction I just made, or
> should it be phrased differently?
Sometimes, statistics can help the optimizer pick a better plan, even in cases where an index
wouldn't be used. So, in these cases, why carry a b-tree if it won't be used?
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"JayKon" <JayKon@.discussions.microsoft.com> wrote in message
news:9CAF481A-3FF4-4557-9D7E-8053544E480D@.microsoft.com...[vbcol=seagreen]
> Thanks Tibor,
> After reading your reply, I decided to look at the table again. For some
> reason I was expecting MSSMS to give me all the DDL to the table in a single
> option. My bad.
> There is a non-unique index on ObjectID of the Documents table, so Visio's
> "U" means non-unique and "I" means unique Oh well, Visio vs. ERwin.
> As to the DTA, it kinda of sounds like you don't think much of it. Yes? No?
> As to the Statistics, I'm still unclear why I would want them and not an
> index. I am, of course, only refering to the statistics that show up in the
> DDL, not the engine stats. Am I correct in the distinction I just made, or
> should it be phrased differently?
> "Tibor Karaszi" wrote:

No comments:

Post a Comment