Hi today I would like to share with you experiment I did small experiment for checking improvement of operations in table with primary key that has unique identifier (GUID) type and it is PRIMARY KEY in this table. I want to check if PK will be CLUSTERED (by default) and if PK will be NONCLUSTERED index. There is a pervasive question in any job interview, so probably all of you already know the difference between both index kinds. But I would explain that CLUSTERED index is sorted physically and the NONCLUSTERED is not. I used 1 million rows for both test tables and measured INSERT and SELECT statements times. Below you can find the test script. The script is straightforward and collects data from SQL Statistics. If you like to reproduce this experiment, please run every part finished with the go statement 1 by 1 in order. I hope TSQL code it is simple to understand.
use tempdb go create table TestTableClustered ( ID uniqueidentifier not null, Number bigint not null identity(1,1), Stamp timestamp not null, constraint PK_TestTableClustered primary key clustered (ID) ) create nonclustered index IX_TestTableClustered on TestTableClustered (Number) create table TestTableNonClustered ( ID uniqueidentifier not null, Number bigint not null identity(1,1), Stamp timestamp not null, constraint PK_TestTableNonClustered primary key clustered (Number) ) create nonclustered index IX_TestTableNonClustered on TestTableNonClustered (ID) go declare @count int = 1000000 set statistics time on ;with n(n) as ( select 1 n union all select n+1 from n where n < @count ) insert into TestTableClustered(ID) select NEWID() from n option (maxrecursion 0) set statistics time off go declare @count int = 1000000 set statistics time on ;with n(n) as ( select 1 n union all select n+1 from n where n < @count ) insert into TestTableNonClustered(ID) select NEWID() from n option (maxrecursion 0) set statistics time off go set statistics time on select * from TestTableClustered order by Number set statistics time off go set statistics time on select * from TestTableNonClustered order by Number set statistics time off go drop index IX_TestTableClustered on TestTableClustered drop table TestTableClustered drop index IX_TestTableNonClustered on TestTableNonClustered drop table TestTableNonClustered go
And here you have results for above scripts. I dropped results for SELECT statements, so it is not whole output of the scripts, but just last result set.
Operation | Clustered Time | Clustered CPU | Non Clustered Time | Non Clustered CPU |
INSERT | 17363 | 16442 | 17253 | 16380 |
SELECT | 13119 | 2871 | 11263 | 359 |
As you can see this experiment clearly shows that selecting from table with NONCLUSTERED index as PRIMARY KEY is a lot more efficient.That is because CLUSTERED index causes a lot of fragmentation in data. Please let me know if you had such dilemma in your coding experience so far?
p ;).
Use NEWSEQUENTIALID() (as default on column) instead of inserting NEWID().
Thank you very much. I did not know about that option. And it works very fast.
I never knew there was a difference between Clustered and Non Clustered, until now, thanks a lot piotr