Why is SQL Server Creating Statistics When You Create a Primary Key?

 

Why is SQL Server Creating Statistics When You Create a Primary Key?

When you create a Primary Key in SQL Server, it automatically creates a Clustered Index (unless a different index type is specified). As part of this process, SQL Server also creates statistics to help optimize query performance.


📌 What are Statistics in SQL Server?

Statistics in SQL Server store data distribution information about a column or an index. The Query Optimizer uses these statistics to create efficient execution plans for queries.

🔹 Why Are Statistics Important?

  • Help the Query Optimizer determine the best way to execute queries.
  • Improve index usage by providing row count estimates.
  • Reduce query execution time by avoiding full table scans.

📌 Why Does SQL Server Create Statistics When You Add a Primary Key?

1. Statistics Are Created Automatically for Indexes

  • When you create a Primary Key, SQL Server automatically creates a Clustered Index (if no other Clustered Index exists).
  • SQL Server automatically generates statistics for any new index to optimize query execution.

🔹 Example:

ALTER TABLE HumanResources.Department  
ADD CONSTRAINT PK_Department_DepartmentID  
PRIMARY KEY CLUSTERED (DepartmentID ASC)  
ON [PRIMARY];

✔ This command creates an index on DepartmentID.
✔ SQL Server automatically generates statistics on DepartmentID to help with query optimization.


2. SQL Server Uses Statistics to Estimate Query Performance

  • When you run a query, SQL Server looks at statistics to estimate how many rows will match the query condition.
  • This helps SQL Server decide whether to:
    • Use index seek (fast) or index scan (slower).
    • Use nested loops, hash joins, or merge joins.

🔹 Example Query:

SELECT * FROM HumanResources.Department WHERE DepartmentID = 5;
  • SQL Server checks the statistics on DepartmentID to estimate how many rows match DepartmentID = 5.
  • If only a few rows match, SQL Server may use an index seek.
  • If many rows match, SQL Server may choose an index scan or a different execution plan.

📌 How to Check Statistics for an Index?

You can check the statistics SQL Server created using:

DBCC SHOW_STATISTICS ('HumanResources.Department', 'PK_Department_DepartmentID');

This shows:

  • Histogram (distribution of data values)
  • Density Vector (uniqueness of values)
  • Row Count Estimates

📌 Can I Manually Update Statistics?

Yes! Statistics update automatically, but you can also manually update them:

UPDATE STATISTICS HumanResources.Department (PK_Department_DepartmentID);

or for all statistics in the database:

EXEC sp_updatestats;

📝 Key Takeaways

Statistics store data distribution information to help the Query Optimizer.
When you create a Primary Key (or Index), SQL Server automatically creates statistics.
The Query Optimizer uses statistics to choose the best execution plan.
You can manually update statistics using UPDATE STATISTICS or sp_updatestats.

Would you like an example of how to analyze statistics for performance tuning? 😊

No comments:

Post a Comment