About Me

My photo
I am an MCSE in Data Management and Analytics, specializing in MS SQL Server, and an MCP in Azure. With over 19+ years of experience in the IT industry, I bring expertise in data management, Azure Cloud, Data Center Migration, Infrastructure Architecture planning, as well as Virtualization and automation. I have a deep passion for driving innovation through infrastructure automation, particularly using Terraform for efficient provisioning. If you're looking for guidance on automating your infrastructure or have questions about Azure, SQL Server, or cloud migration, feel free to reach out. I often write to capture my own experiences and insights for future reference, but I hope that sharing these experiences through my blog will help others on their journey as well. Thank you for reading!

Why is SQL Server Creating Statistics When You Create a Primary Key?

 

Why is SQL Server Creating Statistics When You Create a Primary Key?

When you create a Primary Key in SQL Server, it automatically creates a Clustered Index (unless a different index type is specified). As part of this process, SQL Server also creates statistics to help optimize query performance.


📌 What are Statistics in SQL Server?

Statistics in SQL Server store data distribution information about a column or an index. The Query Optimizer uses these statistics to create efficient execution plans for queries.

🔹 Why Are Statistics Important?

  • Help the Query Optimizer determine the best way to execute queries.
  • Improve index usage by providing row count estimates.
  • Reduce query execution time by avoiding full table scans.

📌 Why Does SQL Server Create Statistics When You Add a Primary Key?

1. Statistics Are Created Automatically for Indexes

  • When you create a Primary Key, SQL Server automatically creates a Clustered Index (if no other Clustered Index exists).
  • SQL Server automatically generates statistics for any new index to optimize query execution.

🔹 Example:

ALTER TABLE HumanResources.Department  
ADD CONSTRAINT PK_Department_DepartmentID  
PRIMARY KEY CLUSTERED (DepartmentID ASC)  
ON [PRIMARY];

✔ This command creates an index on DepartmentID.
✔ SQL Server automatically generates statistics on DepartmentID to help with query optimization.


2. SQL Server Uses Statistics to Estimate Query Performance

  • When you run a query, SQL Server looks at statistics to estimate how many rows will match the query condition.
  • This helps SQL Server decide whether to:
    • Use index seek (fast) or index scan (slower).
    • Use nested loops, hash joins, or merge joins.

🔹 Example Query:

SELECT * FROM HumanResources.Department WHERE DepartmentID = 5;
  • SQL Server checks the statistics on DepartmentID to estimate how many rows match DepartmentID = 5.
  • If only a few rows match, SQL Server may use an index seek.
  • If many rows match, SQL Server may choose an index scan or a different execution plan.

📌 How to Check Statistics for an Index?

You can check the statistics SQL Server created using:

DBCC SHOW_STATISTICS ('HumanResources.Department', 'PK_Department_DepartmentID');

This shows:

  • Histogram (distribution of data values)
  • Density Vector (uniqueness of values)
  • Row Count Estimates

📌 Can I Manually Update Statistics?

Yes! Statistics update automatically, but you can also manually update them:

UPDATE STATISTICS HumanResources.Department (PK_Department_DepartmentID);

or for all statistics in the database:

EXEC sp_updatestats;

📝 Key Takeaways

Statistics store data distribution information to help the Query Optimizer.
When you create a Primary Key (or Index), SQL Server automatically creates statistics.
The Query Optimizer uses statistics to choose the best execution plan.
You can manually update statistics using UPDATE STATISTICS or sp_updatestats.

Would you like an example of how to analyze statistics for performance tuning? 😊

No comments: