About Me

My photo
I am MCSE in Data Management and Analytics with specialization in MS SQL Server and MCP in Azure. I have over 13+ years of experience in IT industry with expertise in data management, Azure Cloud, Data-Canter Migration, Infrastructure Architecture planning and Virtualization and automation. Contact me if you are looking for any sort of guidance in getting your Infrastructure provisioning automated through Terraform. I sometime write for a place to store my own experiences for future search and read by own blog but can hopefully help others along the way. Thanks.

SQL Server Architecture

SQL Server Architecture 




 


what are the different components of relational engine in SQL Server Architecture

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The relational engine is a component of the SQL Server database engine that manages the storage and retrieval of data

stored in a relational database. The relational engine comprises several components, including:

Query Processor:

~~~~~~~~~~~~~~~~~ 

The query processor is responsible for interpreting and optimizing SQL queries.

It parses SQL queries, performs query optimization and generates execution plans for efficient query processing.

Access Methods:

~~~~~~~~~~~~~~~~

Access methods are used to retrieve data from tables and indexes stored in the database. 

SQL Server uses different access methods like B-Tree, Heap, Full-text search, and XML Index Access.

Transaction Manager:-

~~~~~~~~~~~~~~~

 The transaction manager is responsible for ensuring the consistency and durability of transactions.

It manages transactions by maintaining the ACID properties (Atomicity, Consistency, Isolation, and Durability) of the database.

Lock Manager:-

~~~~~~~~~~~~~~

 The lock manager is responsible for managing locks on resources in the database. 

It provides concurrency control by ensuring that multiple transactions can access the same resource without interfering with each other.

Buffer Manager:-

~~~~~~~~~~~~~~~ 

The buffer manager is responsible for managing the in-memory cache of data pages. It caches frequently accessed data pages in memory to reduce the number of physical I/O operations required for data access.

Database Manager:-

~~~~~~~~~~~~~~~ 

The database manager is responsible for managing the physical storage of data in the database. It manages the creation, modification, and deletion of database objects like tables, indexes, and views.

Memory Manager: 

~~~~~~~~~~~~~~~~

The memory manager is responsible for managing the memory allocation and deallocation for SQL Server processes. It manages the buffer pool, query workspace memory, and other memory structures used by the database engine.


Together, these components of the relational engine work together to provide a powerful and efficient database management system.

Storage engine in SQL Server 2019 :-

~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In Microsoft SQL Server 2019, the storage engine is a core component of the relational database management system (RDBMS)

that is responsible for managing the storage and retrieval of data in the database. 

The storage engine interacts with the underlying file system to read and write data to disk or memory, 

and provides an interface for the SQL Server Database Engine to manage the data.

what are different components of Storage engine in SQL Server

The storage engine in SQL Server is comprised of several different components that work together to manage the storage 

and retrieval of data. Some of the key components of the SQL Server storage engine include:

Data files: These are the physical files that store the data in the database. In SQL Server, data files can be divided 

into filegroups, which can be used to manage storage and backup operations.

Transaction log: The transaction log is a file that stores a record of all changes to the data in the database. 

This log is used to support recovery operations and ensure data consistency.

Buffer pool: The buffer pool is a cache of frequently accessed data pages in memory. 

The buffer pool helps improve performance by reducing the need for disk I/O operations.

Lock manager: The lock manager is responsible for managing concurrency control in the database.

 It ensures that multiple users can access and modify the data in the database without causing conflicts.

Query processor: The query processor is responsible for processing queries against the data in the database.

It optimizes queries and generates execution plans to ensure optimal performance.

Indexes: Indexes are used to speed up queries by providing quick access to data. 

SQL Server supports several types of indexes, including clustered and non-clustered indexes.

Full-text search: Full-text search is a feature that allows users to search for text within documents stored in the database.

These components work together to provide a comprehensive storage and retrieval system for data in SQL Server. By managing the storage and retrieval of data, the storage engine ensures that data is available to applications when it is needed, while also ensuring data consistency and integrity.

what is buffer Pool

~~~~~~~~~~~~~~~~~~~~

In Microsoft SQL Server, the buffer pool is a memory cache that stores data pages and index pages from disk. 

The buffer pool is managed by the SQL Server Database Engine and is used to improve the performance of database operations 

by reducing the need to read data from disk.


When SQL Server reads data from disk, it places the data pages into the buffer pool in memory. 

When an application requests data, SQL Server checks the buffer pool first to see if the requested data is already in memory.

If the data is in the buffer pool, SQL Server can retrieve the data quickly without needing to read from disk. 

This process is known as a buffer hit.


If the requested data is not in the buffer pool, SQL Server must read the data from disk and place it into the buffer pool. 

This process is known as a buffer miss. 

As more data is read from disk and placed into the buffer pool, the likelihood of a buffer hit increases, 

improving the performance of the database.


The buffer pool size can be configured in SQL Server, 

allowing database administrators to allocate memory resources to optimize database performance. 

However, it's important to balance the size of the buffer pool with the available system resources and other applications running 

on the server. 

If the buffer pool is too small, SQL Server may need to read data from disk more frequently, reducing performance.

If the buffer pool is too large, it may compete with other applications for system memory, also reducing performance.


what is data cache

~~~~~~~~~~~~~~~~~

In Microsoft SQL Server, the data cache is a memory structure that stores recently accessed data pages from the database 

on the server. 

The data cache is a subset of the buffer pool, which is a larger memory cache that stores both data pages and index pages.

When SQL Server reads data from a database, it first checks to see if the data is in the data cache. 

If the data is in the cache, SQL Server retrieves it from memory instead of reading it from disk,

which can improve performance. If the data is not in the cache, 

SQL Server reads it from disk and stores it in the cache for future use.

The data cache is dynamically managed by SQL Server to ensure that the most frequently accessed data is kept in memory. 

SQL Server uses a Least Recently Used (LRU) algorithm to determine which data pages to remove from the cache when the cache

is full. 

The LRU algorithm removes the least recently used data pages from the cache first, making room for new data pages to be cached.

The size of the data cache can be configured in SQL Server,

 allowing database administrators to allocate memory resources to optimize database performance. 

However, it's important to balance the size of the data cache with the available system resources and other applications 

running on the server.

If the data cache is too small, SQL Server may need to read data from disk more frequently, reducing performance. 

If the data cache is too large, it may compete with other applications for system memory, also reducing performance.


what is dirty page in SQl Server

---------------------------------


In Microsoft SQL Server, a dirty page refers to a data page in the buffer pool that has been modified but not yet written to disk. 

When data is modified in SQL Server, the changes are made in memory in the buffer pool. 

If the data is part of a table or index that is stored on disk, the modified data page is marked as dirty in the buffer pool.

SQL Server uses a mechanism called write-ahead logging to ensure that modifications to the database are written to disk in a 

consistent manner. 

The modified data page is first written to the transaction log, 

which is a file that contains a record of all changes made to the database. 

The transaction log is written to disk before the modified data page is written to disk.

This ensures that the modifications can be recovered in the event of a system failure or crash.

When SQL Server writes a dirty page to disk, it is said to be flushed.

The process of flushing dirty pages to disk is called a checkpoint.

During a checkpoint, SQL Server writes all dirty pages from memory to disk and updates the metadata in the 

database to reflect the changes.

It's important to manage dirty pages in SQL Server to ensure that data is written to disk in a timely manner 

and to avoid excessive memory usage. 

If there are too many dirty pages in memory, 

SQL Server may need to flush them frequently, which can reduce performance. 

If there are not enough dirty pages in memory, SQL Server may need to read data from disk more frequently, 

also reducing performance.


Plan cache

-------------------

In Microsoft SQL Server, the plan cache is a memory structure that stores the execution plans for SQL queries and 

stored procedures. 

When a query or stored procedure is executed, SQL Server generates an execution plan, 

which is a set of instructions that describe how the query should be executed.

The execution plan is stored in the plan cache so that it can be reused the next time the query is executed.

The plan cache is an important component of SQL Server's query processing engine, as it allows SQL Server to 

reuse execution plans and avoid the overhead of generating a new plan each time a query is executed. 

Reusing execution plans can significantly improve the performance of SQL Server, 

as generating a new execution plan can be a time-consuming process.

The plan cache is managed dynamically by SQL Server, and the plans in the cache are periodically aged out or removed 

to make room for new plans. SQL Server uses a number of factors to determine which plans to remove from the cache, 

including the frequency of use, the size of the plan, and the available memory on the server.

Database administrators can monitor the plan cache using various SQL Server performance monitoring tools, 

and can clear the cache manually if necessary. However, clearing the plan cache should be done with caution, 

as it can cause a temporary increase in CPU and memory usage while new execution plans are generated.


What is Buffer Manager

------------------------

In Microsoft SQL Server, the buffer manager is a component of the database engine that manages the buffer pool, 

which is a memory cache that stores recently accessed data and index pages from the database on the server.

The buffer manager is responsible for ensuring that the most frequently accessed data is kept in memory,

which can improve the performance of SQL Server by reducing the number of times data needs to be read from disk.


The buffer manager uses a least recently used (LRU) algorithm to manage the contents of the buffer pool.

When data is read from the database, SQL Server checks to see if the data is in the buffer pool.

If the data is not in the buffer pool, SQL Server reads it from disk and stores it in the buffer pool.

If the buffer pool is full, the buffer manager uses the LRU algorithm to determine which data or index pages to 

remove from the buffer pool to make room for new pages.

The buffer manager is also responsible for managing dirty pages, which are data pages that have been modified 

but not yet written to disk. 

SQL Server uses a write-ahead logging mechanism to ensure that modifications to the database are written to disk

in a consistent manner.

When a dirty page needs to be written to disk, the buffer manager is responsible for flushing the page to disk,

which is known as a checkpoint.

Database administrators can monitor the buffer manager using various SQL Server performance monitoring tools to 

ensure that the buffer pool is properly sized and configured for optimal performance. 

The buffer manager is an important component of SQL Server's query processing engine, 

and its proper configuration and management can significantly improve the performance of SQL Server.


what is checkpoint

-------------------


In SQL Server, a checkpoint is a process that writes all dirty pages (modified pages that are still in memory and have 

not yet been written to disk) from the buffer cache to disk. 

Checkpoints are a critical component of the SQL Server recovery process, 

as they ensure that all changes made to the database are written to disk and available for recovery in the event of a failure.


During normal database operation, SQL Server writes changes to the database to the transaction log in a process known as write-ahead logging. 

However, it is not efficient to write every change immediately to disk, as this would result in a lot of disk I/O overhead. Instead, SQL Server keeps modified pages in memory until they can be written to disk in a batch process.


Checkpoints are triggered automatically by SQL Server in several situations, including:


When the transaction log reaches a certain size.

When a database is shut down cleanly.

When the recovery interval specified by the server configuration is reached.

Once a checkpoint is triggered, SQL Server writes all dirty pages to disk and updates the log to reflect the fact that the changes have been written to disk. This ensures that the changes are durable and can be recovered in the event of a failure.


The frequency of checkpoints can be configured in SQL Server using the recovery interval setting.

A shorter recovery interval will result in more frequent checkpoints and more frequent disk I/O, but will also ensure that changes are written to disk more quickly. A longer recovery interval will result in fewer checkpoints and less frequent disk I/O, but may increase the risk of data loss in the event of a failure.



What is least recently used Machanism

--------------------------------

Least Recently Used (LRU) is a mechanism used in computer memory caching systems to determine which items should be removed 

from the cache when the cache becomes full and new items need to be added. 

The basic idea behind LRU is that the items that have not been accessed for the longest period of time are the least likely 

to be accessed in the future, and so they can be safely removed from the cache.

In the context of database management systems, LRU can be used to manage the buffer pool, which is a cache of frequently 

accessed data pages in memory. 

When a query needs to access a data page that is not in the buffer pool, the buffer manager needs to find a page 

to evict from the pool to make room for the new page. LRU is one of several algorithms that can be used to determine which page to evict.


In an LRU-based buffer pool, each data page is assigned a timestamp indicating the last time it was accessed. 

When a new page needs to be added to the pool and there is no room, 

the buffer manager selects the page with the oldest timestamp for eviction. 

This ensures that the least recently used pages are the first to be evicted,

 which is likely to be the pages that are least important for query performance.


There are variations of LRU, such as least frequently used (LFU), which takes into account the frequency of page access 

in addition to the recency. 

LRU is a commonly used algorithm in database management systems because it is simple, 

easy to implement, and effective at managing the buffer pool to maximize query performance.


What is write ahead logging

-------------------------


Write-Ahead Logging (WAL) is a technique used in database management systems, including SQL Server, 

to ensure the durability of transactions. 

The basic idea behind WAL is that changes to the database are first written to a log file on disk before they are written to the

 database itself.


In SQL Server, when a transaction modifies data in the database, the changes are first written to the transaction log,

which is a sequential file on disk. This ensures that the log is always up-to-date and reflects all changes to the database.

Once the changes are written to the transaction log, they can be safely applied to the database itself.


The WAL technique provides several benefits to SQL Server, including:


Durability: Since the transaction log is written before changes are made to the database, it ensures that transactions are durable even in the event of a system failure or crash.


Rollback: The transaction log can be used to undo changes made by a transaction, enabling rollbacks in case of an error or user rollback request.


Recovery: The transaction log can be used to restore the database to a consistent state in the event of a failure or system crash.


In summary, Write-Ahead Logging (WAL) is a critical component of the SQL Server architecture that ensures transaction durability, enables rollbacks, and facilitates recovery in the event of a system failure.


What is Access Method in the Architecture

---------------------------------------------


In SQL Server, an access method is a component of the relational engine that is responsible for retrieving data from tables and 

indexes stored in the database. 

The SQL Server access methods provide a layer of abstraction between the logical data model and the physical storage model 

of the database.


The primary access methods used by SQL Server are:

Heap Access: A heap is an unordered set of data pages that contain data records without any specific order. 

SQL Server uses a heap access method to scan and retrieve data from tables that do not have a clustered index.

B-Tree Access: A B-Tree is a balanced tree data structure that is used to index data in a table. 

SQL Server uses a B-Tree access method to retrieve data from tables with clustered indexes or non-clustered indexes.

Full-Text Search Access: Full-text search is a technology used to search text-based data efficiently.

SQL Server uses a full-text search access method to retrieve data from tables that contain large amounts of text-based data.

XML Index Access: SQL Server uses an XML index access method to retrieve data from tables that have XML columns. 

XML indexes are used to optimize queries that involve XML data.

The choice of access method depends on various factors such as the size of the database, the nature of the data, 

the frequency of access, and the type of processing required. A well-designed access method can significantly improve the performance and efficiency of a database.

Transaction Manager:-

In SQL Server architecture, the transaction manager is responsible for ensuring the Atomicity, Consistency,

Isolation, and Durability (ACID) properties of transactions. 

The transaction manager manages transactions at the database level and is responsible for the following:


Coordinating the beginning and end of transactions: The transaction manager receives requests from the client 

applications to begin a transaction and ensures that the transaction is started successfully. Once the transaction is started,

the transaction manager keeps track of the transaction until it is completed or rolled back.

Ensuring atomicity: The transaction manager ensures that transactions are treated as a single unit of work and that all changes made within a transaction are either committed together or rolled back together in case of a failure. It ensures that a transaction is either completely successful or completely unsuccessful, without any partial or incomplete changes to the database.

Maintaining consistency: The transaction manager ensures that the database remains in a consistent state throughout the transaction. It ensures that the integrity constraints, triggers, and other business rules defined on the database are satisfied before and after the transaction.

Providing isolation: The transaction manager ensures that each transaction executes in isolation from other concurrent transactions, using various locking and concurrency control mechanisms to prevent interference between transactions.

Ensuring durability: The transaction manager ensures that the changes made to the database by a committed transaction are persisted and durable even in the event of a system failure or crash.

Managing transaction logs: The transaction manager maintains a log of all transactions, including the changes made to the database, to ensure that they can be rolled back or recovered in case of a failure.

In SQL Server, the transaction manager is a key component of the relational engine and provides the necessary mechanisms to ensure transactional consistency and integrity in the database.


what is query optimizer

Query optimizer is a component of the relational engine in SQL Server that is responsible for generating an optimized 

execution plan for a given SQL query. Its main goal is to generate an execution plan that will execute the query with the 

lowest possible cost in terms of time and resources.


The query optimizer is a complex algorithm that analyzes the query, considers various execution plans, 

and chooses the most efficient one. It takes into account various factors such as the availability of indexes, 

data distribution, query complexity, and system resources to generate the best possible execution plan.


The optimizer works in two stages: the parsing stage and the optimization stage. 

In the parsing stage, the query is parsed and converted into a parse tree.

In the optimization stage, the parse tree is analyzed, and various execution plans are generated and compared based 

on their estimated cost. 

The optimizer then chooses the best execution plan and generates the optimized query.


Query optimization is an important aspect of SQL Server performance tuning,

as a poorly optimized query can cause performance problems and slow down the entire database system. 

Therefore, it is important to understand how the query optimizer works and to design queries that can be optimized efficiently.


what is command parser in sql server architecture

The command parser is a component of the relational engine in SQL Server that is responsible for parsing and validating

SQL commands. When a user submits a SQL command, the command parser first checks the syntax of the command to ensure

that it is valid. 

If the syntax is correct, the parser then verifies that the objects referenced in the command (such as tables, columns, or stored procedures) exist and are accessible to the user.

The command parser is the first step in processing any SQL command, and its role is crucial in ensuring that the command 

is properly structured and valid. 

If the command parser detects any errors, it will generate an error message and prevent the command from executing.

The command parser is a critical component of the SQL Server architecture, 

and its role is to ensure the integrity and security of the database by preventing invalid or unauthorized commands

 from being executed. In addition, the parser helps to optimize query performance by validating the syntax and structure 

 of the command, which allows the query optimizer to generate the most efficient execution plan.


No comments: