About Me

My photo
I am an MCSE in Data Management and Analytics, specializing in MS SQL Server, and an MCP in Azure. With over 19+ years of experience in the IT industry, I bring expertise in data management, Azure Cloud, Data Center Migration, Infrastructure Architecture planning, as well as Virtualization and automation. I have a deep passion for driving innovation through infrastructure automation, particularly using Terraform for efficient provisioning. If you're looking for guidance on automating your infrastructure or have questions about Azure, SQL Server, or cloud migration, feel free to reach out. I often write to capture my own experiences and insights for future reference, but I hope that sharing these experiences through my blog will help others on their journey as well. Thank you for reading!

MCP Server Deployment Checklist

 # MCP Server Deployment Checklist


Use this checklist to ensure a successful deployment of your enterprise MCP Server.

## Pre-Deployment

### Prerequisites
- [ ] Azure CLI installed and configured (`az --version`)
- [ ] Terraform >= 1.5.0 installed (`terraform --version`)
- [ ] Docker installed (`docker --version`)
- [ ] Node.js >= 20.0.0 installed (`node --version`)
- [ ] Azure subscription with Owner or Contributor role
- [ ] Valid Azure Entra ID tenant

### Azure Entra ID Setup
- [ ] Run `setup-entra-id.ps1` or `setup-entra-id.sh`
- [ ] Save Tenant ID, Client ID, and Client Secret securely
- [ ] Grant admin consent for API permissions in Azure Portal
- [ ] Assign test users to the application
- [ ] (Optional) Configure application roles
- [ ] (Optional) Set up conditional access policies

### Configuration
- [ ] Update `terraform/terraform.tfvars` with your values
- [ ] Choose globally unique names for ACR and PostgreSQL
- [ ] Set strong PostgreSQL admin password
- [ ] Configure tags for resource management
- [ ] Review network configuration (address spaces, subnets)

### Security
- [ ] Obtain or generate SSL certificate for Application Gateway
- [ ] Place certificate in `terraform/cert.pfx`
- [ ] Set certificate password in variables
- [ ] Review NSG rules and adjust if needed
- [ ] Configure allowed CORS origins

## Deployment Phase

### Infrastructure Deployment
- [ ] Navigate to `terraform/` directory
- [ ] Run `terraform init`
- [ ] Review `terraform plan` output carefully
- [ ] Run `terraform apply` and confirm
- [ ] Verify all resources created successfully
- [ ] Save Terraform outputs (ACR, PostgreSQL FQDN, etc.)

### Application Deployment
- [ ] Navigate to `server/` directory
- [ ] Login to ACR: `az acr login --name <acr-name>`
- [ ] Build Docker image: `docker build -t mcpserver:latest .`
- [ ] Tag image for ACR
- [ ] Push image to ACR
- [ ] Verify image in ACR: `az acr repository list --name <acr-name>`

### Container App Update
- [ ] Update Container App with new image
- [ ] Wait for deployment to complete
- [ ] Check Container App status: `az containerapp show`
- [ ] Verify replicas are running

## Post-Deployment

### Verification
- [ ] Test health endpoint: `curl https://<ip>/health`
- [ ] Test readiness endpoint: `curl https://<ip>/ready`
- [ ] Test authentication with Azure CLI token
- [ ] Verify MCP SSE endpoint connection
- [ ] Check logs in Log Analytics
- [ ] Review Container App metrics

### DNS and SSL
- [ ] Create DNS A record pointing to Application Gateway IP
- [ ] Update Application Gateway with production SSL certificate
- [ ] Verify SSL certificate validity
- [ ] Test HTTPS connection
- [ ] Enable HTTP to HTTPS redirect

### Monitoring Setup
- [ ] Create Azure Monitor alerts for:
  - [ ] High error rate (>5%)
  - [ ] High response time (>2s)
  - [ ] Authentication failures
  - [ ] Low availability
  - [ ] High resource usage
- [ ] Configure action groups for notifications
- [ ] Create custom dashboard in Azure Portal
- [ ] Set up Log Analytics saved queries
- [ ] Test alert notifications

### Client Configuration
- [ ] Distribute client configuration to users
- [ ] Update `claude_desktop_config.json` with production URL
- [ ] Test client connection from multiple machines
- [ ] Verify authentication works for all users
- [ ] Document any troubleshooting steps

### Documentation
- [ ] Update internal wiki with deployment info
- [ ] Document server URL and configuration
- [ ] Create runbook for common issues
- [ ] Document escalation procedures
- [ ] Share monitoring dashboard links

## User Onboarding

### Azure Entra ID
- [ ] Assign users to MCP Server application
- [ ] Grant appropriate roles (Admin vs User)
- [ ] Configure group-based access if needed
- [ ] Test user authentication

### Training
- [ ] Provide client configuration guide to users
- [ ] Document how to get access tokens
- [ ] Explain available MCP tools and capabilities
- [ ] Share troubleshooting guide
- [ ] Set up support channel (Teams/Slack)

## Security Hardening

### Network
- [ ] Review and restrict NSG rules
- [ ] Enable private endpoints for all services
- [ ] Configure Application Gateway WAF to Prevention mode
- [ ] Review firewall rules
- [ ] Enable DDoS protection

### Access Control
- [ ] Implement principle of least privilege
- [ ] Review and remove unnecessary permissions
- [ ] Enable Azure AD PIM if available
- [ ] Configure conditional access policies
- [ ] Enable MFA for admin accounts

### Secrets
- [ ] Rotate client secrets
- [ ] Store all secrets in Key Vault
- [ ] Enable Key Vault soft delete
- [ ] Configure access policies
- [ ] Set up secret expiration alerts

### Compliance
- [ ] Enable audit logging
- [ ] Configure log retention per compliance requirements
- [ ] Set up log export to long-term storage
- [ ] Document data residency
- [ ] Review compliance with organizational policies

## Operational Readiness

### Backup and Recovery
- [ ] Configure PostgreSQL automated backups
- [ ] Test database restore procedure
- [ ] Document recovery time objective (RTO)
- [ ] Document recovery point objective (RPO)
- [ ] Create disaster recovery plan

### Cost Management
- [ ] Set up budget alerts
- [ ] Review resource SKUs for optimization
- [ ] Enable auto-shutdown for non-prod
- [ ] Tag all resources for cost allocation
- [ ] Schedule monthly cost review

### Maintenance
- [ ] Schedule regular update windows
- [ ] Document update procedures
- [ ] Create rollback plan
- [ ] Set up change management process
- [ ] Define SLA commitments

## Sign-Off

### Technical Review
- [ ] DevOps team approval
- [ ] Security team review completed
- [ ] Network team approval
- [ ] Database team verification

### Business Review
- [ ] Stakeholder notification sent
- [ ] User communication prepared
- [ ] Support team trained
- [ ] Documentation published
- [ ] Go-live date confirmed

### Final Checks
- [ ] All checklist items completed
- [ ] No critical issues outstanding
- [ ] Monitoring and alerts verified
- [ ] Support procedures documented
- [ ] Rollback plan tested

---

## Deployment Sign-Off

**Deployment Date**: _________________

**Deployed By**: _________________

**Reviewed By**: _________________

**Approval**: _________________

---

## Post-Go-Live

### Week 1
- [ ] Daily monitoring of logs and metrics
- [ ] User feedback collection
- [ ] Performance tuning as needed
- [ ] Address any issues immediately

### Week 2-4
- [ ] Continue monitoring
- [ ] Optimize based on usage patterns
- [ ] Scale resources if needed
- [ ] Document lessons learned

### Month 1+
- [ ] Regular maintenance schedule
- [ ] Monthly cost review
- [ ] Quarterly security review
- [ ] Annual disaster recovery test

No comments: