Monitor deployments
Go beyond simple threshold alerts to detect meaningful performance changes and potential issues before they impact people using your product.
How it Works
Section titled “How it Works”Cased’s deployment monitoring enables AI analysis to detect anomalies and handle post-deploy observability:
- Baseline Establishment: Cased analyzes historical metrics to establish normal performance baselines for your applications
- Real-time Monitoring: During and after deployments, Cased continuously monitors key metrics against these baselines
- Anomaly Detection: AI algorithms detect significant deviations from normal patterns, not just threshold breaches
- Contextual Analysis: Cased correlates multiple metrics to provide meaningful insights about deployment health
Monitored Metrics
Section titled “Monitored Metrics”Resource Metrics
Section titled “Resource Metrics”- CPU Utilization: Detects unusual CPU spikes or sustained high usage
- Memory Usage: Monitors memory pressure and potential memory leaks
- Disk I/O: Tracks read/write latency and throughput changes
- Network Traffic: Monitors network utilization and packet loss
Application Metrics
Section titled “Application Metrics”- Error Rates: Detects increases in 4xx and 5xx HTTP errors
- Response Times: Monitors API latency and response time degradation
- Throughput: Tracks request volume and processing rates
- Queue Depths: Monitors message queue backlogs and processing delays
Infrastructure Metrics
Section titled “Infrastructure Metrics”- Load Balancer Health: Monitors healthy/unhealthy host counts
- Database Performance: Tracks query latency and connection counts
- Cache Hit Rates: Monitors cache performance and effectiveness
- Container Health: Tracks container restart rates and resource limits
Anomaly Detection Thresholds
Section titled “Anomaly Detection Thresholds”Cased uses intelligent thresholds based on statistical analysis:
Severity Levels
Section titled “Severity Levels”- Medium (1.25x baseline): 25% increase from normal - worth investigating
- High (1.5x baseline): 50% increase from normal - significant issue
- Critical (2.0x baseline): 100% increase from normal - urgent attention required
Metric-Specific Thresholds
Section titled “Metric-Specific Thresholds”- CPU/Memory: Resource constraints that could impact performance
- Error Rates: More sensitive thresholds since small increases matter
- Latency: Response time increases that affect user experience
Supported Data Sources
Section titled “Supported Data Sources”AWS CloudWatch
Section titled “AWS CloudWatch”- Native integration with AWS services
- Comprehensive metric coverage for EC2, RDS, ELB, Lambda, and more
- Custom metrics and dashboards
- Automated alerting and dashboard creation
Datadog
Section titled “Datadog”- Full-stack monitoring across infrastructure and applications
- Custom metrics and synthetic monitoring
- APM (Application Performance Monitoring) integration
- Log correlation and analysis
Getting Started
Section titled “Getting Started”To use deployment monitoring:
- Connect Data Sources: Ensure your CloudWatch or Datadog integration is configured
- Deploy with Monitoring: Cased automatically monitors deployments when data sources are connected
- Review Anomalies: Check the Mission Control dashboard for detected anomalies
Example Monitoring Scenarios
Section titled “Example Monitoring Scenarios”Post-Deployment Monitoring
Section titled “Post-Deployment Monitoring”After deploying version 2.1.3:- CPU usage increased 45% above baseline (HIGH severity)- Error rate increased 150% (CRITICAL severity)- Memory usage within normal range- Database latency increased 30% (MEDIUM severity)
Recommendation: Investigate error rate spike and CPU usage
Resource Anomaly Detection
Section titled “Resource Anomaly Detection”Deployment monitoring detected:- Memory usage consistently 80% above baseline- Potential memory leak in new code- Container restart rate increased 3x- Database connection pool exhaustion
Action: Rollback recommended, investigate memory management
Performance Degradation
Section titled “Performance Degradation”Gradual performance degradation detected:- API response times increased 60% over 2 hours- Database query latency doubled- Cache hit rate decreased 25%- No error rate increase
Analysis: Database performance issue, possibly query optimization needed
Best Practices
Section titled “Best Practices”Monitoring Strategy
Section titled “Monitoring Strategy”- Establish Baselines: Allow sufficient time for baseline establishment before relying on anomaly detection
- Gradual Rollouts: Use canary deployments to limit blast radius of issues
- Multiple Metrics: Don’t rely on single metrics - correlate multiple data points
- Regular Reviews: Periodically review anomaly detection accuracy and adjust thresholds
Integration with CI/CD
Section titled “Integration with CI/CD”Cased’s deployment monitoring integrates seamlessly with your existing CI/CD pipeline:
- Automatic Activation: Monitoring starts automatically when deployments are detected
- Status Reporting: Deployment status updated based on monitoring results
- Rollback Triggers: Can trigger automated rollbacks based on anomaly severity
- Pipeline Integration: Works with GitHub Actions