Test Flakiness Detection
Track and analyze flaky tests in your CI pipeline
Note: A flaky test is one that produces different results with the same code and test inputs. This guide explains how to integrate Cased’s flakiness detection into your workflow to identify flaky tests across your CI runs over time.
Understanding Flakiness Detection
How It Works
Cased analyzes test results across your CI pipeline runs to identify patterns of inconsistent behavior. For each test, we calculate:
Key Metrics
- Flakiness Score: Percentage of CI runs where the test result differed from the majority outcome (0-100%)
- Detection Pattern: How the flakiness manifests (e.g., random, timing-dependent)
- First/Last Detected: Timestamps of when flaky behavior was first and most recently observed
Detection Patterns
Common Flakiness Patterns
- Random: Results appear to be randomly inconsistent across different CI runs
- Timing: Test failures correlate with execution timing in certain CI environments
- Resource: Failures relate to system resource availability on different CI runners
- Order: Test result depends on execution order within the test suite
- Setup: Failures occur during test setup/teardown in specific CI environments
Flakiness Score
The flakiness score indicates how frequently a test produces inconsistent results across CI runs:
- High Flakiness (75-100%): Test results are highly unreliable across CI runs
- Medium Flakiness (40-74%): Test has significant stability issues in CI
- Low Flakiness (1-39%): Test occasionally produces inconsistent results
- Not Flaky (0%): Test produces consistent results
Language Integrations
Python
pytest Integration
Cased works out of the box with pytest. Configure the JSON reporter in your CI:
Warning: Make sure the JSON report captures all test metadata needed for analysis.
Recommended pytest Configuration
Setting | Purpose |
---|---|
json-report | Required. Generates detailed test reports |
junit-xml | Optional. Provides additional execution metadata |
JavaScript
Jest Integration
Jest provides built-in JSON reporter support. Add this to your package.json
:
Install required dependency:
Tip: Using Jest’s built-in JSON reporter eliminates the need for additional reporter plugins.
Ruby
RSpec Integration
Add the JSON formatter to your .rspec
file:
Rust
Cargo Test Integration
Rust’s built-in test framework can output JSON results:
Go
Go Test Integration
Go’s testing package supports JSON output through the -json
flag:
For more detailed test output, you can use gotestsum
:
Tip:
gotestsum
provides richer test metadata and is recommended for better flakiness detection.
API Reference
Request Format
To send test results to Cased, make a POST request to our CI agent endpoint:
https://app.cased.com/api/v1/projects/{repository}/ci-agent/
Use the following format:
Authorization:
CI Integration
GitHub Actions
Info: Here’s an example of how to integrate Cased with GitHub Actions to collect test results.
Best Practices
Recommended Practices:
- Consistent Environments: Use consistent CI environments to reduce false positives from environment differences
- Complete Metadata: Include detailed test and environment information in your reports
- Regular Collection: Configure test result collection for all CI runs to build comprehensive data
- Monitor Trends: Watch for patterns in flakiness across different branches and environments
Troubleshooting
Warning: Common issues and solutions:
- Missing Data: Ensure all required fields are included in your JSON payload
- Authentication Errors: Verify your Cased API token is correctly configured
- Incomplete Results: Check that your test reporter is configured to capture all necessary metadata%