Test Scenario Definition: from business to script

"Just test it to see if it holds up." That's not a test scenario. A real scenario has an objective, context, success criteria, and an execution plan. This article teaches you how to transform vague requirements into concrete and meaningful test scenarios.

A good test scenario answers: What, Why, How, How much, and For how long.

Anatomy of a Scenario

Essential components

Test Scenario:
  Name: [Clear identification]
  Objective: [What we want to discover]
  Context: [When/why this test is relevant]
  Pre-conditions: [Required initial state]
  Workload: [Load to be applied]
  Duration: [For how long]
  Metrics: [What to measure]
  Success criteria: [When it passes]
  Failure criteria: [When to stop]

Complete example

Name: Peak Hour Load Test - Checkout Flow
Objective: Validate that checkout supports Black Friday peak
Context: Preparation for event in 30 days

Pre-conditions:
  - Staging environment with prod data (anonymized)
  - Cache warmed with top 1000 products
  - DB with production-like volume

Workload:
  Virtual users: 5000
  Ramp-up: 30 minutes
  Think time: 20-60s (log-normal)
  Distribution:
    - Browse: 60%
    - Search: 25%
    - Checkout: 15%

Duration:
  - Ramp-up: 30 min
  - Steady state: 2 hours
  - Ramp-down: 10 min

Metrics:
  - Latency p50, p95, p99 per endpoint
  - Throughput (req/s)
  - Error rate
  - CPU, Memory, DB connections

Success criteria:
  - Checkout p95 < 3s
  - Error rate < 1%
  - Zero 5xx in checkout
  - CPU < 70%

Failure criteria (stop test):
  - Error rate > 10%
  - p99 > 30s
  - Timeout in > 5% of requests

Types of Scenarios

1. Load Test (Normal load)

Objective: Validate performance under expected load

Characteristics:
  - Load within expectations
  - Extended duration (1-4 hours)
  - Focus on stability

Example:
  Name: Daily Peak Load
  Load: 2x daily average
  Duration: 2 hours
  Criterion: No degradation over time

2. Stress Test (Beyond limits)

Objective: Find the breaking point

Characteristics:
  - Increasing load until failure
  - Identify first bottleneck
  - Document behavior under stress

Example:
  Name: Breaking Point Discovery
  Load: Increment 10% every 10 minutes
  Duration: Until failure
  Criterion: Identify maximum sustainable load

3. Spike Test (Sudden peak)

Objective: Validate response to sudden peaks

Characteristics:
  - Abrupt load increase
  - Short duration
  - Focus on recovery

Example:
  Name: Flash Sale Spike
  Load: 0 → 10x normal in 1 minute
  Peak duration: 5 minutes
  Criterion: Recovers in < 2 min after peak

4. Soak Test (Endurance)

Objective: Identify long-term problems

Characteristics:
  - Moderate load
  - Very long duration (8-24h)
  - Focus on memory leaks, resource exhaustion

Example:
  Name: 24h Endurance Test
  Load: 1x daily average
  Duration: 24 hours
  Criterion: No degradation, no memory leak

5. Capacity Test (Planning)

Objective: Determine maximum capacity

Characteristics:
  - Multiple configurations tested
  - Capacity curve
  - Input for capacity planning

Example:
  Name: Capacity Planning Q4
  Tests: 1x, 2x, 3x, 4x, 5x baseline
  Output: Capacity vs resources chart

From Requirement to Scenario

Step 1: Understand the business requirement

Vague requirement:
  "The system needs to handle Black Friday"

Questions to clarify:
  1. How many users do we expect?
     → "3x normal peak, ~150K simultaneous"

  2. Which features are critical?
     → "Search and checkout. Browse can degrade."

  3. What latency is acceptable?
     → "Checkout < 5s, search < 2s"

  4. How long does the event last?
     → "Peak of 4 hours, 6-10h total"

  5. What is considered failure?
     → "Any error in checkout"

Step 2: Translate to technical metrics

Requirement: "150K simultaneous users"

Translation:
  - Average think time: 30s
  - Request rate: 150K / 30 = 5K req/s
  - Requests per page: 8
  - Total: 40K req/s on API

Step 3: Define workload model

Based on production analytics:

Action distribution:
  home: 100%
  search: 45%
  product_view: 60%
  add_cart: 12%
  checkout: 4%
  payment: 3%

Test data:
  - Pool of 10,000 products
  - 1,000 real search terms
  - Unique user IDs

Step 4: Define success criteria

SLOs derived from requirement:

Latency:
  - GET /search: p95 < 2s
  - GET /product: p95 < 1s
  - POST /checkout: p95 < 5s
  - POST /payment: p95 < 3s

Availability:
  - Overall error rate: < 1%
  - Checkout error rate: < 0.1%
  - Zero timeout in payment

Resources:
  - CPU < 80%
  - Memory < 85%
  - DB connections < 90% of pool

Scenario Template

# Scenario: [Scenario Name]

## 1. Context
**Event/Motivation**: [Why this test]
**Planned date**: [When it will run]
**Environment**: [Staging/Perf/Prod-like]

## 2. Objective
**Main question**: [What we want to discover]
**Hypothesis**: [What we expect to find]

## 3. Scope
**Features tested**:
- [ ] Feature A
- [ ] Feature B

**Features excluded**:
- [ ] Feature C (reason)

## 4. Workload

### Load profile
| Parameter | Value |
|-----------|-------|
| Virtual Users | X |
| Ramp-up | X min |
| Steady state | X min |
| Think time | X-Y s |

### Action distribution
| Action | % | Req/s |
|--------|---|-------|
| Action A | X% | X |
| Action B | Y% | Y |

### Test data
- Product pool: X items
- Search terms: X unique
- Users: X unique

## 5. Infrastructure

### Test environment
| Component | Spec |
|-----------|------|
| App servers | X × [size] |
| Database | [type], [size] |
| Cache | [type], [size] |

### Monitoring
- [ ] APM configured
- [ ] Custom metrics
- [ ] Centralized logs
- [ ] Alerts active

## 6. Criteria

### Success (all must pass)
- [ ] Latency p95 < X ms
- [ ] Error rate < X%
- [ ] Throughput > X req/s

### Failure (any one stops test)
- [ ] Error rate > X%
- [ ] Latency p99 > X s
- [ ] OOM or crash

### Warnings (document but don't stop)
- [ ] CPU > X%
- [ ] Memory > X%

## 7. Execution

### Prerequisites
- [ ] Environment provisioned
- [ ] Data loaded
- [ ] Cache warmed
- [ ] Stakeholders notified

### Execution checklist
1. [ ] Validate environment
2. [ ] Run smoke test (1 min)
3. [ ] Capture baseline
4. [ ] Execute full test
5. [ ] Collect results
6. [ ] Document observations

### Contacts
- **Owner**: [Name]
- **Infra**: [Name]
- **On-call**: [Name]

## 8. Results (filled after)
- **Status**: [Pass/Fail]
- **Metrics**: [Link to dashboard]
- **Observations**: [Notes]
- **Actions**: [Next steps]

Common Mistakes

1. Scenario without clear objective

❌ "Run load test"
✅ "Validate that checkout supports 5000 req/s with p95 < 3s"

2. Vague success criteria

❌ "System should be fast"
✅ "p95 latency < 2s, error rate < 0.5%"

3. Unrealistic workload

❌ "1000 VUs doing checkout in loop"
✅ "1000 VUs with funnel: 80% browse, 15% search, 5% checkout"

4. Insufficient duration

❌ "5 minute test"
✅ "2 hours steady state + 30 min ramp"

5. Ignoring warm-up

❌ "Start test immediately"
✅ "10 min warm-up for JIT, cache, connection pools"

Conclusion

A well-defined test scenario:

Has clear objective - answers a specific question
Reflects reality - workload based on real data
Defines success - objective and measurable criteria
Is reproducible - complete documentation
Is communicable - stakeholders understand

A test without a defined scenario is just noise. A well-defined scenario is knowledge.

This article is part of the series on the OCTOPUS Performance Engineering methodology.