Thursday 19 February 2026

Mau E2E Testing Framework - Comprehensive Plan ¶

Version: 1.0
Date: 19 February 2026
Status: Design Phase

Table of Contents ¶

Executive Summary
Research Findings
Architecture Overview
Technology Stack
Test Scenarios
File Structure
Implementation Phases
Example Test Case Walkthrough
Framework Comparison
CI/CD Integration
Debugging & Observability
Open Questions & Future Work

Executive Summary ¶

This document outlines a comprehensive end-to-end (E2E) testing framework for Mau, a P2P file synchronization system built on Kademlia DHT. The framework provides two complementary modes:

Interactive CLI Mode (mau-e2e tool) - Manual control and exploratory testing
Automated Testing Mode (go test) - CI/CD integration and regression detection

Both modes share the same core testenv library, ensuring consistency between manual exploration and automated validation.

Key Design Principles:

Interactive-first design - Developers can see P2P behavior, not just assert it worked
Deterministic by default, chaos-ready by design
Easy to add new test cases with minimal boilerplate
Rich observability with comprehensive logging, tracing, and state inspection
Network condition simulation (latency, packet loss, partitions)
CI/CD friendly with parallelizable, isolated test execution
Production-grade reliability for long-term maintenance

Primary Goals:

[Interactive] Enable manual exploration of P2P synchronization behavior
[Automated] Test N Mau instances discovering each other via Kademlia DHT
[Automated] Verify friend relationship establishment and maintenance
[Automated] Validate file synchronization across peers
[Both] Simulate network failures and verify recovery
[Automated] Stress test with varying peer counts (2-100+)
[Automated] Detect regressions before they reach production

See Also: CLI_DESIGN.md for detailed interactive CLI specifications

Research Findings ¶

Analysis of Existing P2P Testing Frameworks ¶

1. libp2p/test-plans ¶

What it does: Interoperability testing for libp2p implementations
Approach:

Docker containers with different libp2p implementations
Language-agnostic test orchestration
Test scenarios defined in separate containers
Results published to GitHub Pages

Key Learnings:

✅ Language-agnostic orchestration allows testing across implementations
✅ Docker-based isolation ensures reproducibility
✅ Separate test definition from implementation (test scenarios as standalone containers)
⚠️ Complex setup for simple scenarios

Applicable to Mau:

Use Docker containers for peer isolation
Define test scenarios declaratively
Collect structured test results

2. Ethereum Hive ¶

What it does: End-to-end test harness for Ethereum clients
Approach:

Simulator framework in Go
Client implementations run in Docker containers
Simulators orchestrate multi-client scenarios
JSON-based test result reporting

Key Learnings:

✅ Simulator pattern separates test logic from client runtime
✅ Client-agnostic interface allows testing different implementations
✅ Structured result format (JSON) enables trend analysis
✅ Trophy list motivates participation and showcases bugs found

Applicable to Mau:

Adopt simulator pattern for orchestration
Use Go for test harness (matches Mau’s language)
Implement structured test results with detailed traces
Create “bug trophy list” to validate framework effectiveness

3. Testcontainers-Go ¶

What it does: Programmatic container lifecycle management for tests
Approach:

Go API for creating/managing Docker containers in tests
Lifecycle hooks (startup, ready checks, teardown)
Network management, volume mounts, log streaming
Automatic cleanup with defer

Key Learnings:

✅ Native Go integration - no external orchestration needed
✅ Type-safe API prevents configuration errors
✅ Automatic cleanup reduces test pollution
✅ Wait strategies ensure containers are ready before testing
✅ Log capture for debugging failures
⚠️ Requires Docker daemon access

Applicable to Mau:

Primary orchestration tool for E2E tests
Use GenericContainer with custom Mau image
Implement custom wait strategies for Kademlia bootstrap
Capture logs per-container for debugging

4. Toxiproxy ¶

What it does: Network condition simulation proxy
Approach:

TCP proxy with HTTP API for adding “toxics”
Toxics: latency, bandwidth limits, timeouts, connection resets
Upstream/downstream control (client→server vs server→client)
Probability-based application (toxicity parameter)

Supported Toxics:

latency: Add delay ± jitter
bandwidth: Rate limiting (KB/s)
timeout: Drop data, optionally close connection
slow_close: Delay TCP FIN
reset_peer: TCP RST simulation
slicer: Fragment packets
limit_data: Close after N bytes

Key Learnings:

✅ Surgical network failure injection without container restarts
✅ Directional control (affect only requests or responses)
✅ Dynamic reconfiguration during test execution
✅ Lightweight - runs alongside services
⚠️ HTTP API adds complexity but enables dynamic control

Applicable to Mau:

Run Toxiproxy sidecars in Docker network
Configure each Mau peer to route through proxy
Inject latency/partitions during synchronization tests
Test Kademlia resilience under packet loss

5. Chaos Engineering Principles ¶

Source: principlesofchaos.org, Netflix Simian Army

Core Principles:

Define steady state - measurable system output indicating normal behavior
Hypothesize steady state continues under perturbation
Introduce real-world failure variables (crashes, network issues, resource exhaustion)
Disprove hypothesis by detecting steady state deviation

Advanced Principles:

Focus on system behavior, not internals
Vary real-world events by impact/frequency
Run in production when possible (not applicable to Mau tests)
Automate continuously
Minimize blast radius

Key Learnings:

✅ Define “sync success” steady state for Mau
✅ Test assumptions about resilience explicitly
✅ Automate chaos scenarios in CI
✅ Start with small perturbations, increase gradually

Applicable to Mau:

Define steady state: “All peers have consistent file states”
Hypothesis: “File sync completes within 5s under normal conditions”
Variables: peer crashes, network partitions, high latency
Validation: Check file SHA256 consistency across peers

Technology Stack Evaluation ¶

Component	Options Considered	Selected	Rationale
Container Orchestration	Docker Compose, Testcontainers-Go, Kubernetes	Testcontainers-Go	Native Go integration, programmatic control, automatic cleanup
Mau Instance Packaging	Binary in container, Docker image	Docker image	Consistent environment, easy version management
Network Simulation	Toxiproxy, tc (Linux), Pumba, Comcast	Toxiproxy	Cross-platform, programmable, well-documented
Test Framework	Go testing, Ginkgo/Gomega, Testify	Go testing + Testify	Minimal dependencies, familiar to Mau contributors
Logging	Container logs, Loki, ELK	Structured JSON logs + file export	Simple, parseable, CI-friendly
Tracing	OpenTelemetry, Jaeger, custom	Custom trace IDs in logs	Lightweight, no external dependencies
Metrics	Prometheus, InfluxDB, none	Test execution metrics in JSON	Easy CI integration, no infra overhead
Assertion Library	Standard Go, Testify, Gomega	Testify	Rich assertions, already used in Mau codebase

Architecture Overview ¶

Dual-Mode Architecture ¶

┌────────────────────────────────────────────────────────────────┐
│                     Mau E2E Framework                          │
│                                                                │
│  ┌──────────────────────┐    ┌──────────────────────┐         │
│  │  Interactive Mode    │    │   Automated Mode     │         │
│  │  (mau-e2e CLI)       │    │   (go test)          │         │
│  │                      │    │                      │         │
│  │  - Manual control    │    │  - CI/CD testing     │         │
│  │  - Exploration       │    │  - Regression detect │         │
│  │  - Debugging         │    │  - Assertions        │         │
│  │  - Demonstrations    │    │  - Coverage tracking │         │
│  └──────────┬───────────┘    └──────────┬───────────┘         │
│             │                           │                     │
│             └────────┬──────────────────┘                     │
│                      ▼                                        │
│           ┌──────────────────────┐                           │
│           │   Shared Core        │                           │
│           │   (testenv library)  │                           │
│           │                      │                           │
│           │  - Peer management   │                           │
│           │  - Network control   │                           │
│           │  - State persistence │                           │
│           │  - Assertions        │                           │
│           └──────────┬───────────┘                           │
└──────────────────────┼───────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Docker Network (mau-test-net)               │
│                                                                 │
│  ┌───────────────┐   ┌───────────────┐   ┌───────────────┐    │
│  │  Mau Peer 1   │   │  Mau Peer 2   │   │  Mau Peer 3   │    │
│  │  Container    │   │  Container    │   │  Container    │    │
│  │               │   │               │   │               │    │
│  │  - Account    │   │  - Account    │   │  - Account    │    │
│  │  - Server     │   │  - Server     │   │  - Server     │    │
│  │  - DHT Node   │   │  - DHT Node   │   │  - DHT Node   │    │
│  │  - Files      │   │  - Files      │   │  - Files      │    │
│  └───────┬───────┘   └───────┬───────┘   └───────┬───────┘    │
│          │                   │                   │            │
│          └───────────────────┼───────────────────┘            │
│                              │                                │
│  ┌───────────────────────────┴────────────────────────────┐   │
│  │              Toxiproxy (Optional)                      │   │
│  │  - Latency injection                                   │   │
│  │  - Bandwidth limiting                                  │   │
│  │  - Network partitions                                  │   │
│  └────────────────────────────────────────────────────────┘   │
│                                                                │
│  ┌────────────────────────────────────────────────────────┐   │
│  │         Bootstrap Node (Optional)                      │   │
│  │  - Kademlia DHT seed                                   │   │
│  └────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Observability Layer                         │
│                                                                 │
│  - Container logs (JSON structured)                             │
│  - Test result artifacts (JSON)                                 │
│  - File state snapshots (for debugging)                         │
│  - Network traffic traces (optional)                            │
└─────────────────────────────────────────────────────────────────┘

Component Breakdown ¶

1. Test Coordinator ¶

Language: Go
Framework: go test with Testcontainers-Go
Responsibilities:
- Parse test configuration (peer count, network topology, test scenario)
- Build/pull Mau Docker image
- Create isolated Docker network
- Spawn Mau peer containers
- Inject friend relationships
- Inject files to sync
- Wait for synchronization
- Assert expected state
- Collect logs and artifacts
- Cleanup containers and networks

2. Mau Peer Container ¶

Base Image: golang:1.21-alpine (multi-stage build)
Contents:
- Mau binary (server + DHT node)
- PGP keyring initialization
- Configuration via environment variables
- Healthcheck endpoint
- Structured logging to stdout

Container Lifecycle:

Init: Generate PGP account or import existing
Bootstrap: Connect to DHT seed nodes (if provided)
Ready: HTTP server listening, DHT routing table populated
Runtime: Accept friend additions, file synchronization
Shutdown: Graceful stop, flush logs

3. Toxiproxy Sidecar (Optional) ¶

When to use: Chaos/resilience tests
Configuration:
- Each Mau peer routes through local Toxiproxy instance
- Toxiproxy forwards to actual Mau server
- Test coordinator controls toxics via HTTP API

Example Proxy Configuration:

1{
2  "name": "mau_peer_1",
3  "listen": "0.0.0.0:8080",
4  "upstream": "mau-peer-1:8080",
5  "enabled": true
6}

4. Bootstrap Node ¶

Purpose: Seed Kademlia DHT for peer discovery
Implementation:
- Dedicated Mau instance with known address
- All peers configured with bootstrap node in environment
- Not tested, acts as infrastructure

Technology Stack ¶

Core Technologies ¶

Component	Technology	Version	Purpose
Container Runtime	Docker	20.10+	Peer isolation
Orchestration	Testcontainers-Go	v0.27+	Programmatic container management
Test Framework	Go testing	1.21+	Test execution
Assertions	Testify	v1.8+	Rich assertions (already in Mau)
Network Proxy	Toxiproxy	2.5+	Network condition simulation
Logging	Zerolog / Zap	Latest	Structured JSON logging

Supporting Tools ¶

Tool	Purpose	When Used
Docker Compose	Manual test environment setup	Local development/debugging
jq	Log parsing/filtering	Debugging failed tests
Make	Build automation	CI/CD pipeline
GitHub Actions	CI/CD runner	Automated testing

Test Scenarios ¶

Level 1: Basic Functionality (Deterministic) ¶

TC-001: Two-Peer Discovery ¶

Objective: Verify two Mau peers can discover each other via Kademlia DHT
Setup:

Peer A, Peer B
Shared bootstrap node
Same Docker network

Steps:

Start bootstrap node
Start Peer A, configure bootstrap node
Start Peer B, configure bootstrap node
Wait for DHT routing tables to populate
Query Peer A for Peer B’s fingerprint
Query Peer B for Peer A’s fingerprint

Assertions:

Peer A finds Peer B within 5 seconds
Peer B finds Peer A within 5 seconds
DHT distance calculation is correct

TC-002: Two-Peer Friend Sync ¶

Objective: Verify friend relationship establishment and file synchronization
Setup:

Peer A, Peer B
Peer A adds Peer B as friend (exchange public keys)
Peer B adds Peer A as friend

Steps:

Start both peers
Exchange public keys via test coordinator
Inject friend relationship on both sides
Peer A creates file hello.txt encrypted for Peer B
Wait for synchronization
Verify Peer B has hello.txt
Decrypt and verify content matches

Assertions:

Friend relationship established (both sides)
File appears on Peer B within 10 seconds
Decrypted content matches original
File permissions/metadata preserved

TC-003: Multi-Peer Sync (N=5) ¶

Objective: Verify synchronization across 5 peers in a friend graph
Friend Graph:

    A
   /|\
  B C D
   \ /
    E

Setup:

5 peers
Friend relationships as shown
Peer A creates public file

Steps:

Start all peers
Establish friend relationships
Peer A publishes public file
Wait for propagation
Verify all peers receive file

Assertions:

All peers have file within 30 seconds
File SHA256 matches across all peers
No duplicate file fetches (check logs)

TC-004: Version Conflict Resolution ¶

Objective: Test behavior when two peers edit same file concurrently
Setup:

Peer A, Peer B (mutual friends)
Both have shared.txt version 1

Steps:

Network partition: isolate A and B
Peer A edits shared.txt → version 2a
Peer B edits shared.txt → version 2b
Restore network
Wait for synchronization

Assertions:

Both versions exist (.versions/ directory)
Latest version determined by timestamp or conflict resolution rules
No data loss

Level 2: Resilience Testing (Chaos) ¶

TC-101: Peer Crash During Sync ¶

Objective: Verify resilience when peer crashes mid-synchronization
Setup:

Peer A, Peer B, Peer C (all friends)
Peer A has large file (100MB)

Steps:

Peer A starts sharing file
Peer B starts downloading (50% complete)
Kill Peer A container
Wait 10 seconds
Restart Peer A
Verify Peer B resumes download

Assertions:

Peer B resumes from last checkpoint (HTTP Range request)
Download completes successfully
File SHA256 matches
Peer C unaffected

TC-102: Network Partition (Split Brain) ¶

Objective: Test synchronization after network partition heals
Friend Graph:

Partition 1: A - B
Partition 2: C - D

Setup:

4 peers in two groups
Toxiproxy creates network partition

Steps:

All peers connected initially
Create partition: A-B can’t reach C-D
Peer A publishes file X
Peer C publishes file Y
Wait 30 seconds
Heal partition
Wait for sync

Assertions:

After healing, all peers have both files (X and Y)
No file corruption
Sync time < 60 seconds

TC-103: High Latency Network (500ms) ¶

Objective: Verify synchronization under high latency
Setup:

3 peers
Toxiproxy adds 500ms latency ± 100ms jitter

Steps:

Start all peers with latency toxic
Peer A publishes 10 files (1KB each)
Measure sync time

Assertions:

Sync completes (may be slow)
No timeout errors
All files synced correctly
Test logs latency measurements

TC-104: Bandwidth Limitation (10 KB/s) ¶

Objective: Test large file sync under bandwidth constraints
Setup:

Peer A, Peer B
Toxiproxy limits bandwidth to 10 KB/s
File size: 1 MB

Steps:

Start both peers
Apply bandwidth toxic
Peer A shares file
Measure sync time

Assertions:

Sync time ~= 100 seconds (1MB / 10KB/s)
No connection drops
File SHA256 correct

TC-105: Packet Loss (10%) ¶

Objective: Verify TCP retransmission handles packet loss
Setup:

Peer A, Peer B
Toxiproxy slicer toxic with 10% loss simulation

Steps:

Start both peers
Apply packet loss toxic
Peer A shares 100 small files
Monitor sync

Assertions:

All files eventually sync
Retransmissions visible in logs
Sync time < 5 minutes

Level 3: Stress Testing ¶

TC-201: 10-Peer Full Mesh ¶

Objective: Test scalability with 10 peers all friends with each other
Setup:

10 peers
45 friend relationships (full mesh)
Peer 1 publishes file

Steps:

Start all peers
Establish all friend relationships
Peer 1 publishes file
Wait for propagation

Assertions:

All peers receive file within 2 minutes
DHT routing table sizes < 20 entries (k-bucket limit)
No memory leaks (check container stats)

TC-202: 100-Peer Network (Sparse Graph) ¶

Objective: Validate DHT performance with 100 peers
Friend Graph: Random graph, average degree = 5
Setup:

100 peers
Random friend relationships
Bootstrap node

Steps:

Start all peers (parallel batches)
Establish friend relationships
10 random peers publish files
Wait for propagation

Assertions:

DHT queries succeed for all peers
Average lookup time < 1 second
Sync eventually reaches all connected peers
Test completes in < 30 minutes

TC-203: Churn Test (Peers Join/Leave) ¶

Objective: Test DHT stability under peer churn
Setup:

Initial: 20 peers
Every 30 seconds: 2 peers leave, 2 new peers join
Duration: 10 minutes

Steps:

Start initial 20 peers
Start churn loop
Publish file every minute from random peer
Monitor sync success rate

Assertions:

File sync success rate > 95%
DHT routing table recovers from churn
No peer becomes permanently isolated

Level 4: Security Testing ¶

TC-301: Unauthorized File Access ¶

Objective: Verify encrypted files not accessible without decryption key
Setup:

Peer A (file owner)
Peer B (authorized friend)
Peer C (unauthorized, not a friend)

Steps:

Peer A creates file encrypted for Peer B only
Peer C attempts to download file (if discoverable)
Peer C attempts to decrypt file

Assertions:

Peer C cannot decrypt file
File content remains confidential
No plaintext leakage in logs

TC-302: DHT Sybil Attack Resistance ¶

Objective: Test DHT behavior under Sybil attack (future work)
Setup:

10 honest peers
50 malicious peers with coordinated IDs

Steps:

Honest peers establish DHT
Malicious peers join with IDs near target peer
Attempt to monopolize routing table
Honest peer tries to find another honest peer

Assertions:

Lookup success rate > 90%
S/Kademlia defenses (if implemented) mitigate attack

File Structure ¶

mau/
├── e2e/                              # E2E test framework root
│   ├── PLAN.md                       # This document
│   ├── README.md                     # Quick start guide
│   ├── Makefile                      # Build and test automation
│   │
│   ├── framework/                    # Core test framework
│   │   ├── testenv/                  # Test environment setup
│   │   │   ├── testenv.go            # Testcontainers orchestration
│   │   │   ├── peer.go               # Mau peer container wrapper
│   │   │   ├── network.go            # Docker network management
│   │   │   ├── toxiproxy.go          # Toxiproxy integration
│   │   │   └── bootstrap.go          # Bootstrap node management
│   │   │
│   │   ├── assertions/               # Custom assertions
│   │   │   ├── sync.go               # File sync assertions
│   │   │   ├── dht.go                # Kademlia DHT assertions
│   │   │   └── friend.go             # Friend relationship assertions
│   │   │
│   │   ├── helpers/                  # Utility functions
│   │   │   ├── pgp.go                # PGP key generation/management
│   │   │   ├── files.go              # File creation/comparison
│   │   │   ├── logs.go               # Log collection/parsing
│   │   │   └── wait.go               # Wait strategies
│   │   │
│   │   └── types/                    # Shared types
│   │       ├── config.go             # Test configuration
│   │       ├── peer.go               # Peer metadata
│   │       └── result.go             # Test result structures
│   │
│   ├── scenarios/                    # Test scenarios
│   │   ├── basic/                    # Level 1 tests
│   │   │   ├── discovery_test.go     # TC-001: Two-peer discovery
│   │   │   ├── friend_sync_test.go   # TC-002: Two-peer friend sync
│   │   │   ├── multi_peer_test.go    # TC-003: Multi-peer sync
│   │   │   └── version_conflict_test.go # TC-004: Version conflicts
│   │   │
│   │   ├── resilience/               # Level 2 tests
│   │   │   ├── peer_crash_test.go    # TC-101: Peer crash
│   │   │   ├── partition_test.go     # TC-102: Network partition
│   │   │   ├── latency_test.go       # TC-103: High latency
│   │   │   ├── bandwidth_test.go     # TC-104: Bandwidth limits
│   │   │   └── packet_loss_test.go   # TC-105: Packet loss
│   │   │
│   │   ├── stress/                   # Level 3 tests
│   │   │   ├── full_mesh_test.go     # TC-201: 10-peer mesh
│   │   │   ├── large_network_test.go # TC-202: 100-peer network
│   │   │   └── churn_test.go         # TC-203: Peer churn
│   │   │
│   │   └── security/                 # Level 4 tests
│   │       ├── unauthorized_access_test.go # TC-301
│   │       └── sybil_attack_test.go  # TC-302 (future)
│   │
│   ├── docker/                       # Docker configurations
│   │   ├── Dockerfile.mau            # Mau peer image
│   │   ├── Dockerfile.bootstrap      # Bootstrap node image
│   │   ├── docker-compose.yml        # Manual test environment
│   │   └── entrypoint.sh             # Container entrypoint script
│   │
│   ├── configs/                      # Test configurations
│   │   ├── default.json              # Default test config
│   │   ├── ci.json                   # CI-optimized config
│   │   └── stress.json               # Stress test config
│   │
│   ├── scripts/                      # Utility scripts
│   │   ├── build-images.sh           # Build Docker images
│   │   ├── run-tests.sh              # Run test suite
│   │   ├── parse-logs.sh             # Extract logs from failed tests
│   │   └── generate-report.sh        # Generate HTML test report
│   │
│   └── docs/                         # Documentation
│       ├── writing-tests.md          # Guide for adding new tests
│       ├── debugging.md              # Debugging failed tests
│       ├── architecture.md           # Framework architecture
│       └── toxiproxy-guide.md        # Toxiproxy usage guide
│
├── go.mod                            # Add e2e dependencies
└── .github/
    └── workflows/
        └── e2e-tests.yml             # GitHub Actions workflow

Implementation Phases ¶

Phase 1: Foundation (Weeks 1-2) ¶

Goal: Basic framework with simple 2-peer tests + interactive CLI foundation

Deliverables:

Docker image for Mau peer (e2e/docker/Dockerfile.mau)
Shared testenv library (e2e/framework/testenv/)
Interactive CLI structure (e2e/cmd/mau-e2e/)
mau-e2e up/down commands
mau-e2e peer add/list commands
State persistence (~/.mau-e2e/)
TC-001: Two-peer discovery (automated test)
TC-002: Two-peer friend sync (automated test)
Makefile for building and running tests
CI workflow (GitHub Actions)

Success Criteria:

Can start 2 peers with mau-e2e up --peers 2
Can list peers with mau-e2e peer list
Tests run locally with make test-e2e
Tests pass in CI
Logs captured on failure
Same testenv library used by both CLI and tests

Key Files:

e2e/docker/Dockerfile.mau
e2e/framework/testenv/testenv.go
e2e/framework/testenv/peer.go
e2e/scenarios/basic/discovery_test.go
e2e/scenarios/basic/friend_sync_test.go
e2e/Makefile
.github/workflows/e2e-tests.yml

Example testenv.go structure:

 1package testenv
 2
 3import (
 4    "context"
 5    "testing"
 6    "github.com/testcontainers/testcontainers-go"
 7)
 8
 9type TestEnv struct {
10    ctx     context.Context
11    network testcontainers.Network
12    peers   []*MauPeer
13    t       *testing.T
14}
15
16func NewTestEnv(t *testing.T) *TestEnv {
17    // Create isolated Docker network
18    // Return TestEnv instance
19}
20
21func (e *TestEnv) AddPeer(name string) (*MauPeer, error) {
22    // Create and start Mau peer container
23    // Wait for readiness
24    // Return MauPeer wrapper
25}
26
27func (e *TestEnv) Cleanup() {
28    // Stop all containers
29    // Remove network
30    // Collect logs
31}

Phase 2: Multi-Peer & Peer Interaction (Weeks 3-4) ¶

Goal: Expand to multi-peer scenarios + file/friend CLI commands

Deliverables:

Custom assertion library (e2e/framework/assertions/)
- AssertFilesSynced(peers []*MauPeer, filename string, timeout time.Duration)
- AssertDHTLookup(peer *MauPeer, targetFingerprint string, timeout time.Duration)
- AssertFriendRelationship(peer1, peer2 *MauPeer)
mau-e2e friend add/list commands
mau-e2e file add/list/cat commands
mau-e2e peer inspect command
TC-003: Multi-peer sync (5 peers)
TC-004: Version conflict resolution
Helper for complex friend graph setup
Documentation: docs/writing-tests.md

Success Criteria:

Can manually test 2-peer sync via CLI
5-peer test completes in < 2 minutes
Assertions provide clear failure messages
New test cases easy to write (< 50 lines)

Phase 3: Real-time Monitoring + Chaos (Weeks 5-6) ¶

Goal: Introduce Toxiproxy and real-time observability

Deliverables:

Toxiproxy integration (e2e/framework/testenv/toxiproxy.go)
mau-e2e file watch command (real-time sync events)
mau-e2e status --watch command (live dashboard)
mau-e2e net partition/heal commands
mau-e2e net latency/limit commands
Color-coded CLI output
Proxy configuration per peer
TC-101: Peer crash during sync
TC-102: Network partition
TC-103: High latency
TC-104: Bandwidth limitation
TC-105: Packet loss
Documentation: docs/toxiproxy-guide.md

Success Criteria:

Can observe sync happening in real-time via CLI
Can create network partitions interactively
Toxiproxy dynamically controlled during tests
Chaos tests reproducible (same seed → same result)
Tests detect real bugs (validate against known issues)

Example Toxiproxy usage:

 1func TestNetworkPartition(t *testing.T) {
 2    env := testenv.NewTestEnv(t)
 3    defer env.Cleanup()
 4
 5    // Create 4 peers
 6    peers := env.AddPeers(4)
 7    
 8    // Establish friend relationships
 9    env.MakeFriends(peers[0], peers[1])
10    env.MakeFriends(peers[2], peers[3])
11    
12    // Create network partition: {0,1} vs {2,3}
13    partition := env.CreatePartition([]int{0, 1}, []int{2, 3})
14    
15    // Publish files on both sides
16    env.AddFile(peers[0], "fileA.txt", "content A")
17    env.AddFile(peers[2], "fileB.txt", "content B")
18    
19    time.Sleep(5 * time.Second)
20    
21    // Assert files don't cross partition
22    assert.NoFile(t, peers[2], "fileA.txt")
23    assert.NoFile(t, peers[0], "fileB.txt")
24    
25    // Heal partition
26    partition.Heal()
27    
28    // Assert files eventually sync
29    assertions.AssertFilesSynced(t, peers, "fileA.txt", 60*time.Second)
30    assertions.AssertFilesSynced(t, peers, "fileB.txt", 60*time.Second)
31}

Phase 4: Stress Testing (Weeks 7-8) ¶

Goal: Validate scalability and performance

Deliverables:

TC-201: 10-peer full mesh
TC-202: 100-peer network (if CI resources allow)
TC-203: Peer churn
Performance metrics collection
Memory/CPU usage monitoring
Test result trending (store results in Git)

Success Criteria:

10-peer test completes in < 5 minutes
100-peer test completes in < 30 minutes (optional)
No memory leaks detected
Performance baselines established

Resource Considerations:

100-peer test may require dedicated CI runners
Consider matrix testing: run 100-peer test weekly, not on every PR
Implement early exit if resource exhaustion detected

Phase 5: Advanced CLI Features (Weeks 9-10) ¶

Goal: Complete interactive feature set

Deliverables:

Interactive shell mode (mau-e2e shell)
Predefined scenarios (mau-e2e scenario <name>)
Snapshot/restore (mau-e2e snapshot/restore)
DHT commands (dht lookup/table)
Structured logging with trace IDs
Log aggregation script (scripts/parse-logs.sh)
State snapshot capture on failure (peer file trees, DHT tables)
HTML test report generation (scripts/generate-report.sh)
Documentation: docs/debugging.md
Automatic log upload to CI artifacts

Success Criteria:

Interactive shell provides seamless workflow
Can prototype test scenarios interactively
Failed test produces:
- Full logs for all peers
- File system state snapshots
- DHT routing table dumps
- Network traffic summary (if available)
Debugging time reduced by 80%

Example Log Format:

 1{
 2  "timestamp": "2026-02-19T14:30:00Z",
 3  "level": "info",
 4  "peer": "peer-1",
 5  "fingerprint": "ABAF11C65A2970B130ABE3C479BE3E4300411886",
 6  "trace_id": "test-tc002-abc123",
 7  "component": "sync",
 8  "event": "file_download_started",
 9  "file": "hello.txt",
10  "source_peer": "peer-2",
11  "source_fingerprint": "BBAF11C65A2970B130ABE3C479BE3E4300411887"
12}

Phase 6: Polish & Documentation (Weeks 11-12) ¶

Goal: Production-ready framework with excellent docs

Deliverables:

TC-301: Unauthorized file access
TC-302: DHT Sybil attack (basic version)
Comprehensive CLI documentation
Video tutorial (screencast of interactive usage)
Example demo scripts
Parallel test execution in CI
Test result caching (skip unchanged tests)
Nightly stress test runs
Security test suite in separate workflow
Badge generation (test pass rate, coverage)
Integration verification (ensure CLI + tests share code)

Success Criteria:

New developer can use CLI productively in < 15 minutes
Video tutorial demonstrates P2P sync visually
CI pipeline completes in < 15 minutes (basic tests)
Nightly stress tests run without supervision
Security tests detect unauthorized access attempts
Test failures block PR merges

GitHub Actions Workflow Structure:

 1name: E2E Tests
 2
 3on:
 4  pull_request:
 5  push:
 6    branches: [main]
 7  schedule:
 8    - cron: '0 2 * * *'  # Nightly at 2 AM
 9
10jobs:
11  basic-tests:
12    runs-on: ubuntu-latest
13    steps:
14      - uses: actions/checkout@v3
15      - uses: docker/setup-buildx-action@v2
16      - name: Build Mau image
17        run: make -C e2e build-image
18      - name: Run basic tests
19        run: make -C e2e test-basic
20      - name: Upload logs
21        if: failure()
22        uses: actions/upload-artifact@v3
23        with:
24          name: test-logs-basic
25          path: e2e/test-results/
26
27  chaos-tests:
28    runs-on: ubuntu-latest
29    steps:
30      # Similar structure
31      
32  stress-tests:
33    runs-on: ubuntu-latest
34    if: github.event_name == 'schedule'  # Only nightly
35    steps:
36      # Run TC-202 (100-peer test)

Example Test Case Walkthrough ¶

Test: TC-002 - Two-Peer Friend Sync ¶

File: e2e/scenarios/basic/friend_sync_test.go

 1package basic
 2
 3import (
 4    "strings"
 5    "testing"
 6    "time"
 7    
 8    "github.com/mau-network/mau/e2e/framework/assertions"
 9    "github.com/mau-network/mau/e2e/framework/testenv"
10    "github.com/stretchr/testify/assert"
11    "github.com/stretchr/testify/require"
12)
13
14func TestTwoP eerFriendSync(t *testing.T) {
15    // Step 1: Create test environment
16    env := testenv.NewTestEnv(t)
17    defer env.Cleanup()  // Ensures cleanup even on test failure
18    
19    // Step 2: Start two Mau peers
20    peerA, err := env.AddPeer("peer-a")
21    require.NoError(t, err, "Failed to create peer A")
22    
23    peerB, err := env.AddPeer("peer-b")
24    require.NoError(t, err, "Failed to create peer B")
25    
26    // Step 3: Exchange public keys and establish friend relationship
27    err = env.MakeFriends(peerA, peerB)
28    require.NoError(t, err, "Failed to establish friendship")
29    
30    // Step 4: Verify friend relationship from both sides
31    assertions.AssertFriendRelationship(t, peerA, peerB)
32    assertions.AssertFriendRelationship(t, peerB, peerA)
33    
34    // Step 5: Peer A creates a file encrypted for Peer B
35    fileContent := "Hello from Peer A!"
36    err = peerA.AddFile("hello.txt", strings.NewReader(fileContent), []string{peerB.Fingerprint()})
37    require.NoError(t, err, "Failed to create file on peer A")
38    
39    // Step 6: Wait for synchronization (with timeout)
40    syncTimeout := 30 * time.Second
41    err = assertions.WaitForFile(t, peerB, "hello.txt", syncTimeout)
42    require.NoError(t, err, "File did not sync to peer B within timeout")
43    
44    // Step 7: Verify file content matches
45    content, err := peerB.ReadFile("hello.txt")
46    require.NoError(t, err, "Failed to read file from peer B")
47    assert.Equal(t, fileContent, content, "File content mismatch")
48    
49    // Step 8: Verify file is encrypted (PGP format)
50    rawContent, err := peerB.ReadFileRaw("hello.txt")
51    require.NoError(t, err, "Failed to read raw file")
52    assert.Contains(t, rawContent, "-----BEGIN PGP MESSAGE-----", "File not encrypted")
53    
54    // Step 9: Check synchronization logs for debugging
55    logs := peerB.GetLogs()
56    assert.Contains(t, logs, "file_download_completed", "Sync event not logged")
57}

How This Test Executes:

Test Environment Creation:
- testenv.NewTestEnv(t) creates isolated Docker network mau-test-<uuid>
- Initializes cleanup handlers
Peer A Container Startup:
- Pulls/uses mau-e2e:latest image
- Generates PGP account or uses pre-generated
- Starts HTTP server on random port (mapped to host)
- Joins DHT with bootstrap node (if configured)
- Exposes health endpoint: GET /health
- Testcontainers waits for healthy status (max 30s)
Peer B Container Startup:
- Same process as Peer A
- Different fingerprint, different port
Friend Relationship Setup:
- Test coordinator extracts Peer B’s public key via API: GET /p2p/<peer-b-fpr>/account.pgp
- Injects into Peer A’s keyring: POST /admin/friends (test-only endpoint)
- Repeats in reverse direction
- Verifies keyring files created: .mau/<peer-fpr>.pgp

File Creation:

Test coordinator calls Peer A API: POST /admin/files

1{
2  "name": "hello.txt",
3  "content": "SGVsbG8gZnJvbSBQZWVyIEEh",  // base64
4  "encrypt_for": ["BBAF..."] // Peer B fingerprint
5}

Peer A encrypts file with Peer B’s public key
Writes to <peer-a-fpr>/hello.txt.pgp

Synchronization:
- Peer B periodically polls Peer A: GET /p2p/<peer-a-fpr> (If-Modified-Since header)
- Response includes hello.txt metadata
- Peer B downloads: GET /p2p/<peer-a-fpr>/hello.txt
- Verifies signature, decrypts, writes to local storage
Assertion:
- Test coordinator calls Peer B: GET /admin/files/hello.txt
- Decrypts and returns plaintext
- Compares with original content
Cleanup:
- defer env.Cleanup() triggers
- Stops containers
- Collects logs to e2e/test-results/<test-name>/
- Removes Docker network
- On failure: preserves container state for debugging

Execution Time: ~15 seconds (including container startup)

Framework Comparison ¶

Approach 1: Pure Docker Compose ¶

How it works:

Define all peers in docker-compose.yml
Use shell scripts to orchestrate (docker-compose up/down)
Manual assertion via docker exec commands

Pros:

✅ Simple to understand
✅ Easy to run manually for debugging
✅ No Go dependencies

Cons:

❌ Not programmatic - hard to parameterize (N peers)
❌ Poor test isolation (shared Docker Compose project)
❌ Manual cleanup prone to errors
❌ Difficult to integrate with go test
❌ No automatic log collection on failure

Verdict: ❌ Not Recommended - Good for manual exploration, bad for automated testing

Approach 2: Testcontainers-Go (Recommended) ¶

How it works:

Go test code creates containers programmatically
Full control over lifecycle, networking, configuration
Native integration with go test

Pros:

✅ Type-safe, programmatic control
✅ Automatic cleanup with defer
✅ Parameterized tests (easy to vary peer count)
✅ Test isolation (each test gets unique network)
✅ Rich ecosystem (wait strategies, log streaming)
✅ CI-friendly (integrates with GitHub Actions)

Cons:

⚠️ Requires Docker daemon (already needed for Mau development)
⚠️ Learning curve for Testcontainers API (well-documented)

Verdict: ✅ Recommended - Best balance of control and maintainability

Approach 3: Kubernetes-based (e.g., kind, k3d) ¶

How it works:

Deploy Mau peers as Kubernetes pods
Use Kubernetes CRDs for test orchestration
Tools: Kubetest2, Chainsaw, Sonobuoy

Pros:

✅ Production-like environment
✅ Advanced networking (NetworkPolicies for partitions)
✅ Resource management (CPU/memory limits)

Cons:

❌ Massive overkill for Mau’s scope
❌ Slow startup time (k8s cluster initialization)
❌ Complex debugging
❌ CI resource intensive

Verdict: ❌ Not Recommended - Overkill, stick with Docker

Approach 4: Custom Test Harness (like Ethereum Hive) ¶

How it works:

Build custom orchestration tool in Go
Test scenarios as separate binaries
Client implementations containerized

Pros:

✅ Maximum flexibility
✅ Client-agnostic (could test Rust/Python Mau implementations)
✅ Reusable across projects

Cons:

❌ Huge development effort (weeks to build harness)
❌ Maintenance burden
❌ Not justified for single-implementation project (Mau only has Go impl)

Verdict: ⚠️ Overkill Now, Revisit Later - Good if Mau gets multiple implementations

Recommendation Matrix ¶

Criterion	Docker Compose	Testcontainers-Go	Kubernetes	Custom Harness
Ease of Use	★★★★☆	★★★☆☆	★☆☆☆☆	★★☆☆☆
Programmatic Control	★☆☆☆☆	★★★★★	★★★☆☆	★★★★★
Test Isolation	★★☆☆☆	★★★★★	★★★★★	★★★★☆
CI/CD Integration	★★☆☆☆	★★★★★	★★★☆☆	★★★☆☆
Debugging	★★★★☆	★★★★☆	★★☆☆☆	★★★☆☆
Maintenance	★★★☆☆	★★★★☆	★☆☆☆☆	★★☆☆☆
Scalability (100+ peers)	★★☆☆☆	★★★★☆	★★★★★	★★★★☆
Setup Time	5 min	15 min	60 min	120 min

Final Recommendation: Testcontainers-Go with Docker Compose for manual debugging

CI/CD Integration ¶

GitHub Actions Workflow Design ¶

File: .github/workflows/e2e-tests.yml

  1name: E2E Tests
  2
  3on:
  4  pull_request:
  5    paths:
  6      - '**.go'
  7      - 'e2e/**'
  8      - '.github/workflows/e2e-tests.yml'
  9  push:
 10    branches: [main, develop]
 11  schedule:
 12    - cron: '0 2 * * *'  # Nightly stress tests
 13
 14env:
 15  GO_VERSION: '1.21'
 16  DOCKER_BUILDKIT: 1
 17
 18jobs:
 19  build-image:
 20    name: Build Mau E2E Image
 21    runs-on: ubuntu-latest
 22    steps:
 23      - uses: actions/checkout@v4
 24      
 25      - name: Set up Docker Buildx
 26        uses: docker/setup-buildx-action@v3
 27      
 28      - name: Build and export
 29        uses: docker/build-push-action@v5
 30        with:
 31          context: .
 32          file: e2e/docker/Dockerfile.mau
 33          tags: mau-e2e:${{ github.sha }}
 34          outputs: type=docker,dest=/tmp/mau-e2e.tar
 35      
 36      - name: Upload image artifact
 37        uses: actions/upload-artifact@v4
 38        with:
 39          name: mau-e2e-image
 40          path: /tmp/mau-e2e.tar
 41          retention-days: 1
 42
 43  test-basic:
 44    name: Basic Tests (Level 1)
 45    needs: build-image
 46    runs-on: ubuntu-latest
 47    timeout-minutes: 15
 48    steps:
 49      - uses: actions/checkout@v4
 50      
 51      - name: Set up Go
 52        uses: actions/setup-go@v4
 53        with:
 54          go-version: ${{ env.GO_VERSION }}
 55      
 56      - name: Download image
 57        uses: actions/download-artifact@v4
 58        with:
 59          name: mau-e2e-image
 60          path: /tmp
 61      
 62      - name: Load image
 63        run: docker load --input /tmp/mau-e2e.tar
 64      
 65      - name: Run basic tests
 66        run: |
 67          cd e2e
 68          go test -v -timeout 10m ./scenarios/basic/...          
 69        env:
 70          MAU_E2E_IMAGE: mau-e2e:${{ github.sha }}
 71      
 72      - name: Upload test results
 73        if: always()
 74        uses: actions/upload-artifact@v4
 75        with:
 76          name: test-results-basic
 77          path: e2e/test-results/
 78
 79  test-resilience:
 80    name: Resilience Tests (Level 2)
 81    needs: build-image
 82    runs-on: ubuntu-latest
 83    timeout-minutes: 30
 84    steps:
 85      # Similar to test-basic
 86      - name: Run resilience tests
 87        run: |
 88          cd e2e
 89          go test -v -timeout 25m ./scenarios/resilience/...          
 90
 91  test-stress:
 92    name: Stress Tests (Level 3)
 93    needs: build-image
 94    runs-on: ubuntu-latest-8-cores  # Larger runner
 95    if: github.event_name == 'schedule' || contains(github.event.head_commit.message, '[stress]')
 96    timeout-minutes: 60
 97    steps:
 98      # Similar to test-basic
 99      - name: Run stress tests
100        run: |
101          cd e2e
102          go test -v -timeout 50m ./scenarios/stress/...          
103
104  test-security:
105    name: Security Tests (Level 4)
106    needs: build-image
107    runs-on: ubuntu-latest
108    timeout-minutes: 20
109    steps:
110      - name: Run security tests
111        run: |
112          cd e2e
113          go test -v -timeout 15m ./scenarios/security/...          
114
115  report:
116    name: Generate Test Report
117    needs: [test-basic, test-resilience, test-security]
118    if: always()
119    runs-on: ubuntu-latest
120    steps:
121      - uses: actions/checkout@v4
122      
123      - name: Download all results
124        uses: actions/download-artifact@v4
125        with:
126          path: all-results
127      
128      - name: Generate HTML report
129        run: |
130          cd e2e
131          ./scripts/generate-report.sh ../all-results          
132      
133      - name: Upload report
134        uses: actions/upload-artifact@v4
135        with:
136          name: test-report
137          path: e2e/report.html
138      
139      - name: Comment PR
140        if: github.event_name == 'pull_request'
141        uses: actions/github-script@v7
142        with:
143          script: |
144            // Parse test results and post summary comment

Optimization Strategies ¶

Parallel Test Execution:
- Use t.Parallel() in Go tests where safe
- Run test levels (basic/resilience/stress) in parallel jobs
- Resource limits: max 4 parallel stress tests
Image Caching:
- Cache Mau Docker image layers in GitHub Actions
- Only rebuild on source changes
- Use Docker Buildx cache export
Test Result Caching:
- Hash test inputs (code + config)
- Skip tests if hash matches previous run
- Stored in GitHub Actions cache
Fast Failure:
- Run basic tests first
- Fail fast on basic test failures
- Stress tests only on nightly or manual trigger
Resource Management:
- Limit concurrent containers per test (max 20)
- Use Docker resource limits (CPU/memory)
- Clean up orphaned containers with reaper

Debugging & Observability ¶

Log Collection Strategy ¶

Structured Logs:
All Mau peers emit JSON logs to stdout:

 1{
 2  "timestamp": "2026-02-19T14:30:00Z",
 3  "level": "info",
 4  "peer_id": "peer-a",
 5  "fingerprint": "ABAF11C65A2970B130ABE3C479BE3E4300411886",
 6  "trace_id": "tc002-run42",
 7  "component": "sync",
 8  "event": "file_download_started",
 9  "file": "hello.txt",
10  "source_peer": "peer-b",
11  "bytes": 1024
12}

Fields:

trace_id: Links all logs from one test execution
peer_id: Container name (e.g., peer-a)
component: dht, sync, server, keyring
event: Structured event name

Collection:

Testcontainers auto-captures stdout/stderr
On test failure: dump to e2e/test-results/<test-name>/logs/<peer-id>.json
Use jq for filtering: jq '.component == "sync"' peer-a.json

State Snapshots ¶

What to capture on test failure:

File System State:
- Peer directory trees (.mau/, <fpr>/)
- tar -czf peer-a-files.tar.gz /data
DHT Routing Tables:
- Admin API: GET /admin/dht/routing-table
- Save as JSON
Friend Lists:
- Admin API: GET /admin/friends
- Shows keyring state
Container Stats:
- docker stats snapshot (CPU/memory usage)
- Helps detect resource exhaustion
Network State:
- Active Toxiproxy toxics
- Container connectivity matrix

Automated Snapshot Script:

 1func (e *TestEnv) CaptureSnapshot(testName string) error {
 2    snapshotDir := filepath.Join("test-results", testName, "snapshots")
 3    os.MkdirAll(snapshotDir, 0755)
 4    
 5    for _, peer := range e.peers {
 6        // Capture file tree
 7        peer.ExecTar("/data", filepath.Join(snapshotDir, peer.Name+"-files.tar.gz"))
 8        
 9        // Capture DHT state
10        dht, _ := peer.GetDHTState()
11        writeJSON(filepath.Join(snapshotDir, peer.Name+"-dht.json"), dht)
12        
13        // Capture logs
14        logs := peer.GetLogs()
15        os.WriteFile(filepath.Join(snapshotDir, peer.Name+".log"), logs, 0644)
16    }
17    return nil
18}

Debugging Workflow ¶

When a test fails:

Check CI Artifacts:
- Download test-results-<level>.zip
- Extract to local machine
Read Test Summary:
- test-results/<test-name>/summary.json
- Shows which assertion failed

Filter Logs by Trace ID:

1cd test-results/TestTwoPeerFriendSync/logs
2jq '. | select(.trace_id == "tc002-run42")' peer-*.json | less

Inspect File State:

1tar -xzf snapshots/peer-a-files.tar.gz
2tree data/

Reproduce Locally:

1# Use Docker Compose for manual control
2cd e2e
3docker-compose -f docker/docker-compose.yml up
4# Manually trigger actions via API
5curl -X POST http://localhost:8080/admin/files -d '...'

Enable Verbose Logging:

1// In test file
2env.SetLogLevel("debug")  // Enables DEBUG level logs

Pause Test on Failure:

1if t.Failed() {
2    fmt.Println("Test failed, containers still running. Press enter to cleanup...")
3    bufio.NewReader(os.Stdin).ReadString('\n')
4}

Observability Tools ¶

Tool	Purpose	Integration
jq	Log filtering/analysis	Manual, CI scripts
Docker logs	Real-time log tailing	`docker logs -f <container>`
Docker stats	Resource monitoring	`docker stats` during test
Wireshark/tcpdump	Network traffic capture (advanced)	Manual debugging
Grafana/Loki	Log aggregation (future)	Optional for large test suites

Open Questions & Future Work ¶

Open Questions ¶

DHT Bootstrap Strategy:
- Should tests use a dedicated bootstrap node or peer-to-peer discovery?
- Trade-off: Bootstrap node simplifies setup but adds dependency
Test Data Persistence:
- Should test results be stored in Git for trend analysis?
- Or use external service (TestRail, Allure)?
Performance Baselines:
- What is acceptable sync time for 10 peers? 100 peers?
- Need empirical data to set thresholds
Chaos Test Reproducibility:
- How to ensure random failures are reproducible?
- Solution: Seed-based randomness with seed in test name
Security Test Scope:
- How deep should Sybil attack testing go?
- May require S/Kademlia implementation first
Test Environment Variables:
- Should tests read config from env vars (for CI tuning)?
- Or strictly use code-defined configs?

Future Enhancements ¶

Phase 7+: Advanced Features ¶

Visual Test Reports:
- HTML dashboard with pass/fail trends
- Peer graph visualization (D3.js)
- Timeline view of peer interactions
Mutation Testing:
- Inject bugs into Mau code
- Verify E2E tests catch them
- Measures test effectiveness
Fuzz Testing Integration:
- Use go-fuzz to generate file content
- Test PGP encryption with malformed keys
- Kademlia message fuzzing
Performance Regression Detection:
- Store sync time metrics in database
- Alert on >20% slowdown
- Integration with GitHub Status Checks
Multi-Platform Testing:
- Test on ARM64 (e.g., Raspberry Pi simulation)
- Windows containers (if Mau supports)
Record/Replay:
- Record network interactions during test
- Replay for deterministic debugging
- Tools: VCR, go-replay
Chaos Mesh Integration:
- More advanced chaos scenarios
- CPU/memory pressure testing
- Clock skew simulation (important for PGP timestamp validation)
Contract Testing:
- Verify HTTP API backwards compatibility
- Pact or OpenAPI validation

Conclusion ¶

This E2E testing framework is designed to:

✅ Validate Mau’s core P2P functionality (discovery, sync, friend management)
✅ Detect regressions early via automated CI/CD integration
✅ Simulate real-world conditions (network failures, high latency, peer churn)
✅ Scale from 2 to 100+ peers with minimal test code changes
✅ Provide excellent debugging with rich logs, state snapshots, and artifacts
✅ Remain maintainable with clear structure and comprehensive documentation

Next Steps ¶

Review this plan with the team
Approve technology choices (Testcontainers-Go, Toxiproxy)
Prioritize test scenarios (start with TC-001, TC-002)
Begin Phase 1 implementation
Iterate based on real-world bugs found

Success Metrics ¶

After 6 months of use:

Test coverage: >80% of P2P scenarios
Bug detection: >10 bugs caught before production
Developer adoption: New contributors can add tests in <1 hour
CI reliability: <1% flaky test rate
Debugging time: <30 minutes to root-cause failures

Document Version: 1.0
Last Updated: 19 February 2026
Reviewers: [To be assigned]
Status: Awaiting Review