e2e/PLAN
Tuesday 24 February 2026

Mau E2E Testing Framework - Comprehensive Plan

Version: 1.0
Date: 19 February 2026
Status: Design Phase

Table of Contents

  1. Executive Summary
  2. Research Findings
  3. Architecture Overview
  4. Technology Stack
  5. Test Scenarios
  6. File Structure
  7. Implementation Phases
  8. Example Test Case Walkthrough
  9. Framework Comparison
  10. CI/CD Integration
  11. Debugging & Observability
  12. Open Questions & Future Work

Executive Summary

This document outlines a comprehensive end-to-end (E2E) testing framework for Mau, a P2P file synchronization system built on Kademlia DHT. The framework provides two complementary modes:

  1. Interactive CLI Mode (mau-e2e tool) - Manual control and exploratory testing
  2. Automated Testing Mode (go test) - CI/CD integration and regression detection

Both modes share the same core testenv library, ensuring consistency between manual exploration and automated validation.

Key Design Principles:

  • Interactive-first design - Developers can see P2P behavior, not just assert it worked
  • Deterministic by default, chaos-ready by design
  • Easy to add new test cases with minimal boilerplate
  • Rich observability with comprehensive logging, tracing, and state inspection
  • Network condition simulation (latency, packet loss, partitions)
  • CI/CD friendly with parallelizable, isolated test execution
  • Production-grade reliability for long-term maintenance

Primary Goals:

  1. [Interactive] Enable manual exploration of P2P synchronization behavior
  2. [Automated] Test N Mau instances discovering each other via Kademlia DHT
  3. [Automated] Verify friend relationship establishment and maintenance
  4. [Automated] Validate file synchronization across peers
  5. [Both] Simulate network failures and verify recovery
  6. [Automated] Stress test with varying peer counts (2-100+)
  7. [Automated] Detect regressions before they reach production

See Also: CLI_DESIGN.md for detailed interactive CLI specifications


Research Findings

Analysis of Existing P2P Testing Frameworks

1. libp2p/test-plans

What it does: Interoperability testing for libp2p implementations
Approach:

  • Docker containers with different libp2p implementations
  • Language-agnostic test orchestration
  • Test scenarios defined in separate containers
  • Results published to GitHub Pages

Key Learnings:

  • Language-agnostic orchestration allows testing across implementations
  • Docker-based isolation ensures reproducibility
  • Separate test definition from implementation (test scenarios as standalone containers)
  • ⚠️ Complex setup for simple scenarios

Applicable to Mau:

  • Use Docker containers for peer isolation
  • Define test scenarios declaratively
  • Collect structured test results

2. Ethereum Hive

What it does: End-to-end test harness for Ethereum clients
Approach:

  • Simulator framework in Go
  • Client implementations run in Docker containers
  • Simulators orchestrate multi-client scenarios
  • JSON-based test result reporting

Key Learnings:

  • Simulator pattern separates test logic from client runtime
  • Client-agnostic interface allows testing different implementations
  • Structured result format (JSON) enables trend analysis
  • Trophy list motivates participation and showcases bugs found

Applicable to Mau:

  • Adopt simulator pattern for orchestration
  • Use Go for test harness (matches Mau’s language)
  • Implement structured test results with detailed traces
  • Create “bug trophy list” to validate framework effectiveness

3. Testcontainers-Go

What it does: Programmatic container lifecycle management for tests
Approach:

  • Go API for creating/managing Docker containers in tests
  • Lifecycle hooks (startup, ready checks, teardown)
  • Network management, volume mounts, log streaming
  • Automatic cleanup with defer

Key Learnings:

  • Native Go integration - no external orchestration needed
  • Type-safe API prevents configuration errors
  • Automatic cleanup reduces test pollution
  • Wait strategies ensure containers are ready before testing
  • Log capture for debugging failures
  • ⚠️ Requires Docker daemon access

Applicable to Mau:

  • Primary orchestration tool for E2E tests
  • Use GenericContainer with custom Mau image
  • Implement custom wait strategies for Kademlia bootstrap
  • Capture logs per-container for debugging

4. Toxiproxy

What it does: Network condition simulation proxy
Approach:

  • TCP proxy with HTTP API for adding “toxics”
  • Toxics: latency, bandwidth limits, timeouts, connection resets
  • Upstream/downstream control (client→server vs server→client)
  • Probability-based application (toxicity parameter)

Supported Toxics:

  • latency: Add delay ± jitter
  • bandwidth: Rate limiting (KB/s)
  • timeout: Drop data, optionally close connection
  • slow_close: Delay TCP FIN
  • reset_peer: TCP RST simulation
  • slicer: Fragment packets
  • limit_data: Close after N bytes

Key Learnings:

  • Surgical network failure injection without container restarts
  • Directional control (affect only requests or responses)
  • Dynamic reconfiguration during test execution
  • Lightweight - runs alongside services
  • ⚠️ HTTP API adds complexity but enables dynamic control

Applicable to Mau:

  • Run Toxiproxy sidecars in Docker network
  • Configure each Mau peer to route through proxy
  • Inject latency/partitions during synchronization tests
  • Test Kademlia resilience under packet loss

5. Chaos Engineering Principles

Source: principlesofchaos.org, Netflix Simian Army

Core Principles:

  1. Define steady state - measurable system output indicating normal behavior
  2. Hypothesize steady state continues under perturbation
  3. Introduce real-world failure variables (crashes, network issues, resource exhaustion)
  4. Disprove hypothesis by detecting steady state deviation

Advanced Principles:

  • Focus on system behavior, not internals
  • Vary real-world events by impact/frequency
  • Run in production when possible (not applicable to Mau tests)
  • Automate continuously
  • Minimize blast radius

Key Learnings:

  • ✅ Define “sync success” steady state for Mau
  • ✅ Test assumptions about resilience explicitly
  • ✅ Automate chaos scenarios in CI
  • ✅ Start with small perturbations, increase gradually

Applicable to Mau:

  • Define steady state: “All peers have consistent file states”
  • Hypothesis: “File sync completes within 5s under normal conditions”
  • Variables: peer crashes, network partitions, high latency
  • Validation: Check file SHA256 consistency across peers

Technology Stack Evaluation

Component Options Considered Selected Rationale
Container Orchestration Docker Compose, Testcontainers-Go, Kubernetes Testcontainers-Go Native Go integration, programmatic control, automatic cleanup
Mau Instance Packaging Binary in container, Docker image Docker image Consistent environment, easy version management
Network Simulation Toxiproxy, tc (Linux), Pumba, Comcast Toxiproxy Cross-platform, programmable, well-documented
Test Framework Go testing, Ginkgo/Gomega, Testify Go testing + Testify Minimal dependencies, familiar to Mau contributors
Logging Container logs, Loki, ELK Structured JSON logs + file export Simple, parseable, CI-friendly
Tracing OpenTelemetry, Jaeger, custom Custom trace IDs in logs Lightweight, no external dependencies
Metrics Prometheus, InfluxDB, none Test execution metrics in JSON Easy CI integration, no infra overhead
Assertion Library Standard Go, Testify, Gomega Testify Rich assertions, already used in Mau codebase

Architecture Overview

Dual-Mode Architecture

┌────────────────────────────────────────────────────────────────┐
│                     Mau E2E Framework                          │
│                                                                │
│  ┌──────────────────────┐    ┌──────────────────────┐         │
│  │  Interactive Mode    │    │   Automated Mode     │         │
│  │  (mau-e2e CLI)       │    │   (go test)          │         │
│  │                      │    │                      │         │
│  │  - Manual control    │    │  - CI/CD testing     │         │
│  │  - Exploration       │    │  - Regression detect │         │
│  │  - Debugging         │    │  - Assertions        │         │
│  │  - Demonstrations    │    │  - Coverage tracking │         │
│  └──────────┬───────────┘    └──────────┬───────────┘         │
│             │                           │                     │
│             └────────┬──────────────────┘                     │
│                      ▼                                        │
│           ┌──────────────────────┐                           │
│           │   Shared Core        │                           │
│           │   (testenv library)  │                           │
│           │                      │                           │
│           │  - Peer management   │                           │
│           │  - Network control   │                           │
│           │  - State persistence │                           │
│           │  - Assertions        │                           │
│           └──────────┬───────────┘                           │
└──────────────────────┼───────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Docker Network (mau-test-net)               │
│                                                                 │
│  ┌───────────────┐   ┌───────────────┐   ┌───────────────┐    │
│  │  Mau Peer 1   │   │  Mau Peer 2   │   │  Mau Peer 3   │    │
│  │  Container    │   │  Container    │   │  Container    │    │
│  │               │   │               │   │               │    │
│  │  - Account    │   │  - Account    │   │  - Account    │    │
│  │  - Server     │   │  - Server     │   │  - Server     │    │
│  │  - DHT Node   │   │  - DHT Node   │   │  - DHT Node   │    │
│  │  - Files      │   │  - Files      │   │  - Files      │    │
│  └───────┬───────┘   └───────┬───────┘   └───────┬───────┘    │
│          │                   │                   │            │
│          └───────────────────┼───────────────────┘            │
│                              │                                │
│  ┌───────────────────────────┴────────────────────────────┐   │
│  │              Toxiproxy (Optional)                      │   │
│  │  - Latency injection                                   │   │
│  │  - Bandwidth limiting                                  │   │
│  │  - Network partitions                                  │   │
│  └────────────────────────────────────────────────────────┘   │
│                                                                │
│  ┌────────────────────────────────────────────────────────┐   │
│  │         Bootstrap Node (Optional)                      │   │
│  │  - Kademlia DHT seed                                   │   │
│  └────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Observability Layer                         │
│                                                                 │
│  - Container logs (JSON structured)                             │
│  - Test result artifacts (JSON)                                 │
│  - File state snapshots (for debugging)                         │
│  - Network traffic traces (optional)                            │
└─────────────────────────────────────────────────────────────────┘

Component Breakdown

1. Test Coordinator

  • Language: Go
  • Framework: go test with Testcontainers-Go
  • Responsibilities:
    • Parse test configuration (peer count, network topology, test scenario)
    • Build/pull Mau Docker image
    • Create isolated Docker network
    • Spawn Mau peer containers
    • Inject friend relationships
    • Inject files to sync
    • Wait for synchronization
    • Assert expected state
    • Collect logs and artifacts
    • Cleanup containers and networks

2. Mau Peer Container

  • Base Image: golang:1.21-alpine (multi-stage build)
  • Contents:
    • Mau binary (server + DHT node)
    • PGP keyring initialization
    • Configuration via environment variables
    • Healthcheck endpoint
    • Structured logging to stdout

Container Lifecycle:

  1. Init: Generate PGP account or import existing
  2. Bootstrap: Connect to DHT seed nodes (if provided)
  3. Ready: HTTP server listening, DHT routing table populated
  4. Runtime: Accept friend additions, file synchronization
  5. Shutdown: Graceful stop, flush logs

3. Toxiproxy Sidecar (Optional)

  • When to use: Chaos/resilience tests
  • Configuration:
    • Each Mau peer routes through local Toxiproxy instance
    • Toxiproxy forwards to actual Mau server
    • Test coordinator controls toxics via HTTP API

Example Proxy Configuration:

1{
2  "name": "mau_peer_1",
3  "listen": "0.0.0.0:8080",
4  "upstream": "mau-peer-1:8080",
5  "enabled": true
6}

4. Bootstrap Node

  • Purpose: Seed Kademlia DHT for peer discovery
  • Implementation:
    • Dedicated Mau instance with known address
    • All peers configured with bootstrap node in environment
    • Not tested, acts as infrastructure

Technology Stack

Core Technologies

Component Technology Version Purpose
Container Runtime Docker 20.10+ Peer isolation
Orchestration Testcontainers-Go v0.27+ Programmatic container management
Test Framework Go testing 1.21+ Test execution
Assertions Testify v1.8+ Rich assertions (already in Mau)
Network Proxy Toxiproxy 2.5+ Network condition simulation
Logging Zerolog / Zap Latest Structured JSON logging

Supporting Tools

Tool Purpose When Used
Docker Compose Manual test environment setup Local development/debugging
jq Log parsing/filtering Debugging failed tests
Make Build automation CI/CD pipeline
GitHub Actions CI/CD runner Automated testing

Test Scenarios

Level 1: Basic Functionality (Deterministic)

TC-001: Two-Peer Discovery

Objective: Verify two Mau peers can discover each other via Kademlia DHT
Setup:

  • Peer A, Peer B
  • Shared bootstrap node
  • Same Docker network

Steps:

  1. Start bootstrap node
  2. Start Peer A, configure bootstrap node
  3. Start Peer B, configure bootstrap node
  4. Wait for DHT routing tables to populate
  5. Query Peer A for Peer B’s fingerprint
  6. Query Peer B for Peer A’s fingerprint

Assertions:

  • Peer A finds Peer B within 5 seconds
  • Peer B finds Peer A within 5 seconds
  • DHT distance calculation is correct

TC-002: Two-Peer Friend Sync

Objective: Verify friend relationship establishment and file synchronization
Setup:

  • Peer A, Peer B
  • Peer A adds Peer B as friend (exchange public keys)
  • Peer B adds Peer A as friend

Steps:

  1. Start both peers
  2. Exchange public keys via test coordinator
  3. Inject friend relationship on both sides
  4. Peer A creates file hello.txt encrypted for Peer B
  5. Wait for synchronization
  6. Verify Peer B has hello.txt
  7. Decrypt and verify content matches

Assertions:

  • Friend relationship established (both sides)
  • File appears on Peer B within 10 seconds
  • Decrypted content matches original
  • File permissions/metadata preserved

TC-003: Multi-Peer Sync (N=5)

Objective: Verify synchronization across 5 peers in a friend graph
Friend Graph:

    A
   /|\
  B C D
   \ /
    E

Setup:

  • 5 peers
  • Friend relationships as shown
  • Peer A creates public file

Steps:

  1. Start all peers
  2. Establish friend relationships
  3. Peer A publishes public file
  4. Wait for propagation
  5. Verify all peers receive file

Assertions:

  • All peers have file within 30 seconds
  • File SHA256 matches across all peers
  • No duplicate file fetches (check logs)

TC-004: Version Conflict Resolution

Objective: Test behavior when two peers edit same file concurrently
Setup:

  • Peer A, Peer B (mutual friends)
  • Both have shared.txt version 1

Steps:

  1. Network partition: isolate A and B
  2. Peer A edits shared.txt → version 2a
  3. Peer B edits shared.txt → version 2b
  4. Restore network
  5. Wait for synchronization

Assertions:

  • Both versions exist (.versions/ directory)
  • Latest version determined by timestamp or conflict resolution rules
  • No data loss

Level 2: Resilience Testing (Chaos)

TC-101: Peer Crash During Sync

Objective: Verify resilience when peer crashes mid-synchronization
Setup:

  • Peer A, Peer B, Peer C (all friends)
  • Peer A has large file (100MB)

Steps:

  1. Peer A starts sharing file
  2. Peer B starts downloading (50% complete)
  3. Kill Peer A container
  4. Wait 10 seconds
  5. Restart Peer A
  6. Verify Peer B resumes download

Assertions:

  • Peer B resumes from last checkpoint (HTTP Range request)
  • Download completes successfully
  • File SHA256 matches
  • Peer C unaffected

TC-102: Network Partition (Split Brain)

Objective: Test synchronization after network partition heals
Friend Graph:

Partition 1: A - B
Partition 2: C - D

Setup:

  • 4 peers in two groups
  • Toxiproxy creates network partition

Steps:

  1. All peers connected initially
  2. Create partition: A-B can’t reach C-D
  3. Peer A publishes file X
  4. Peer C publishes file Y
  5. Wait 30 seconds
  6. Heal partition
  7. Wait for sync

Assertions:

  • After healing, all peers have both files (X and Y)
  • No file corruption
  • Sync time < 60 seconds

TC-103: High Latency Network (500ms)

Objective: Verify synchronization under high latency
Setup:

  • 3 peers
  • Toxiproxy adds 500ms latency ± 100ms jitter

Steps:

  1. Start all peers with latency toxic
  2. Peer A publishes 10 files (1KB each)
  3. Measure sync time

Assertions:

  • Sync completes (may be slow)
  • No timeout errors
  • All files synced correctly
  • Test logs latency measurements

TC-104: Bandwidth Limitation (10 KB/s)

Objective: Test large file sync under bandwidth constraints
Setup:

  • Peer A, Peer B
  • Toxiproxy limits bandwidth to 10 KB/s
  • File size: 1 MB

Steps:

  1. Start both peers
  2. Apply bandwidth toxic
  3. Peer A shares file
  4. Measure sync time

Assertions:

  • Sync time ~= 100 seconds (1MB / 10KB/s)
  • No connection drops
  • File SHA256 correct

TC-105: Packet Loss (10%)

Objective: Verify TCP retransmission handles packet loss
Setup:

  • Peer A, Peer B
  • Toxiproxy slicer toxic with 10% loss simulation

Steps:

  1. Start both peers
  2. Apply packet loss toxic
  3. Peer A shares 100 small files
  4. Monitor sync

Assertions:

  • All files eventually sync
  • Retransmissions visible in logs
  • Sync time < 5 minutes

Level 3: Stress Testing

TC-201: 10-Peer Full Mesh

Objective: Test scalability with 10 peers all friends with each other
Setup:

  • 10 peers
  • 45 friend relationships (full mesh)
  • Peer 1 publishes file

Steps:

  1. Start all peers
  2. Establish all friend relationships
  3. Peer 1 publishes file
  4. Wait for propagation

Assertions:

  • All peers receive file within 2 minutes
  • DHT routing table sizes < 20 entries (k-bucket limit)
  • No memory leaks (check container stats)

TC-202: 100-Peer Network (Sparse Graph)

Objective: Validate DHT performance with 100 peers
Friend Graph: Random graph, average degree = 5
Setup:

  • 100 peers
  • Random friend relationships
  • Bootstrap node

Steps:

  1. Start all peers (parallel batches)
  2. Establish friend relationships
  3. 10 random peers publish files
  4. Wait for propagation

Assertions:

  • DHT queries succeed for all peers
  • Average lookup time < 1 second
  • Sync eventually reaches all connected peers
  • Test completes in < 30 minutes

TC-203: Churn Test (Peers Join/Leave)

Objective: Test DHT stability under peer churn
Setup:

  • Initial: 20 peers
  • Every 30 seconds: 2 peers leave, 2 new peers join
  • Duration: 10 minutes

Steps:

  1. Start initial 20 peers
  2. Start churn loop
  3. Publish file every minute from random peer
  4. Monitor sync success rate

Assertions:

  • File sync success rate > 95%
  • DHT routing table recovers from churn
  • No peer becomes permanently isolated

Level 4: Security Testing

TC-301: Unauthorized File Access

Objective: Verify encrypted files not accessible without decryption key
Setup:

  • Peer A (file owner)
  • Peer B (authorized friend)
  • Peer C (unauthorized, not a friend)

Steps:

  1. Peer A creates file encrypted for Peer B only
  2. Peer C attempts to download file (if discoverable)
  3. Peer C attempts to decrypt file

Assertions:

  • Peer C cannot decrypt file
  • File content remains confidential
  • No plaintext leakage in logs

TC-302: DHT Sybil Attack Resistance

Objective: Test DHT behavior under Sybil attack (future work)
Setup:

  • 10 honest peers
  • 50 malicious peers with coordinated IDs

Steps:

  1. Honest peers establish DHT
  2. Malicious peers join with IDs near target peer
  3. Attempt to monopolize routing table
  4. Honest peer tries to find another honest peer

Assertions:

  • Lookup success rate > 90%
  • S/Kademlia defenses (if implemented) mitigate attack

File Structure

mau/
├── e2e/                              # E2E test framework root
│   ├── PLAN.md                       # This document
│   ├── README.md                     # Quick start guide
│   ├── Makefile                      # Build and test automation
│   │
│   ├── framework/                    # Core test framework
│   │   ├── testenv/                  # Test environment setup
│   │   │   ├── testenv.go            # Testcontainers orchestration
│   │   │   ├── peer.go               # Mau peer container wrapper
│   │   │   ├── network.go            # Docker network management
│   │   │   ├── toxiproxy.go          # Toxiproxy integration
│   │   │   └── bootstrap.go          # Bootstrap node management
│   │   │
│   │   ├── assertions/               # Custom assertions
│   │   │   ├── sync.go               # File sync assertions
│   │   │   ├── dht.go                # Kademlia DHT assertions
│   │   │   └── friend.go             # Friend relationship assertions
│   │   │
│   │   ├── helpers/                  # Utility functions
│   │   │   ├── pgp.go                # PGP key generation/management
│   │   │   ├── files.go              # File creation/comparison
│   │   │   ├── logs.go               # Log collection/parsing
│   │   │   └── wait.go               # Wait strategies
│   │   │
│   │   └── types/                    # Shared types
│   │       ├── config.go             # Test configuration
│   │       ├── peer.go               # Peer metadata
│   │       └── result.go             # Test result structures
│   │
│   ├── scenarios/                    # Test scenarios
│   │   ├── basic/                    # Level 1 tests
│   │   │   ├── discovery_test.go     # TC-001: Two-peer discovery
│   │   │   ├── friend_sync_test.go   # TC-002: Two-peer friend sync
│   │   │   ├── multi_peer_test.go    # TC-003: Multi-peer sync
│   │   │   └── version_conflict_test.go # TC-004: Version conflicts
│   │   │
│   │   ├── resilience/               # Level 2 tests
│   │   │   ├── peer_crash_test.go    # TC-101: Peer crash
│   │   │   ├── partition_test.go     # TC-102: Network partition
│   │   │   ├── latency_test.go       # TC-103: High latency
│   │   │   ├── bandwidth_test.go     # TC-104: Bandwidth limits
│   │   │   └── packet_loss_test.go   # TC-105: Packet loss
│   │   │
│   │   ├── stress/                   # Level 3 tests
│   │   │   ├── full_mesh_test.go     # TC-201: 10-peer mesh
│   │   │   ├── large_network_test.go # TC-202: 100-peer network
│   │   │   └── churn_test.go         # TC-203: Peer churn
│   │   │
│   │   └── security/                 # Level 4 tests
│   │       ├── unauthorized_access_test.go # TC-301
│   │       └── sybil_attack_test.go  # TC-302 (future)
│   │
│   ├── docker/                       # Docker configurations
│   │   ├── Dockerfile.mau            # Mau peer image
│   │   ├── Dockerfile.bootstrap      # Bootstrap node image
│   │   ├── docker-compose.yml        # Manual test environment
│   │   └── entrypoint.sh             # Container entrypoint script
│   │
│   ├── configs/                      # Test configurations
│   │   ├── default.json              # Default test config
│   │   ├── ci.json                   # CI-optimized config
│   │   └── stress.json               # Stress test config
│   │
│   ├── scripts/                      # Utility scripts
│   │   ├── build-images.sh           # Build Docker images
│   │   ├── run-tests.sh              # Run test suite
│   │   ├── parse-logs.sh             # Extract logs from failed tests
│   │   └── generate-report.sh        # Generate HTML test report
│   │
│   └── docs/                         # Documentation
│       ├── writing-tests.md          # Guide for adding new tests
│       ├── debugging.md              # Debugging failed tests
│       ├── architecture.md           # Framework architecture
│       └── toxiproxy-guide.md        # Toxiproxy usage guide
│
├── go.mod                            # Add e2e dependencies
└── .github/
    └── workflows/
        └── e2e-tests.yml             # GitHub Actions workflow

Implementation Phases

Phase 1: Foundation (Weeks 1-2)

Goal: Basic framework with simple 2-peer tests + interactive CLI foundation

Deliverables:

  • Docker image for Mau peer (e2e/docker/Dockerfile.mau)
  • Shared testenv library (e2e/framework/testenv/)
  • Interactive CLI structure (e2e/cmd/mau-e2e/)
  • mau-e2e up/down commands
  • mau-e2e peer add/list commands
  • State persistence (~/.mau-e2e/)
  • TC-001: Two-peer discovery (automated test)
  • TC-002: Two-peer friend sync (automated test)
  • Makefile for building and running tests
  • CI workflow (GitHub Actions)

Success Criteria:

  • Can start 2 peers with mau-e2e up --peers 2
  • Can list peers with mau-e2e peer list
  • Tests run locally with make test-e2e
  • Tests pass in CI
  • Logs captured on failure
  • Same testenv library used by both CLI and tests

Key Files:

e2e/docker/Dockerfile.mau
e2e/framework/testenv/testenv.go
e2e/framework/testenv/peer.go
e2e/scenarios/basic/discovery_test.go
e2e/scenarios/basic/friend_sync_test.go
e2e/Makefile
.github/workflows/e2e-tests.yml

Example testenv.go structure:

 1package testenv
 2
 3import (
 4    "context"
 5    "testing"
 6    "github.com/testcontainers/testcontainers-go"
 7)
 8
 9type TestEnv struct {
10    ctx     context.Context
11    network testcontainers.Network
12    peers   []*MauPeer
13    t       *testing.T
14}
15
16func NewTestEnv(t *testing.T) *TestEnv {
17    // Create isolated Docker network
18    // Return TestEnv instance
19}
20
21func (e *TestEnv) AddPeer(name string) (*MauPeer, error) {
22    // Create and start Mau peer container
23    // Wait for readiness
24    // Return MauPeer wrapper
25}
26
27func (e *TestEnv) Cleanup() {
28    // Stop all containers
29    // Remove network
30    // Collect logs
31}

Phase 2: Multi-Peer & Peer Interaction (Weeks 3-4)

Goal: Expand to multi-peer scenarios + file/friend CLI commands

Deliverables:

  • Custom assertion library (e2e/framework/assertions/)
    • AssertFilesSynced(peers []*MauPeer, filename string, timeout time.Duration)
    • AssertDHTLookup(peer *MauPeer, targetFingerprint string, timeout time.Duration)
    • AssertFriendRelationship(peer1, peer2 *MauPeer)
  • mau-e2e friend add/list commands
  • mau-e2e file add/list/cat commands
  • mau-e2e peer inspect command
  • TC-003: Multi-peer sync (5 peers)
  • TC-004: Version conflict resolution
  • Helper for complex friend graph setup
  • Documentation: docs/writing-tests.md

Success Criteria:

  • Can manually test 2-peer sync via CLI
  • 5-peer test completes in < 2 minutes
  • Assertions provide clear failure messages
  • New test cases easy to write (< 50 lines)

Phase 3: Real-time Monitoring + Chaos (Weeks 5-6)

Goal: Introduce Toxiproxy and real-time observability

Deliverables:

  • Toxiproxy integration (e2e/framework/testenv/toxiproxy.go)
  • mau-e2e file watch command (real-time sync events)
  • mau-e2e status --watch command (live dashboard)
  • mau-e2e net partition/heal commands
  • mau-e2e net latency/limit commands
  • Color-coded CLI output
  • Proxy configuration per peer
  • TC-101: Peer crash during sync
  • TC-102: Network partition
  • TC-103: High latency
  • TC-104: Bandwidth limitation
  • TC-105: Packet loss
  • Documentation: docs/toxiproxy-guide.md

Success Criteria:

  • Can observe sync happening in real-time via CLI
  • Can create network partitions interactively
  • Toxiproxy dynamically controlled during tests
  • Chaos tests reproducible (same seed → same result)
  • Tests detect real bugs (validate against known issues)

Example Toxiproxy usage:

 1func TestNetworkPartition(t *testing.T) {
 2    env := testenv.NewTestEnv(t)
 3    defer env.Cleanup()
 4
 5    // Create 4 peers
 6    peers := env.AddPeers(4)
 7    
 8    // Establish friend relationships
 9    env.MakeFriends(peers[0], peers[1])
10    env.MakeFriends(peers[2], peers[3])
11    
12    // Create network partition: {0,1} vs {2,3}
13    partition := env.CreatePartition([]int{0, 1}, []int{2, 3})
14    
15    // Publish files on both sides
16    env.AddFile(peers[0], "fileA.txt", "content A")
17    env.AddFile(peers[2], "fileB.txt", "content B")
18    
19    time.Sleep(5 * time.Second)
20    
21    // Assert files don't cross partition
22    assert.NoFile(t, peers[2], "fileA.txt")
23    assert.NoFile(t, peers[0], "fileB.txt")
24    
25    // Heal partition
26    partition.Heal()
27    
28    // Assert files eventually sync
29    assertions.AssertFilesSynced(t, peers, "fileA.txt", 60*time.Second)
30    assertions.AssertFilesSynced(t, peers, "fileB.txt", 60*time.Second)
31}

Phase 4: Stress Testing (Weeks 7-8)

Goal: Validate scalability and performance

Deliverables:

  • TC-201: 10-peer full mesh
  • TC-202: 100-peer network (if CI resources allow)
  • TC-203: Peer churn
  • Performance metrics collection
  • Memory/CPU usage monitoring
  • Test result trending (store results in Git)

Success Criteria:

  • 10-peer test completes in < 5 minutes
  • 100-peer test completes in < 30 minutes (optional)
  • No memory leaks detected
  • Performance baselines established

Resource Considerations:

  • 100-peer test may require dedicated CI runners
  • Consider matrix testing: run 100-peer test weekly, not on every PR
  • Implement early exit if resource exhaustion detected

Phase 5: Advanced CLI Features (Weeks 9-10)

Goal: Complete interactive feature set

Deliverables:

  • Interactive shell mode (mau-e2e shell)
  • Predefined scenarios (mau-e2e scenario <name>)
  • Snapshot/restore (mau-e2e snapshot/restore)
  • DHT commands (dht lookup/table)
  • Structured logging with trace IDs
  • Log aggregation script (scripts/parse-logs.sh)
  • State snapshot capture on failure (peer file trees, DHT tables)
  • HTML test report generation (scripts/generate-report.sh)
  • Documentation: docs/debugging.md
  • Automatic log upload to CI artifacts

Success Criteria:

  • Interactive shell provides seamless workflow
  • Can prototype test scenarios interactively
  • Failed test produces:
    • Full logs for all peers
    • File system state snapshots
    • DHT routing table dumps
    • Network traffic summary (if available)
  • Debugging time reduced by 80%

Example Log Format:

 1{
 2  "timestamp": "2026-02-19T14:30:00Z",
 3  "level": "info",
 4  "peer": "peer-1",
 5  "fingerprint": "ABAF11C65A2970B130ABE3C479BE3E4300411886",
 6  "trace_id": "test-tc002-abc123",
 7  "component": "sync",
 8  "event": "file_download_started",
 9  "file": "hello.txt",
10  "source_peer": "peer-2",
11  "source_fingerprint": "BBAF11C65A2970B130ABE3C479BE3E4300411887"
12}

Phase 6: Polish & Documentation (Weeks 11-12)

Goal: Production-ready framework with excellent docs

Deliverables:

  • TC-301: Unauthorized file access
  • TC-302: DHT Sybil attack (basic version)
  • Comprehensive CLI documentation
  • Video tutorial (screencast of interactive usage)
  • Example demo scripts
  • Parallel test execution in CI
  • Test result caching (skip unchanged tests)
  • Nightly stress test runs
  • Security test suite in separate workflow
  • Badge generation (test pass rate, coverage)
  • Integration verification (ensure CLI + tests share code)

Success Criteria:

  • New developer can use CLI productively in < 15 minutes
  • Video tutorial demonstrates P2P sync visually
  • CI pipeline completes in < 15 minutes (basic tests)
  • Nightly stress tests run without supervision
  • Security tests detect unauthorized access attempts
  • Test failures block PR merges

GitHub Actions Workflow Structure:

 1name: E2E Tests
 2
 3on:
 4  pull_request:
 5  push:
 6    branches: [main]
 7  schedule:
 8    - cron: '0 2 * * *'  # Nightly at 2 AM
 9
10jobs:
11  basic-tests:
12    runs-on: ubuntu-latest
13    steps:
14      - uses: actions/checkout@v3
15      - uses: docker/setup-buildx-action@v2
16      - name: Build Mau image
17        run: make -C e2e build-image
18      - name: Run basic tests
19        run: make -C e2e test-basic
20      - name: Upload logs
21        if: failure()
22        uses: actions/upload-artifact@v3
23        with:
24          name: test-logs-basic
25          path: e2e/test-results/
26
27  chaos-tests:
28    runs-on: ubuntu-latest
29    steps:
30      # Similar structure
31      
32  stress-tests:
33    runs-on: ubuntu-latest
34    if: github.event_name == 'schedule'  # Only nightly
35    steps:
36      # Run TC-202 (100-peer test)

Example Test Case Walkthrough

Test: TC-002 - Two-Peer Friend Sync

File: e2e/scenarios/basic/friend_sync_test.go

 1package basic
 2
 3import (
 4    "strings"
 5    "testing"
 6    "time"
 7    
 8    "github.com/mau-network/mau/e2e/framework/assertions"
 9    "github.com/mau-network/mau/e2e/framework/testenv"
10    "github.com/stretchr/testify/assert"
11    "github.com/stretchr/testify/require"
12)
13
14func TestTwoP eerFriendSync(t *testing.T) {
15    // Step 1: Create test environment
16    env := testenv.NewTestEnv(t)
17    defer env.Cleanup()  // Ensures cleanup even on test failure
18    
19    // Step 2: Start two Mau peers
20    peerA, err := env.AddPeer("peer-a")
21    require.NoError(t, err, "Failed to create peer A")
22    
23    peerB, err := env.AddPeer("peer-b")
24    require.NoError(t, err, "Failed to create peer B")
25    
26    // Step 3: Exchange public keys and establish friend relationship
27    err = env.MakeFriends(peerA, peerB)
28    require.NoError(t, err, "Failed to establish friendship")
29    
30    // Step 4: Verify friend relationship from both sides
31    assertions.AssertFriendRelationship(t, peerA, peerB)
32    assertions.AssertFriendRelationship(t, peerB, peerA)
33    
34    // Step 5: Peer A creates a file encrypted for Peer B
35    fileContent := "Hello from Peer A!"
36    err = peerA.AddFile("hello.txt", strings.NewReader(fileContent), []string{peerB.Fingerprint()})
37    require.NoError(t, err, "Failed to create file on peer A")
38    
39    // Step 6: Wait for synchronization (with timeout)
40    syncTimeout := 30 * time.Second
41    err = assertions.WaitForFile(t, peerB, "hello.txt", syncTimeout)
42    require.NoError(t, err, "File did not sync to peer B within timeout")
43    
44    // Step 7: Verify file content matches
45    content, err := peerB.ReadFile("hello.txt")
46    require.NoError(t, err, "Failed to read file from peer B")
47    assert.Equal(t, fileContent, content, "File content mismatch")
48    
49    // Step 8: Verify file is encrypted (PGP format)
50    rawContent, err := peerB.ReadFileRaw("hello.txt")
51    require.NoError(t, err, "Failed to read raw file")
52    assert.Contains(t, rawContent, "-----BEGIN PGP MESSAGE-----", "File not encrypted")
53    
54    // Step 9: Check synchronization logs for debugging
55    logs := peerB.GetLogs()
56    assert.Contains(t, logs, "file_download_completed", "Sync event not logged")
57}

How This Test Executes:

  1. Test Environment Creation:

    • testenv.NewTestEnv(t) creates isolated Docker network mau-test-<uuid>
    • Initializes cleanup handlers
  2. Peer A Container Startup:

    • Pulls/uses mau-e2e:latest image
    • Generates PGP account or uses pre-generated
    • Starts HTTP server on random port (mapped to host)
    • Joins DHT with bootstrap node (if configured)
    • Exposes health endpoint: GET /health
    • Testcontainers waits for healthy status (max 30s)
  3. Peer B Container Startup:

    • Same process as Peer A
    • Different fingerprint, different port
  4. Friend Relationship Setup:

    • Test coordinator extracts Peer B’s public key via API: GET /p2p/<peer-b-fpr>/account.pgp
    • Injects into Peer A’s keyring: POST /admin/friends (test-only endpoint)
    • Repeats in reverse direction
    • Verifies keyring files created: .mau/<peer-fpr>.pgp
  5. File Creation:

    • Test coordinator calls Peer A API: POST /admin/files
      1{
      2  "name": "hello.txt",
      3  "content": "SGVsbG8gZnJvbSBQZWVyIEEh",  // base64
      4  "encrypt_for": ["BBAF..."] // Peer B fingerprint
      5}
      
    • Peer A encrypts file with Peer B’s public key
    • Writes to <peer-a-fpr>/hello.txt.pgp
  6. Synchronization:

    • Peer B periodically polls Peer A: GET /p2p/<peer-a-fpr> (If-Modified-Since header)
    • Response includes hello.txt metadata
    • Peer B downloads: GET /p2p/<peer-a-fpr>/hello.txt
    • Verifies signature, decrypts, writes to local storage
  7. Assertion:

    • Test coordinator calls Peer B: GET /admin/files/hello.txt
    • Decrypts and returns plaintext
    • Compares with original content
  8. Cleanup:

    • defer env.Cleanup() triggers
    • Stops containers
    • Collects logs to e2e/test-results/<test-name>/
    • Removes Docker network
    • On failure: preserves container state for debugging

Execution Time: ~15 seconds (including container startup)


Framework Comparison

Approach 1: Pure Docker Compose

How it works:

  • Define all peers in docker-compose.yml
  • Use shell scripts to orchestrate (docker-compose up/down)
  • Manual assertion via docker exec commands

Pros:

  • ✅ Simple to understand
  • ✅ Easy to run manually for debugging
  • ✅ No Go dependencies

Cons:

  • ❌ Not programmatic - hard to parameterize (N peers)
  • ❌ Poor test isolation (shared Docker Compose project)
  • ❌ Manual cleanup prone to errors
  • ❌ Difficult to integrate with go test
  • ❌ No automatic log collection on failure

Verdict:Not Recommended - Good for manual exploration, bad for automated testing


How it works:

  • Go test code creates containers programmatically
  • Full control over lifecycle, networking, configuration
  • Native integration with go test

Pros:

  • Type-safe, programmatic control
  • Automatic cleanup with defer
  • Parameterized tests (easy to vary peer count)
  • Test isolation (each test gets unique network)
  • Rich ecosystem (wait strategies, log streaming)
  • CI-friendly (integrates with GitHub Actions)

Cons:

  • ⚠️ Requires Docker daemon (already needed for Mau development)
  • ⚠️ Learning curve for Testcontainers API (well-documented)

Verdict:Recommended - Best balance of control and maintainability


Approach 3: Kubernetes-based (e.g., kind, k3d)

How it works:

  • Deploy Mau peers as Kubernetes pods
  • Use Kubernetes CRDs for test orchestration
  • Tools: Kubetest2, Chainsaw, Sonobuoy

Pros:

  • ✅ Production-like environment
  • ✅ Advanced networking (NetworkPolicies for partitions)
  • ✅ Resource management (CPU/memory limits)

Cons:

  • Massive overkill for Mau’s scope
  • ❌ Slow startup time (k8s cluster initialization)
  • ❌ Complex debugging
  • ❌ CI resource intensive

Verdict:Not Recommended - Overkill, stick with Docker


Approach 4: Custom Test Harness (like Ethereum Hive)

How it works:

  • Build custom orchestration tool in Go
  • Test scenarios as separate binaries
  • Client implementations containerized

Pros:

  • Maximum flexibility
  • Client-agnostic (could test Rust/Python Mau implementations)
  • Reusable across projects

Cons:

  • Huge development effort (weeks to build harness)
  • Maintenance burden
  • ❌ Not justified for single-implementation project (Mau only has Go impl)

Verdict: ⚠️ Overkill Now, Revisit Later - Good if Mau gets multiple implementations


Recommendation Matrix

Criterion Docker Compose Testcontainers-Go Kubernetes Custom Harness
Ease of Use ★★★★☆ ★★★☆☆ ★☆☆☆☆ ★★☆☆☆
Programmatic Control ★☆☆☆☆ ★★★★★ ★★★☆☆ ★★★★★
Test Isolation ★★☆☆☆ ★★★★★ ★★★★★ ★★★★☆
CI/CD Integration ★★☆☆☆ ★★★★★ ★★★☆☆ ★★★☆☆
Debugging ★★★★☆ ★★★★☆ ★★☆☆☆ ★★★☆☆
Maintenance ★★★☆☆ ★★★★☆ ★☆☆☆☆ ★★☆☆☆
Scalability (100+ peers) ★★☆☆☆ ★★★★☆ ★★★★★ ★★★★☆
Setup Time 5 min 15 min 60 min 120 min

Final Recommendation: Testcontainers-Go with Docker Compose for manual debugging


CI/CD Integration

GitHub Actions Workflow Design

File: .github/workflows/e2e-tests.yml

  1name: E2E Tests
  2
  3on:
  4  pull_request:
  5    paths:
  6      - '**.go'
  7      - 'e2e/**'
  8      - '.github/workflows/e2e-tests.yml'
  9  push:
 10    branches: [main, develop]
 11  schedule:
 12    - cron: '0 2 * * *'  # Nightly stress tests
 13
 14env:
 15  GO_VERSION: '1.21'
 16  DOCKER_BUILDKIT: 1
 17
 18jobs:
 19  build-image:
 20    name: Build Mau E2E Image
 21    runs-on: ubuntu-latest
 22    steps:
 23      - uses: actions/checkout@v4
 24      
 25      - name: Set up Docker Buildx
 26        uses: docker/setup-buildx-action@v3
 27      
 28      - name: Build and export
 29        uses: docker/build-push-action@v5
 30        with:
 31          context: .
 32          file: e2e/docker/Dockerfile.mau
 33          tags: mau-e2e:${{ github.sha }}
 34          outputs: type=docker,dest=/tmp/mau-e2e.tar
 35      
 36      - name: Upload image artifact
 37        uses: actions/upload-artifact@v4
 38        with:
 39          name: mau-e2e-image
 40          path: /tmp/mau-e2e.tar
 41          retention-days: 1
 42
 43  test-basic:
 44    name: Basic Tests (Level 1)
 45    needs: build-image
 46    runs-on: ubuntu-latest
 47    timeout-minutes: 15
 48    steps:
 49      - uses: actions/checkout@v4
 50      
 51      - name: Set up Go
 52        uses: actions/setup-go@v4
 53        with:
 54          go-version: ${{ env.GO_VERSION }}
 55      
 56      - name: Download image
 57        uses: actions/download-artifact@v4
 58        with:
 59          name: mau-e2e-image
 60          path: /tmp
 61      
 62      - name: Load image
 63        run: docker load --input /tmp/mau-e2e.tar
 64      
 65      - name: Run basic tests
 66        run: |
 67          cd e2e
 68          go test -v -timeout 10m ./scenarios/basic/...          
 69        env:
 70          MAU_E2E_IMAGE: mau-e2e:${{ github.sha }}
 71      
 72      - name: Upload test results
 73        if: always()
 74        uses: actions/upload-artifact@v4
 75        with:
 76          name: test-results-basic
 77          path: e2e/test-results/
 78
 79  test-resilience:
 80    name: Resilience Tests (Level 2)
 81    needs: build-image
 82    runs-on: ubuntu-latest
 83    timeout-minutes: 30
 84    steps:
 85      # Similar to test-basic
 86      - name: Run resilience tests
 87        run: |
 88          cd e2e
 89          go test -v -timeout 25m ./scenarios/resilience/...          
 90
 91  test-stress:
 92    name: Stress Tests (Level 3)
 93    needs: build-image
 94    runs-on: ubuntu-latest-8-cores  # Larger runner
 95    if: github.event_name == 'schedule' || contains(github.event.head_commit.message, '[stress]')
 96    timeout-minutes: 60
 97    steps:
 98      # Similar to test-basic
 99      - name: Run stress tests
100        run: |
101          cd e2e
102          go test -v -timeout 50m ./scenarios/stress/...          
103
104  test-security:
105    name: Security Tests (Level 4)
106    needs: build-image
107    runs-on: ubuntu-latest
108    timeout-minutes: 20
109    steps:
110      - name: Run security tests
111        run: |
112          cd e2e
113          go test -v -timeout 15m ./scenarios/security/...          
114
115  report:
116    name: Generate Test Report
117    needs: [test-basic, test-resilience, test-security]
118    if: always()
119    runs-on: ubuntu-latest
120    steps:
121      - uses: actions/checkout@v4
122      
123      - name: Download all results
124        uses: actions/download-artifact@v4
125        with:
126          path: all-results
127      
128      - name: Generate HTML report
129        run: |
130          cd e2e
131          ./scripts/generate-report.sh ../all-results          
132      
133      - name: Upload report
134        uses: actions/upload-artifact@v4
135        with:
136          name: test-report
137          path: e2e/report.html
138      
139      - name: Comment PR
140        if: github.event_name == 'pull_request'
141        uses: actions/github-script@v7
142        with:
143          script: |
144            // Parse test results and post summary comment            

Optimization Strategies

  1. Parallel Test Execution:

    • Use t.Parallel() in Go tests where safe
    • Run test levels (basic/resilience/stress) in parallel jobs
    • Resource limits: max 4 parallel stress tests
  2. Image Caching:

    • Cache Mau Docker image layers in GitHub Actions
    • Only rebuild on source changes
    • Use Docker Buildx cache export
  3. Test Result Caching:

    • Hash test inputs (code + config)
    • Skip tests if hash matches previous run
    • Stored in GitHub Actions cache
  4. Fast Failure:

    • Run basic tests first
    • Fail fast on basic test failures
    • Stress tests only on nightly or manual trigger
  5. Resource Management:

    • Limit concurrent containers per test (max 20)
    • Use Docker resource limits (CPU/memory)
    • Clean up orphaned containers with reaper

Debugging & Observability

Log Collection Strategy

Structured Logs:
All Mau peers emit JSON logs to stdout:

 1{
 2  "timestamp": "2026-02-19T14:30:00Z",
 3  "level": "info",
 4  "peer_id": "peer-a",
 5  "fingerprint": "ABAF11C65A2970B130ABE3C479BE3E4300411886",
 6  "trace_id": "tc002-run42",
 7  "component": "sync",
 8  "event": "file_download_started",
 9  "file": "hello.txt",
10  "source_peer": "peer-b",
11  "bytes": 1024
12}

Fields:

  • trace_id: Links all logs from one test execution
  • peer_id: Container name (e.g., peer-a)
  • component: dht, sync, server, keyring
  • event: Structured event name

Collection:

  • Testcontainers auto-captures stdout/stderr
  • On test failure: dump to e2e/test-results/<test-name>/logs/<peer-id>.json
  • Use jq for filtering: jq '.component == "sync"' peer-a.json

State Snapshots

What to capture on test failure:

  1. File System State:

    • Peer directory trees (.mau/, <fpr>/)
    • tar -czf peer-a-files.tar.gz /data
  2. DHT Routing Tables:

    • Admin API: GET /admin/dht/routing-table
    • Save as JSON
  3. Friend Lists:

    • Admin API: GET /admin/friends
    • Shows keyring state
  4. Container Stats:

    • docker stats snapshot (CPU/memory usage)
    • Helps detect resource exhaustion
  5. Network State:

    • Active Toxiproxy toxics
    • Container connectivity matrix

Automated Snapshot Script:

 1func (e *TestEnv) CaptureSnapshot(testName string) error {
 2    snapshotDir := filepath.Join("test-results", testName, "snapshots")
 3    os.MkdirAll(snapshotDir, 0755)
 4    
 5    for _, peer := range e.peers {
 6        // Capture file tree
 7        peer.ExecTar("/data", filepath.Join(snapshotDir, peer.Name+"-files.tar.gz"))
 8        
 9        // Capture DHT state
10        dht, _ := peer.GetDHTState()
11        writeJSON(filepath.Join(snapshotDir, peer.Name+"-dht.json"), dht)
12        
13        // Capture logs
14        logs := peer.GetLogs()
15        os.WriteFile(filepath.Join(snapshotDir, peer.Name+".log"), logs, 0644)
16    }
17    return nil
18}

Debugging Workflow

When a test fails:

  1. Check CI Artifacts:

    • Download test-results-<level>.zip
    • Extract to local machine
  2. Read Test Summary:

    • test-results/<test-name>/summary.json
    • Shows which assertion failed
  3. Filter Logs by Trace ID:

    1cd test-results/TestTwoPeerFriendSync/logs
    2jq '. | select(.trace_id == "tc002-run42")' peer-*.json | less
    
  4. Inspect File State:

    1tar -xzf snapshots/peer-a-files.tar.gz
    2tree data/
    
  5. Reproduce Locally:

    1# Use Docker Compose for manual control
    2cd e2e
    3docker-compose -f docker/docker-compose.yml up
    4# Manually trigger actions via API
    5curl -X POST http://localhost:8080/admin/files -d '...'
    
  6. Enable Verbose Logging:

    1// In test file
    2env.SetLogLevel("debug")  // Enables DEBUG level logs
    
  7. Pause Test on Failure:

    1if t.Failed() {
    2    fmt.Println("Test failed, containers still running. Press enter to cleanup...")
    3    bufio.NewReader(os.Stdin).ReadString('\n')
    4}
    

Observability Tools

Tool Purpose Integration
jq Log filtering/analysis Manual, CI scripts
Docker logs Real-time log tailing docker logs -f <container>
Docker stats Resource monitoring docker stats during test
Wireshark/tcpdump Network traffic capture (advanced) Manual debugging
Grafana/Loki Log aggregation (future) Optional for large test suites

Open Questions & Future Work

Open Questions

  1. DHT Bootstrap Strategy:

    • Should tests use a dedicated bootstrap node or peer-to-peer discovery?
    • Trade-off: Bootstrap node simplifies setup but adds dependency
  2. Test Data Persistence:

    • Should test results be stored in Git for trend analysis?
    • Or use external service (TestRail, Allure)?
  3. Performance Baselines:

    • What is acceptable sync time for 10 peers? 100 peers?
    • Need empirical data to set thresholds
  4. Chaos Test Reproducibility:

    • How to ensure random failures are reproducible?
    • Solution: Seed-based randomness with seed in test name
  5. Security Test Scope:

    • How deep should Sybil attack testing go?
    • May require S/Kademlia implementation first
  6. Test Environment Variables:

    • Should tests read config from env vars (for CI tuning)?
    • Or strictly use code-defined configs?

Future Enhancements

Phase 7+: Advanced Features

  1. Visual Test Reports:

    • HTML dashboard with pass/fail trends
    • Peer graph visualization (D3.js)
    • Timeline view of peer interactions
  2. Mutation Testing:

    • Inject bugs into Mau code
    • Verify E2E tests catch them
    • Measures test effectiveness
  3. Fuzz Testing Integration:

    • Use go-fuzz to generate file content
    • Test PGP encryption with malformed keys
    • Kademlia message fuzzing
  4. Performance Regression Detection:

    • Store sync time metrics in database
    • Alert on >20% slowdown
    • Integration with GitHub Status Checks
  5. Multi-Platform Testing:

    • Test on ARM64 (e.g., Raspberry Pi simulation)
    • Windows containers (if Mau supports)
  6. Record/Replay:

    • Record network interactions during test
    • Replay for deterministic debugging
    • Tools: VCR, go-replay
  7. Chaos Mesh Integration:

    • More advanced chaos scenarios
    • CPU/memory pressure testing
    • Clock skew simulation (important for PGP timestamp validation)
  8. Contract Testing:

    • Verify HTTP API backwards compatibility
    • Pact or OpenAPI validation

Conclusion

This E2E testing framework is designed to:

Validate Mau’s core P2P functionality (discovery, sync, friend management)
Detect regressions early via automated CI/CD integration
Simulate real-world conditions (network failures, high latency, peer churn)
Scale from 2 to 100+ peers with minimal test code changes
Provide excellent debugging with rich logs, state snapshots, and artifacts
Remain maintainable with clear structure and comprehensive documentation

Next Steps

  1. Review this plan with the team
  2. Approve technology choices (Testcontainers-Go, Toxiproxy)
  3. Prioritize test scenarios (start with TC-001, TC-002)
  4. Begin Phase 1 implementation
  5. Iterate based on real-world bugs found

Success Metrics

After 6 months of use:

  • Test coverage: >80% of P2P scenarios
  • Bug detection: >10 bugs caught before production
  • Developer adoption: New contributors can add tests in <1 hour
  • CI reliability: <1% flaky test rate
  • Debugging time: <30 minutes to root-cause failures

Document Version: 1.0
Last Updated: 19 February 2026
Reviewers: [To be assigned]
Status: Awaiting Review