CI/CD Integration¶
Run deterministic test scripts in CI/CD pipelines without an LLM.
Clicker Binary Required
E2E tests require the W3Pilot clicker binary, which is not yet publicly distributed. The examples below assume you have access to the clicker binary. See Prerequisites for details.
Overview¶
┌─────────────────────────────────────────────────────────────────────────┐
│ Development Workflow │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Developer writes LLM explores & Script saved │
│ Markdown test plan ──▶ records actions ──▶ to repo │
│ (with MCP) │
│ │
├─────────────────────────────────────────────────────────────────────────┤
│ CI/CD Pipeline │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ git push ──▶ CI runs ──▶ w3pilot run test.json ──▶ Pass/Fail │
│ headless (no LLM needed) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Benefits¶
| Benefit | Description |
|---|---|
| No LLM costs | Scripts run without API calls |
| Deterministic | Same inputs → same outputs |
| Fast | No LLM latency |
| Auditable | Scripts are version-controlled |
| Parallelizable | Run multiple scripts concurrently |
GitHub Actions¶
Basic Workflow¶
name: E2E Tests
on:
workflow_dispatch:
inputs:
clicker_url:
description: 'URL to download clicker binary'
required: true
type: string
jobs:
e2e:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: '1.22'
- name: Download Clicker
run: |
curl -L -o clicker "${{ github.event.inputs.clicker_url }}"
chmod +x clicker
echo "W3PILOT_CLICKER_PATH=$PWD/clicker" >> $GITHUB_ENV
- name: Install W3Pilot CLI
run: go install github.com/plexusone/w3pilot/cmd/w3pilot@latest
- name: Run E2E Tests
env:
W3PILOT_HEADLESS: "1"
run: |
w3pilot run tests/login.json
w3pilot run tests/checkout.json
Manual Trigger
Until the clicker is publicly distributed, E2E workflows should use workflow_dispatch with a clicker_url input rather than automatic triggers on push/PR.
Matrix Strategy¶
Run tests in parallel:
jobs:
e2e:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
test: [smoke, auth, checkout, search]
steps:
- uses: actions/checkout@v4
- name: Download Clicker
run: |
curl -L -o clicker "${{ github.event.inputs.clicker_url }}"
chmod +x clicker
echo "W3PILOT_CLICKER_PATH=$PWD/clicker" >> $GITHUB_ENV
- name: Setup
run: go install github.com/plexusone/w3pilot/cmd/w3pilot@latest
- name: Run ${{ matrix.test }} tests
env:
W3PILOT_HEADLESS: "1"
run: |
for script in tests/${{ matrix.test }}/*.json; do
w3pilot run "$script"
done
Upload Artifacts on Failure¶
- name: Run tests
run: w3pilot run tests/e2e.json
- name: Upload screenshots on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: screenshots
path: screenshots/
retention-days: 7
Scheduled Runs¶
GitLab CI¶
e2e:
image: golang:1.22
variables:
CLICKER_URL: "$CLICKER_URL" # Set in CI/CD variables
W3PILOT_HEADLESS: "1"
before_script:
- curl -L -o clicker "$CLICKER_URL"
- chmod +x clicker
- export W3PILOT_CLICKER_PATH=$PWD/clicker
- go install github.com/plexusone/w3pilot/cmd/w3pilot@latest
script:
- w3pilot run tests/smoke.json
- w3pilot run tests/auth.json
artifacts:
when: on_failure
paths:
- screenshots/
expire_in: 1 week
rules:
- when: manual # Manual trigger until clicker is public
CircleCI¶
version: 2.1
jobs:
e2e:
docker:
- image: cimg/go:1.22
steps:
- checkout
- run:
name: Download Clicker
command: |
curl -L -o clicker "$CLICKER_URL"
chmod +x clicker
echo "export W3PILOT_CLICKER_PATH=$PWD/clicker" >> $BASH_ENV
- run:
name: Install W3Pilot CLI
command: go install github.com/plexusone/w3pilot/cmd/w3pilot@latest
- run:
name: Run E2E Tests
environment:
W3PILOT_HEADLESS: "1"
command: |
w3pilot run tests/smoke.json
w3pilot run tests/auth.json
- store_artifacts:
path: screenshots
destination: screenshots
workflows:
test:
jobs:
- e2e:
# Manual approval until clicker is public
type: approval
Jenkins¶
pipeline {
agent any
parameters {
string(name: 'CLICKER_URL', description: 'URL to download clicker binary')
}
environment {
W3PILOT_HEADLESS = '1'
}
stages {
stage('Setup') {
steps {
sh '''
curl -L -o clicker "${CLICKER_URL}"
chmod +x clicker
'''
sh 'go install github.com/plexusone/w3pilot/cmd/w3pilot@latest'
}
}
stage('E2E Tests') {
environment {
W3PILOT_CLICKER_PATH = "${WORKSPACE}/clicker"
}
steps {
sh 'w3pilot run tests/smoke.json'
sh 'w3pilot run tests/auth.json'
}
}
}
post {
failure {
archiveArtifacts artifacts: 'screenshots/**', fingerprint: true
}
}
}
Azure Pipelines¶
trigger: none # Manual trigger until clicker is public
parameters:
- name: clickerUrl
displayName: 'Clicker Download URL'
type: string
pool:
vmImage: 'ubuntu-latest'
steps:
- task: GoTool@0
inputs:
version: '1.22'
- script: |
curl -L -o clicker "${{ parameters.clickerUrl }}"
chmod +x clicker
echo "##vso[task.setvariable variable=W3PILOT_CLICKER_PATH]$(pwd)/clicker"
displayName: 'Download Clicker'
- script: go install github.com/plexusone/w3pilot/cmd/w3pilot@latest
displayName: 'Install W3Pilot CLI'
- script: |
export W3PILOT_HEADLESS=1
w3pilot run tests/smoke.json
w3pilot run tests/auth.json
displayName: 'Run E2E Tests'
- task: PublishBuildArtifacts@1
condition: failed()
inputs:
pathToPublish: 'screenshots'
artifactName: 'screenshots'
Test Organization¶
Recommended structure:
tests/
├── e2e/
│ ├── smoke/
│ │ ├── homepage.json
│ │ └── navigation.json
│ ├── auth/
│ │ ├── login.json
│ │ ├── logout.json
│ │ └── password-reset.json
│ ├── checkout/
│ │ ├── add-to-cart.json
│ │ └── purchase.json
│ └── search/
│ └── basic-search.json
├── plans/
│ ├── smoke.md # Markdown test plans
│ ├── auth.md
│ └── checkout.md
└── README.md
Environment Variables¶
| Variable | Description | Default |
|---|---|---|
W3PILOT_HEADLESS |
Run headless | false |
W3PILOT_DEBUG |
Enable debug logs | false |
W3PILOT_CLICKER_PATH |
Path to clicker | Auto-detect |
W3PILOT_TIMEOUT |
Default timeout | 30s |
Best Practices¶
1. Use Headless Mode¶
Always set W3PILOT_HEADLESS=1 in CI:
2. Set Appropriate Timeouts¶
CI environments may be slower:
3. Capture Screenshots on Failure¶
Add screenshot steps for debugging:
{
"steps": [
{"action": "navigate", "url": "https://example.com"},
{"action": "screenshot", "file": "screenshots/step1.png"},
{"action": "click", "selector": "#submit"},
{"action": "screenshot", "file": "screenshots/step2.png"}
]
}
4. Use continueOnError for Non-Critical Steps¶
5. Parallelize Independent Tests¶
Use matrix strategies to run tests concurrently.
6. Version Control Test Scripts¶
- Store scripts in Git alongside code
- Review script changes in PRs
- Track test evolution over time
Debugging CI Failures¶
Enable Debug Logging¶
Download Artifacts¶
Screenshots and logs uploaded as artifacts help debug failures.
Run Locally¶
Reproduce CI failures locally:
Accessibility Testing in CI/CD¶
For WCAG 2.2 accessibility testing in CI/CD, use agent-a11y:
name: Accessibility
on:
workflow_dispatch:
inputs:
clicker_url:
description: 'URL to download clicker binary'
required: true
jobs:
wcag:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download Clicker
run: |
curl -L -o clicker "${{ github.event.inputs.clicker_url }}"
chmod +x clicker
echo "W3PILOT_CLICKER_PATH=$PWD/clicker" >> $GITHUB_ENV
- name: Setup
run: go install github.com/agentplexus/agent-a11y/cmd/agent-a11y@latest
- name: Run WCAG 2.2 AA Evaluation
env:
W3PILOT_HEADLESS: "1"
run: |
agent-a11y vpat https://staging.example.com \
--format json --output wcag-results.json
- name: Upload WCAG Results
uses: actions/upload-artifact@v4
with:
name: wcag-results
path: wcag-results.json
agent-a11y combines:
- Automated testing (~40% coverage) - axe-core rule-based checks
- Specialized automation (~25% coverage) - keyboard, focus, reflow tests
- LLM-as-a-Judge (~25% coverage) - semantic evaluation (optional)
See the agent-a11y documentation for details.
Example: Complete Workflow¶
See .github/workflows/e2e.yaml in this repository for a complete example.