AI Agent Ergonomics¶
W3Pilot includes features designed specifically for AI agents to interact with web pages more effectively.
Overview¶
AI agents face unique challenges when automating browsers:
- Discovery: Finding interactive elements without prior knowledge of the page
- Validation: Verifying selectors before attempting interactions
- Recovery: Understanding what went wrong when actions fail
- State Management: Saving and restoring browser state across sessions
W3Pilot addresses these with purpose-built tools.
Page Inspection¶
The Inspect method scans the page and returns a structured inventory of interactive elements:
result, err := pilot.Inspect(ctx, nil)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Page: %s\n", result.Title)
fmt.Printf("URL: %s\n", result.URL)
// Discover buttons
for _, btn := range result.Buttons {
fmt.Printf("Button: %s - %s\n", btn.Selector, btn.Text)
}
// Discover inputs
for _, input := range result.Inputs {
fmt.Printf("Input: %s (%s) - %s\n",
input.Selector, input.Type, input.Label)
}
Inspection Options¶
Control what elements to include:
opts := &w3pilot.InspectOptions{
IncludeButtons: true,
IncludeLinks: true,
IncludeInputs: true,
IncludeSelects: true,
IncludeHeadings: true,
IncludeImages: true,
MaxItems: 50, // Per category
}
result, err := pilot.Inspect(ctx, opts)
Inspect Result Structure¶
type InspectResult struct {
URL string
Title string
Buttons []InspectButton // button, input[type=submit], [role=button]
Links []InspectLink // a[href]
Inputs []InspectInput // input, textarea
Selects []InspectSelect // select
Headings []InspectHeading // h1-h6
Images []InspectImage // img[alt]
Summary InspectSummary
}
type InspectButton struct {
Selector string
Text string
Type string // button, submit, reset
Disabled bool
Visible bool
}
type InspectInput struct {
Selector string
Type string // text, password, email, etc.
Name string
Placeholder string
Value string // Masked for password fields
Label string // Associated label text
Required bool
Disabled bool
ReadOnly bool
Visible bool
}
MCP Tool¶
{
"name": "page_inspect",
"arguments": {
"include_buttons": true,
"include_links": false,
"max_items": 25
}
}
CLI Command¶
# Full inspection
w3pilot page inspect
# JSON output for parsing
w3pilot page inspect --format json
# Exclude categories
w3pilot page inspect --no-links --no-images
# Limit items
w3pilot page inspect --max-items 100
Selector Validation¶
Before interacting with elements, validate selectors to check existence and state:
results, err := pilot.ValidateSelectors(ctx, []string{
"#login-button",
"#username",
"#password",
"#nonexistent",
})
for _, v := range results {
if v.Found {
fmt.Printf("FOUND: %s (tag=%s, visible=%t, enabled=%t)\n",
v.Selector, v.TagName, v.Visible, v.Enabled)
} else {
fmt.Printf("NOT FOUND: %s\n", v.Selector)
if len(v.Suggestions) > 0 {
fmt.Printf(" Suggestions: %v\n", v.Suggestions)
}
}
}
Validation Result¶
type SelectorValidation struct {
Selector string
Found bool
Count int // Number of matching elements
Visible bool // First match is visible
Enabled bool // First match is enabled
TagName string // Tag name of first match
Suggestions []string // Alternative selectors if not found
}
MCP Tool¶
{
"name": "test_validate_selectors",
"arguments": {
"selectors": ["#login", ".submit-btn", "button[type=submit]"]
}
}
CLI Command¶
w3pilot test validate-selectors "#login" ".password" "button[type=submit]"
w3pilot test validate-selectors --format json "#nonexistent"
Workflow Recipes¶
High-level workflows combine multiple steps for common tasks.
Login Workflow¶
Automated login with credential filling and success verification:
result, err := pilot.Login(ctx, &w3pilot.LoginOptions{
UsernameSelector: "#email",
PasswordSelector: "#password",
SubmitSelector: "button[type=submit]",
Username: "user@example.com",
Password: "secret123",
SuccessIndicator: "/dashboard", // URL pattern
Timeout: 30 * time.Second,
})
if result.Success {
fmt.Printf("Logged in! Now at: %s\n", result.URL)
} else {
fmt.Printf("Login failed: %s\n", result.ErrorReason)
}
The SuccessIndicator can be:
- A URL pattern (starts with
/,*, orhttp) - A CSS selector (element that appears after successful login)
MCP Tool¶
{
"name": "workflow_login",
"arguments": {
"username_selector": "#email",
"password_selector": "#password",
"submit_selector": "button[type=submit]",
"username": "user@example.com",
"password": "secret123",
"success_indicator": "#dashboard-header"
}
}
Table Extraction¶
Extract HTML table data to structured JSON:
table, err := pilot.ExtractTable(ctx, "table.data", &w3pilot.ExtractTableOptions{
IncludeHeaders: true,
MaxRows: 100,
})
// Access as arrays
for _, row := range table.Rows {
fmt.Println(row) // []string{"value1", "value2", ...}
}
// Access as objects (keyed by headers)
for _, row := range table.RowsJSON {
fmt.Printf("Name: %s, Email: %s\n", row["Name"], row["Email"])
}
MCP Tool¶
{
"name": "workflow_extract_table",
"arguments": {
"selector": "table#users",
"include_headers": true,
"max_rows": 50
}
}
Named State Snapshots¶
Save and restore browser state (cookies, localStorage, sessionStorage) with named snapshots.
SDK Usage¶
// Save current state
state, err := pilot.StorageState(ctx)
if err != nil {
log.Fatal(err)
}
mgr, _ := state.NewManager("") // Uses ~/.w3pilot/states/
// Save to named snapshot
err = mgr.Save("logged-in-session", state)
// Later, restore the state
savedState, err := mgr.Load("logged-in-session")
pilot.SetStorageState(ctx, savedState)
pilot.Reload(ctx) // Apply the restored state
MCP Tools¶
// Save current state
{"name": "state_save", "arguments": {"name": "my-session"}}
// List saved states
{"name": "state_list"}
// Load a saved state
{"name": "state_load", "arguments": {"name": "my-session"}}
// Delete a saved state
{"name": "state_delete", "arguments": {"name": "old-session"}}
CLI Commands¶
# Save current browser state
w3pilot state save my-session
# List all saved states
w3pilot state list
# Load a saved state
w3pilot state load my-session
# Delete a saved state
w3pilot state delete my-session
State Storage Location¶
States are saved to ~/.w3pilot/states/{name}.json with metadata:
{
"name": "my-session",
"created_at": "2026-03-28T10:30:00Z",
"state": {
"cookies": [...],
"origins": [
{
"origin": "https://example.com",
"localStorage": {"key": "value"},
"sessionStorage": {"key": "value"}
}
]
}
}
Output Formatting¶
All CLI commands that return data support the --format flag:
# Default text output
w3pilot page title
# Example Domain
# JSON output
w3pilot page title --format json
# {"title":"Example Domain"}
# JSON for element queries
w3pilot element text "h1" --format json
# {"selector":"h1","text":"Example Domain"}
This enables AI agents to reliably parse command output.
Enhanced Error Context¶
When operations fail, errors include page context to help diagnose issues:
type ElementNotFoundError struct {
Selector string
PageContext *PageContext
Suggestions []string
}
type PageContext struct {
URL string
Title string
VisibleText string
}
This helps AI agents understand:
- What URL they're on when the error occurred
- What similar selectors might work instead
- The current page state for debugging
Best Practices for AI Agents¶
1. Inspect Before Acting¶
// First, understand the page
result, _ := pilot.Inspect(ctx, nil)
// Then interact with discovered elements
for _, input := range result.Inputs {
if input.Type == "email" {
elem, _ := pilot.Find(ctx, input.Selector, nil)
elem.Fill(ctx, "user@example.com", nil)
break
}
}
2. Validate Selectors¶
// Check selectors before using them
validations, _ := pilot.ValidateSelectors(ctx, selectors)
for _, v := range validations {
if !v.Found {
// Use suggestions or try alternatives
if len(v.Suggestions) > 0 {
selectors = append(selectors, v.Suggestions[0])
}
}
}
3. Use State Snapshots for Repeatability¶
# After manual login, save the state
w3pilot state save authenticated
# In automation, load the saved state
w3pilot state load authenticated
w3pilot page navigate https://example.com/dashboard