FEAT_INBOUND: Human-to-Agent Communication¶

Overview¶

Enable humans to send messages to AI coding agents through communication channels (Discord, WhatsApp, Twilio), allowing real-time interruption and interaction with local agent runtimes.

Direction terminology:

OUTBOUND (existing): Agent → MCP → Channel → Human
INBOUND (this feature): Human → Channel → AgentBridge → Agent

Problem Statement¶

Current AgentComms enables agents to contact humans (OUTBOUND), but humans cannot interrupt or message agents in return. MCP is pull-based (agent initiates tool calls), so human messages sit undelivered until the agent happens to poll.

For coding assistants like Claude Code or Codex CLI running in tmux, users need the ability to:

Interrupt a running agent ("stop", "pause")
Send follow-up instructions without waiting for the agent to ask
Coordinate multiple agents through separate channels
Monitor agent activity remotely

Goals¶

Real-time delivery: Human messages reach agents within seconds
Interrupt support: Agents can be stopped/paused mid-task
Multi-agent routing: Each agent has its own channel/conversation
Local-first: Runs on a laptop without cloud infrastructure
Cloud-ready: Architecture supports future multi-tenant deployment
Event-sourced: All communication is logged for replay/debugging

Non-Goals¶

Building a new agent runtime (use existing Claude Code, Codex CLI, etc.)
Replacing MCP (INBOUND complements OUTBOUND, both coexist)
Real-time collaboration features (Google Docs-style)

Architecture¶

High-Level Flow¶

Human
  │
  ▼
Communication Channel (Discord / WhatsApp / Twilio)
  │
  ▼
AgentComms Daemon
  │
  ├── Event Store (JSONL + SQLite)
  │
  ├── Actor Router
  │
  └── AgentBridge
        │
        ▼
      Local Agent (tmux / CLI process)

Components¶

1. Event Store¶

Append-only event log with query index.

Storage:

~/.agentcomms/events/{agent_id}.jsonl - append-only log (source of truth)
~/.agentcomms/events.db - SQLite index for queries

Event Schema:

type Event struct {
    ID             string         `json:"id"`
    TenantID       string         `json:"tenant_id"`       // "local" for single-tenant
    ConversationID string         `json:"conversation_id"`
    AgentID        string         `json:"agent_id"`
    HumanID        *string        `json:"human_id,omitempty"`
    ChannelID      string         `json:"channel_id"`
    Type           string         `json:"type"`
    Role           string         `json:"role"`
    Timestamp      time.Time      `json:"timestamp"`
    Payload        map[string]any `json:"payload"`
    Refs           []string       `json:"refs,omitempty"`
    Status         string         `json:"status"`
}

Event Types:

Type	Description
`human_message`	Message from human via chat/phone
`agent_message`	Agent output to human or another agent
`tool_call`	Agent invoking a tool
`tool_result`	Result from tool execution
`interrupt`	Stop/pause/cancel request
`system`	System events (agent started, errors)
`voice_transcript`	Phone call transcript

2. Actor Router¶

Goroutine-per-agent architecture to prevent race conditions.

type AgentActor struct {
    id    string
    inbox chan Event
}

func (a *AgentActor) Start(ctx context.Context) {
    go func() {
        for {
            select {
            case evt := <-a.inbox:
                a.handle(evt)
            case <-ctx.Done():
                return
            }
        }
    }()
}

Key properties:

Each agent has its own goroutine and channel
Events processed sequentially per agent (no locks needed)
Router dispatches events to correct agent actor

3. AgentBridge¶

Adapters connecting events to local agent runtimes.

tmux Adapter:

type TmuxAdapter struct {
    session string
    pane    string
}

func (t *TmuxAdapter) Send(msg string) error {
    return exec.Command("tmux", "send-keys", "-t",
        fmt.Sprintf("%s:%s", t.session, t.pane),
        msg, "Enter").Run()
}

func (t *TmuxAdapter) Interrupt() error {
    return exec.Command("tmux", "send-keys", "-t",
        fmt.Sprintf("%s:%s", t.session, t.pane),
        "C-c").Run()
}

Process Adapter:

For agents running as child processes with stdin/stdout.

4. Transport Integration¶

Extend existing chat/voice managers to publish INBOUND events.

// In Discord handler
func (d *DiscordTransport) onMessage(m *discordgo.MessageCreate) {
    evt := Event{
        Type:      "human_message",
        ChannelID: fmt.Sprintf("discord:%s", m.ChannelID),
        Payload:   map[string]any{"text": m.Content},
    }
    d.eventBus.Publish(evt)
}

Directory Structure¶

agentcomms/
├── cmd/agentcomms/
├── internal/
│   ├── events/
│   │   ├── event.go         # Event type
│   │   ├── bus.go           # Event bus (pub/sub)
│   │   └── store.go         # JSONL + SQLite store
│   ├── router/
│   │   ├── router.go        # Actor router
│   │   └── agent_actor.go   # Per-agent actor
│   ├── agentbridge/
│   │   ├── adapter.go       # Adapter interface
│   │   ├── tmux.go          # tmux adapter
│   │   └── process.go       # Process adapter
│   └── transports/
│       ├── discord.go       # Extended for INBOUND
│       └── twilio.go        # Extended for INBOUND
├── pkg/
│   ├── chat/                # (existing)
│   ├── voice/               # (existing)
│   └── config/              # (existing)
└── ent/
    └── schema/
        ├── event.go
        ├── agent.go
        └── conversation.go

Configuration¶

Agent Registration¶

# ~/.agentcomms/config.yaml
agents:
  backend:
    type: tmux
    session: agents
    pane: "1"
    channel: "discord:agent-backend"

  frontend:
    type: tmux
    session: agents
    pane: "2"
    channel: "discord:agent-frontend"

  devops:
    type: process
    command: ["claude-code", "--agent"]
    channel: "discord:agent-devops"

Channel Mapping¶

Discord channels map to agents:

Discord Channel	Agent
#agent-backend	backend
#agent-frontend	frontend
#agent-devops	devops

macOS Deployment¶

launchd Service¶

<!-- ~/Library/LaunchAgents/com.agentcomms.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.agentcomms</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/local/bin/agentcomms</string>
        <string>daemon</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/tmp/agentcomms.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/agentcomms.err</string>
</dict>
</plist>

CLI Commands¶

# Start daemon
agentcomms daemon

# Send message to agent
agentcomms send backend "run the tests"

# Interrupt agent
agentcomms interrupt backend

# List agents
agentcomms agents

# Tail agent events
agentcomms logs backend --follow

Cloud Evolution¶

Multi-Tenant Support¶

Local deployment uses tenant_id = "local".

Cloud deployment:

PostgreSQL replaces SQLite
Row-Level Security (RLS) enforces tenant isolation
NATS/Kafka replaces in-memory event bus

-- PostgreSQL RLS policy
CREATE POLICY tenant_isolation ON events
    USING (tenant_id = current_setting('app.tenant_id'));

Ent Schema¶

Same schema works for SQLite (local) and PostgreSQL (cloud):

// ent/schema/event.go
func (Event) Fields() []ent.Field {
    return []ent.Field{
        field.String("id").Unique(),
        field.String("tenant_id").Default("local"),
        field.String("conversation_id"),
        field.String("agent_id"),
        field.String("type"),
        field.String("role"),
        field.Time("timestamp"),
        field.JSON("payload", map[string]any{}),
        field.Strings("refs").Optional(),
        field.String("status").Default("new"),
    }
}

Implementation Phases¶

Phase 1: Core Infrastructure¶

[ ] Event type and bus
[ ] JSONL event store
[ ] SQLite index with Ent
[ ] Actor router

Phase 2: AgentBridge¶

[ ] Adapter interface
[ ] tmux adapter
[ ] Process adapter
[ ] Agent configuration

Phase 3: Transport Integration¶

[ ] Discord INBOUND handler
[ ] Twilio INBOUND handler (SMS)
[ ] WhatsApp INBOUND handler

Phase 4: CLI & Daemon¶

[ ] agentcomms daemon command
[ ] agentcomms send/interrupt/logs commands
[ ] launchd plist generator
[ ] Log rotation

Phase 5: Cloud Readiness¶

[ ] PostgreSQL driver support
[ ] tenant_id propagation
[ ] Cloud sync client (optional)

Success Metrics¶

Message delivery latency < 2 seconds
Zero race conditions under concurrent agent load
Event replay reproduces exact conversation state
Daemon stability > 7 days without restart

Open Questions¶

MCP integration: Should INBOUND events also be exposed via MCP resources/notifications for agents that poll?
Voice interrupts: How should phone call interrupts work (human says "stop" → transcription → interrupt event)?
Agent discovery: Auto-detect tmux panes vs explicit configuration?
Rate limiting: Prevent message flooding to agents?

References¶

Beads - Task graph architecture inspiration
MCP Specification - Model Context Protocol
Claude Code - Target agent runtime