# RubiPi – Minimal learning-focused build plan
## Goal
Build a **small, readable Ruby port** of the core “pi” experience to learn how an agent works end-to-end:
- send messages to an LLM
- stream the response
- handle tool calls (function calling)
- execute local tools
- feed tool results back to the model
- repeat until the model finishes
This is not a feature-complete clone. It’s a “core loop + thin CLI” project.
## Non-goals (for the minimal version)
- Full-screen terminal UI (TUI), fancy rendering, themes
- Extensions / plugins / skills packages system
- OAuth login flows
- Multi-provider support (start with one provider)
- Session tree navigation (forking/branching)
- Auto-compaction / summarization
- Web UI / Slack bot / pod management
## Architecture (3 layers, one gem)
Keep everything in one repo/gem, but separate concerns by namespace/folders:
1) **`RubiPi::AI`** (LLM connector)
- Models, messages, tools, and a unified streaming interface
- A provider registry/dispatcher (even if we only register one provider initially)
2) **`RubiPi::AgentCore`** (the agent loop)
- The loop that:
- asks the model
- executes tool calls
- appends tool results
- continues until done
- Emits a stream of events for UI/CLI to render
3) **`RubiPi::CodingAgent`** (product harness)
- CLI parsing and “modes” (start with print mode)
- Tool implementations (read/write/bash)
- Optional minimal session persistence (later milestone)
This mirrors the TypeScript monorepo split:
- `packages/ai` → `RubiPi::AI`
- `packages/agent` → `RubiPi::AgentCore`
- `packages/coding-agent` → `RubiPi::CodingAgent`
## Minimal feature set (what we will actually build)
### A. CLI (print mode)
- Command: `bin/rubipi "message..."` (or stdin)
- Streams assistant text to stdout
- If the assistant calls tools, execute them and continue automatically
- Exits when the assistant is done
### B. One provider (OpenAI-compatible)
- Implement **one** provider using an OpenAI-compatible endpoint:
- Chat Completions compatible streaming (SSE) with tool calling
- Environment variable for API key
- Model config provided via CLI flags or defaults:
- `--base-url`, `--model`, `--api-key` (or `OPENAI_API_KEY`)
### C. Tool calling
- Define a Ruby tool interface:
- name, description, JSON schema-ish parameters (minimal at first), execute(args)
- Tools to implement:
- `read`: read a file
- `write`: write/overwrite a file
- `bash`: run a shell command (careful, but fine for local learning)
### D. Event stream (for learning + testability)
Emit events like:
- `agent_start`, `turn_start`
- `message_start`, `message_delta`, `message_end`
- `tool_start`, `tool_end`
- `turn_end`, `agent_end`
The CLI prints from these events; later a TUI could also consume them.
## Build sequence (milestones)
### Milestone 1: “Hello streaming”
**Outcome:** `rubipi "hi"` streams model text and exits cleanly.
- Implement `RubiPi::AI::Stream` with a single provider adapter
- Parse SSE chunks, produce text deltas, finish with a final message
**Done when:**
- Text appears incrementally (not only at the end)
- Exit code is 0 on success; non-0 with a readable error on failure
### Milestone 2: “Tool call round-trip”
**Outcome:** model requests a tool → RubiPi executes it → model continues.
- Implement `RubiPi::AgentCore::AgentLoop`
- Implement minimal tool registry + 1 tool (`read`)
- Support: assistant → toolCall → toolResult → assistant
**Done when:**
- Prompt “Read RubiPi/plan.md and summarize it” results in a tool call + summary
### Milestone 3: “Useful tool set”
**Outcome:** basic coding-agent-ish actions are possible.
- Add `write` and `bash` tools
- Add basic safety rails (see below)
**Done when:**
- Prompt “Create a file X with Y content” causes `write`
- Prompt “Run `ruby -v`” causes `bash`
### Milestone 4: “Usable CLI ergonomics”
**Outcome:** feels like a small CLI tool, not a script.
- Support stdin piping
- Add flags:
- `--model`, `--base-url`, `--api-key`
- `--json` (optional) to output event stream as JSONL for debugging
- Clear help output
**Done when:**
- `echo "hello" | rubipi` works
- `rubipi --help` shows options
### Milestone 5 (optional): “Minimal sessions”
**Outcome:** persist conversation to a JSONL file.
- Append messages and tool results
- Support `--session <path>` and `--continue`
**Done when:**
- You can run two commands and the second run continues prior context
## Safety rails (minimal)
Even for a personal learning project, add a few guardrails early:
- `read`/`write` limited to a project root (default: current working dir)
- `bash` runs with a timeout and captures output size (truncate large output)
- Never auto-execute “dangerous” commands by default (optional prompt/allowlist later)
## Testing approach (keep it lightweight)
- Unit test:
- SSE parsing (feed chunks → expect deltas)
- tool argument decoding
- agent loop transitions (tool call → tool result)
- Avoid end-to-end provider tests unless you want to provide real keys in CI.
## Open questions (decide as we implement)
- Choose the first API format:
- OpenAI Chat Completions compatible streaming is the simplest starting point
- Later: OpenAI Responses / other providers
- How strict to be on tool parameter validation in Ruby (start permissive, tighten later)