# Workspace overview
_Exported on 5.2.2026 at 16:17:13 MEZ from Cursor (2.4.28)_
---
**User**
what is this workspace all about ?
---
**Cursor**
I’ll quickly scan the root `README.md` and the key package READMEs to summarize what this monorepo does, how it’s organized, and what each major package is for.
## What this workspace is
This is the **Pi monorepo** (`pi-mono`): a **TypeScript/Node.js workspace** containing a set of packages for **building AI agents**, **talking to many LLM providers through one API**, and **running/hosting models**.
At the top level it’s an npm workspaces monorepo (`package.json` lists `packages/*`), with shared tooling for lint/format/typecheck (Biome + TypeScript).
## The main packages (what each is for)
- **`packages/coding-agent` (`@mariozechner/pi-coding-agent`)**: the flagship **interactive terminal coding agent CLI** (“pi”). It’s designed to be extensible via **extensions, skills, prompt templates, themes, and “pi packages”**.
- **`packages/ai` (`@mariozechner/pi-ai`)**: a **unified multi-provider LLM API** (OpenAI, Anthropic, Google, etc.), with streaming events, tool calling, model discovery, usage/cost tracking, cross-provider handoffs, and auth (API keys + some OAuth flows).
- **`packages/agent` (`@mariozechner/pi-agent-core`)**: a **stateful agent runtime** built on `pi-ai`, providing an event-driven prompt/tool-execution loop and message/state management for UIs/CLIs.
- **`packages/tui` (`@mariozechner/pi-tui`)**: a minimal **terminal UI framework** with **differential rendering**, input handling, and components (editor, markdown, lists, overlays, inline images, etc.).
- **`packages/web-ui` (`@mariozechner/pi-web-ui`)**: **web components** (chat UI, artifacts, attachments, IndexedDB storage) for building browser chat apps on top of `pi-ai` + `pi-agent-core`.
- **`packages/mom` (`@mariozechner/pi-mom`)**: “mom”, a **Slack bot** that delegates work to an LLM agent with filesystem + bash-style tooling, designed to be self-managing (often run in Docker).
- **`packages/pods` (`@mariozechner/pi`)**: a different “pi”: a **CLI to deploy/manage vLLM-backed models on remote GPU pods**, exposing OpenAI-compatible endpoints and providing an agent for testing.
## How the pieces fit together (mental model)
A common stack is:
`pi-ai` (LLM streaming + tools) → `pi-agent-core` (agent loop/state) → UIs and harnesses:
- terminal app: `pi-coding-agent` + `pi-tui`
- browser app: `pi-web-ui`
- Slack bot: `pi-mom`
- model hosting: `pi` (pods)
If you tell me what you want to do (use the coding agent CLI, build a web chat, add a provider, deploy models on pods, etc.), I can point you to the exact package(s) and the best entry docs/files.
---
**User**
The idea is, to build a port of the ts version of pi to ruby in order to learn and understand in depth how all of this works
---
**Cursor**
Got it. Next I’ll locate the **entrypoints** and the **core runtime loop** in the TypeScript `pi` (coding agent) package, then outline a Ruby port plan that mirrors the same boundaries (LLM API → agent loop → UI/CLI → persistence/extensions).
Search isn’t working in this environment right now (the underlying ripgrep binary can’t be found), so I’ll navigate by listing directories and then reading the key entrypoint files directly.
### How “pi (TypeScript)” is structured (what to port, in layers)
From the current codebase, the clean architectural stack is:
- **LLM API layer** (`packages/ai`): “given a model + context, stream standardized events and produce a final `AssistantMessage`”
```1:60:/Users/thomasjansen/Sites/pi-mono/packages/ai/src/stream.ts
import "./providers/register-builtins.js";
import "./utils/http-proxy.js";
import { getApiProvider } from "./api-registry.js";
import type {
Api,
AssistantMessage,
AssistantMessageEventStream,
Context,
Model,
ProviderStreamOptions,
SimpleStreamOptions,
StreamOptions,
} from "./types.js";
export { getEnvApiKey } from "./env-api-keys.js";
function resolveApiProvider(api: Api) {
const provider = getApiProvider(api);
if (!provider) {
throw new Error(`No API provider registered for api: ${api}`);
}
return provider;
}
export function stream<TApi extends Api>(
model: Model<TApi>,
context: Context,
options?: ProviderStreamOptions,
): AssistantMessageEventStream {
const provider = resolveApiProvider(model.api);
return provider.stream(model, context, options as StreamOptions);
}
export async function complete<TApi extends Api>(
model: Model<TApi>,
context: Context,
options?: ProviderStreamOptions,
): Promise<AssistantMessage> {
const s = stream(model, context, options);
return s.result();
}
export function streamSimple<TApi extends Api>(
model: Model<TApi>,
context: Context,
options?: SimpleStreamOptions,
): AssistantMessageEventStream {
const provider = resolveApiProvider(model.api);
return provider.streamSimple(model, context, options);
}
export async function completeSimple<TApi extends Api>(
model: Model<TApi>,
context: Context,
options?: SimpleStreamOptions,
): Promise<AssistantMessage> {
const s = streamSimple(model, context, options);
return s.result();
}
```
- **Agent loop layer** (`packages/agent`): “run turns until no tool calls and no queued messages; execute tools; support steering/follow-up queues; emit events”
```1:56:/Users/thomasjansen/Sites/pi-mono/packages/agent/src/agent-loop.ts
/**
* Agent loop that works with AgentMessage throughout.
* Transforms to Message[] only at the LLM call boundary.
*/
import {
type AssistantMessage,
type Context,
EventStream,
streamSimple,
type ToolResultMessage,
validateToolArguments,
} from "@mariozechner/pi-ai";
import type {
AgentContext,
AgentEvent,
AgentLoopConfig,
AgentMessage,
AgentTool,
AgentToolResult,
StreamFn,
} from "./types.js";
/**
* Start an agent loop with a new prompt message.
* The prompt is added to the context and events are emitted for it.
*/
export function agentLoop(
prompts: AgentMessage[],
context: AgentContext,
config: AgentLoopConfig,
signal?: AbortSignal,
streamFn?: StreamFn,
): EventStream<AgentEvent, AgentMessage[]> {
const stream = createAgentStream();
// ...
return stream;
}
```
- **“Coding agent” product layer** (`packages/coding-agent`): wraps the agent loop with **session persistence**, **resource loading** (skills/prompts/extensions/themes), **tools**, **modes** (interactive/print/rpc), **auto-compaction**, **auto-retry**, etc. The key abstraction is `AgentSession`:
```1:14:/Users/thomasjansen/Sites/pi-mono/packages/coding-agent/src/core/agent-session.ts
/**
* AgentSession - Core abstraction for agent lifecycle and session management.
*
* This class is shared between all run modes (interactive, print, rpc).
* It encapsulates:
* - Agent state access
* - Event subscription with automatic session persistence
* - Model and thinking level management
* - Compaction (manual and auto)
* - Bash execution
* - Session switching and branching
*
* Modes use this class and add their own I/O layer on top.
*/
```
And `main.ts` is “CLI args → build session options → choose mode → run”:
```1:35:/Users/thomasjansen/Sites/pi-mono/packages/coding-agent/src/main.ts
/**
* Main entry point for the coding agent CLI.
*
* This file handles CLI argument parsing and translates them into
* createAgentSession() options. The SDK does the heavy lifting.
*/
import { homedir } from "node:os";
import { isAbsolute, join, relative, resolve } from "node:path";
import { type ImageContent, modelsAreEqual, supportsXhigh } from "@mariozechner/pi-ai";
import chalk from "chalk";
import { createInterface } from "readline";
import { type Args, parseArgs, printHelp } from "./cli/args.js";
// ...
import { InteractiveMode, runPrintMode, runRpcMode } from "./modes/index.js";
```
---
### A practical Ruby port plan (best for learning without boiling the ocean)
If the goal is to **understand deeply**, the highest-signal path is a **vertical slice** that preserves the same boundaries as TS:
#### 1) Port the “pi-ai” equivalent first (small, self-contained)
- **Core types**: `Model`, `Context`, `Message` (user/assistant/toolResult), tool schema, usage/cost.
- **Streaming**: an event stream interface that yields events like `text_delta`, `toolcall_delta`, `done`, `error`.
- Start with **one provider** (OpenAI-compatible Chat Completions is easiest) and implement:
- streaming HTTP
- incremental JSON/tool-call assembly
- final `AssistantMessage` construction
Ruby building blocks:
- HTTP streaming: `Net::HTTP` (chunked) or `httpx` / `faraday` with streaming adapter
- Event stream: `Enumerator` or a small `EventStream` class (push + `each`)
- JSON schema validation for tool args: `json_schemer` (or skip validation initially and add it later)
#### 2) Port the “pi-agent-core” loop next (the real “agentic” behavior)
Mirror `agent-loop.ts` behavior:
- Append prompt(s) → stream assistant → if tool calls → execute tools → append tool results → continue
- Support:
- **steering queue** (interrupt remaining tools)
- **follow-up queue** (run after the agent would stop)
- tool arg validation and “tool not found” errors
This is the heart of “how it works” and maps very naturally to Ruby.
#### 3) Add a minimal “coding-agent” harness (don’t start with the full TUI)
To keep scope sane, begin with **print mode** (or a minimal REPL) and only later attempt the TUI.
- Implement a minimal `AgentSession` analogue:
- event subscription + persistence
- session file format (JSONL) and basic “append message”
- Implement just 2–4 tools: `read`, `write`, `bash`, `grep` (or fewer)
#### 4) Only then tackle the “pi extras”
These are big and can be layered in later:
- **Session tree + branching/forking**
- **Compaction** (and the tricky “overflow recovery” behavior)
- **Extensions/skills/prompt templates** loading and execution
- **Interactive TUI** with differential rendering
---
### What I’d personally port first (minimal but authentic “pi”)
- **Mode**: print mode (single prompt, stream to stdout)
- **Provider**: OpenAI-compatible streaming
- **Tools**: `read`, `bash` (even just stubbing `bash` initially)
- **Loop**: tool calling + tool results + continue
- **Persistence**: append-only JSONL log of messages
That gets you “pi’s essence” quickly, and you can iteratively add sessions, queues, compaction, then UI.
---
### The big conceptual “gotchas” to watch while porting
- **Streaming tool calls**: you often get partial JSON; you must assemble and only validate/execute at “toolcall_end”.
- **Message ordering constraints**: if you add extra messages (like bash output) during streaming, you can break toolcall/toolResult sequences—TS explicitly defers some messages until after turns.
- **Cross-provider compatibility** (optional): TS has message transforms for handoffs; you can ignore this at first.
- **OAuth flows** (optional): huge scope; skip initially unless it’s a core learning goal.
---
### If you tell me your target scope, I can point to the exact TS files to mirror
Two good choices:
- **Scope A (fast learning)**: print-mode-only Ruby clone (LLM + tools + sessions)
- **Scope B (full experience)**: interactive TUI + extensions + tree sessions + compaction
Which one are you aiming for?
---
**User**
Now explain it again in semi- to non-technical terms. What is the core, what happens, how is it done
---
**Cursor**
### The “big idea” (in plain language)
This workspace is a set of building blocks for a tool called **pi**: something you can talk to in your terminal (or in Slack / a web UI) that can **use tools** (read files, run commands, edit code) while it chats with you.
Think of it like:
- **A brain connector** (talks to many AI providers)
- **A manager** (decides when to call tools and how to continue)
- **A shell/body** (terminal UI, sessions, settings, plugins)
---
### What is the core?
There are two “cores”, one inside the other:
- **Core #1: “Talk to AI in a consistent way”** (`packages/ai`)
- No matter whether the AI is OpenAI, Anthropic, Google, etc., pi tries to make the interaction look the same: same message format, same event stream (“text arrived”, “tool call requested”, “done”, “error”), same cost/token reporting.
- **Core #2: “Agent loop”** (`packages/agent`)
- This is the logic that turns a chat model into an *agent*:
- Ask the AI something
- If the AI says “I need to use a tool”, run the tool
- Feed the result back to the AI
- Repeat until the AI is finished
Then **`packages/coding-agent`** wraps that core with all the “product” stuff (terminal UI, saved sessions, extensions, commands like `/model`, etc.).
---
### What happens when you use pi? (the story)
Imagine you type:
> “Rename this function everywhere and fix the tests.”
Behind the scenes, pi runs a loop like this:
1) **You send a message**
- pi stores it as part of the conversation history (so the AI can remember what happened).
2) **pi asks the AI for the next step**
- It sends the AI:
- the conversation so far
- the “system prompt” (house rules / instructions)
- a list of available tools (“you can read files, edit files, run bash…”)
3) **The AI starts streaming an answer**
- pi prints the response as it comes in (so you don’t wait for the whole thing).
- Sometimes the AI just writes text.
- Sometimes it says: “Call tool X with these arguments.”
4) **If the AI requests a tool, pi runs it**
- Example: AI requests `read` on `src/foo.ts`
- pi runs the tool locally, gets the result (file contents), and records it.
5) **pi sends the tool result back to the AI**
- Now the AI has new information (“here is the file content”).
- The AI continues: maybe it asks to edit, maybe run tests, maybe read more.
6) **Repeat until the AI stops requesting tools**
- When the AI finally says “done”, pi ends that turn.
That’s the core behavior: **chat → tool calls → tool results → chat → … until done**.
---
### How is it done (conceptually, not code)
#### 1) Everything is “messages + events”
pi treats nearly everything as a stream of events:
- “a user message started/ended”
- “assistant text chunk arrived”
- “assistant is calling tool X”
- “tool execution started/updated/ended”
- “turn ended”
This event stream is what drives:
- the terminal UI updates
- session saving
- extensions/hooks
#### 2) There’s a strict separation of responsibilities
- **LLM connector**: “How do I talk to provider X and stream results?”
- **Agent loop**: “Given tools + messages, how do I keep going until finished?”
- **Session/product wrapper**: “How do I save history, load skills, offer `/commands`, show UI, handle retries, compact long chats?”
This separation is why it’s portable: you can re-implement the same layers in Ruby.
#### 3) Tools are just callable functions with a contract
A tool is basically:
- a name (e.g. `read`, `bash`)
- a description + expected inputs
- an `execute` function that returns output
The AI doesn’t directly read files or run commands. It only **requests** tool calls. pi is the one that actually performs them and returns the output back.
---
### What makes pi “pi” (vs a simple chatbot)
- **Tool calling loop** (the agent loop)
- **Streaming UX** (you see progress live)
- **Sessions** (it remembers and can branch/fork conversation history)
- **Customization** (extensions, skills, prompts, themes)
- **Recovery features**:
- retries on transient provider errors
- “compaction” when the conversation gets too long (summarize older parts)
---
### If you’re porting to Ruby: what you should aim to replicate first
To get the *core experience* with minimal scope:
- A single provider connector (OpenAI-compatible streaming)
- The agent loop (tool calls → run tools → send results → continue)
- 2 tools (`read`, `bash` or even just `read`)
- A simple terminal mode (print streaming text)
Once that works, you’ll “feel” the system, and the rest (sessions, UI polish, extensions) becomes iterative layering.
If you want, I can describe the same flow using a concrete example step-by-step (like “user asks to refactor file X”, and what messages/tools fire in which order).
---
**User**
rebuilding pi in ruby for myself, is it really usefull to handle the parts of it as packages or couln't it rather be modules ?
---
**Cursor**
### Short answer
For a **personal Ruby port**, you can absolutely start with **modules inside one repo/gem**. Splitting into “packages” is only *required* once you need **separate distribution/versioning**, **separate release cadence**, or **multiple independent consumers**.
### What “packages” buy you (and when it matters)
- **Independent reuse**: TS has multiple products that reuse the same core pieces (terminal app, web UI, Slack bot, GPU-pods CLI). Packages make that clean.
- **Clear boundaries**: “LLM connector” vs “agent loop” vs “UI/session harness” stay decoupled by design.
- **Separate publishing**: people can install only `pi-ai` without pulling in a TUI, Slack bot, etc.
- **Dependency isolation**: web UI wants browser deps; CLI wants Node/TUI deps; separating avoids dragging everything everywhere.
- **CI/test focus**: per-package tests and build steps.
For you, if you’re not publishing multiple libraries and not building multiple apps that reuse the same core, these benefits are mostly **organizational** rather than strictly necessary.
### What modules buy you (and why it’s great for a solo Ruby rebuild)
- **Much lower overhead**: one gemspec/Gemfile, one test setup, one release.
- **Faster iteration**: refactors across boundaries are easy early on.
- **Still possible to keep boundaries**: you can enforce them with folders/namespaces, not separate gems.
### A good compromise for Ruby (recommended)
Keep **one repo + one gem**, but structure it like packages internally:
- `lib/pi_ai/...` (LLM providers + streaming + message types)
- `lib/pi_agent_core/...` (agent loop, tools interface, event stream)
- `lib/pi_coding_agent/...` (session persistence, commands, modes, UI glue)
- `bin/pi` (CLI entrypoint)
This gives you the *learning value* of the architecture (clear layers) without the operational cost of multiple gems.
### When to split into multiple gems later
Split only if one of these becomes true:
- You want to release `pi_ai` as a standalone gem others can use.
- You’re building multiple frontends (CLI + web + Slack) and want strict dependency separation.
- You want different versioning cadences (core stable, UI churn, etc.).
- Boot time / dependency weight becomes annoying.
### Why the TS repo is monorepo “packages” instead of just modules
Because in TS they genuinely have **multiple deliverables** (CLI, web UI, Slack bot, pods manager) and a shared core. Packages make it possible to publish and consume those parts independently.
### Bottom line
- **Start with modules** (single gem) and keep the same conceptual boundaries.
- **Split into gems only if reuse/publishing/dep isolation becomes real pain**.
If you tell me whether you’re aiming for *just the CLI* or *CLI + web + Slack*, I can suggest a concrete Ruby folder layout and the minimum boundaries to keep.
---
**User**
yes do that in a "RubiPi" folder
---
**Cursor**
### I can’t create folders/files for you in Ask mode
But I can give you an exact **`RubiPi/`** layout + the minimum starter files to create (copy/paste), and a sequencing that mirrors pi’s real architecture without multiple gems.
---
### Recommended `RubiPi/` structure (single gem, layered modules)
Use **one repo + one gem**, but keep “package-like” boundaries as Ruby namespaces:
```text
RubiPi/
README.md
Gemfile
rubi_pi.gemspec
bin/
rubipi
lib/
rubi_pi.rb
rubi_pi/
version.rb
ai/
types.rb # Model, Context, Message, Tool schema-ish types
stream.rb # stream/complete API (provider registry dispatch)
providers/
openai_compat.rb # first provider: OpenAI-compatible streaming
agent_core/
event_stream.rb # push/subscribe/enumerator
agent_loop.rb # the core loop (prompt -> toolcalls -> toolresults -> continue)
types.rb # AgentMessage, AgentTool, AgentEvent
coding_agent/
agent_session.rb # session wrapper: persistence, queues, retry/compaction hooks later
session_manager.rb # JSONL storage, append, load (start minimal)
tools/
read.rb
bash.rb
write.rb # optional early
modes/
print_mode.rb # first “UI”
interactive_mode.rb # later
cli.rb # parse args, choose mode
spec/
spec_helper.rb
rubi_pi_smoke_spec.rb
```
**Why this works well for your goal**
- You still get the real conceptual boundaries from TS (`ai` vs `agent_core` vs `coding_agent`)
- But you avoid the overhead of separate gems/packages until you actually need them
---
### What each layer “is” in RubiPi terms
- **`RubiPi::AI`**: “Talk to an LLM provider and stream events consistently.”
- **`RubiPi::AgentCore`**: “Given messages + tools, keep looping until the model is done.”
- **`RubiPi::CodingAgent`**: “Make it feel like pi: sessions, CLI modes, tool implementations.”
This mirrors the TS repo:
- `packages/ai` → `lib/rubi_pi/ai`
- `packages/agent` → `lib/rubi_pi/agent_core`
- `packages/coding-agent` → `lib/rubi_pi/coding_agent`
---
### Minimal starter scope (what to build first)
If you want the fastest “it works” moment:
- **Mode**: `print_mode` only
- **Provider**: OpenAI-compatible streaming only
- **Tools**: `read` + `bash` (or even just `read`)
- **Loop**: tool calls → run tool → send result → continue
- **Persistence**: append-only JSONL later (or stub it at first)
---
### Commands you can run to scaffold `RubiPi/`
Pick one of these approaches:
#### Option A: Bundler gem scaffold (recommended)
From wherever you keep projects:
```bash
mkdir RubiPi && cd RubiPi
bundle gem rubi_pi --test=rspec
```
Then place your code under `lib/rubi_pi/...` as in the tree above, and create `bin/rubipi`.
#### Option B: Plain folder (manual)
Just create the directories/files from the tree and wire `bin/rubipi` to call `RubiPi::CodingAgent::CLI.run(ARGV)`.
---
### Next: I’ll give you the first 3 Ruby files to copy/paste
If you confirm one detail, I’ll tailor the exact skeleton:
- **Which provider do you want first?** (OpenAI Responses vs OpenAI Chat Completions compatible endpoint, or Anthropic)
If you don’t care, I’ll assume **OpenAI-compatible Chat Completions streaming** as the simplest starting point and provide:
- `lib/rubi_pi/ai/types.rb`
- `lib/rubi_pi/agent_core/agent_loop.rb`
- `bin/rubipi` + a tiny `print_mode` that streams to stdout