Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: LLMLarge Language Model. ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). Abstraction
Overview
pi-ai provides a single unified API for chat and image generation across 30+ LLMLarge Language Model. providers, abstracting away differences in wire protocols, authentication, streaming formats, tool-calling schemas, and token/cost accounting.
Consumers (the agent runtime, the coding agent, and third-party SDKSoftware Development Kit (the embeddable programmatic API). users) program against one Models collection and one event-stream contract instead of per-provider SDKs.
Stakeholders
| Stakeholder | Interest |
|---|---|
| Agent runtime (pi-agent-core) | A stable StreamFn contract and message types to drive the agent loop |
| Coding agent (pi-coding-agent) | Broad provider coverage, automatic auth resolution, and a model catalog |
| SDKSoftware Development Kit (the embeddable programmatic API). / extension authors | A clean public API to register custom providers and models |
| End users | Access to their chosen model and subscription with minimal setup |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall expose a `Models` collection that routes model lookups and streams by owning provider. | Must | The system shall expose a Models collection that routes model lookups and streams by owning provider. |
| FRFunctional Requirement.-02MustThe system shall support the major wire protocols: Anthropic Messages, OpenAI Responses, OpenAI Completions, Google Generative AI, Google Vertex, Mistral Conversations, and Bedrock Converse Stream. | Must | The system shall support the major wire protocols: Anthropic Messages, OpenAI Responses, OpenAI Completions, Google Generative AI, Google Vertex, Mistral Conversations, and Bedrock Converse Stream. |
| FRFunctional Requirement.-03MustThe system shall stream assistant responses as an async-iterable event stream emitting `start`, text/thinking/toolcall deltas, `done`, and `error` events. | Must | The system shall stream assistant responses as an async-iterable event stream emitting start, text/thinking/toolcall deltas, done, and error events. |
| FRFunctional Requirement.-04Must"aborted"` and never throw out of a stream. | Must | The system shall encode all stream failures as terminal events with stopReason: "error" \ | "aborted" and never throw out of a stream. |
| FRFunctional Requirement.-05MustThe system shall resolve provider authentication via a credential store first (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). or API keyApplication Programming Interface key (ambient provider authentication).), falling back to ambient env-var resolution only when nothing is stored. | Must | The system shall resolve provider authentication via a credential store first (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). or API keyApplication Programming Interface key (ambient provider authentication).), falling back to ambient env-var resolution only when nothing is stored. |
| FRFunctional Requirement.-06MustThe system shall support OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flows (PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). and device code) for subscription-based providers. | Must | The system shall support OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flows (PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). and device code) for subscription-based providers. |
| FRFunctional Requirement.-07MustThe system shall represent tool definitions and tool calls using TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). schemas that are serializable and self-validating. | Must | The system shall represent tool definitions and tool calls using TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). schemas that are serializable and self-validating. |
| FRFunctional Requirement.-08MustThe system shall lazily load provider SDKs on first request rather than at import time. | Must | The system shall lazily load provider SDKs on first request rather than at import time. |
| FRFunctional Requirement.-09MustThe system shall generate its model catalog (`models.generated.ts`) from live upstream sources (models.dev, OpenRouter, Vercel AI Gateway), filtering to tool-capable models. | Must | The system shall generate its model catalog (models.generated.ts) from live upstream sources (models.dev, OpenRouter, Vercel AI Gateway), filtering to tool-capable models. |
| FRFunctional Requirement.-10ShouldThe system shall track token usage and cost per request. | Should | The system shall track token usage and cost per request. |
| FRFunctional Requirement.-11ShouldThe system shall support cross-provider handoffs, converting thinking blocks from foreign providers into tagged text. | Should | The system shall support cross-provider handoffs, converting thinking blocks from foreign providers into tagged text. |
| FRFunctional Requirement.-12ShouldThe system shall provide per-provider compatibility flags (e.g. `OpenAICompletionsCompat`) auto-detected from baseUrl and overridable per model. | Should | The system shall provide per-provider compatibility flags (e.g. OpenAICompletionsCompat) auto-detected from baseUrl and overridable per model. |
| FRFunctional Requirement.-13ShouldThe system shall provide a `faux` in-memory provider for deterministic testing. | Should | The system shall provide a faux in-memory provider for deterministic testing. |
| FRFunctional Requirement.-14MayThe system shall provide an image-generation API (`ImagesModels`) parallel to the chat API. | May | The system shall provide an image-generation API (ImagesModels) parallel to the chat API. |
| FRFunctional Requirement.-15MayThe system shall preserve a deprecated `/compat` entrypoint for the legacy global API during migration. | May | The system shall preserve a deprecated /compat entrypoint for the legacy global API during migration. |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustThe root `index.ts` shall be side-effect free. | Must | Compatibility | The root index.ts shall be side-effect free. |
| NFRNon-Functional Requirement.-02MustSynchronous model reads (`getModels`, `getModel`) shall return last-known data without awaiting; `refresh()` shall be the explicit async verb. | Must | Performance | Synchronous model reads (getModels, getModel) shall return last-known data without awaiting; refresh() shall be the explicit async verb. |
| NFRNon-Functional Requirement.-03MustOAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). token refresh shall use double-checked locking so concurrent requests refresh at most once. | Must | Reliability | OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). token refresh shall use double-checked locking so concurrent requests refresh at most once. |
| NFRNon-Functional Requirement.-04MustThe system shall not silently fall back to ambient auth after a failed stored-credential refresh. | Must | Security | The system shall not silently fall back to ambient auth after a failed stored-credential refresh. |
| NFRNon-Functional Requirement.-05ShouldSDKSoftware Development Kit (the embeddable programmatic API). loading shall not block stream creation; setup may run behind a lazily returned stream. | Should | Performance | SDKSoftware Development Kit (the embeddable programmatic API). loading shall not block stream creation; setup may run behind a lazily returned stream. |
| NFRNon-Functional Requirement.-06ShouldProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). factories shall be thin wrappers over a shared `createProvider` helper. | Should | Maintainability | ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). factories shall be thin wrappers over a shared createProvider helper. |
Constraints
- Only tool-call-capable models are cataloged.
- Generated files (
models.generated.ts,*.models.ts) must never be hand-edited. - Erasable TypeScript syntax only; ESM only; Node
>=22.19.0.
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-01MustThe system shall expose a `Models` collection that routes model lookups and streams by owning provider.
- Given a
Modelscollection with multiple registered providers - When a consumer requests a model by
provider/id - Then the owning provider resolves the model and its stream behavior.
- Given a
- FRFunctional Requirement.-03MustThe system shall stream assistant responses as an async-iterable event stream emitting `start`, text/thinking/toolcall deltas, `done`, and `error` events.
- Given a configured provider and a prompt
- When the consumer calls the stream function
- Then it receives an
AssistantMessageEventStreamyielding ordered text/thinking/toolcall deltas and a terminaldoneevent.
- FRFunctional Requirement.-04Must"aborted"` and never throw out of a stream.
- Given a provider request that fails mid-stream
- When the failure occurs
- Then the stream emits an
errorevent with partial content and astopReasonoferrororabortedinstead of throwing.
- FRFunctional Requirement.-05MustThe system shall resolve provider authentication via a credential store first (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). or API keyApplication Programming Interface key (ambient provider authentication).), falling back to ambient env-var resolution only when nothing is stored.
- Given a provider with both a stored credential and an ambient env var
- When auth is resolved
- Then the stored credential is used and the env var is ignored.
- FRFunctional Requirement.-07MustThe system shall represent tool definitions and tool calls using TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). schemas that are serializable and self-validating.
- Given a tool defined with a TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). parameter schema
- When the provider returns a tool call
- Then the arguments are parsed and available for validation without provider-specific handling.
- FRFunctional Requirement.-08MustThe system shall lazily load provider SDKs on first request rather than at import time.
- Given a fresh process importing the root entrypoint
- When no request has been made
- Then no provider SDKSoftware Development Kit (the embeddable programmatic API). (
@anthropic-ai/sdk,openai, etc.) is loaded.
- FRFunctional Requirement.-09MustThe system shall generate its model catalog (`models.generated.ts`) from live upstream sources (models.dev, OpenRouter, Vercel AI Gateway), filtering to tool-capable models.
- Given the model generator
- When run via
npm run generate-models - Then per-provider
*.models.tsfiles and the aggregatormodels.generated.tsare written, containing only tool-capable models.
- NFRNon-Functional Requirement.-01MustThe root `index.ts` shall be side-effect free.
- Given a consumer importing the root
index.ts - When the import completes
- Then no provider factories, generated catalogs, or OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). implementations have executed as side effects.
- Given a consumer importing the root
- NFRNon-Functional Requirement.-03MustOAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). token refresh shall use double-checked locking so concurrent requests refresh at most once.
- Given an expired OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). token and concurrent requests
- When both attempt to refresh
- Then exactly one network refresh occurs and both requests use the refreshed token.
Conflicts
None identified yet.
Open Questions
- What is the timeline and completion criteria for retiring the
/compatentrypoint once coding-agent'sModelManagermigration finishes? - Which providers, if any, are considered tier-1 (must-ship) versus community-maintained for the purposes of future feature work?
Specification: LLMLarge Language Model. ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). Abstraction
Overview
pi-ai is a provider-centric LLMLarge Language Model. abstraction.
Each provider owns its model catalog, auth, and stream behavior and delegates to one of a small set of shared wire-protocol ("API") implementations.
A Models collection routes by provider, streams flow through a unified AssistantMessageEventStream, and SDKs load lazily on first use.
Architecture
Consumer (Agent / SDKSoftware Development Kit (the embeddable programmatic API).)
|
v
+-------------------+ routes by provider id
| Models collection |----+ (createModels / builtinProviders)
+-------------------+ |
v
+-------------------+ owns auth + catalog
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). |
+---------+---------+
| delegates to
v
+-------------------+ wire protocol
| API impl (lazy) | (anthropic-messages,
+---------+---------+ openai-responses, ...)
| streams
v
+-------------------+
| AssistantMessage |
| EventStream |
+-------------------+
Data Models
ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`).
| Field | Type | Constraints | Description |
|---|---|---|---|
| id | string | unique | ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). identifier (e.g. anthropic) |
| name | string | not null | Display name |
| baseUrl | string | optional | API base URL |
| auth | ProviderAuth | not null | API keyApplication Programming Interface key (ambient provider authentication). and/or OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). resolvers |
| models | Model[] | not null | Catalog of models |
| api | ApiFactory | not null | Wire-protocol factory |
Model
| Field | Type | Constraints | Description |
|---|---|---|---|
| id | string | unique per provider | Model identifier |
| provider | string | FK -> ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`)..id | Owning provider |
| contextWindow | number | not null | Max context tokens |
| pricing | object | optional | Input/output/cache cost per million tokens |
| reasoning | object | optional | Supported thinking levels and budgets |
| compat | object | optional | ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). compatibility flags |
AssistantMessageEvent
| Field | Type | Constraints | Description |
|---|---|---|---|
| type | enum | not null | start, text_*, thinking_*, toolcall_*, done, error |
| partial | AssistantMessage | optional | Cumulative message at event time |
| contentIndex | number | optional | Associates delta with a content block |
| stopReason | enum | on done/error | stop, length, toolCall, error, aborted |
Tool
| Field | Type | Constraints | Description |
|---|---|---|---|
| name | string | not null | Tool name |
| description | string | not null | LLMLarge Language Model.-facing description |
| parameters | TSchema (TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating).) | not null | Serializable, self-validating schema |
API Contracts
Models.stream(model, context)
Request
| Field | Type | Required | Description |
|---|---|---|---|
| model | Model | yes | Target model |
| context | Context | yes | System prompt, messages, tools, reasoning level |
| apiKey | ModelAuth | optional | Explicit auth override |
Response: an AssistantMessageEventStream (async iterable) whose .result() resolves to the final AssistantMessage.
Error contract: failures are encoded as a final event with stopReason: "error" \| "aborted" and errorMessage; the function never throws.
Models.refresh()
Resolves the latest catalogs/auth; returns a promise. Synchronous readers (getModels/getModel) return last-known data without awaiting.
Sequences
Streaming a prompt
Consumer -> Models: stream(model, context)
Models -> ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`).: resolve auth (store first, then ambient)
ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). -> API impl: lazy-load SDKSoftware Development Kit (the embeddable programmatic API)., open stream
API impl -> Consumer: AssistantMessageEventStream
loop: yield start/text_delta/toolcall_delta ... done|error
Consumer awaits stream.result() -> AssistantMessage
OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). token refresh (double-checked locking)
Request -> resolveStoredOAuth: token expired?
yes -> acquire credentials.modify lock
-> re-check expiry (another request may have refreshed)
-> if still expired: refresh once, persist
-> release lock
no -> use cached token (zero locks)
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Routing unit | ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). (not API) | A provider owns auth + catalog; multiple providers share one wire protocol |
| SDKSoftware Development Kit (the embeddable programmatic API). loading | Lazy via .lazy.ts wrappers |
Keeps import side-effect-free and startup fast |
| Failure model | Encoded in stream | Callers always get a stream object; no try/catch around the call |
| Schema library | TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | JSON-serializable, self-validating, works across providers |
| Sync vs async reads | Sync last-known + explicit refresh() |
Avoids blocking the agent loop on network |
Risks and Unknowns
- The
/compatentrypoint duplicates the provider-centric API; its removal depends on the coding-agentModelManagermigration completing. - Generated catalogs depend on live upstream APIs (models.dev, OpenRouter, Vercel); outages during generation could stall releases.
- ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`).-specific compatibility quirks (token counting, cache headers, thinking-level support) require ongoing manual overrides in the generator.
Out of Scope
- The agent loop and tool execution (FEAT-0002).
- SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). persistence, compaction, and branching (FEAT-0006).
- The TUITerminal User Interface (the interactive mode, and the `pi-tui` library). and interactive experience (FEAT-0003, FEAT-0004).
requirements
- What is the timeline and completion criteria for retiring the
/compatentrypoint once coding-agent'sModelManagermigration finishes? - Which providers, if any, are considered tier-1 (must-ship) versus community-maintained for the purposes of future feature work?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |
Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: Agent Runtime
Overview
pi-agent-core provides the stateful agent runtime that sits between the LLMLarge Language Model. abstraction (pi-ai) and application UIs.
It owns the transcript, runs the prompt-stream-tool-continue loop, executes tools (sequentially or in parallel), manages message queues (steering and follow-up), and exposes a higher-level AgentHarness with sessions, compaction, skills, and provider hooks.
Stakeholders
| Stakeholder | Interest |
|---|---|
| Coding agent (pi-coding-agent) | A reliable Agent/AgentHarness to drive prompts, tools, and session lifecycle |
| SDKSoftware Development Kit (the embeddable programmatic API). / embedding users | A transport-agnostic runtime they can point at any StreamFn |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. authors | Typed hooks (tool_call, tool_result, before_provider_request, etc.) to observe and mutate behavior |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall provide an `Agent` class owning the transcript, tools, model, and thinking level, with `prompt`, `continue`, `abort`, and `subscribe` methods. | Must | The system shall provide an Agent class owning the transcript, tools, model, and thinking level, with prompt, continue, abort, and subscribe methods. |
| FRFunctional Requirement.-02MustThe system shall run an agent loop that streams an assistant response, extracts tool calls, executes them, and continues until no more tool calls or queued messages remain. | Must | The system shall run an agent loop that streams an assistant response, extracts tool calls, executes them, and continues until no more tool calls or queued messages remain. |
| FRFunctional Requirement.-03MustThe system shall call an injectable `StreamFn` (default pi-ai `streamSimple`) to reach the LLMLarge Language Model., keeping the runtime transport-agnostic. | Must | The system shall call an injectable StreamFn (default pi-ai streamSimple) to reach the LLMLarge Language Model., keeping the runtime transport-agnostic. |
| FRFunctional Requirement.-04MustThe system shall execute tool calls either sequentially or in parallel based on config and per-tool `executionMode`, emitting `tool_execution_*` events in completion order while persisting tool results in source order. | Must | The system shall execute tool calls either sequentially or in parallel based on config and per-tool executionMode, emitting tool_execution_* events in completion order while persisting tool results in source order. |
| FRFunctional Requirement.-05MustThe system shall validate tool arguments before execution and support `beforeToolCall`/`afterToolCall` hooks that can block, override, or terminate a tool call. | Must | The system shall validate tool arguments before execution and support beforeToolCall/afterToolCall hooks that can block, override, or terminate a tool call. |
| FRFunctional Requirement.-06MustThe system shall support message queues for steering (delivered after the current tool batch) and follow-up (delivered after the agent stops). | Must | The system shall support message queues for steering (delivered after the current tool batch) and follow-up (delivered after the agent stops). |
| FRFunctional Requirement.-07MustThe system shall support abort via an `AbortController` whose signal flows to `streamFn` and tool `execute`. | Must | The system shall support abort via an AbortController whose signal flows to streamFn and tool execute. |
| FRFunctional Requirement.-08MustThe system shall emit a stable event taxonomy: `agent_start/end`, `turn_start/end`, `message_start/update/end`, `tool_execution_start/update/end`. | Must | The system shall emit a stable event taxonomy: agent_start/end, turn_start/end, message_start/update/end, tool_execution_start/update/end. |
| FRFunctional Requirement.-09MustThe system shall provide an `AgentHarness` wrapping `Agent` with sessions, compaction, skills, system-prompt building, and provider hooks. | Must | The system shall provide an AgentHarness wrapping Agent with sessions, compaction, skills, system-prompt building, and provider hooks. |
| FRFunctional Requirement.-10MustThe system shall support app-extensible `AgentMessage` types via declaration merging, with `convertToLlm` bridging to pi-ai messages and an optional `transformContext` hook. | Must | The system shall support app-extensible AgentMessage types via declaration merging, with convertToLlm bridging to pi-ai messages and an optional transformContext hook. |
| FRFunctional Requirement.-11ShouldThe system shall provide a `streamProxy` function for routing LLMLarge Language Model. calls through a server with bandwidth-reduced events. | Should | The system shall provide a streamProxy function for routing LLMLarge Language Model. calls through a server with bandwidth-reduced events. |
| FRFunctional Requirement.-12ShouldThe system shall include compaction utilities (`compact`, `shouldCompact`, `estimateTokens`, `findCutPoint`). | Should | The system shall include compaction utilities (compact, shouldCompact, estimateTokens, findCutPoint). |
| FRFunctional Requirement.-13ShouldThe system shall include branch-summarization utilities. | Should | The system shall include branch-summarization utilities. |
| FRFunctional Requirement.-14ShouldThe system shall provide a `NodeExecutionEnv` (filesystem + shell) via the `./node` entrypoint. | Should | The system shall provide a NodeExecutionEnv (filesystem + shell) via the ./node entrypoint. |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustExactly one active run per `Agent` shall be allowed; concurrent `prompt` calls shall throw. | Must | Reliability | Exactly one active run per Agent shall be allowed; concurrent prompt calls shall throw. |
| NFRNon-Functional Requirement.-02MustRun failures shall be surfaced as a synthetic failure assistant message with `stopReason: "aborted" \ | Must | Reliability | Run failures shall be surfaced as a synthetic failure assistant message with stopReason: "aborted" \ | "error" followed by agent_end. |
| NFRNon-Functional Requirement.-03MustThe run shall not be considered idle until `agent_end` listeners have settled. | Must | Correctness | The run shall not be considered idle until agent_end listeners have settled. |
| NFRNon-Functional Requirement.-04MustCoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. modules shall be platform-agnostic; Node-specific code shall live behind the `./node` entrypoint. | Must | Portability | CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. modules shall be platform-agnostic; Node-specific code shall live behind the ./node entrypoint. |
| NFRNon-Functional Requirement.-05ShouldA stable, backend-independent error taxonomy (`FileErrorCode`, `ExecutionErrorCode`, `CompactionErrorCode`, `SessionErrorCode`) shall be used for fallible operations. | Should | Maintainability | A stable, backend-independent error taxonomy (FileErrorCode, ExecutionErrorCode, CompactionErrorCode, SessionErrorCode) shall be used for fallible operations. |
Constraints
- Built on
pi-aivia theStreamFnboundary; does not import provider SDKs directly. - Erasable TypeScript syntax only; ESM only.
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-02MustThe system shall run an agent loop that streams an assistant response, extracts tool calls, executes them, and continues until no more tool calls or queued messages remain.
- Given an agent with tools and a prompt that triggers tool calls
- When the agent loop runs to completion
- Then it streams the response, executes the tool calls, appends results, and continues until there are no more tool calls or queued messages.
- FRFunctional Requirement.-04MustThe system shall execute tool calls either sequentially or in parallel based on config and per-tool `executionMode`, emitting `tool_execution_*` events in completion order while persisting tool results in source order.
- Given a batch of parallel-capable tool calls plus one sequential tool
- When the batch executes
- Then the whole batch runs sequentially and
tool_execution_endevents fire in completion order, while persisted tool-result messages preserve assistant source order.
- FRFunctional Requirement.-05MustThe system shall validate tool arguments before execution and support `beforeToolCall`/`afterToolCall` hooks that can block, override, or terminate a tool call.
- Given a
beforeToolCallhook returning{ block: true } - When the tool is about to execute
- Then the call is blocked and an error tool result is recorded.
- Given a
- FRFunctional Requirement.-06MustThe system shall support message queues for steering (delivered after the current tool batch) and follow-up (delivered after the agent stops).
- Given a streaming agent and an incoming steering message
- When the current tool batch completes
- Then the steering message is delivered to the agent before it stops.
- FRFunctional Requirement.-07MustThe system shall support abort via an `AbortController` whose signal flows to `streamFn` and tool `execute`.
- Given an in-progress run
- When
abort()is called - Then the abort signal propagates to the active stream and executing tools.
- NFRNon-Functional Requirement.-01MustExactly one active run per `Agent` shall be allowed; concurrent `prompt` calls shall throw.
- Given an agent with an active run
- When
promptis called again concurrently - Then the second call throws rather than interleaving.
- NFRNon-Functional Requirement.-02MustRun failures shall be surfaced as a synthetic failure assistant message with `stopReason: "aborted" \
- Given a run that throws unexpectedly
- When the error is caught
- Then a synthetic failure assistant message with an appropriate
stopReasonis emitted, followed byagent_end.
Conflicts
None identified yet.
Open Questions
- Should
streamProxyremain a first-class supported deployment mode, or is it expected to migrate to an extension? - What is the long-term relationship between
AgentHarnessand the coding-agent's ownAgentSession(overlap, merge, or distinct layers)?
Specification: Agent Runtime
Overview
pi-agent-core is split into a low-level layer (Agent, agentLoop) and a high-level AgentHarness.
The low-level layer is transport-agnostic and delegates LLMLarge Language Model. calls to an injectable StreamFn.
The harness adds durable session storage, compaction, skills, system prompts, and a rich provider/session hook surface.
Architecture
+---------------------------------------------------+
| AgentHarness |
| sessions, compaction, skills, system-prompt, |
| provider/session hooks (before_provider_request, |
| tool_call, session_before_compact, ...) |
+-------------------------+-------------------------+
| owns
v
+---------------------------------------------------+
| Agent |
| transcript, tools, model/thinking, message |
| queues, lifecycle (prompt/continue/abort) |
+-------------------------+-------------------------+
| drives
v
+---------------------------------------------------+
| agentLoop |
| stream -> extract tools -> execute -> continue |
+-------------------------+-------------------------+
| calls
v
+---------------------+
| StreamFn (inject) | default: pi-ai streamSimple
+---------------------+ alt: streamProxy (server)
Data Models
AgentState
| Field | Type | Constraints | Description |
|---|---|---|---|
| systemPrompt | string | optional | Active system prompt |
| model | Model | not null | Active LLMLarge Language Model. model |
| thinkingLevel | ThinkingLevel | not null | Reasoning effort |
| tools | AgentTool[] | not null | Registered tools |
| messages | AgentMessage[] | not null | Transcript (app-extensible) |
| isStreaming | boolean | readonly | True while a run is active |
| streamingMessage | AssistantMessage | readonly | Partial message during stream |
| pendingToolCalls | Set | readonly | Tool calls in flight |
| errorMessage | string | readonly | Last run error, if any |
AgentTool
| Field | Type | Constraints | Description |
|---|---|---|---|
| name | string | not null | Tool name |
| description | string | not null | LLMLarge Language Model.-facing description |
| parameters | TSchema | not null | TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). parameter schema |
| label | string | optional | UI label |
| execute | function | not null | (id, args, signal, onUpdate) -> result |
| prepareArguments | function | optional | Pre-execution arg transform |
| executionMode | enum | optional | sequential or parallel |
AgentMessage
A union of pi-ai Message types plus app-injected custom message types via declaration merging.
convertToLlm bridges AgentMessage[] to pi-ai Message[] for the stream; transformContext optionally rewrites the AgentMessage[] before conversion.
AgentEvent
| Type | Emitted | Carries |
|---|---|---|
| agent_start / agent_end | run begin / end | - |
| turn_start / turn_end | each turn | - |
| message_start / update / end | stream lifecycle | partial/full assistant message |
| tool_execution_start / update / end | per tool | tool id, params, progress, result |
API Contracts
Agent.prompt(messages) / Agent.continue()
Request: AgentMessage[] for prompt; none for continue (resumes from current context; last message must be user/toolResult).
Response: an async subscription; results arrive via subscribe() events. Throws if an agent is already running.
StreamFn
| Field | Type | Required | Description |
|---|---|---|---|
| model | Model | yes | Target model |
| context | Context | yes | Messages, tools, system prompt, reasoning |
| apiKey | ModelAuth | optional | Explicit auth |
| signal | AbortSignal | optional | Cancellation |
Response: an AssistantMessageEventStream. Must never throw; failures are terminal stream events.
Sequences
Agent loop (single turn)
turn_start
-> transformContext (AgentMessage[])
-> convertToLlm (-> Message[])
-> StreamFn(context)
-> stream deltas -> message_update events
-> done -> message_end
-> extract tool calls
-> beforeToolCall (can block)
-> execute tools (sequential|parallel) -> tool_execution_* events
-> afterToolCall (can override/terminate)
-> append tool results to context
turn_end
-> drain steering messages? loop
-> else drain follow-up messages? outer loop
agent_end (after listeners settle)
Failure handling
run executor throws
-> handleRunFailure: synthesize assistant message (stopReason: aborted|error)
-> emit message_end, agent_end
-> finishRun: clear runtime state
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Transport boundary | Injectable StreamFn |
Enables direct, proxy, and custom backends without coupling to pi-ai |
| Message extensibility | Declaration-merged AgentMessage |
Apps add custom types without forking; convertToLlm is the bridge |
| Parallel tool execution | Completion-order events, source-order results | Parallelism for speed, deterministic transcript ordering |
| Idle definition | After agent_end listeners settle |
Listeners can perform async cleanup before idle |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. vs Node | ./node entrypoint |
Keeps core platform-agnostic; Node fs/shell isolated |
Risks and Unknowns
- Overlap between
AgentHarnessand coding-agent'sAgentSessioncould lead to duplicated session/compaction logic. - The declaration-merge extensibility model is powerful but can make type errors hard to localize.
- Parallel tool execution ordering semantics are subtle; regressions here are hard to detect without targeted tests.
Out of Scope
- LLMLarge Language Model. provider abstraction and streaming protocol (FEAT-0001).
- Terminal rendering (FEAT-0003).
- Interactive TUITerminal User Interface (the interactive mode, and the `pi-tui` library)., slash commands, and the CLICommand-Line Interface (the `pi` binary). product (FEAT-0004).
- Coding-agent-specific session format and branching UI (FEAT-0006).
requirements
- Should
streamProxyremain a first-class supported deployment mode, or is it expected to migrate to an extension? - What is the long-term relationship between
AgentHarnessand the coding-agent's ownAgentSession(overlap, merge, or distinct layers)?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |
Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: Terminal UI Framework
Overview
pi-tui is a minimal terminal UI framework with differential rendering and synchronized output for flicker-free interactive CLICommand-Line Interface (the `pi` binary). applications.
It is custom-built (no React/Ink/VDOM): components are plain classes that return styled strings, and a hand-written renderer diffs the new frame against the previous one and writes minimal escape sequences.
It powers the coding agent's interactive mode and is reusable by any terminal application.
Stakeholders
| Stakeholder | Interest |
|---|---|
| Coding agent interactive mode | Primitive components (editor, text, markdown, lists, overlays) and flicker-free rendering |
| SDKSoftware Development Kit (the embeddable programmatic API). / library users | A terminal-agnostic component framework they can target |
| End users | Smooth, responsive terminal rendering across platforms and terminals |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall provide a `Component` model where components implement `render(width): string[]` and optional `handleInput`. | Must | The system shall provide a Component model where components implement render(width): string[] and optional handleInput. |
| FRFunctional Requirement.-02MustThe system shall provide a `TUITerminal User Interface (the interactive mode, and the `pi-tui` library).` manager that composes components, manages focus, overlays, and drives rendering. | Must | The system shall provide a TUITerminal User Interface (the interactive mode, and the `pi-tui` library). manager that composes components, manages focus, overlays, and drives rendering. |
| FRFunctional Requirement.-03MustThe system shall perform differential rendering, writing only changed line ranges between frames. | Must | The system shall perform differential rendering, writing only changed line ranges between frames. |
| FRFunctional Requirement.-04MustThe system shall wrap updates in synchronized output escape sequences for atomic, flicker-free rendering. | Must | The system shall wrap updates in synchronized output escape sequences for atomic, flicker-free rendering. |
| FRFunctional Requirement.-05MustThe system shall throttle rendering to a capped frame rate and coalesce multiple invalidations per tick. | Must | The system shall throttle rendering to a capped frame rate and coalesce multiple invalidations per tick. |
| FRFunctional Requirement.-06MustThe system shall provide reusable components: `Editor`, `Text`, `TruncatedText`, `Input`, `Box`, `Markdown`, `SelectList`, `SettingsList`, `Loader`, `CancellableLoader`, `Image`, `Spacer`. | Must | The system shall provide reusable components: Editor, Text, TruncatedText, Input, Box, Markdown, SelectList, SettingsList, Loader, CancellableLoader, Image, Spacer. |
| FRFunctional Requirement.-07MustThe system shall provide an `Overlay` system with anchor-based, percentage, and absolute positioning, plus focus restore. | Must | The system shall provide an Overlay system with anchor-based, percentage, and absolute positioning, plus focus restore. |
| FRFunctional Requirement.-08MustThe system shall provide key handling with Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. negotiation and legacy sequence fallback. | Must | The system shall provide key handling with Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. negotiation and legacy sequence fallback. |
| FRFunctional Requirement.-09MustThe system shall provide column-aware string utilities (`visibleWidth`, `truncateToWidth`, `sliceByColumn`, `wrapTextWithAnsi`) handling ANSI codes and wide/CJKChinese, Japanese, Korean (terminal width handling for wide characters). characters. | Must | The system shall provide column-aware string utilities (visibleWidth, truncateToWidth, sliceByColumn, wrapTextWithAnsi) handling ANSI codes and wide/CJKChinese, Japanese, Korean (terminal width handling for wide characters). characters. |
| FRFunctional Requirement.-10MustThe system shall be terminal-agnostic via a `Terminal` interface, with `ProcessTerminal` for production. | Must | The system shall be terminal-agnostic via a Terminal interface, with ProcessTerminal for production. |
| FRFunctional Requirement.-11ShouldThe system shall support terminal image protocols (Kitty and iTerm2 inline images) with capability detection. | Should | The system shall support terminal image protocols (Kitty and iTerm2 inline images) with capability detection. |
| FRFunctional Requirement.-12ShouldThe system shall support IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows by emitting a zero-width cursor marker and positioning the hardware cursor. | Should | The system shall support IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows by emitting a zero-width cursor marker and positioning the hardware cursor. |
| FRFunctional Requirement.-13ShouldThe system shall provide a keybindings manager with default TUITerminal User Interface (the interactive mode, and the `pi-tui` library). keybindings. | Should | The system shall provide a keybindings manager with default TUITerminal User Interface (the interactive mode, and the `pi-tui` library). keybindings. |
| FRFunctional Requirement.-14ShouldThe system shall detect terminal color scheme (light/dark) via OSC 11. | Should | The system shall detect terminal color scheme (light/dark) via OSC 11. |
| FRFunctional Requirement.-15ShouldThe system shall provide fuzzy match/filter and autocomplete providers (slash commands + file paths). | Should | The system shall provide fuzzy match/filter and autocomplete providers (slash commands + file paths). |
| FRFunctional Requirement.-16MayThe system shall ship native addons for win32 and darwin to report modifier-key state. | May | The system shall ship native addons for win32 and darwin to report modifier-key state. |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustComponents shall return lines that do not exceed the provided width; the framework shall error otherwise. | Must | Correctness | Components shall return lines that do not exceed the provided width; the framework shall error otherwise. |
| NFRNon-Functional Requirement.-02MustFirst render and full re-render strategies shall clear scrollback correctly; differential updates shall never leave stale content. | Must | Performance | First render and full re-render strategies shall clear scrollback correctly; differential updates shall never leave stale content. |
| NFRNon-Functional Requirement.-03MustEach rendered line shall append a full SGR reset and OSC 8 hyperlink reset so styles do not carry across lines. | Must | Compatibility | Each rendered line shall append a full SGR reset and OSC 8 hyperlink reset so styles do not carry across lines. |
| NFRNon-Functional Requirement.-04ShouldThe framework shall be testable against a virtual terminal (xterm headless) for deterministic render assertions. | Should | Testability | The framework shall be testable against a virtual terminal (xterm headless) for deterministic render assertions. |
| NFRNon-Functional Requirement.-05ShouldRendering shall be capped at approximately 60 fps (`MIN_RENDER_INTERVAL_MS = 16`). | Should | Performance | Rendering shall be capped at approximately 60 fps (MIN_RENDER_INTERVAL_MS = 16). |
Constraints
- No React, no Ink, no virtual DOM; imperative component model.
- Dependencies kept minimal (
get-east-asian-width,marked). - Erasable TypeScript syntax only; ESM only.
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-03MustThe system shall perform differential rendering, writing only changed line ranges between frames.
- Given a previous frame and a new frame differing on a contiguous range of lines
- When a render is requested
- Then only the changed range is written, with the cursor moved to the first changed line and clear-to-end applied.
- FRFunctional Requirement.-04MustThe system shall wrap updates in synchronized output escape sequences for atomic, flicker-free rendering.
- Given any render update
- When the update is written to the terminal
- Then the bytes are wrapped in synchronized output (
\x1b[?2026h...\x1b[?2026l).
- FRFunctional Requirement.-07MustThe system shall provide an `Overlay` system with anchor-based, percentage, and absolute positioning, plus focus restore.
- Given a focused overlay that is temporarily replaced by another overlay
- When the replacement releases focus
- Then the original overlay reclaims focus via the focus-restore state machine.
- FRFunctional Requirement.-09MustThe system shall provide column-aware string utilities (`visibleWidth`, `truncateToWidth`, `sliceByColumn`, `wrapTextWithAnsi`) handling ANSI codes and wide/CJKChinese, Japanese, Korean (terminal width handling for wide characters). characters.
- Given a string containing ANSI codes and wide CJKChinese, Japanese, Korean (terminal width handling for wide characters). characters
- When
visibleWidthandtruncateToWidthare applied - Then the visible width and truncation account for display columns, not raw character count.
- NFRNon-Functional Requirement.-01MustComponents shall return lines that do not exceed the provided width; the framework shall error otherwise.
- Given a component that returns a line longer than
width - When rendered
- Then the framework raises an error rather than overflowing the terminal.
- Given a component that returns a line longer than
- NFRNon-Functional Requirement.-03MustEach rendered line shall append a full SGR reset and OSC 8 hyperlink reset so styles do not carry across lines.
- Given consecutive rendered lines with different styling
- When composited
- Then each line resets SGR state so no style leaks into the next line.
Conflicts
None identified yet.
Open Questions
- Is the native modifier-key addon (
native-modifiers.ts) expected to gain a Linux path, or is the protocol fallback sufficient there? - What is the policy for adding new built-in components versus leaving them to consumers?
Specification: Terminal UI Framework
Overview
pi-tui is an imperative terminal UI framework.
Components return arrays of pre-styled strings; the TUITerminal User Interface (the interactive mode, and the `pi-tui` library). manager diffs the new array against the previous frame and emits minimal escape sequences wrapped in synchronized output.
There is no virtual DOM; callers mutate components then request a render.
Architecture
+-------------------+ mutate + requestRender
| Application code |-------------------------+
+-------------------+ |
| uses |
v v
+-------------------+ render(width) +-------------+
| Components |------------------>| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). |
| (Editor, Text, | | (manager) |
| Markdown, ...) | +------+------+
+-------------------+ | composites overlays
| diffs vs previousLines
v
+---------------------+
| differential write |
| (sync output, ~60fps)|
+----------+----------+
|
v
+---------------------+
| Terminal interface |
| (ProcessTerminal / |
| VirtualTerminal) |
+---------------------+
Data Models
Component
| Field | Type | Constraints | Description |
|---|---|---|---|
| render | (width: number) => string[] | required | Returns styled lines, each <= width |
| handleInput | (data) => boolean | optional | Returns true if input consumed |
| wantsKeyRelease | boolean | optional | Claims key release |
| invalidate | () => void | required | Marks the component dirty |
OverlayOptions
| Field | Type | Constraints | Description |
|---|---|---|---|
| anchor | object | optional | Positioning anchor |
| margins | object | optional | Edge margins |
| visible | () => boolean | optional | Visibility callback |
| nonCapturing | boolean | optional | Does not capture focus |
Key
| Field | Type | Constraints | Description |
|---|---|---|---|
| key | string | optional | Logical key name |
| ctrl/alt/shift | boolean | optional | Modifiers |
| paste | boolean | optional | Bracketed paste flag |
API Contracts
TUITerminal User Interface (the interactive mode, and the `pi-tui` library)..start() / TUITerminal User Interface (the interactive mode, and the `pi-tui` library)..stop()
Begins/ends raw mode, Kitty keyboard negotiation, and the render loop.
requestRender() schedules a throttled render; setFocus, showOverlay/hideOverlay manage composition.
Keybindings
matchesKey(keyData, binding) compares parsed keys against configurable bindings.
Defaults live in TUI_KEYBINDINGS; consumers override via setKeybindings.
Sequences
Differential render
application -> component.invalidate() / TUITerminal User Interface (the interactive mode, and the `pi-tui` library)..requestRender()
TUITerminal User Interface (the interactive mode, and the `pi-tui` library). (next tick, if >= MIN_RENDER_INTERVAL_MS):
render each focused component -> lines[]
composite overlays into lines[]
extract cursor marker -> compute hardware cursor position
if first render: output all lines (no scrollback clear)
elif full re-render: clear screen + home + clear scrollback; delete Kitty images
else (differential):
find first/last changed line vs previousLines
move cursor to first changed line; clear to end
write only changed range
wrap all output in synchronized output escapes
handle appended/deleted lines and viewport scroll
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Rendering model | Imperative, string-line diff | No VDOM overhead; minimal escape output |
| Atomicity | Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. (?2026h/l) |
Flicker-free even across slow connections |
| Throttling | ~60fps with nextTick coalescing | Bounding CPU while staying responsive |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). cursor | Zero-width APC marker | Keeps fake cursor while positioning hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). |
| Line resets | Full SGR + OSC 8 reset per line | Prevents style/hyperlink leakage across lines |
| Terminal abstraction | Terminal interface |
ProcessTerminal prod, VirtualTerminal tests |
Risks and Unknowns
- Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. correctness depends on accurate change detection; subtle bugs (Kitty image ranges, overlay compositing above viewport) require regression tests.
- Terminal compatibility variance (Kitty support, synchronized output, color scheme reporting) means fallback paths must be exercised.
- The overlay focus-restore state machine has many states ("eligible"/"blocked") and is a likely source of regressions.
Out of Scope
- LLMLarge Language Model. streaming and provider abstraction (FEAT-0001).
- Agent loop and tool execution (FEAT-0002).
- The interactive coding agent product built on top of this framework (FEAT-0004).
requirements
- Is the native modifier-key addon (
native-modifiers.ts) expected to gain a Linux path, or is the protocol fallback sufficient there? - What is the policy for adding new built-in components versus leaving them to consumers?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |
Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: Interactive Coding Agent
Overview
pi-coding-agent is the CLICommand-Line Interface (the `pi` binary). product: a minimal terminal coding harness built around four core tools (read, bash, edit, write), designed to be extended.
It wires pi-ai, pi-agent-core, and pi-tui into an AgentSession that drives prompts, manages models and thinking levels, executes built-in tools, and renders an interactive TUITerminal User Interface (the interactive mode, and the `pi-tui` library)..
It also exposes print, JSON, and RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). modes plus an embeddable SDKSoftware Development Kit (the embeddable programmatic API)..
Stakeholders
| Stakeholder | Interest |
|---|---|
| End users (developers) | A reliable interactive coding workflow with model choice and responsive UI |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. / SDKSoftware Development Kit (the embeddable programmatic API). authors | A stable SDKSoftware Development Kit (the embeddable programmatic API). surface and clear extension boundaries |
| Maintainers | A minimal core that stays small; features belong in extensions when possible |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall provide a `pi` CLICommand-Line Interface (the `pi` binary). with an interactive TUITerminal User Interface (the interactive mode, and the `pi-tui` library). as the default mode. | Must | The system shall provide a pi CLICommand-Line Interface (the `pi` binary). with an interactive TUITerminal User Interface (the interactive mode, and the `pi-tui` library). as the default mode. |
| FRFunctional Requirement.-02MustThe system shall provide four default built-in tools: `read`, `bash`, `edit`, `write`, and a read-only tool set (`read`, `grep`, `find`, `ls`). | Must | The system shall provide four default built-in tools: read, bash, edit, write, and a read-only tool set (read, grep, find, ls). |
| FRFunctional Requirement.-03MustThe system shall provide an `AgentSession` core that drives the prompt loop, manages model and thinking level, runs compaction, executes bash, and exports HTML. | Must | The system shall provide an AgentSession core that drives the prompt loop, manages model and thinking level, runs compaction, executes bash, and exports HTML. |
| FRFunctional Requirement.-04MustThe system shall resolve model selection via `provider/id:thinking` patterns, scoped model cycling, and in-TUITerminal User Interface (the interactive mode, and the `pi-tui` library). `/model` and `/scoped-models` commands. | Must | The system shall resolve model selection via provider/id:thinking patterns, scoped model cycling, and in-TUITerminal User Interface (the interactive mode, and the `pi-tui` library). /model and /scoped-models commands. |
| FRFunctional Requirement.-05MustThe system shall provide built-in slash commands (`settings`, `model`, `export`, `fork`, `tree`, `login`, `logout`, `new`, `compact`, `resume`, `reload`, `quit`, and others). | Must | The system shall provide built-in slash commands (settings, model, export, fork, tree, login, logout, new, compact, resume, reload, quit, and others). |
| FRFunctional Requirement.-06MustThe system shall support run modes: interactive (default), print (`-p`), JSON (`--mode json`), RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). (`--mode rpc`), and SDKSoftware Development Kit (the embeddable programmatic API).. | Must | The system shall support run modes: interactive (default), print (-p), JSON (--mode json), RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). (--mode rpc), and SDKSoftware Development Kit (the embeddable programmatic API).. |
| FRFunctional Requirement.-07MustThe system shall provide global and project settings (`~/.pi/agent/settings.json`, `.pi/settings.json`) with file locking and deep merge, plus keybindings. | Must | The system shall provide global and project settings (~/.pi/agent/settings.json, .pi/settings.json) with file locking and deep merge, plus keybindings. |
| FRFunctional Requirement.-08MustThe system shall gate project resources (settings, extensions, context files) behind a project-trust decision. | Must | The system shall gate project resources (settings, extensions, context files) behind a project-trust decision. |
| FRFunctional Requirement.-09MustThe system shall discover and load resources (extensions, skills, prompt templates, themes, AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root./CLAUDE.md) from global, project, and package sources. | Must | The system shall discover and load resources (extensions, skills, prompt templates, themes, AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root./CLAUDE.md) from global, project, and package sources. |
| FRFunctional Requirement.-10MustThe system shall build the system prompt from base prompt, project context, skills, date, and cwd. | Must | The system shall build the system prompt from base prompt, project context, skills, date, and cwd. |
| FRFunctional Requirement.-11ShouldThe system shall support theme loading (JSON themes, light/dark auto, hot-reload). | Should | The system shall support theme loading (JSON themes, light/dark auto, hot-reload). |
| FRFunctional Requirement.-12ShouldThe system shall provide update-check and install-telemetry endpoints, disable-able via env flags or `--offline`. | Should | The system shall provide update-check and install-telemetry endpoints, disable-able via env flags or --offline. |
| FRFunctional Requirement.-13ShouldThe system shall support first-time setup flow (theme + analytics opt-in) behind an experimental flag. | Should | The system shall support first-time setup flow (theme + analytics opt-in) behind an experimental flag. |
| FRFunctional Requirement.-14MayThe system shall provide a `pi install/remove/update/list/config` package manager for pi packages (npm/git). | May | The system shall provide a pi install/remove/update/list/config package manager for pi packages (npm/git). |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustThe package shall target Node `>=22.19.0` and use erasable TypeScript syntax only. | Must | Compatibility | The package shall target Node >=22.19.0 and use erasable TypeScript syntax only. |
| NFRNon-Functional Requirement.-02MustA hot-swap runtime shall tear down and recreate cwd-bound services when switching sessions or cwds. | Must | Reliability | A hot-swap runtime shall tear down and recreate cwd-bound services when switching sessions or cwds. |
| NFRNon-Functional Requirement.-03MustThe system shall not include an in-process sandbox; containerization is delegated to external sandboxes (documented). | Must | Security | The system shall not include an in-process sandbox; containerization is delegated to external sandboxes (documented). |
| NFRNon-Functional Requirement.-04ShouldStartup shall defer non-essential network operations when `--offline` / `PI_OFFLINE=1` is set. | Should | Performance | Startup shall defer non-essential network operations when --offline / PI_OFFLINE=1 is set. |
| NFRNon-Functional Requirement.-05ShouldThe published CLICommand-Line Interface (the `pi` binary). shall include a generated npm shrinkwrap pinning transitive deps. | Should | Maintainability | The published CLICommand-Line Interface (the `pi` binary). shall include a generated npm shrinkwrap pinning transitive deps. |
Constraints
- CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. is minimal: no built-in MCPModel Context Protocol (not built into core; extensions may add it)., sub-agents, plan mode, or permission popups (extensions provide these).
- Direct dependencies pinned to exact versions; lockfile is ground truth.
- New issues/PRs from new contributors are auto-closed by default (maintainer gate).
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-02MustThe system shall provide four default built-in tools: `read`, `bash`, `edit`, `write`, and a read-only tool set (`read`, `grep`, `find`, `ls`).
- Given a default agent session
- When tools are registered
- Then
read,bash,edit,writeare available, and the read-only set restricts toread,grep,find,ls.
- FRFunctional Requirement.-03MustThe system shall provide an `AgentSession` core that drives the prompt loop, manages model and thinking level, runs compaction, executes bash, and exports HTML.
- Given a user prompt
- When submitted to
AgentSession.prompt() - Then it expands commands/templates, emits lifecycle events, streams the response, executes tools, handles retries/compaction, and persists entries to the session file.
- FRFunctional Requirement.-06MustThe system shall support run modes: interactive (default), print (`-p`), JSON (`--mode json`), RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). (`--mode rpc`), and SDKSoftware Development Kit (the embeddable programmatic API)..
- Given the CLICommand-Line Interface (the `pi` binary). invoked with
-p,--mode json,--mode rpc, or no flag - When it starts
- Then it runs in print, JSON, RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode)., or interactive mode respectively.
- Given the CLICommand-Line Interface (the `pi` binary). invoked with
- FRFunctional Requirement.-08MustThe system shall gate project resources (settings, extensions, context files) behind a project-trust decision.
- Given an untrusted project
- When the agent starts
- Then project resources are not loaded until a trust decision is recorded (or
--approveis passed non-interactively).
- NFRNon-Functional Requirement.-02MustA hot-swap runtime shall tear down and recreate cwd-bound services when switching sessions or cwds.
- Given an active interactive session
- When the user switches sessions or cwd via
/forkor/tree - Then the runtime tears down and recreates cwd-bound services without leaking state.
Conflicts
None identified yet.
Open Questions
- Should the experimental first-time setup flow graduate to default behavior, and on what timeline?
- What is the policy for promoting a widely-used extension into a built-in tool or command?
Specification: Interactive Coding Agent
Overview
pi-coding-agent composes the three libraries into a CLICommand-Line Interface (the `pi` binary). product.
main() parses args, resolves project trust, builds a SessionManager and AgentSessionRuntime, then dispatches to a run mode.
The AgentSession is the central class shared by all modes; InteractiveMode renders events via pi-tui.
Architecture
cli.ts -> main() -------------------------------+
args parse, migrations, trust resolve |
createSessionManager, createAgentSessionRuntime|
| |
v |
+-------------------+ owns session + |
| AgentSession | cwd-bound services |
| Runtime | (hot-swap on |
+---------+---------+ fork/switch/tree) |
| |
v |
+-------------------+ delegates to |
| AgentSession |----> pi-agent-core Agent |
| (prompt loop, |----> ModelRegistry (auth)|
| compaction, bash,|----> pi-ai streamSimple |
| tools, export) | |
+---------+---------+ |
| events |
v |
+------+------+------+ |
| | | |
v v v |
Interactive Print RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). <-- modes dispatch-+
Mode (pi-tui) (text/json)
Data Models
Settings (excerpt)
| Field | Type | Description |
|---|---|---|
| defaultModel | string | provider/id:thinking |
| transport | enum | sse / websocket / auto |
| compaction | object | Thresholds and behavior |
| extensions/skills/prompts/themes | string[] | Enabled resource globs |
| telemetry/analytics | object | Opt-in/out flags |
| projectTrust | object | Trust defaults |
ResourceLoader sources
| Source | Path | Trust-gated |
|---|---|---|
| Global user | ~/.pi/agent/, ~/.agents/ |
No |
| Project | .pi/, .agents/, AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root., CLAUDE.md |
Yes |
| Package | installed pi packages | Per-source |
Tool sets
| Set | Tools |
|---|---|
| Default (4) | read, bash, edit, write |
| Read-only | read, grep, find, ls |
API Contracts
SDKSoftware Development Kit (the embeddable programmatic API).: createAgentSession()
Factory building an AgentSession wired with a resource loader, model registry, settings, and extension runner.
Exposes prompt, subscribe, state, abort, setModel, setThinkingLevel, compact, fork.
CLICommand-Line Interface (the `pi` binary). flags (excerpt)
| Flag | Effect |
|---|---|
-p / --prompt |
Print mode (single-shot) |
--mode json / --mode rpc |
JSON event stream / RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). protocol |
-c / -r / --fork / --session |
Continue / resume / fork / specific session |
-t / -xt / -nbt / -nt |
Tool allowlist / exclude / no-builtin / none |
--model / --models |
Model / scoped cycling set |
--approve / --no-approve |
Non-interactive project trust |
--offline |
Disable startup network ops |
Sequences
Interactive prompt
user input -> AgentSession.prompt()
-> slash command? extension command or builtin
-> emit input event (extensions may transform/intercept)
-> expand skill/template
-> streaming? queue steer/follow-up
-> validate model + auth
-> check pre-prompt compaction
-> before_agent_start (extensions modify system prompt / inject messages)
-> Agent.prompt() -> agentLoop -> streamFn -> pi-ai
-> stream deltas, execute tools, emit events
-> _handlePostAgentRun: retry/compact/queued messages loop
-> append entries to SessionManager JSONL
InteractiveMode renders events incrementally via pi-tui
Run mode dispatch
main() -> resolveAppMode(args)
interactive -> AgentSessionRuntime + InteractiveMode (TUITerminal User Interface (the interactive mode, and the `pi-tui` library).)
print -> runPrintMode (text or json)
rpc -> runRpcMode (JSONL stdin/stdout)
(SDKSoftware Development Kit (the embeddable programmatic API).) -> createAgentSession() used programmatically
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Central class | AgentSession shared by all modes |
One code path for prompt lifecycle across modes |
| Runtime hot-swap | AgentSessionRuntime recreates services |
Clean state on fork/switch/tree without leaking cwd-bound resources |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. tools | Four only | Minimal core; more tools via extensions |
| No in-process sandbox | Delegate to external sandboxes | CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. stays small; security boundary is the OS/container |
| Resource discovery | Global + project + package, trust-gated | User controls what project code executes |
| Modes | Interactive default; print/json/rpc alternatives | One binary serves TUITerminal User Interface (the interactive mode, and the `pi-tui` library). users, scripts, and embeddings |
Risks and Unknowns
InteractiveModeis a very large module (~193KB); changes there are high-churn and hard to review.- Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. is a security-critical path; regressions could execute untrusted code.
- The package manager and resource discovery span many source locations; collision and precedence rules must stay consistent.
Out of Scope
- ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). abstraction internals (FEAT-0001).
- Agent loop internals (FEAT-0002).
- TUITerminal User Interface (the interactive mode, and the `pi-tui` library). rendering internals (FEAT-0003).
- ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers./skills API surface (FEAT-0005).
- SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). format and branching internals (FEAT-0006).
requirements
- Should the experimental first-time setup flow graduate to default behavior, and on what timeline?
- What is the policy for promoting a widely-used extension into a built-in tool or command?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |
Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. and Skills Platform
Overview
The extension platform is how pi stays minimal while remaining extensible. TypeScript extensions register custom tools, commands, event handlers, UI primitives, providers, and autocomplete; skills package on-demand capabilities via the Agent Skills standard; prompt templates offer reusable prompt snippets. Together they let users and packages reshape nearly every part of the agent without forking the core.
Stakeholders
| Stakeholder | Interest |
|---|---|
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. authors | A rich, stable ExtensionAPI and clear loading rules |
| Skills authors | A standard SKILL.md format and predictable invocation |
| End users | Safe, discoverable ways to add capabilities (install, trust) |
| Maintainers | A small, well-considered hook surface that does not bloat core |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall load TypeScript extensions via a default export `function (pi: ExtensionAPI)` using cached, lazy module loading. | Must | The system shall load TypeScript extensions via a default export function (pi: ExtensionAPI) using cached, lazy module loading. |
| FRFunctional Requirement.-02MustThe system shall allow extensions to register and replace tools (built-ins included). | Must | The system shall allow extensions to register and replace tools (built-ins included). |
| FRFunctional Requirement.-03MustThe system shall allow extensions to register slash commands invoked as `/name`. | Must | The system shall allow extensions to register slash commands invoked as /name. |
| FRFunctional Requirement.-04MustThe system shall provide event handlers for input, tool_call, tool_result, message lifecycle, turn/agent lifecycle, session lifecycle, compaction, provider request/response, project_trust, and resources_discover. | Must | The system shall provide event handlers for input, tool_call, tool_result, message lifecycle, turn/agent lifecycle, session lifecycle, compaction, provider request/response, project_trust, and resources_discover. |
| FRFunctional Requirement.-05MustThe system shall allow extensions to render UI primitives: selectors, confirmations, inputs, notifications, status line, widgets, custom footer/header/editor/overlay, and raw terminal input. | Must | The system shall allow extensions to render UI primitives: selectors, confirmations, inputs, notifications, status line, widgets, custom footer/header/editor/overlay, and raw terminal input. |
| FRFunctional Requirement.-06MustThe system shall allow extensions to define keyboard shortcuts, CLICommand-Line Interface (the `pi` binary). flags, autocomplete providers, and message renderers. | Must | The system shall allow extensions to define keyboard shortcuts, CLICommand-Line Interface (the `pi` binary). flags, autocomplete providers, and message renderers. |
| FRFunctional Requirement.-07MustThe system shall allow extensions to perform session control actions (setActiveTools, setModel, setThinkingLevel, abort, compact, fork). | Must | The system shall allow extensions to perform session control actions (setActiveTools, setModel, setThinkingLevel, abort, compact, fork). |
| FRFunctional Requirement.-08MustThe system shall load Agent Skills from `SKILL.md` files (global, project parent-walk, or packages), invoked as `/skill:name`, injected into the system prompt on demand. | Must | The system shall load Agent Skills from SKILL.md files (global, project parent-walk, or packages), invoked as /skill:name, injected into the system prompt on demand. |
| FRFunctional Requirement.-09MustThe system shall load prompt templates (Markdown with `{{variable}}` expansion) invoked as `/templatename`. | Must | The system shall load prompt templates (Markdown with {{variable}} expansion) invoked as /templatename. |
| FRFunctional Requirement.-10MustThe system shall load resources from global, project, and package sources with project-trust gating. | Must | The system shall load resources from global, project, and package sources with project-trust gating. |
| FRFunctional Requirement.-11ShouldThe system shall support a package manager (`pi install/remove/update/list`) for distributing extensions, skills, prompts, and themes via npm or git. | Should | The system shall support a package manager (pi install/remove/update/list) for distributing extensions, skills, prompts, and themes via npm or git. |
| FRFunctional Requirement.-12ShouldThe system shall support custom providers via `~/.pi/agent/models.json` or extensions for custom APIs/OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot).. | Should | The system shall support custom providers via ~/.pi/agent/models.json or extensions for custom APIs/OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot).. |
| FRFunctional Requirement.-13MayThe system shall allow extensions to register custom compaction and summarization behavior. | May | The system shall allow extensions to register custom compaction and summarization behavior. |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustProject-sourced extensions shall not execute until a project-trust decision is recorded. | Must | Security | Project-sourced extensions shall not execute until a project-trust decision is recorded. |
| NFRNon-Functional Requirement.-02MustExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. module loading shall be cached so a module is not re-evaluated per use. | Must | Reliability | ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. module loading shall be cached so a module is not re-evaluated per use. |
| NFRNon-Functional Requirement.-03MustThe legacy pi-ai root API used by extensions shall be aliased to `/compat` at runtime so existing extensions keep working. | Must | Compatibility | The legacy pi-ai root API used by extensions shall be aliased to /compat at runtime so existing extensions keep working. |
| NFRNon-Functional Requirement.-04ShouldThe hook surface shall be well-considered; new hooks require maintainer discussion to avoid unmaintainable complexity. | Should | Maintainability | The hook surface shall be well-considered; new hooks require maintainer discussion to avoid unmaintainable complexity. |
| NFRNon-Functional Requirement.-05ShouldExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. event dispatch shall not block the agent loop on the happy path. | Should | Performance | ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. event dispatch shall not block the agent loop on the happy path. |
Constraints
- Extensions are TypeScript modules loaded via
jiti. - Skills follow the external Agent Skills standard (
agentskills.io). - New hooks that bloat core are rejected; the bar is "well considered and discussed".
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-02MustThe system shall allow extensions to register and replace tools (built-ins included).
- Given an extension calling
pi.registerTool(...) - When the agent runs
- Then the custom tool is available and may replace a built-in of the same name.
- Given an extension calling
- FRFunctional Requirement.-04MustThe system shall provide event handlers for input, tool_call, tool_result, message lifecycle, turn/agent lifecycle, session lifecycle, compaction, provider request/response, project_trust, and resources_discover.
- Given an extension with a
tool_callhandler - When a tool is invoked
- Then the handler receives the call and may observe or mutate it.
- Given an extension with a
- FRFunctional Requirement.-08MustThe system shall load Agent Skills from `SKILL.md` files (global, project parent-walk, or packages), invoked as `/skill:name`, injected into the system prompt on demand.
- Given a
SKILL.mdplaced in a discoverable skills directory - When the user invokes
/skill:name - Then the skill content is injected into the system prompt for that session.
- Given a
- FRFunctional Requirement.-10MustThe system shall load resources from global, project, and package sources with project-trust gating.
- Given an untrusted project with
.pi/resources - When the agent starts
- Then project-sourced extensions, skills, and prompts are not loaded until trust is granted.
- Given an untrusted project with
- NFRNon-Functional Requirement.-01MustProject-sourced extensions shall not execute until a project-trust decision is recorded.
- Given a project-sourced extension
- When the project is not trusted
- Then the extension's default export is never executed.
- NFRNon-Functional Requirement.-03MustThe legacy pi-ai root API used by extensions shall be aliased to `/compat` at runtime so existing extensions keep working.
- Given an extension importing from the pi-ai root
- When loaded after the
/compatmigration - Then the import resolves to the compat entrypoint without code changes.
Conflicts
None identified yet.
Open Questions
- What is the stabilization criteria for the
ExtensionAPI(currently documented in a 104KB extensions.md), and which parts are considered stable versus experimental? - How should extension-provided permissions/sandboxing hooks interact with the documented external-sandbox patterns?
Specification: ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. and Skills Platform
Overview
Three resource types extend pi: extensions (TypeScript modules), skills (SKILL.md packages), and prompt templates ({{var}} Markdown).
DefaultResourceLoader discovers them from global/project/package sources with trust gating; ExtensionRunner dispatches lifecycle and event hooks; skills and templates are injected into the system prompt on demand.
Architecture
DefaultResourceLoader
discover: global (~/.pi/agent, ~/.agents)
project (.pi, .agents, AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root./CLAUDE.md) -- trust gated
packages (installed)
|
v
+--------------------+ cached, lazy (jiti)
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. loader |----> default export (pi: ExtensionAPI)
+---------+----------+
|
v
+--------------------+
| ExtensionRunner | lifecycle + event dispatch
| (on/emit hooks) | (input, tool_call, tool_result,
+--------------------+ message_*, session_*, provider_*, ...)
|
v
AgentSession / InteractiveMode (consume hooks, expose UI context)
Skills & Prompt Templates
loaded on demand (/skill:name, /templatename)
-> injected into system prompt or expanded into user message
Data Models
ExtensionAPI (excerpt)
| Capability | Method | Description |
|---|---|---|
| Tools | registerTool, defineTool |
Add or replace tools |
| Commands | registerCommand |
Add slash command /name |
| Events | on(event, handler) |
Subscribe to lifecycle hooks |
| UI | ExtensionUIContext |
Selectors, confirmations, inputs, widgets, overlays |
| Shortcuts | ExtensionShortcut |
Register key bindings |
| Flags | ExtensionFlag |
Register CLICommand-Line Interface (the `pi` binary). flags |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). control | ExtensionContextActions |
setActiveTools, setModel, compact, fork, abort |
Event surface (categories)
| Category | Events |
|---|---|
| Input | input |
| Agent lifecycle | before_agent_start, agent_start/end, turn_start/end |
| Messages | message_start/update/end |
| Tools | tool_call, tool_result |
| Sessions | session_start/shutdown, session_before_compact/compact/fork/switch/tree |
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | before_provider_request, after_provider_response |
| Context | context, resources_discover, project_trust, user_bash |
SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`.
| Field | Type | Constraints | Description |
|---|---|---|---|
| name | string | from /skill: invocation |
SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. identifier |
| path | string | file path | Location of SKILL.md |
| frontmatter | object | optional | Metadata |
| body | string | not null | Markdown injected into prompt |
API Contracts
/skill:name
Injects the skill's SKILL.md body into the system prompt context for the session.
/templatename
Expands {{variable}} placeholders in the Markdown template into a user message.
pi.install(name) / pi.remove(name) / pi.update(name)
Package manager operations over npm or git sources, resolving to resource directories (extensions, skills, prompts, themes, models).
Sequences
Tool call through extension hooks
agent -> tool_call event
-> beforeToolCall (extension can {block: true})
-> if not blocked: tool.execute(args)
-> tool_execution_update events (extensions observe)
-> afterToolCall (extension can override content/details/isError/terminate)
-> tool_result event
Resource discovery (trust-gated)
DefaultResourceLoader.discover()
for each source (global, project, package):
if project source and not trusted: skip
collect extensions, skills, prompts, themes, context files
-> resources_discover event (extensions may augment)
-> registry populated
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Language | TypeScript via jiti | Type-safe, cached, lazy |
| Skills standard | Agent Skills (SKILL.md) |
Cross-agent interoperability |
| Hook surface | Curated event taxonomy | Power without unbounded core growth |
| Trust model | Project sources gated | Prevent untrusted code execution by default |
| Compat alias | pi-ai root -> /compat |
Existing extensions keep working |
Risks and Unknowns
- The
ExtensionAPIis large and evolving; breaking changes affect the ecosystem. - Event handler ordering and interaction effects across multiple extensions can be hard to predict.
- Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. bypass (e.g. via a malicious package source) is a security risk to monitor.
Out of Scope
- The core built-in tools and agent loop (FEAT-0002, FEAT-0004).
- SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). persistence and branching (FEAT-0006).
- LLMLarge Language Model. provider auth internals (FEAT-0001).
requirements
- What is the stabilization criteria for the
ExtensionAPI(currently documented in a 104KB extensions.md), and which parts are considered stable versus experimental? - How should extension-provided permissions/sandboxing hooks interact with the documented external-sandbox patterns?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |
Pipeline Progress ?⚪ not started not yet begun✏️ draft initial version🔍 in review under review🚧 in progress actively worked on⛔ blocked waiting on dependency✅ done completed⏭️ skipped not applicable
Requirements: SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). Persistence and BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`.
Overview
Sessions make pi's conversations durable, resumable, and branchable.
SessionManager persists tree-structured JSONL logs (messages, model/thinking changes, compaction summaries, branch summaries); users can resume, fork, clone, and navigate session trees; compaction reclaims context lossily while preserving the original log.
Stakeholders
| Stakeholder | Interest |
|---|---|
| End users | Resume past work, branch explorations without losing history, manage context length |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. authors | SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). lifecycle hooks (compact, fork, switch, tree) to observe and augment |
| Maintainers | A stable, migratable on-disk format with clear versioning |
Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Requirement |
|---|---|---|
| FRFunctional Requirement.-01MustThe system shall persist sessions as JSONL with tree-structured entries (`id`/`parentId` per entry) under a versioned format. | Must | The system shall persist sessions as JSONL with tree-structured entries (id/parentId per entry) under a versioned format. |
| FRFunctional Requirement.-02MustThe system shall support session entry types: messages, thinking-level changes, model changes, compaction summaries, branch summaries, labels, custom messages, and session info. | Must | The system shall support session entry types: messages, thinking-level changes, model changes, compaction summaries, branch summaries, labels, custom messages, and session info. |
| FRFunctional Requirement.-03MustThe system shall support resume (`-r`, `-c`, `--session`, `/resume`), new (`/new`), fork (`/fork`, `--fork`), clone (`/clone`), and tree navigation (`/tree`). | Must | The system shall support resume (-r, -c, --session, /resume), new (/new), fork (/fork, --fork), clone (/clone), and tree navigation (/tree). |
| FRFunctional Requirement.-04MustThe system shall support branching where forking from a previous user message creates a new session file and the original log is preserved. | Must | The system shall support branching where forking from a previous user message creates a new session file and the original log is preserved. |
| FRFunctional Requirement.-05MustThe system shall support compaction (manual via `/compact [prompt]` or automatic on threshold/overflow) that summarizes older messages while keeping recent ones. | Must | The system shall support compaction (manual via /compact [prompt] or automatic on threshold/overflow) that summarizes older messages while keeping recent ones. |
| FRFunctional Requirement.-06MustThe system shall preserve the original JSONL file through compaction (compaction is lossy but non-destructive to the source log). | Must | The system shall preserve the original JSONL file through compaction (compaction is lossy but non-destructive to the source log). |
| FRFunctional Requirement.-07MustThe system shall hot-swap the runtime when switching sessions or cwds, tearing down and recreating cwd-bound services. | Must | The system shall hot-swap the runtime when switching sessions or cwds, tearing down and recreating cwd-bound services. |
| FRFunctional Requirement.-08MustThe system shall emit session lifecycle hooks (`session_before_compact/compact`, `session_before_fork/fork`, `session_before_switch/switch`, `session_before_tree/tree`) extensions can observe. | Must | The system shall emit session lifecycle hooks (session_before_compact/compact, session_before_fork/fork, session_before_switch/switch, session_before_tree/tree) extensions can observe. |
| FRFunctional Requirement.-09ShouldThe system shall provide compaction post-token estimates and branch summarization. | Should | The system shall provide compaction post-token estimates and branch summarization. |
| FRFunctional Requirement.-10ShouldThe system shall support session export to HTML. | Should | The system shall support session export to HTML. |
| FRFunctional Requirement.-11ShouldThe system shall migrate older session format versions to the current version on load. | Should | The system shall migrate older session format versions to the current version on load. |
| FRFunctional Requirement.-12MayThe system shall support session labels and bookmarks for navigation. | May | The system shall support session labels and bookmarks for navigation. |
Non-Functional Requirements
Order rows by priority: Must first, then Should, then May.
| ID | Priority | Category | Requirement |
|---|---|---|---|
| NFRNon-Functional Requirement.-01MustThe original session JSONL shall never be destructively modified by compaction or branching. | Must | Reliability | The original session JSONL shall never be destructively modified by compaction or branching. |
| NFRNon-Functional Requirement.-02MustThe format version shall be explicit (`CURRENT_SESSION_VERSION`) with documented migration steps. | Must | Compatibility | The format version shall be explicit (CURRENT_SESSION_VERSION) with documented migration steps. |
| NFRNon-Functional Requirement.-03ShouldSessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). append and read operations shall remain efficient for large session files. | Should | Performance | SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). append and read operations shall remain efficient for large session files. |
| NFRNon-Functional Requirement.-04ShouldCompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. events shall carry reason and retry metadata for extension consumers. | Should | Observability | CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. events shall carry reason and retry metadata for extension consumers. |
Constraints
- Sessions are local JSONL files; no server-side session store.
progress.mdandstate.yml(workflow state) are never committed; session JSONL is user data.- CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. is lossy by design.
Acceptance Criteria
Every FRFunctional Requirement. and NFRNon-Functional Requirement. shall have at least one acceptance criterion.
Order criteria by FRs first (sorted by ID), then NFRs (sorted by ID).
- FRFunctional Requirement.-01MustThe system shall persist sessions as JSONL with tree-structured entries (`id`/`parentId` per entry) under a versioned format.
- Given a session with multiple turns and a branch
- When serialized to disk
- Then the JSONL contains tree-structured entries each with
idandparentIdforming a valid tree.
- FRFunctional Requirement.-05MustThe system shall support compaction (manual via `/compact [prompt]` or automatic on threshold/overflow) that summarizes older messages while keeping recent ones.
- Given a session exceeding the compaction threshold
- When compaction runs
- Then older messages are summarized, recent messages are retained, and a compaction summary entry is appended.
- FRFunctional Requirement.-06MustThe system shall preserve the original JSONL file through compaction (compaction is lossy but non-destructive to the source log).
- Given a session that has been compacted
- When inspecting the JSONL file
- Then the original message entries remain present alongside the compaction summary.
- FRFunctional Requirement.-07MustThe system shall hot-swap the runtime when switching sessions or cwds, tearing down and recreating cwd-bound services.
- Given an interactive session
- When the user runs
/treeand switches branches - Then the runtime tears down cwd-bound services and recreates them for the target branch without leaking state.
- NFRNon-Functional Requirement.-01MustThe original session JSONL shall never be destructively modified by compaction or branching.
- Given any compaction or fork operation
- When it completes
- Then the source session JSONL file is byte-for-byte unchanged except for appended entries.
Conflicts
None identified yet.
Open Questions
- Is there a plan to support remote/session-sync, or will sessions remain strictly local files?
- What is the policy for pruning or archiving very large session files over time?
Specification: SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). Persistence and BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`.
Overview
Sessions are JSONL files with a tree-structured entry stream.
SessionManager owns append/read/migration; AgentSessionRuntime hot-swaps the active session and its cwd-bound services on fork/switch/tree navigation; compaction runs in AgentSession with hooks for extensions.
Architecture
AgentSession
prompt loop appends entries ------+
compaction summarizes |
branch summarization |
| v
| +-------------------+
+----------------->| SessionManager |
| JSONL append/read |
| tree traversal |
| migration (v1->N) |
+---------+---------+
| hot-swap
v
+------------------------+
| AgentSessionRuntime |
| owns active session + |
| cwd-bound services |
| fork()/switch()/ |
| navigate() |
+------------------------+
Data Models
SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). JSONL entry (union, tree-structured)
| Field | Type | Constraints | Description |
|---|---|---|---|
| id | string | required | Unique entry id |
| parentId | string | nullable | Parent entry id (null = root) |
| type | enum | required | message, thinkingChange, modelChange, compaction, branchSummary, label, custom, sessionInfo |
| ... | varies | per type | Type-specific payload |
SessionHeader
| Field | Type | Description |
|---|---|---|
| version | number | Format version (CURRENT_SESSION_VERSION) |
| sessionId | string | Stable session identifier |
| cwd | string | Working directory at creation |
CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. summary entry
| Field | Type | Description |
|---|---|---|
| reason | enum | manual, threshold, overflow, retry |
| willRetry | boolean | Whether compaction will retry |
| summary | string | Summarized older context |
| cutPoint | string | Entry id where cut occurred |
API Contracts
SessionManager.append(entry) / read()
Append appends a JSONL line; read loads and indexes the tree for traversal, listing, and migration.
Runtime: fork() / switch(id) / navigate(direction)
Each tears down the current runtime, emits the corresponding session_before_* event, and recreates services bound to the target session/cwd.
/compact [prompt]
Triggers compaction: emits session_before_compact, summarizes older messages, retains recent ones, appends a compaction summary, emits session_compact.
Sequences
CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. flow
threshold exceeded OR /compact
-> session_before_compact event (extensions may modify/inject)
-> findCutPoint (token estimation)
-> summarize messages before cut
-> keep recent messages
-> append compaction summary entry to JSONL
-> post-token estimate
-> session_compact event (with reason, willRetry)
-> if willRetry: agent.continue()
BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. flow
/fork <message-id>
-> session_before_fork event
-> create new session file from entries up to message-id
-> runtime.switch() to new session
-> session_fork event
(original session file unchanged)
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Format | JSONL, tree-structured | Append-only, diff/stream friendly, supports branching |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy, non-destructive | Reclaims context while preserving source history |
| Migration | Explicit version + on-load migration | Supports evolving format without breaking old sessions |
| Hot-swap | Runtime recreation | Clean state across sessions/cwds |
| Hook surface | session_before_*/session_* pairs |
Extensions observe and augment lifecycle |
Risks and Unknowns
- Large session files (multi-MB) affect resume and read performance; pruning policy is undefined.
- Migration logic must handle every prior version; a missed step corrupts old sessions.
- CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. quality (what gets summarized) directly affects agent performance and is hard to evaluate automatically.
Out of Scope
- The interactive TUITerminal User Interface (the interactive mode, and the `pi-tui` library). rendering of sessions (FEAT-0004).
- The agent loop and tool execution (FEAT-0002).
- ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. API internals beyond session hooks (FEAT-0005).
requirements
- Is there a plan to support remote/session-sync, or will sessions remain strictly local files?
- What is the policy for pruning or archiving very large session files over time?
Vocabulary
Domain Terms
| Term | Definition |
|---|---|
| PiThe project: a minimal, self-extensible terminal coding agent harness and its libraries. | The project: a minimal, self-extensible terminal coding agent harness and its libraries. |
| HarnessThe coding agent runtime that wires the agent loop, tools, sessions, and UI together. | The coding agent runtime that wires the agent loop, tools, sessions, and UI together. |
| ExtensionA TypeScript module with a default export `function (pi: ExtensionAPI)` that augments the agent with tools, commands, events, UI, or providers. | A TypeScript module with a default export function (pi: ExtensionAPI) that augments the agent with tools, commands, events, UI, or providers. |
| SkillAn on-demand capability package following the Agent Skills standard (`SKILL.md` + optional frontmatter), invoked as `/skill:name`. | An on-demand capability package following the Agent Skills standard (SKILL.md + optional frontmatter), invoked as /skill:name. |
| Prompt templateA Markdown file with `{{variable}}` expansion invoked as `/templatename`. | A Markdown file with {{variable}} expansion invoked as /templatename. |
| pi packageA distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via `pi install`. | A distributable bundle (npm or git) of extensions, skills, prompts, themes, or custom providers, installed via pi install. |
| SessionA persistent, branchable conversation log stored as JSONL (`SessionHeader`, messages, compaction summaries, branch summaries). | A persistent, branchable conversation log stored as JSONL (SessionHeader, messages, compaction summaries, branch summaries). |
| BranchingTree-structured session forking (`/fork`, `/clone`, `/tree`) where each entry has `id`/`parentId`. | Tree-structured session forking (/fork, /clone, /tree) where each entry has id/parentId. |
| CompactionLossy summarization of older session messages to reclaim context; original JSONL is preserved. | Lossy summarization of older session messages to reclaim context; original JSONL is preserved. |
| SteeringA queued message delivered to a streaming agent after the current tool batch completes. | A queued message delivered to a streaming agent after the current tool batch completes. |
| Follow-upA queued message delivered after the agent fully stops. | A queued message delivered after the agent fully stops. |
| Project trustA per-folder decision (`~/.pi/agent/trust.json`) gating whether project settings, resources, and extensions execute. | A per-folder decision (~/.pi/agent/trust.json) gating whether project settings, resources, and extensions execute. |
| Scope (model) | A scoped model set selected with --models pat1,pat2 for Ctrl+P cycling. |
| Faux providerAn in-memory scripted provider (`providers/faux.ts`) used for deterministic tests with no real API calls. | An in-memory scripted provider (providers/faux.ts) used for deterministic tests with no real API calls. |
| CoreThe deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. | The deliberately minimal set of built-in capabilities (four tools); features outside core must be extensions. |
Technical Terms
| Term | Definition |
|---|---|
| ProviderThe runtime unit owning a model catalog, auth, and stream behavior (e.g. `anthropic`, `openai`). | The runtime unit owning a model catalog, auth, and stream behavior (e.g. anthropic, openai). |
| API implementationA wire-protocol backend shared by providers (e.g. `anthropic-messages`, `openai-responses`, `openai-completions`, `google-generative-ai`, `bedrock-converse-stream`). | A wire-protocol backend shared by providers (e.g. anthropic-messages, openai-responses, openai-completions, google-generative-ai, bedrock-converse-stream). |
Models collection |
pi-ai's provider registry that routes model lookups and streams by owning provider. |
streamFn / StreamFn |
The injectable function the agent calls to reach the LLMLarge Language Model.; streamSimple is the default. |
AssistantMessageEventStream |
pi-ai's async-iterable event queue (push queue + result promise) carrying start/*_delta/done/error events. |
Agent |
pi-agent-core's stateful class owning the transcript and lifecycle (prompt, continue, abort). |
AgentHarness |
pi-agent-core's higher-level orchestrator wrapping Agent with sessions, compaction, skills, and provider hooks. |
agentLoop |
The low-level prompt-stream-tool-continue loop in pi-agent-core. |
AgentMessage |
pi-agent-core's app-extensible message union (via declaration merging); convertToLlm bridges to pi-ai Message. |
| TypeBoxThe schema library used for tool parameter definitions (serializable JSON, self-validating). | The schema library used for tool parameter definitions (serializable JSON, self-validating). |
| Differential renderingpi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. | pi-tui's technique of diffing a new line array against the previous frame and writing minimal escape sequences. |
| Synchronized outputTerminal escape sequence (`\x1b[?2026h..l`) used by pi-tui for atomic, flicker-free rendering. | Terminal escape sequence (\x1b[?2026h..l) used by pi-tui for atomic, flicker-free rendering. |
| Kitty keyboard protocolTerminal input protocol pi-tui negotiates for richer key reporting. | Terminal input protocol pi-tui negotiates for richer key reporting. |
| Lockstep versioningAll packages share one version and release together. | All packages share one version and release together. |
| Trusted publishingnpm publish via GitHub Actions OIDC (environment `npm-publish`); no local credentials required. | npm publish via GitHub Actions OIDCOpenID Connect (used for npm trusted publishing identity). (environment npm-publish); no local credentials required. |
| Shrinkwrap`packages/coding-agent/npm-shrinkwrap.json`, generated from the root lockfile to pin transitive deps for npm users. | packages/coding-agent/npm-shrinkwrap.json, generated from the root lockfile to pin transitive deps for npm users. |
Acronyms and Abbreviations
| Abbreviation | Expansion |
|---|---|
| TUITerminal User Interface (the interactive mode, and the `pi-tui` library). | Terminal User Interface (the interactive mode, and the pi-tui library). |
| CLICommand-Line Interface (the `pi` binary). | Command-Line Interface (the pi binary). |
| LLMLarge Language Model. | Large Language Model. |
| MCPModel Context Protocol (not built into core; extensions may add it). | Model Context Protocol (not built into core; extensions may add it). |
| SDKSoftware Development Kit (the embeddable programmatic API). | Software Development Kit (the embeddable programmatic API). |
| RPCRemote Procedure Call (the JSONL stdin/stdout protocol mode). | Remote Procedure Call (the JSONL stdin/stdout protocol mode). |
| OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). | Open Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). |
| PKCEProof Key for Code Exchange (OAuth flow used by pi-ai). | Proof Key for Code Exchange (OAuthOpen Authorization (used for subscription-based provider login: Claude Pro/Max, ChatGPT Plus/Pro, GitHub Copilot). flow used by pi-ai). |
| API keyApplication Programming Interface key (ambient provider authentication). | Application Programming Interface key (ambient provider authentication). |
| ACAcceptance Criterion / Acceptance Criteria. | Acceptance Criterion / Acceptance Criteria. |
| FRFunctional Requirement. | Functional Requirement. |
| NFRNon-Functional Requirement. | Non-Functional Requirement. |
| ADRArchitecture Decision Record. | Architecture Decision Record. |
| RICEReach, Impact, Confidence, Effort (issue prioritization scoring). | Reach, Impact, Confidence, Effort (issue prioritization scoring). |
| SLO / SLIService Level Objective / Service Level Indicator. | Service Level Objective / Service Level Indicator. |
| OIDCOpenID Connect (used for npm trusted publishing identity). | OpenID Connect (used for npm trusted publishing identity). |
| CVECommon Vulnerabilities and Exposures. | Common Vulnerabilities and Exposures. |
| IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). | Input Method Editor (pi-tui positions the hardware cursor for IMEInput Method Editor (pi-tui positions the hardware cursor for IME candidate windows). candidate windows). |
| CJKChinese, Japanese, Korean (terminal width handling for wide characters). | Chinese, Japanese, Korean (terminal width handling for wide characters). |
| WASMWebAssembly (photon-node used for image resizing). | WebAssembly (photon-node used for image resizing). |
| AGENTS.mdProject-specific rules file for humans and agents, read automatically from the repo root. | Project-specific rules file for humans and agents, read automatically from the repo root. |