Data Models
Overview
The data models provide structured representations of core domain objects: GitHub issues, implementation plans, and repositories. Built using dataclasses, they support serialization, validation, and conversion between different formats (JSON, YAML, markdown).
Models
Issue
Represents a GitHub issue with all metadata, located in src/gh_worker/models/issue.py.
Fields:
number: int- Issue numbertitle: str- Issue titlebody: str- Issue description/contentstate: str- Issue state ("open", "closed")created_at: datetime- Creation timestampupdated_at: datetime- Last update timestampauthor: str- Issue author usernamelabels: list[str]- List of label namesassignees: list[str]- List of assignee usernamesurl: str- GitHub URL for the issuerepository: str- Repository in "owner/repo" formatmilestone: str | None- Milestone title (optional)
Methods:
from_gh_json(data, repository)- Create from GitHub CLI JSONto_markdown()- Convert to markdown format
JSON Conversion: Parses GitHub CLI JSON output:
- Handles missing fields (body defaults to "")
- Converts ISO 8601 timestamps (with "Z" suffix)
- Extracts author login (handles missing author)
- Extracts label names from label objects
- Requires repository parameter (not in GitHub JSON)
Markdown Format:
# {title}
**Issue**: #{number}
**Repository**: {repository}
**State**: {state}
**Author**: {author}
**Created**: {created_at}
**Updated**: {updated_at}
**URL**: {url}
**Labels**: {label1, label2, ...}
**Assignees**: {assignee1, assignee2, ...}
**Milestone**: {milestone}
---
{body}
PlanMetadata
Tracks implementation plan status and results, located in src/gh_worker/models/plan.py.
Fields:
issue_number: int- Associated issue numberrepository: str- Repository in "owner/repo" formatcreated_at: datetime- Plan creation timestampstatus: PlanStatus- Current status (enum)session_id: str | None- Agent session ID (optional)branch_name: str | None- Implementation branch (optional)pr_url: str | None- Pull request URL (optional)error_message: str | None- Error details if failed (optional)completed_at: datetime | None- Completion timestamp (optional)merged_at: datetime | None- PR merge timestamp (optional)agent: str | None- Agent name used (optional)model: str | None- Model used by agent (optional)commit_hash: str | None- Repository commit when plan was generated (optional)plan_file: Path | None- Path to plan file (not serialized)
Methods:
to_dict()- Convert to dictionary for YAMLfrom_dict(data)- Create from dictionarysave(path)- Save to YAML fileload(path)- Load from YAML file
Status Lifecycle:
PENDING- Plan created, waiting for review/approvalAPPROVED- Plan approved, ready for implementationIN_PROGRESS- Implementation in progressCOMPLETED- Successfully implemented (may have PR or be waiting for review)FAILED- Implementation failed
PlanStatus
Enumeration for plan status values.
Values:
PENDING = "pending"APPROVED = "approved"IN_PROGRESS = "in_progress"COMPLETED = "completed"FAILED = "failed"
Serialization:
- Uses enum value (string) for YAML
- Parsed back to enum on deserialization
Repository
Simple repository identifier with parsing and formatting, located in src/gh_worker/models/repository.py.
Fields:
owner: str- Repository owner (user or organization)name: str- Repository name
Methods:
from_string(repo_string)- Parse "owner/repo" string__str__()- Format as "owner/repo"full_name- Property returning "owner/repo" format
Validation:
- Requires exactly one "/" separator
- Owner and name must be non-empty
- Strips whitespace from owner and name
Serialization
Issue → Markdown
Uses to_markdown() method:
- Creates title header
- Adds metadata fields
- Includes labels if present
- Separates metadata from body with "---"
- Appends full issue body
Use Case:
- Storage in description.md files
- Human-readable issue archives
- Issue templates
GitHub JSON → Issue
Uses from_gh_json() classmethod:
- Parses required fields (number, title, state, timestamps, URL)
- Handles optional fields (body, author, labels)
- Converts ISO 8601 timestamps with timezone
- Extracts nested data (author login, label names)
- Adds repository context
Timestamp Handling:
- Replaces "Z" suffix with "+00:00" (UTC)
- Uses
datetime.fromisoformat() - Results in timezone-aware datetime objects
PlanMetadata → YAML
Uses to_dict() and save():
- Converts all fields to dictionary
- Serializes datetimes to ISO 8601 strings
- Converts enum to value (string)
- Handles None values
- Writes YAML with human-readable formatting
YAML Example:
issue_number: 42
repository: octocat/hello-world
created_at: '2024-01-15T14:30:22.123456+00:00'
status: completed
session_id: abc123
branch_name: fix-issue-42
pr_url: https://github.com/octocat/hello-world/pull/43
error_message: null
completed_at: '2024-01-15T15:45:30.987654+00:00'
merged_at: null
agent: claude-code
model: claude-sonnet-4
commit_hash: abc123def456
YAML → PlanMetadata
Uses from_dict() and load():
- Reads YAML file
- Parses dictionary
- Converts timestamps from ISO 8601
- Parses enum from string value
- Handles missing optional fields (defaults to None)
- Creates PlanMetadata instance
Repository Parsing
Uses from_string():
- Split on "/" delimiter
- Validate exactly 2 parts
- Validate non-empty owner and name
- Strip whitespace
- Create Repository instance
Valid Formats:
- "owner/repo"
- "owner / repo" (whitespace stripped)
Invalid Formats:
- "repo" (missing owner)
- "owner/repo/subpath" (too many parts)
- "/repo" (empty owner)
- "owner/" (empty name)
Requirements
Issue Model
MUST:
- Include all fields from GitHub API (number, title, body, state, timestamps, author, labels, URL)
- Support creation from GitHub CLI JSON
- Support conversion to markdown
- Use timezone-aware datetime for timestamps
- Handle missing or null fields (body, author)
- Store repository association
SHOULD:
- Preserve all GitHub metadata
- Use human-readable markdown format
- Handle empty label lists gracefully
- Strip unnecessary whitespace
MAY:
- Support markdown → Issue parsing
- Provide HTML conversion
- Support custom fields or extensions
- Implement issue comparison or diffing
PlanMetadata Model
MUST:
- Track issue association (issue_number, repository)
- Support status lifecycle (PENDING → APPROVED → IN_PROGRESS → COMPLETED/FAILED)
- Store agent results (session_id, branch_name, pr_url)
- Record timestamps (created_at, completed_at)
- Support YAML serialization and deserialization
- Handle optional fields gracefully
SHOULD:
- Use enum for status values
- Serialize timestamps in ISO 8601 format
- Create parent directories on save
- Validate status transitions
- Log metadata changes
MAY:
- Support metadata versioning or history
- Implement computed properties (duration, etc.)
- Provide metadata validation
- Support custom metadata fields
Repository Model
MUST:
- Store owner and name separately
- Support parsing from "owner/repo" string
- Format as "owner/repo" string
- Validate format (exactly one "/", non-empty parts)
- Strip whitespace from owner and name
SHOULD:
- Raise ValueError for invalid formats
- Provide descriptive error messages
- Support equality comparison
- Implement hash for use in sets/dicts
MAY:
- Support alternative formats (URLs, SSH)
- Validate owner and name patterns
- Provide repository metadata (description, stars, etc.)
- Support organization vs. user distinction
Serialization
MUST:
- Use appropriate formats for each model (JSON, YAML, markdown)
- Handle datetime serialization consistently (ISO 8601)
- Support bidirectional conversion where applicable
- Preserve all data during round-trip
- Handle missing or null values
SHOULD:
- Use human-readable formats where possible
- Validate data on deserialization
- Provide clear error messages for invalid data
- Support partial deserialization (optional fields)
MAY:
- Support multiple serialization formats per model
- Implement schema validation
- Provide serialization hooks or callbacks
- Support compression or encoding
Validation
MUST:
- Validate required fields are present
- Check data types match declarations
- Validate format constraints (repository format, enum values)
- Raise appropriate exceptions for invalid data
SHOULD:
- Validate field values (non-negative numbers, valid URLs)
- Check timestamp ordering (created_at < updated_at)
- Validate state transitions (plan status)
- Provide validation error details
MAY:
- Implement custom validators per field
- Support validation policies or rules
- Provide validation summaries
- Support relaxed validation modes
Usage Examples
Create Issue from GitHub JSON
from gh_worker.models.issue import Issue
# GitHub CLI JSON output
gh_json = {
"number": 42,
"title": "Add authentication",
"body": "We need to add user authentication...",
"state": "open",
"createdAt": "2024-01-15T10:00:00Z",
"updatedAt": "2024-01-15T14:30:00Z",
"author": {"login": "octocat"},
"labels": [{"name": "feature"}, {"name": "priority-high"}],
"url": "https://github.com/octocat/hello-world/issues/42"
}
issue = Issue.from_gh_json(gh_json, repository="octocat/hello-world")
print(issue.title) # "Add authentication"
print(issue.labels) # ["feature", "priority-high"]
Convert Issue to Markdown
Create and Save Plan Metadata
from gh_worker.models.plan import PlanMetadata, PlanStatus
from datetime import datetime
from pathlib import Path
metadata = PlanMetadata(
issue_number=42,
repository="octocat/hello-world",
created_at=datetime.now(),
status=PlanStatus.PENDING
)
metadata.save(Path("issue-42/plan.yaml"))
Update Plan Status
# Load existing metadata
metadata = PlanMetadata.load(Path("issue-42/plan.yaml"))
# Update status
metadata.status = PlanStatus.IN_PROGRESS
metadata.session_id = "abc123"
# Save changes
metadata.save(Path("issue-42/plan.yaml"))
# Mark as completed
metadata.status = PlanStatus.COMPLETED
metadata.branch_name = "fix-issue-42"
metadata.pr_url = "https://github.com/octocat/hello-world/pull/43"
metadata.completed_at = datetime.now()
metadata.save(Path("issue-42/plan.yaml"))
Parse Repository
from gh_worker.models.repository import Repository
# From string
repo = Repository.from_string("octocat/hello-world")
print(repo.owner) # "octocat"
print(repo.name) # "hello-world"
print(repo.full_name) # "octocat/hello-world"
# Create directly
repo = Repository(owner="octocat", name="hello-world")
print(str(repo)) # "octocat/hello-world"
Handle Invalid Repository
try:
repo = Repository.from_string("invalid-format")
except ValueError as e:
print(f"Error: {e}")
# Error: Repository must be in 'owner/repo' format, got: invalid-format
Extension Points
The data models can be extended to support:
- Additional GitHub entities (pull requests, comments, reviews)
- Custom metadata fields
- Validation frameworks (pydantic, marshmallow)
- Alternative serialization formats (JSON, protobuf, msgpack)
- Schema evolution and migration
- Computed or derived fields
- Relationships between models
- Event sourcing or audit trails