Specification: Channel Message Fetching
Overview
The fetch --channel command uses SlackClient.iter_channel_history() to fetch top-level messages via conversations.history, then optionally fetches full thread replies via iter_thread_replies() for each threaded conversation.
Architecture
CLI (cli.py)
|
+-- fetch --channel C1 [--full-threads] [--oldest 7d]
|
+-- cache.fetch_channel_messages(conn, client, channel, full_threads)
|
+-- slack_api.iter_channel_history(channel)
| -> paginated top-level messages via conversations.history
|
+-- For each message: storage.record_thread_refresh() + upsert_messages()
|
+-- If full_threads:
For each message with reply_count > 0:
+-- slack_api.iter_thread_replies(channel, thread_ts)
+-- storage.record_thread_refresh() + upsert_messages()
Data Models
ChannelFetchResult
| Field | Type | Constraints | Description |
|---|---|---|---|
| channel | str | not null | Channel id |
| fetched_messages | int | not null | Messages written in this fetch |
| total_messages | int | not null | Total cached messages for the channel |
| threads_with_replies_fetched | int | not null | Threads that had full replies fetched |
No schema changes. Reuses the existing threads and messages tables.
API Contracts
fetch --channel
- Input: --channel (required), --full-threads (optional), --oldest (optional duration), --db, --api-base-url, -v
- Output (stderr): "cached N messages for CHANNEL (M fetched[, T threads with replies fetched])"
- Exit code: 0 on success
Sequences
Channel fetch with full threads
User -> cli fetch --channel C1 --full-threads
-> slack_api.iter_channel_history(channel="C1")
-> paginated top-level messages via cursor
-> For each top-level message:
-> storage.record_thread_refresh(channel, thread_ts, None)
-> storage.upsert_messages(channel, thread_ts, [msg])
-> Identify messages with reply_count > 0 or latest_reply
-> For each such thread:
-> slack_api.iter_thread_replies(channel, thread_ts)
-> storage.record_thread_refresh(channel, thread_ts, latest)
-> storage.upsert_messages(channel, thread_ts, replies)
-> storage.count_channel_messages(channel) -> total
-> return ChannelFetchResult
Technical Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Separate transactions per thread | Each reply fetch gets its own transaction | Avoids a single large transaction that holds a write lock for the entire channel fetch |
| Thread identification | reply_count > 0 or presence of latest_reply | Matches how Slack's conversations.history flags threaded messages |
| storage.count_channel_messages() | Counts across all threads for a channel | Provides total cached message count for the summary |
| --full-threads is opt-in | Default fetches only top-level messages | Fetching all replies can be expensive for active channels; user chooses |
| --oldest duration filtering | Converts duration string to epoch, passes as oldest param | Avoids re-fetching ancient history on repeated runs |
| Duration format | Nh (hours), Nd (days), Nw (weeks) | Simple, memorable syntax for common lookback periods |
Risks and Unknowns
- Very active channels may have thousands of threads, making --full-threads slow
- No incremental channel refresh; every run fetches the full history
- Rate limiting may cause long run times for large channels with --full-threads
Out of Scope
- Incremental channel history refresh
- Filtering by message type or author