Querying Jobs
Silo includes a built-in SQL query engine powered by Apache DataFusion. You can run SQL queries to inspect jobs, monitor queue health, and debug issues — from the TypeScript client, the siloctl CLI, or the web UI.
Running Queries
Section titled “Running Queries”From the TypeScript Client
Section titled “From the TypeScript Client”// Rows are automatically deserialized from MessagePack — no manual decoding needed.const result = await client.query( "SELECT id, status_kind, priority FROM jobs WHERE tenant = 'customer-123' LIMIT 10", "customer-123" // tenant for shard routing);
for (const row of result.rows) { console.log(row); // { id: "job-abc", status_kind: "Running", priority: 10 }}
// You can pass a generic type parameter for typed rows:interface JobRow { id: string; status_kind: string; count: number;}
const typed = await client.query<JobRow>( "SELECT id, status_kind, COUNT(*) as count FROM jobs GROUP BY id, status_kind", "customer-123");
for (const row of typed.rows) { console.log(row.id, row.status_kind, row.count); // fully typed}The second argument is the tenant used for routing the query to the correct shard. If you’re not using multi-tenancy, you can omit it.
You can also pass positional bind parameters as a third argument. Placeholders use $1, $2, … and are matched by position.
const tenant = "customer-123";const minPriority = 10;
const result = await client.query( "SELECT id, priority FROM jobs WHERE tenant = $1 AND priority >= $2 ORDER BY priority ASC LIMIT 25", tenant, [tenant, minPriority]);Supported bind parameter types in the TypeScript client are boolean, number, string, Uint8Array, and null.
The result includes:
columns— schema information (nameanddataTypefor each column)rows— deserialized JavaScript objects, one per rowrowCount— total number of rows returned
From siloctl
Section titled “From siloctl”# Query a specific shardsiloctl query <shard-id> "SELECT id, status_kind FROM jobs LIMIT 10"
# With a tenantsiloctl -t customer-123 query <shard-id> "SELECT id, status_kind FROM jobs WHERE tenant = 'customer-123'"
# JSON output for scriptingsiloctl --json query <shard-id> "SELECT id, status_kind FROM jobs WHERE status_kind = 'Failed'"Tables
Section titled “Tables”Silo exposes two tables: jobs and queues.
jobs Table
Section titled “jobs Table”The primary table containing all job data.
| Column | Type | Description |
|---|---|---|
shard_id | String | UUID of the shard storing this job |
tenant | String | Tenant identifier |
id | String | Job ID |
priority | UInt8 | Priority (0 = highest, 99 = lowest) |
enqueue_time_ms | Int64 | Unix timestamp in milliseconds when the job was enqueued |
payload | String (nullable) | JSON-encoded job payload |
status_kind | String (nullable) | Current job status (see Status Values) |
status_changed_at_ms | Int64 (nullable) | Unix timestamp in milliseconds of last status change |
task_group | String | Task group the job belongs to |
current_attempt | UInt32 (nullable) | Current attempt number (1-indexed). Present for Waiting/Scheduled jobs, null for Running and terminal statuses |
next_attempt_starts_after_ms | Int64 (nullable) | Unix timestamp in milliseconds when the next attempt is scheduled to start. Present for Waiting/Scheduled jobs, null otherwise |
metadata | Map<String, String> (nullable) | Key-value metadata attached to the job |
queues Table
Section titled “queues Table”Shows the current state of concurrency queues — which jobs hold concurrency tickets and which are waiting.
| Column | Type | Description |
|---|---|---|
shard_id | String | UUID of the shard |
tenant | String | Tenant identifier |
queue_name | String | Name of the concurrency queue (the limit key) |
entry_type | String | "holder" (has a ticket) or "requester" (waiting for one) |
task_id | String | Task or request ID |
job_id | String (nullable) | Job ID (present for requesters) |
priority | UInt8 (nullable) | Job priority (present for requesters) |
timestamp_ms | Int64 | When the ticket was granted or the request was made |
Status Values
Section titled “Status Values”The status_kind column has the following values:
| Status | Meaning |
|---|---|
Waiting | Ready to run — the job’s start time has arrived and it’s in the task queue |
Scheduled | Scheduled for the future — the job’s start time hasn’t arrived yet, or it’s waiting for a retry backoff |
Running | Currently being processed by a worker |
Succeeded | Completed successfully |
Failed | Failed after exhausting all retries |
Cancelled | Cancelled by the user |
Example Queries
Section titled “Example Queries”Listing Jobs
Section titled “Listing Jobs”-- All jobs for a tenantSELECT id, status_kind, priority, task_groupFROM jobsWHERE tenant = 'customer-123'ORDER BY enqueue_time_ms DESCLIMIT 50
-- Jobs in a specific task groupSELECT id, status_kind, enqueue_time_msFROM jobsWHERE tenant = 'customer-123' AND task_group = 'emails'ORDER BY enqueue_time_ms DESCFiltering by Status
Section titled “Filtering by Status”-- All currently running jobsSELECT id, task_group, priorityFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Running'
-- Failed jobsSELECT id, task_group, status_changed_at_msFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Failed'ORDER BY status_changed_at_ms DESC
-- Jobs waiting to be picked upSELECT id, priority, task_groupFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Waiting'ORDER BY priority ASC
-- Future-scheduled jobs (including retry backoffs)SELECT id, task_groupFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Scheduled'
-- New future-scheduled jobs (never attempted)SELECT id, task_group, next_attempt_starts_after_msFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Scheduled' AND current_attempt = 1
-- Jobs waiting to retry after a failure (attempted at least once)SELECT id, task_group, current_attempt, next_attempt_starts_after_msFROM jobsWHERE tenant = 'customer-123' AND status_kind = 'Scheduled' AND current_attempt > 1Aggregations
Section titled “Aggregations”-- Count jobs by statusSELECT status_kind, COUNT(*) as countFROM jobsWHERE tenant = 'customer-123'GROUP BY status_kind
-- Count by task group and statusSELECT task_group, status_kind, COUNT(*) as countFROM jobsWHERE tenant = 'customer-123'GROUP BY task_group, status_kindORDER BY task_group, status_kind
-- Average priority by task groupSELECT task_group, AVG(CAST(priority AS DOUBLE)) as avg_priorityFROM jobsWHERE tenant = 'customer-123'GROUP BY task_groupQuerying Metadata
Section titled “Querying Metadata”Jobs can have arbitrary key-value metadata. Querying metadata uses DataFusion’s map functions.
Exact match — find jobs where a metadata key equals a specific value:
SELECT id, status_kindFROM jobsWHERE tenant = 'customer-123' AND array_contains(element_at(metadata, 'env'), 'production')Prefix match — find jobs where a metadata value starts with a prefix:
SELECT id, status_kindFROM jobsWHERE tenant = 'customer-123' AND starts_with(array_any_value(element_at(metadata, 'region')), 'us-')You can also use LIKE for prefix matching:
SELECT id, status_kindFROM jobsWHERE tenant = 'customer-123' AND array_any_value(element_at(metadata, 'region')) LIKE 'us-%'Combining metadata filters with status:
SELECT id, status_kind, priorityFROM jobsWHERE tenant = 'customer-123' AND array_contains(element_at(metadata, 'env'), 'production') AND status_kind = 'Running'Inspecting Concurrency Queues
Section titled “Inspecting Concurrency Queues”-- See all holders and requesters for a queueSELECT queue_name, entry_type, task_id, job_id, timestamp_msFROM queuesWHERE tenant = 'customer-123' AND queue_name = 'user:456'
-- Count holders vs requesters per queueSELECT queue_name, entry_type, COUNT(*) as countFROM queuesWHERE tenant = 'customer-123'GROUP BY queue_name, entry_type
-- Find queues with waiting requesters (indicates jobs are blocked on concurrency)SELECT queue_name, COUNT(*) as waitingFROM queuesWHERE tenant = 'customer-123' AND entry_type = 'requester'GROUP BY queue_nameORDER BY waiting DESCLooking Up a Specific Job
Section titled “Looking Up a Specific Job”-- Get full details for a job by IDSELECT *FROM jobsWHERE tenant = 'customer-123' AND id = 'job-abc-123'Writing Performant Queries
Section titled “Writing Performant Queries”Silo uses predicate pushdown to select the most efficient scan strategy based on your WHERE clause. Understanding how this works helps you write fast queries.
Always filter by tenant
Section titled “Always filter by tenant”When multi-tenancy is enabled, always include a WHERE tenant = '...' condition. Without it, Silo must scan all tenants on the shard, which is much slower.
-- Fast: scans only one tenant's dataSELECT id FROM jobs WHERE tenant = 'customer-123' AND status_kind = 'Running'
-- Slow: scans all tenants on the shardSELECT id FROM jobs WHERE status_kind = 'Running'Scan strategy priority
Section titled “Scan strategy priority”Silo picks the best scan strategy from your filters in this priority order:
- Exact ID lookup —
WHERE id = '...'— direct key lookup, fastest - Metadata exact match —
WHERE array_contains(element_at(metadata, 'key'), 'value')— uses the metadata index - Metadata prefix —
WHERE starts_with(array_any_value(element_at(metadata, 'key')), 'prefix')— uses the metadata index with prefix scan - Status filter —
WHERE status_kind = '...'— uses the status/time index - Full scan — no indexed filter — scans all jobs for the tenant
Only one index-backed filter is used per query. If you combine filters (e.g., status + metadata), the highest-priority one drives the scan and the rest are applied as post-filters.
-- Uses ExactId scan (priority 1), even though status filter is also presentSELECT * FROM jobs WHERE tenant = 'customer-123' AND id = 'job-abc' AND status_kind = 'Running'
-- Uses MetadataExact scan (priority 2)SELECT * FROM jobs WHERE tenant = 'customer-123' AND array_contains(element_at(metadata, 'env'), 'prod')
-- Uses Status scan (priority 4)SELECT * FROM jobs WHERE tenant = 'customer-123' AND status_kind = 'Failed'Use LIMIT
Section titled “Use LIMIT”Always add LIMIT to exploratory queries. Without it, Silo will scan and return all matching rows.
-- Good: bounded result setSELECT id, status_kind FROM jobs WHERE tenant = 'customer-123' LIMIT 100
-- Risky with large tenants: returns everythingSELECT id, status_kind FROM jobs WHERE tenant = 'customer-123'Cross-shard queries
Section titled “Cross-shard queries”The TypeScript client routes queries to a single shard based on the tenant. If you need to query across all shards (e.g., for cluster-wide aggregations), use siloctl to query each shard, or use the cluster query engine if available.
# Query each shard separatelysiloctl query <shard-0> "SELECT COUNT(*) as count FROM jobs"siloctl query <shard-1> "SELECT COUNT(*) as count FROM jobs"Supported SQL
Section titled “Supported SQL”Silo’s query engine supports standard SQL via Apache DataFusion:
SELECTwith column projection and*WHEREwithAND,OR, comparisons (=,!=,>,<,>=,<=),LIKE,INORDER BY(single or multiple columns,ASC/DESC)LIMITandOFFSETGROUP BYwith aggregate functions (COUNT,SUM,AVG,MIN,MAX)CASTfor type conversionsCASEexpressions- Arithmetic expressions
- Map functions:
element_at(),array_contains(),array_any_value(),starts_with()
Next Steps
Section titled “Next Steps”- Use siloctl for command-line query access and common workflows
- Set up observability for continuous monitoring alongside ad-hoc queries
- Learn about concurrency limits to understand the
queuestable