Skip to content

RPC Reference

This page documents Silo’s gRPC API. The protobuf definitions are the source of truth for all RPC interfaces.

The Silo job queue service.

Get cluster topology for client-side routing. Returns shard ownership information so clients can route requests to the correct node.

rpc GetClusterInfo(GetClusterInfoRequest) returns (GetClusterInfoResponse)

Request: GetClusterInfoRequest

Response: GetClusterInfoResponse

Get information about this node and all the shards it owns.

rpc GetNodeInfo(GetNodeInfoRequest) returns (GetNodeInfoResponse)

Request: GetNodeInfoRequest

Response: GetNodeInfoResponse

Enqueue a new job for processing. The job will be scheduled according to its start_at_ms and processed when limits allow.

rpc Enqueue(EnqueueRequest) returns (EnqueueResponse)

Request: EnqueueRequest

Response: EnqueueResponse

Get full details of a job including its current status.

rpc GetJob(GetJobRequest) returns (GetJobResponse)

Request: GetJobRequest

Response: GetJobResponse

Get the result of a completed job. Returns NOT_FOUND if the job doesn’t exist. Returns FAILED_PRECONDITION if the job hasn’t reached a terminal state.

rpc GetJobResult(GetJobResultRequest) returns (GetJobResultResponse)

Request: GetJobResultRequest

Response: GetJobResultResponse

Permanently delete a job and all its data. Running jobs should be cancelled first.

rpc DeleteJob(DeleteJobRequest) returns (DeleteJobResponse)

Request: DeleteJobRequest

Response: DeleteJobResponse

Cancel a job. Running jobs will be notified via heartbeat. Jobs in any state can be cancelled.

rpc CancelJob(CancelJobRequest) returns (CancelJobResponse)

Request: CancelJobRequest

Response: CancelJobResponse

Restart a cancelled or failed job for another attempt. Returns FAILED_PRECONDITION if job is not in a restartable state. Returns NOT_FOUND if the job doesn’t exist.

rpc RestartJob(RestartJobRequest) returns (RestartJobResponse)

Request: RestartJobRequest

Response: RestartJobResponse

Expedite a future-scheduled job to run immediately. Useful for dragging forward scheduled jobs or skipping retry backoff delays. Returns FAILED_PRECONDITION if job is not in an expeditable state (already running, terminal, cancelled, or task already ready to run). Returns NOT_FOUND if the job doesn’t exist.

rpc ExpediteJob(ExpediteJobRequest) returns (ExpediteJobResponse)

Request: ExpediteJobRequest

Response: ExpediteJobResponse

Lease a specific job’s task directly, putting it into Running state. Test-oriented helper: workers should use LeaseTasks for normal processing. Returns FAILED_PRECONDITION if the job is running, terminal, or cancelled. Returns NOT_FOUND if the job doesn’t exist.

rpc LeaseTask(LeaseTaskRequest) returns (LeaseTaskResponse)

Request: LeaseTaskRequest

Response: LeaseTaskResponse

Lease tasks for a worker to process. Workers should call this periodically to get work. Returns both job tasks and floating limit refresh tasks.

rpc LeaseTasks(LeaseTasksRequest) returns (LeaseTasksResponse)

Request: LeaseTasksRequest

Response: LeaseTasksResponse

Report the outcome of a completed job task. Must be called before the task lease expires.

rpc ReportOutcome(ReportOutcomeRequest) returns (ReportOutcomeResponse)

Request: ReportOutcomeRequest

Response: ReportOutcomeResponse

Report the outcome of a floating limit refresh task. Workers compute new max_concurrency and report here.

rpc ReportRefreshOutcome(ReportRefreshOutcomeRequest) returns (ReportRefreshOutcomeResponse)

Request: ReportRefreshOutcomeRequest

Response: ReportRefreshOutcomeResponse

Extend a task lease and check for cancellation. Workers must heartbeat before lease expires to keep tasks. Returns cancelled=true if the job was cancelled.

rpc Heartbeat(HeartbeatRequest) returns (HeartbeatResponse)

Request: HeartbeatRequest

Response: HeartbeatResponse

Execute an SQL query against shard data. Returns results as JSON rows.

rpc Query(QueryRequest) returns (QueryResponse)

Request: QueryRequest

Response: QueryResponse

Execute an SQL query with Arrow IPC streaming response. More efficient for large result sets. First message contains schema, subsequent messages contain record batches.

rpc QueryArrow(QueryArrowRequest) returns (stream ArrowIpcMessage)

Request: QueryArrowRequest

Response: ArrowIpcMessage

Capture a CPU profile from this node. Returns pprof protobuf data that can be analyzed with pprof or go tool pprof. The profile captures CPU usage for the specified duration.

rpc CpuProfile(CpuProfileRequest) returns (CpuProfileResponse)

Request: CpuProfileRequest

Response: CpuProfileResponse

Request a shard split operation. Initiates splitting a shard into two child shards at the specified split point. Returns FAILED_PRECONDITION if a split is already in progress. Returns NOT_FOUND if the shard doesn’t exist on this node. Returns INVALID_ARGUMENT if the split point is outside the shard’s range.

rpc RequestSplit(RequestSplitRequest) returns (RequestSplitResponse)

Request: RequestSplitRequest

Response: RequestSplitResponse

Get the status of a shard split operation. Returns the current phase and child shard IDs if a split is in progress. If no split is in progress, returns with in_progress=false.

rpc GetSplitStatus(GetSplitStatusRequest) returns (GetSplitStatusResponse)

Request: GetSplitStatusRequest

Response: GetSplitStatusResponse

Configure a shard’s placement ring. Changes which placement ring the shard belongs to, affecting which nodes can own it. The shard will be handed off to a node that participates in the new ring. Returns the previous and current ring assignments.

rpc ConfigureShard(ConfigureShardRequest) returns (ConfigureShardResponse)

Request: ConfigureShardRequest

Response: ConfigureShardResponse

Import jobs from another system with historical attempts. Unlike Enqueue, ImportJobs accepts completed attempt records and lets Silo take ownership going forward. Used for migrating workloads from other job queues. Each job is imported independently; per-job errors are returned in the response.

rpc ImportJobs(ImportJobsRequest) returns (ImportJobsResponse)

Request: ImportJobsRequest

Response: ImportJobsResponse

Reset all shards owned by this server. WARNING: Destructive operation. Only available in dev mode. Clears all jobs, tasks, queues, and other data.

rpc ResetShards(ResetShardsRequest) returns (ResetShardsResponse)

Request: ResetShardsRequest

Response: ResetShardsResponse

Force-release a shard lease regardless of the current holder. Operator escape hatch for recovering from permanently lost nodes. After force-release, any live node that desires the shard can acquire it.

rpc ForceReleaseShard(ForceReleaseShardRequest) returns (ForceReleaseShardResponse)

Request: ForceReleaseShardRequest

Response: ForceReleaseShardResponse

Container for arbitrary serialized data with support for multiple encoding formats. Currently only MessagePack is supported, but structured as a union for forward compatibility to add new serialization formats in the future (e.g., JSON, Protobuf). Used for job payloads, results, error data, and query response rows.

Oneof encoding: One of the following:

FieldTypeIDDescription
msgpackoptional bytes1Raw MessagePack bytes. Callers should serialize/deserialize using MessagePack.

Configuration for automatic job retry on failure. When a job attempt fails, Silo will automatically retry according to this policy.

FieldTypeIDDescription
retry_countoptional uint321Maximum number of retry attempts after the initial attempt fails.
initial_interval_msoptional int642Initial delay in milliseconds before the first retry.
max_interval_msoptional int643Maximum delay between retries (caps exponential backoff).
randomize_intervaloptional bool4If true, adds jitter to retry intervals to prevent thundering herd.
backoff_factoroptional double5Multiplier for exponential backoff (e.g., 2.0 doubles delay each retry).

Static concurrency limit that restricts how many jobs with the same key can run simultaneously. Jobs sharing the same key will queue up if max_concurrency is reached.

FieldTypeIDDescription
keyoptional string1Grouping key - jobs with the same key share this limit.
max_concurrencyoptional uint322Maximum number of jobs with this key that can run at once.

Dynamic concurrency limit where the max_concurrency value is computed by workers. Useful when concurrency should be based on external factors like API rate limits. Workers periodically receive refresh tasks to update the limit.

FieldTypeIDDescription
keyoptional string1Grouping key - jobs with the same key share this limit.
default_max_concurrencyoptional uint322Initial max concurrency used until first worker refresh.
refresh_interval_msoptional int643How often workers receive refresh tasks (milliseconds).
metadataoptional map<string, string>4Arbitrary data passed to workers during refresh (e.g., API credentials).

Retry policy for when a job is blocked by a rate limit. Configures how long workers should wait before retrying the rate limit check.

FieldTypeIDDescription
initial_backoff_msoptional int641Initial wait time in milliseconds when rate limited.
max_backoff_msoptional int642Maximum wait time between retries.
backoff_multiplieroptional double3Multiplier for exponential backoff (default 2.0).
max_retriesoptional uint324Max retries before failing the job (0 = infinite until reset_time).

Rate limit backed by the Gubernator distributed rate limiting service. Allows controlling job throughput based on request rates.

FieldTypeIDDescription
nameoptional string1Human-readable name for debugging and metrics.
unique_keyoptional string2Unique identifier for this rate limit instance (e.g., user ID).
limitoptional int643Maximum number of requests allowed within the duration.
duration_msoptional int644Time window in milliseconds for the rate limit.
hitsoptional int325Number of hits this job consumes (usually 1).
algorithmoptional GubernatorAlgorithm6Algorithm for rate limiting (token bucket or leaky bucket).
behavioroptional int327Behavior flags - combine GubernatorBehavior values with OR.
retry_policyoptional RateLimitRetryPolicy8Policy for retrying when rate limited.

Union type representing any kind of limit that can be applied to a job. Jobs can have multiple limits; all must be satisfied before execution.

Oneof limit: One of the following:

FieldTypeIDDescription
concurrencyoptional ConcurrencyLimit1Static concurrency limit.
rate_limitoptional GubernatorRateLimit2Gubernator-based rate limit.
floating_concurrencyoptional FloatingConcurrencyLimit3Dynamic worker-computed concurrency limit.

Request to enqueue a new job for processing.

Oneof _retry_policy: One of the following:

FieldTypeIDDescription
retry_policyoptional RetryPolicy5Retry configuration. If absent, job fails on first error.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string8Tenant ID for multi-tenant deployments. Optional.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job should be stored.
idoptional string2Optional job ID. If empty, a random UUID is generated.
priorityoptional uint323Priority from 0-99. Lower is higher priority (0 = highest).
start_at_msoptional int644Unix timestamp (ms) for future scheduling. 0 = run immediately.
payloadoptional SerializedBytes6Opaque serialized payload passed to workers.
limitsrepeated Limit7Ordered list of limits checked before execution.
metadataoptional map<string, string>9Arbitrary key/value metadata stored with the job.
task_groupoptional string10Task group for organizing tasks. Required. Tasks are enqueued into this group.

Response after successfully enqueueing a job.

FieldTypeIDDescription
idoptional string1The job’s ID (either provided or auto-generated).

Request to retrieve details about a specific job.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job is stored.
idoptional string2The job’s unique ID.
include_attemptsoptional bool4If true, include all attempts in the response. Defaults to false.

A single execution attempt of a job.

Oneof _finished_at_ms: One of the following:

FieldTypeIDDescription
finished_at_msoptional int646Unix timestamp (ms) when attempt finished. Present if completed.

Oneof _result: One of the following:

FieldTypeIDDescription
resultoptional SerializedBytes7Result data if attempt succeeded.

Oneof _error_code: One of the following:

FieldTypeIDDescription
error_codeoptional string8Error code if attempt failed.

Oneof _error_data: One of the following:

FieldTypeIDDescription
error_dataoptional SerializedBytes9Error details if attempt failed.

Fields:

FieldTypeIDDescription
job_idoptional string1The job’s unique ID.
attempt_numberoptional uint322Which attempt this is (1 = first attempt).
task_idoptional string3Unique task ID for this attempt.
statusoptional AttemptStatus4Current status of the attempt.
started_at_msoptional int645Unix timestamp (ms) when attempt started. Present for all attempts.

Full details of a job including its current state.

Oneof _retry_policy: One of the following:

FieldTypeIDDescription
retry_policyoptional RetryPolicy5Retry policy if configured.

Oneof _next_attempt_starts_after_ms: One of the following:

FieldTypeIDDescription
next_attempt_starts_after_msoptional int6411Unix timestamp (ms) when the next attempt will start. Present for scheduled jobs, absent for running or terminal jobs.

Oneof _result: One of the following:

FieldTypeIDDescription
resultoptional SerializedBytes13Result data from the last attempt, if the job succeeded.

Fields:

FieldTypeIDDescription
idoptional string1The job’s unique ID.
priorityoptional uint322Job priority (0 = highest, 99 = lowest).
enqueue_time_msoptional int643Unix timestamp (ms) when job was enqueued.
payloadoptional SerializedBytes4The job’s payload data.
limitsrepeated Limit6Limits declared on this job.
metadataoptional map<string, string>7Metadata key/value pairs.
statusoptional JobStatus8Current job status.
status_changed_at_msoptional int649Unix timestamp (ms) of last status change.
attemptsrepeated JobAttempt10All attempts for this job. Only populated if include_attempts was true in the request.
task_groupoptional string12Task group this job’s tasks are enqueued into.

Request to get the result of a completed job. Only succeeds if the job has reached a terminal state (succeeded, failed, or cancelled).

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job is stored.
idoptional string2The job’s unique ID.

Result of a completed job. Check status to determine which result field is populated.

Oneof result: The result depends on the terminal status.

FieldTypeIDDescription
success_dataoptional SerializedBytes4Present if status == SUCCEEDED. Contains job result.
failureoptional JobFailure5Present if status == FAILED. Contains error details.
cancelledoptional JobCancelled6Present if status == CANCELLED. Contains cancellation info.

Fields:

FieldTypeIDDescription
idoptional string1The job’s unique ID.
statusoptional JobStatus2Terminal status: SUCCEEDED, FAILED, or CANCELLED.
finished_at_msoptional int643Unix timestamp (ms) when job reached terminal state.

Error information for a failed job.

FieldTypeIDDescription
error_codeoptional string1Application-defined error code.
error_dataoptional SerializedBytes2Serialized error details.

Information about a cancelled job.

FieldTypeIDDescription
cancelled_at_msoptional int641Unix timestamp (ms) when cancellation was requested.

Request to permanently delete a job and all its data.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job is stored.
idoptional string2The job’s unique ID.

Response confirming job deletion.

No fields

Request to cancel a job. Running jobs will be notified via heartbeat response.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job is stored.
idoptional string2The job’s unique ID.

Response confirming cancellation was requested.

No fields

Restart a cancelled or failed job, allowing it to be processed again. The job will get a fresh set of retries according to its retry policy.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3

Fields:

FieldTypeIDDescription
shardoptional string1
idoptional string2

No fields

Expedite a future-scheduled job to run immediately. This is useful for dragging forward a job that was scheduled for the future, or for skipping retry backoff delays on a mid-retry job.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3

Fields:

FieldTypeIDDescription
shardoptional string1
idoptional string2

No fields

Lease a specific job’s task directly, putting it into Running state. This is a test-oriented helper: workers should use LeaseTasks for normal processing. Bypasses concurrency/rate-limit processing and creates a RunAttempt lease directly. Returns FAILED_PRECONDITION if the job is running, terminal, or cancelled. Returns NOT_FOUND if the job doesn’t exist.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job is stored.
idoptional string2The job’s unique ID.
worker_idoptional string4Worker ID to assign the lease to.
FieldTypeIDDescription
taskoptional Task1The leased task.

Lease tasks for processing from this server. By default, leases from all shards this server owns (fair distribution). If shard is specified, filters to only that shard.

Oneof _shard: One of the following:

FieldTypeIDDescription
shardoptional string1optional filter - if set, only lease from this shard (UUID)

Fields:

FieldTypeIDDescription
worker_idoptional string2
max_tasksoptional uint323
task_groupoptional string4Required. Task group to poll tasks from.

A task representing a single job attempt leased to a worker.

Oneof _tenant_id: One of the following:

FieldTypeIDDescription
tenant_idoptional string3Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
idoptional string1Unique task ID (different from job ID).
job_idoptional string2ID of the job this task belongs to.
attempt_numberoptional uint324Which attempt this is (1 = first attempt). Monotonically increasing across restarts.
relative_attempt_numberoptional uint325Attempt within current run (1 = first attempt since last restart). Resets on restart.
is_last_attemptoptional bool6True if this is the final attempt (no more retries after this run).
metadataoptional map<string, string>7Metadata key/value pairs from the job.
limitsrepeated Limit8Limits declared on this job (concurrency, rate, floating).
payloadoptional SerializedBytes9The job’s payload for the worker to process.
priorityoptional uint3210Job priority (for informational purposes).
shardoptional string11Shard ID (UUID) this task came from (needed for reporting outcome).
task_groupoptional string12Task group this task belongs to.
lease_msoptional int6413How long the lease lasts. Heartbeat before this expires.

Task for refreshing a floating concurrency limit. Workers compute the new max_concurrency and report back.

Oneof _tenant_id: One of the following:

FieldTypeIDDescription
tenant_idoptional string9Tenant ID if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
idoptional string1Unique task ID for this refresh.
queue_keyoptional string2The floating limit key being refreshed.
current_max_concurrencyoptional uint323Current max concurrency value.
last_refreshed_at_msoptional int644Unix timestamp (ms) of last refresh.
metadataoptional map<string, string>5Metadata from the limit definition.
lease_msoptional int646How long the lease lasts. Heartbeat before this expires.
shardoptional string7Shard ID (UUID) this task came from (needed for reporting outcome).
task_groupoptional string8Task group this refresh task belongs to.

Response containing tasks leased to a worker.

FieldTypeIDDescription
tasksrepeated Task1Job execution tasks.
refresh_tasksrepeated RefreshFloatingLimitTask2Floating limit refresh tasks.

Request to report the outcome of a completed task. Note: tenant is determined from the task lease, not from the request.

Oneof outcome: One of the following:

FieldTypeIDDescription
successoptional SerializedBytes3Job succeeded. Contains result data from executing the job.
failureoptional Failure4Job failed. Contains error details.
cancelledoptional Cancelled6Worker acknowledges job was cancelled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) the task came from.
task_idoptional string2The task’s unique ID.

Error details for a failed task.

FieldTypeIDDescription
codeoptional string1Application-defined error code.
dataoptional SerializedBytes2Serialized error details.

Marker indicating the worker acknowledges the job was cancelled.

No fields

Response confirming outcome was recorded.

No fields

Request to report the outcome of a floating limit refresh task. Note: tenant is determined from the task lease, not from the request.

Oneof outcome: One of the following:

FieldTypeIDDescription
successoptional RefreshSuccess4Refresh succeeded with new max concurrency.
failureoptional RefreshFailure5Refresh failed.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) the task came from.
task_idoptional string2The task’s unique ID.

Successful floating limit refresh with the new computed value.

FieldTypeIDDescription
new_max_concurrencyoptional uint321New max concurrency computed by the worker.

Error during floating limit refresh.

FieldTypeIDDescription
codeoptional string1Error code.
messageoptional string2Human-readable error message.

Response confirming refresh outcome was recorded.

No fields

Request to extend a task lease and check for cancellation. Workers must heartbeat before lease_ms expires to keep the task. Note: tenant is determined from the task lease, not from the request.

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) the task came from.
worker_idoptional string2Worker ID that holds the lease.
task_idoptional string3The task’s unique ID.

Response indicating if the lease was extended and if the job was cancelled.

Oneof _cancelled_at_ms: One of the following:

FieldTypeIDDescription
cancelled_at_msoptional int642Unix timestamp (ms) when cancellation was requested, if cancelled.

Fields:

FieldTypeIDDescription
cancelledoptional bool1True if job was cancelled. Worker should stop and report Cancelled.

No fields

A single bind parameter value for SQL query placeholders.

Oneof value: One of the following:

FieldTypeIDDescription
bool_valueoptional bool1
int64_valueoptional int642
uint64_valueoptional uint643
float64_valueoptional double4
string_valueoptional string5
bytes_valueoptional bytes6
null_valueoptional QueryNull7

Request to execute an arbitrary SQL query against shard data. Useful for ad-hoc inspection and debugging.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID to scope results, if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) to query.
sqloptional string2SQL query string.
parametersrepeated QueryParameter4Optional SQL bind parameters ($1, $2, …).

Metadata about a column in query results.

FieldTypeIDDescription
nameoptional string1Column name.
data_typeoptional string2Arrow/DataFusion type as string (e.g., “Utf8”, “Int64”).

Query results with rows as serialized objects.

FieldTypeIDDescription
columnsrepeated ColumnInfo1Schema information for the result columns.
rowsrepeated SerializedBytes2Each row as a serialized object.
row_countoptional int323Total number of rows returned.

Request to execute SQL query with Arrow IPC streaming response. More efficient than QueryResponse for large result sets.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string3Tenant ID to scope results, if multi-tenancy is enabled.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) to query.
sqloptional string2SQL query string.
parametersrepeated QueryParameter4Optional SQL bind parameters ($1, $2, …).

Arrow IPC encoded message. Part of a streaming response.

FieldTypeIDDescription
ipc_dataoptional bytes1Arrow IPC stream data. First message is schema, subsequent are record batches.

Request to get cluster topology for client-side routing.

No fields

Information about which node owns a specific shard.

Oneof _placement_ring: One of the following:

FieldTypeIDDescription
placement_ringoptional string6Placement ring this shard belongs to (empty = default ring).

Fields:

FieldTypeIDDescription
shard_idoptional string1The shard ID (UUID).
grpc_addroptional string2gRPC address of the node owning this shard.
node_idoptional string3Unique identifier of the owning node.
range_startoptional string4Inclusive start of the tenant_id range owned by this shard.
range_endoptional string5Exclusive end of the tenant_id range owned by this shard.

Information about a cluster member node.

FieldTypeIDDescription
node_idoptional string1Unique identifier of this node.
grpc_addroptional string2gRPC address of this node.
placement_ringsrepeated string3Placement rings this node participates in.

Cluster topology information.

FieldTypeIDDescription
num_shardsoptional uint321Total number of shards in the cluster.
shard_ownersrepeated ShardOwner2Mapping of each shard to its owner.
this_node_idoptional string3Node ID of the server responding.
this_grpc_addroptional string4gRPC address of the server responding.
membersrepeated ClusterMember5All cluster members with their ring participation.

Request to reset all shards owned by this server. WARNING: Destructive operation. Only available in dev mode.

No fields

Response confirming shards were reset.

FieldTypeIDDescription
shards_resetoptional uint321Number of shards that were cleared.

Request to capture a CPU profile from this node. Used for production debugging and performance analysis.

FieldTypeIDDescription
duration_secondsoptional uint321How long to profile (1-300 seconds). Default 30.
frequencyoptional uint322Sampling frequency in Hz (1-1000). Default 100.

CPU profile data in pprof protobuf format. Can be analyzed with pprof or go tool pprof.

FieldTypeIDDescription
profile_dataoptional bytes1pprof protobuf bytes (not gzip compressed).
duration_secondsoptional uint322Actual duration profiled.
samplesoptional uint643Number of samples collected.

Request to initiate a shard split operation.

FieldTypeIDDescription
shard_idoptional string1Shard ID (UUID) of the shard to split.
split_pointoptional string2Tenant ID where to split the keyspace.

Response after initiating a shard split.

FieldTypeIDDescription
left_child_idoptional string1UUID of the left child shard [parent_start, split_point).
right_child_idoptional string2UUID of the right child shard [split_point, parent_end).
phaseoptional string3Current split phase (e.g., “SplitRequested”).

Request to get the status of a shard split operation.

FieldTypeIDDescription
shard_idoptional string1Parent shard ID (UUID) of the split operation.

Response with the current split status. Returns empty if no split is in progress for the shard.

FieldTypeIDDescription
in_progressoptional bool1True if a split is in progress for this shard.
phaseoptional string2Current split phase (empty if not in progress).
left_child_idoptional string3UUID of the left child shard (empty if not in progress).
right_child_idoptional string4UUID of the right child shard (empty if not in progress).
split_pointoptional string5Tenant ID at which the split occurs (empty if not in progress).
initiator_node_idoptional string6Node ID that initiated the split.
requested_at_msoptional int647Unix timestamp (ms) when split was requested.

Information about a shard owned by a node, including counters and cleanup status.

FieldTypeIDDescription
shard_idoptional string1The shard ID (UUID).
total_jobsoptional int642Total number of jobs in the shard (not deleted).
completed_jobsoptional int643Number of jobs in terminal states (Succeeded, Failed, Cancelled).
cleanup_statusoptional string4Cleanup status: “CompactionDone”, “CleanupPending”, “CleanupRunning”, “CleanupDone”.
created_at_msoptional int645Unix timestamp (ms) when this shard was first created/initialized.
cleanup_completed_at_msoptional int646Unix timestamp (ms) when cleanup completed (0 if not applicable or not completed).

Request to get node information including owned shards with their counters and cleanup status.

No fields

Response with node information and details for all shards owned by this node.

FieldTypeIDDescription
node_idoptional string1Unique identifier of this node.
owned_shardsrepeated OwnedShardInfo2Information for each shard owned by this node.
placement_ringsrepeated string3Placement rings this node participates in.

Request to configure a shard’s placement ring.

Oneof _placement_ring: One of the following:

FieldTypeIDDescription
placement_ringoptional string2The placement ring to assign (empty/null = default ring).

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string100Optional tenant ID (for multi-tenant mode).

Fields:

FieldTypeIDDescription
shardoptional string1The shard ID (UUID) to configure.

Response after configuring a shard’s placement ring.

FieldTypeIDDescription
previous_ringoptional string1The placement ring before the change (empty = default ring).
current_ringoptional string2The placement ring after the change (empty = default ring).

A historical attempt record for job import. All attempts must be in terminal states (no Running).

Oneof _result: One of the following:

FieldTypeIDDescription
resultoptional SerializedBytes4Present if succeeded.

Oneof _error_code: One of the following:

FieldTypeIDDescription
error_codeoptional string5Present if failed.

Oneof _error_data: One of the following:

FieldTypeIDDescription
error_dataoptional SerializedBytes6Present if failed.

Fields:

FieldTypeIDDescription
statusoptional AttemptStatus1Must be terminal (SUCCEEDED, FAILED, or CANCELLED).
started_at_msoptional int642When the attempt started (epoch ms).
finished_at_msoptional int643When the attempt finished (epoch ms).

Request to import a single job from another system. Unlike Enqueue, ImportJob accepts historical attempts and lets Silo take ownership going forward.

Oneof _retry_policy: One of the following:

FieldTypeIDDescription
retry_policyoptional RetryPolicy6Retry configuration.

Oneof _tenant: One of the following:

FieldTypeIDDescription
tenantoptional string9Tenant ID for multi-tenant deployments.

Fields:

FieldTypeIDDescription
shardoptional string1Shard ID (UUID) where the job should be stored.
idoptional string2Required job ID (migration preserves IDs).
priorityoptional uint323Priority from 0-99. Lower is higher priority.
enqueue_time_msoptional int644Original enqueue time from source system (0 = now).
start_at_msoptional int645When the next attempt should start (0 = now, only for non-terminal).
payloadoptional SerializedBytes7Opaque serialized payload.
limitsrepeated Limit8Ordered list of limits.
metadataoptional map<string, string>10Arbitrary key/value metadata.
task_groupoptional string11Task group for organizing tasks.
attemptsrepeated ImportAttempt12Historical attempts, all terminal.

Batch request to import multiple jobs.

FieldTypeIDDescription
jobsrepeated ImportJobRequest1

Result of importing a single job.

Oneof _error: One of the following:

FieldTypeIDDescription
erroroptional string3Error message if import failed.

Fields:

FieldTypeIDDescription
idoptional string1The job’s ID.
successoptional bool2Whether the import succeeded.
statusoptional JobStatus4The determined status of the imported job.

Response containing results for each imported job.

FieldTypeIDDescription
resultsrepeated ImportJobResult1

Request to force-release a shard’s ownership lease. Operator escape hatch for permanently lost nodes.

FieldTypeIDDescription
shardoptional string1The shard ID (UUID) to force-release.

Response after force-releasing a shard lease.

FieldTypeIDDescription
releasedoptional bool1True if the lease was released.

Rate limiting algorithm for Gubernator-based limits.

ValueNumberDescription
GUBERNATOR_ALGORITHM_TOKEN_BUCKET0Token bucket: tokens refill at steady rate, requests consume tokens.
GUBERNATOR_ALGORITHM_LEAKY_BUCKET1Leaky bucket: requests processed at fixed rate, excess queued.

Behavior flags for Gubernator rate limits. Can be combined via bitwise OR.

ValueNumberDescription
GUBERNATOR_BEHAVIOR_BATCHING0Default: batch rate limit checks to peers for efficiency.
GUBERNATOR_BEHAVIOR_NO_BATCHING1Send each rate limit check immediately (lower latency, higher load).
GUBERNATOR_BEHAVIOR_GLOBAL2Synchronize rate limit globally across all Gubernator peers.
GUBERNATOR_BEHAVIOR_DURATION_IS_GREGORIAN4Reset duration on calendar boundaries (minute, hour, day).
GUBERNATOR_BEHAVIOR_RESET_REMAINING8Force reset the remaining counter on this request.
GUBERNATOR_BEHAVIOR_DRAIN_OVER_LIMIT16Set remaining to zero on first over-limit event.

Current state of a job in its lifecycle.

ValueNumberDescription
JOB_STATUS_SCHEDULED0Job is waiting to be executed (queued or scheduled for future).
JOB_STATUS_RUNNING1Job is currently being processed by a worker.
JOB_STATUS_SUCCEEDED2Job completed successfully.
JOB_STATUS_FAILED3Job failed after exhausting all retry attempts.
JOB_STATUS_CANCELLED4Job was cancelled before completion.

Status of a job attempt in its lifecycle.

ValueNumberDescription
ATTEMPT_STATUS_RUNNING0Attempt is currently running.
ATTEMPT_STATUS_SUCCEEDED1Attempt completed successfully.
ATTEMPT_STATUS_FAILED2Attempt failed.
ATTEMPT_STATUS_CANCELLED3Attempt was cancelled.