Skip to content

Server Configuration

Silo is configured using a TOML configuration file. You can specify the configuration file path using the -c or --config CLI flag:

Terminal window
silo -c /path/to/config.toml

If no configuration file is specified, Silo uses sensible defaults suitable for local development.

ArgumentDescription
-c, --config <path>Path to a TOML configuration file
-vEnable verbose output

Validate your configuration without starting the server using siloctl validate-config:

Terminal window
siloctl validate-config --config config.toml

This command parses the configuration file and reports any errors.

The [server] section configures the main gRPC server.

[server]
grpc_addr = "127.0.0.1:7450"
dev_mode = false
statement_timeout_ms = 5000
OptionTypeDefaultDescription
grpc_addrstring"127.0.0.1:7450"Address and port for the gRPC server to listen on
dev_modeboolfalseEnable development mode features like ResetShards RPC. Never enable in production.
statement_timeout_msnumber5000Maximum SQL statement execution time in milliseconds. Query execution is aborted when this timeout is hit. Set to 0 to disable statement timeout.
auth_tokenstring(none)Shared secret for gRPC authentication. When set, all incoming gRPC requests must include this token as a Bearer token in the authorization metadata header. When unset (default), authentication is disabled.

Silo supports optional shared-secret authentication for gRPC requests. When auth_token is set in the [server] section, all incoming RPCs must include an authorization: Bearer <token> metadata header matching the configured value. Requests without a valid token are rejected with UNAUTHENTICATED.

When auth_token is not set (the default), authentication is disabled and all clients can connect freely.

[server]
grpc_addr = "0.0.0.0:7450"
auth_token = "${SILO_AUTH_TOKEN}"

When authentication is enabled, node-to-node cluster communication and WebUI remote queries automatically use the configured token. External clients (workers, siloctl) must provide the token themselves.

siloctl supports the token via the --auth-token flag or the SILO_AUTH_TOKEN environment variable:

Terminal window
siloctl --auth-token <token> cluster info
# or
SILO_AUTH_TOKEN=<token> siloctl cluster info

The [database] section configures how Silo stores job data. Silo uses SlateDB as its embedded database, which stores data in object storage.

[database]
backend = "gcs"
path = "gs://my-bucket/silo/%shard%"
apply_wal_on_close = true
# Optional: periodic self-healing scan for pending concurrency requests
# concurrency_reconcile_interval_ms = 5000
# Optional: separate WAL storage
[database.wal]
backend = "fs"
path = "/var/lib/silo/wal/%shard%"
OptionTypeDefaultDescription
backendstring"fs"Storage backend type (see below)
pathstring"/tmp/silo/%shard%"Path or URL for data storage. Use %shard% as a placeholder for the shard number.
apply_wal_on_closebooltrueFlush WAL to object storage before closing shards (recommended for durability)
concurrency_reconcile_interval_msnumber5000Optional interval for periodic pending-request reconciliation in the concurrency manager
BackendDescriptionPath Format
fsLocal filesystem/var/lib/silo/%shard%
s3Amazon S3s3://bucket-name/prefix/%shard%
gcsGoogle Cloud Storagegs://bucket-name/prefix/%shard%
memoryIn-memory (testing only)Any string
urlGeneric URL-based object storeURL understood by SlateDB

For S3 and GCS backends, Silo uses the standard credential chain:

  • S3: AWS credential chain (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY, instance profiles, etc.)
  • GCS: GOOGLE_APPLICATION_CREDENTIALS environment variable or GKE Workload Identity

By default, the Write-Ahead Log (WAL) uses the same backend and location as the main data store. For better write performance, you can configure a separate local WAL:

[database]
backend = "gcs"
path = "gs://my-bucket/silo/%shard%"
[database.wal]
backend = "fs"
path = "/var/lib/silo/wal/%shard%"
OptionTypeDescription
backendstringStorage backend for WAL
pathstringPath for WAL storage (supports %shard% placeholder)

When using a local WAL with cloud object storage:

  • Writes are faster because they go to local disk first
  • On graceful shard close (or node shutdown), WAL is flushed to object storage to ensure durability
  • The local WAL directory is deleted after successful flush
  • On crash, shard leases are permanent and persist until the node restarts and recovers the WAL. Set node_id to a stable value (e.g., "${POD_NAME}") to enable automatic WAL recovery after restarts. See the Internals guide for details.

When running on pure object storage, your Silo instances don’t necessarily need to apply the WAL to the rest of the storage before shutting down. But, when running with a split WAL, where the WAL is not on object storage, you should apply the WAL to the rest of the storage before shutting down to ensure durability. apply_wal_on_close is set to true by default, which triggers this behavior.

Note that apply_wal_on_close only applies during graceful shutdowns. If a node crashes, the WAL is not flushed. Silo’s permanent shard leases ensure the crashed node retains ownership of its shards, so when the node restarts, it can recover the unflushed WAL from local disk.

Silo runs a background reconciliation loop while each shard is open. This loop periodically scans pending concurrency request records and re-signals grant processing. It is a self-healing mechanism for cases where durable requests exist but in-memory notifications were missed (for example, due to a crash between separate write/notify phases).

This option is optional. If omitted, Silo uses the default value of 5000 milliseconds.

Silo uses SlateDB as its embedded storage engine. You can configure SlateDB-specific options via the [database.slatedb] section. All SlateDB configuration options are passed directly to SlateDB, so you can refer to the SlateDB Configuration Documentation for the full list of available options.

[database]
backend = "gcs"
path = "gs://my-bucket/silo/%shard%"
[database.slatedb]
flush_interval = "100ms"
l0_sst_size_bytes = 67108864
l0_max_ssts = 8
[database.slatedb.compactor_options]
poll_interval = "5s"
max_sst_size = 1073741824
OptionTypeDefaultDescription
flush_intervalduration"100ms"How often to flush the memtable to SST files
l0_sst_size_bytesnumber67108864Target size for L0 SST files (64MB default)
l0_max_sstsnumber8Maximum number of L0 SSTs before compaction triggers
max_unflushed_bytesnumber536870912Maximum unflushed data in memory (512MB default)

Configure compaction behavior via [database.slatedb.compactor_options]:

OptionTypeDefaultDescription
poll_intervalduration"5s"How often to check for compaction work
max_sst_sizenumber1073741824Maximum size of compacted SST files (1GB default)
max_concurrent_compactionsnumber4Maximum concurrent compaction jobs

Configure garbage collection via [database.slatedb.garbage_collector_options]:

[database.slatedb.garbage_collector_options.manifest_options]
interval = "300s"
min_age = "86400s"
[database.slatedb.garbage_collector_options.wal_options]
interval = "60s"
min_age = "60s"

You can specify only the SlateDB options you want to customize—unspecified options will use SlateDB’s defaults. For example, to only configure the flush interval and object store cache:

[database.slatedb]
flush_interval = "1ms"
[database.slatedb.object_store_cache_options]
root_folder = "/var/silo-cache"
cache_puts = true

All other SlateDB settings (like l0_sst_size_bytes, manifest_poll_interval, etc.) will automatically use their default values. This allows you to tune specific parameters without needing to specify the entire configuration.


The [coordination] section configures how Silo nodes discover each other and coordinate shard ownership in a cluster.

[coordination]
backend = "etcd"
cluster_prefix = "silo-prod"
num_shards = 8
lease_ttl_secs = 10
OptionTypeDefaultDescription
backendstring"none"Coordination backend: "none", "etcd", or "k8s"
cluster_prefixstring"silo"Prefix for namespacing coordination keys/leases
num_shardsnumber8Total number of shards in the cluster
lease_ttl_secsnumber10TTL for the membership lease in seconds. This controls how quickly the cluster detects a node has crashed. Shard ownership leases are permanent and not affected by this TTL.
advertised_grpc_addrstring(none)Address other nodes use to connect to this node
node_idstring(random UUID)Stable node identity for this instance. If set, the node will reclaim shard leases from a previous run on startup, enabling WAL recovery after crashes. See Permanent shard leases.

For local development or single-node deployments:

[coordination]
backend = "none"

In this mode, a single Silo instance owns all shards and no coordination is needed.

In clustered deployments, the advertised_grpc_addr tells other nodes how to connect to this node. This is important when:

  • You bind to 0.0.0.0 but need to advertise a specific IP
  • You’re running in Kubernetes and need to advertise the pod IP
[server]
grpc_addr = "0.0.0.0:7450" # Bind to all interfaces
[coordination]
backend = "k8s"
# Advertise the pod IP (injected via Downward API)
advertised_grpc_addr = "${POD_IP}:7450"

The [webui] section configures the built-in web dashboard.

[webui]
enabled = true
addr = "127.0.0.1:8080"
OptionTypeDefaultDescription
enabledbooltrueEnable the web UI server
addrstring"127.0.0.1:8080"Address and port for the web UI

The web UI provides:

  • Cluster overview and health status
  • Queue inspection and job browsing
  • SQL query interface for debugging
  • Configuration viewer

The [metrics] section configures the Prometheus metrics endpoint.

[metrics]
enabled = true
addr = "127.0.0.1:9090"
OptionTypeDefaultDescription
enabledbooltrueEnable the Prometheus metrics endpoint
addrstring"127.0.0.1:9090"Address and port for the metrics server

Metrics are exposed in Prometheus format at /metrics. See the Observability Guide for available metrics.


The [logging] section configures log output format.

[logging]
format = "json"
OptionTypeDefaultDescription
formatstring"text"Log format: "text" (human-readable) or "json" (structured)

Use json format for production deployments to enable log aggregation and analysis.


The [gubernator] section configures integration with Gubernator for distributed rate limiting.

[gubernator]
address = "http://gubernator:9991"
coalesce_interval_ms = 5
max_batch_size = 100
connect_timeout_ms = 5000
request_timeout_ms = 10000
OptionTypeDefaultDescription
addressstring(none)Gubernator server URL. If not set, rate limiting is disabled.
coalesce_interval_msnumber5Max time to wait before sending a batch
max_batch_sizenumber100Max requests to batch together
connect_timeout_msnumber5000Connection timeout in milliseconds
request_timeout_msnumber10000Request timeout in milliseconds

The [tenancy] section enables multi-tenant features.

[tenancy]
enabled = true
OptionTypeDefaultDescription
enabledboolfalseEnable multi-tenancy support

Silo supports environment variable substitution in configuration values using shell-like syntax:

  • ${VAR} - Expands to the value of VAR, or empty string if not set
  • ${VAR:-default} - Expands to the value of VAR, or "default" if not set

For example:

[database]
# If DATABASE_PATH is set to "/data/silo", this becomes "/data/silo/%shard%"
# If DATABASE_PATH is not set, this becomes "/var/lib/silo/%shard%"
path = "${DATABASE_PATH:-/var/lib/silo}/%shard%"

Substituting pod IPs in for advertised_grpc_addr in Kubernetes

Section titled “Substituting pod IPs in for advertised_grpc_addr in Kubernetes”

You can use environment variable substitution in Kubernetes to inject values via the Downward API, which is required for configuring dynamic values that aren’t known in advance like advertised_grpc_addr:

[coordination]
# Inject pod IP from Kubernetes Downward API
advertised_grpc_addr = "${POD_IP}:7450"
# Use a default if the env var isn't set
cluster_prefix = "${CLUSTER_NAME:-silo-default}"

For example, if you pass the POD_IP environment variable to your Silo pod via the downward API like so:

apiVersion: v1
kind: Pod
metadata:
name: silo
spec:
containers:
- name: silo
image: ghcr.io/gadget-inc/silo:latest
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- "-c"
- "/etc/silo/config.toml"
volumeMounts:
- name: config
mountPath: /etc/silo
# ...