Cancel, Restart, Delete
Jobs in Silo can be cancelled to stop processing, restarted to retry after failure, deleted to permanently remove them from the system, or expedited to run immediately if they were scheduled for the future.
Cancelling Jobs
Section titled “Cancelling Jobs”Cancellation requests that a job stop processing. The behavior depends on the job’s current state:
- Scheduled jobs: The job is immediately marked as Cancelled and will never run
- Running jobs: A cancellation flag is set and the worker discovers this on its next heartbeat. The worker’s task then should stop processing and report a cancelled outcome, but the specific JS task handler can take arbitrarily long to do this as long as it continues to hold the lease.
Using the Client
Section titled “Using the Client”You can cancel a job directly using the client:
import { SiloGRPCClient, JobNotFoundError } from "@silo-ai/client";
const client = new SiloGRPCClient({ servers: ["localhost:7450"],});
// Cancel a job by IDtry { await client.cancelJob("job-123"); console.log("Job cancelled");} catch (error) { if (error instanceof JobNotFoundError) { console.log("Job not found"); } throw error;}If you’re using tenancy, include the tenant:
await client.cancelJob("job-123", "customer-456");Using Job Handles
Section titled “Using Job Handles”Job handles provide a convenient cancel() method:
// From enqueueconst handle = await client.enqueue({ payload: { task: "process-data" }});
// Cancel anytime laterawait handle.cancel();Or create a handle for an existing job:
// Create a handle from a known job IDconst handle = client.handle("job-123");await handle.cancel();
// Or with a tenantconst handle = client.handle("job-456", "customer-123");await handle.cancel();Cancellation Errors
Section titled “Cancellation Errors”The cancelJob() method can throw several errors:
| Error | Condition |
|---|---|
JobNotFoundError | The job ID does not exist |
RpcError (FAILED_PRECONDITION) | The job is already cancelled |
RpcError (FAILED_PRECONDITION) | The job is already in a terminal state (Succeeded or Failed) |
How Workers Discover Cancellation
Section titled “How Workers Discover Cancellation”The SiloWorker handles heartbeats automatically. When a job is cancelled, the worker detects it on the next heartbeat and aborts the cancellationSignal passed to your handler. Your handler should check this signal and stop work:
const worker = new SiloWorker({ client, workerId: "worker-1", taskGroup: "data-processing", handler: async (ctx) => { for (const item of ctx.task.payload.items) { // Check for cancellation between units of work if (ctx.cancellationSignal.aborted) { return { type: "cancelled" }; } await processItem(item); } return { type: "success", result: { processed: true } }; }});You can also pass the signal to APIs that accept AbortSignal:
handler: async (ctx) => { const response = await fetch(ctx.task.payload.url, { signal: ctx.cancellationSignal }); // ...}Deleting Jobs
Section titled “Deleting Jobs”Deletion permanently removes a job and all its data from Silo. Unlike cancellation, deletion completely erases the job from storage.
Using the Client
Section titled “Using the Client”// Delete a job by IDawait client.deleteJob("job-123");
// With tenantawait client.deleteJob("job-456", "customer-123");Using Job Handles
Section titled “Using Job Handles”const handle = client.handle("job-123");await handle.delete();Deletion Requirements
Section titled “Deletion Requirements”Deletion Errors
Section titled “Deletion Errors”| Error | Condition |
|---|---|
JobNotFoundError | The job ID does not exist |
RpcError (INTERNAL) | The job is still in progress (Scheduled or Running) |
To delete a running job, first cancel it, then delete:
const handle = client.handle("job-123");
// First cancel the jobawait handle.cancel();
// Wait for cancellation to complete if needed// (Running jobs need time for the worker to acknowledge)const status = await handle.getStatus();if (status === JobStatus.Cancelled) { await handle.delete();}Restarting Jobs
Section titled “Restarting Jobs”Restarting allows you to re-run a job that has stopped—either because it was cancelled or because it failed after exhausting its retries. The job is re-queued with a fresh retry counter, giving it another set of chances to successfully complete.
When to Restart
Section titled “When to Restart”Restart is useful in several scenarios:
- Accidental cancellation: A job was cancelled by mistake and needs to run
- Transient failures: A job failed due to temporary issues (service outage, rate limits) that have been resolved, and an operator wants to manually gie it more retries
- Manual retry: You want to give a failed job another attempt outside of its automatic retry policy because you really want it to succeed
Using the Client
Section titled “Using the Client”You can restart a job directly using the client:
import { SiloGRPCClient, JobNotFoundError } from "@silo-ai/client";
const client = new SiloGRPCClient({ servers: ["localhost:7450"],});
// Restart a job by IDawait client.restartJob("job-123");console.log("Job restarted and re-queued");If you’re using tenancy, include the tenant:
await client.restartJob("job-123", "customer-456");Using Job Handles
Section titled “Using Job Handles”Job handles provide a convenient restart() method:
// Create a handle for an existing jobconst handle = client.handle("job-123");await handle.restart();
// Or with a tenantconst handle = client.handle("job-456", "customer-123");await handle.restart();What Restart Does
Section titled “What Restart Does”When you restart a job, Silo:
- Clears the cancellation flag (if the job was cancelled)
- Creates a new task with
attempt_number = 1, resetting the retry counter - Sets the status to Scheduled, placing the job back in the queue for immediate processing
- Preserves the original job data including payload, priority, limits, and metadata
The job will be picked up by the next available worker and processed as if it were newly enqueued.
Restart Requirements
Section titled “Restart Requirements”Only jobs in terminal-but-recoverable states can be restarted:
| Status | Can Restart? | Reason |
|---|---|---|
| Cancelled | ✅ Yes | Job was stopped before completion |
| Failed | ✅ Yes | Job failed but can be retried |
| Succeeded | ❌ No | Job completed successfully—nothing to retry |
| Scheduled | ❌ No | Job is already queued to run |
| Running | ❌ No | Job is currently being processed |
Restart Errors
Section titled “Restart Errors”The restartJob() method can throw several errors:
| Error | Condition |
|---|---|
JobNotFoundError | The job ID does not exist |
RpcError (FAILED_PRECONDITION) | Job already succeeded (truly terminal) |
RpcError (FAILED_PRECONDITION) | Job is still in progress (Scheduled or Running) |
import { RpcError } from "@protobuf-ts/runtime-rpc";
try { await handle.restart(); console.log("Job restarted successfully");} catch (error) { if (error instanceof RpcError && error.code === "FAILED_PRECONDITION") { // Check the message to understand why console.log("Cannot restart job:", error.message); // e.g., "job already succeeded" or "job is still in progress" } throw error;}Restarting Failed Jobs
Section titled “Restarting Failed Jobs”A common pattern is to monitor for failed jobs and restart them after fixing the underlying issue:
import { JobStatus } from "@silo-ai/client";
// Check if a job failedconst handle = client.handle("job-123", "customer-456");const status = await handle.getStatus();
if (status === JobStatus.Failed) { // Get job details to understand the failure const job = await handle.getJob(); console.log(`Job failed at ${job.statusChangedAtMs}`);
// After fixing the issue, restart the job await handle.restart(); console.log("Job restarted");}Restarting Cancelled Jobs
Section titled “Restarting Cancelled Jobs”If a job was cancelled by mistake, you can restart it to allow processing:
import { JobStatus } from "@silo-ai/client";
const handle = client.handle("job-123");const status = await handle.getStatus();
if (status === JobStatus.Cancelled) { // Restart the cancelled job await handle.restart(); console.log("Cancelled job has been restarted");}Expediting Jobs
Section titled “Expediting Jobs”Expediting allows you to make a future-scheduled job or attempt run immediately, skipping any scheduled delay. This is useful for dragging forward jobs that were scheduled for later or for bypassing retry backoff delays.
When to Expedite
Section titled “When to Expedite”Expedite is useful in several scenarios:
- User-initiated urgency: A user requests immediate processing of a scheduled job
- Skip retry delays: A job is waiting for retry backoff, but you’ve fixed the issue and want it to run now
- Testing scheduled jobs: You want to test a future-scheduled job without waiting
- Priority escalation: Business needs change and a scheduled job needs to run immediately
Using the Client
Section titled “Using the Client”You can expedite a job directly using the client:
import { SiloGRPCClient, JobNotFoundError } from "@silo-ai/client";
const client = new SiloGRPCClient({ servers: ["localhost:7450"],});
// Expedite a job by IDtry { await client.expediteJob("job-123"); console.log("Job expedited and ready to run immediately");} catch (error) { if (error instanceof JobNotFoundError) { console.log("Job not found"); } throw error;}If you’re using tenancy, include the tenant:
await client.expediteJob("job-123", "customer-456");Using Job Handles
Section titled “Using Job Handles”Job handles provide a convenient expedite() method:
// Create a handle for an existing jobconst handle = client.handle("job-123");await handle.expedite();
// Or with a tenantconst handle = client.handle("job-456", "customer-123");await handle.expedite();What Expedite Does
Section titled “What Expedite Does”When you expedite a job, Silo:
- Finds the future-scheduled task in the task queue
- Updates the task timestamp to the current time, making it immediately ready
- Wakes up the task broker to pick up the newly available task
- Preserves all other job data including attempt number, priority, limits, and metadata
The job becomes immediately available for workers to lease and process.
Expedite Requirements
Section titled “Expedite Requirements”Only jobs with future-scheduled tasks can be expedited:
| Condition | Can Expedite? | Reason |
|---|---|---|
| Future-scheduled task | ✅ Yes | Task timestamp is in the future |
| Mid-retry with backoff | ✅ Yes | Retry is scheduled for future due to exponential backoff |
| Ready to run now | ❌ No | Task is already at current time or earlier |
| Running | ❌ No | Job is currently being processed |
| Terminal (Succeeded/Failed) | ❌ No | Job has finished processing |
| Cancelled | ❌ No | Job was cancelled |
| No pending task | ❌ No | Job has no task in the queue |
Expedite Errors
Section titled “Expedite Errors”The expediteJob() method can throw several errors:
| Error | Condition |
|---|---|
JobNotFoundError | The job ID does not exist |
RpcError (FAILED_PRECONDITION) | Job is currently running |
RpcError (FAILED_PRECONDITION) | Job is terminal (Succeeded or Failed) |
RpcError (FAILED_PRECONDITION) | Job is cancelled |
RpcError (FAILED_PRECONDITION) | Task is already ready to run (not future-scheduled) |
RpcError (FAILED_PRECONDITION) | Job has no pending task in queue |
import { RpcError } from "@protobuf-ts/runtime-rpc";
try { await handle.expedite(); console.log("Job expedited successfully");} catch (error) { if (error instanceof RpcError && error.code === "FAILED_PRECONDITION") { // Check the message to understand why console.log("Cannot expedite job:", error.message); // e.g., "job is already running" or "task is already ready to run" } throw error;}Expediting Scheduled Jobs
Section titled “Expediting Scheduled Jobs”The most common use case is expediting jobs that were enqueued with a future startAtMs:
import { JobStatus } from "@silo-ai/client";
// Enqueue a job to run 1 hour from nowconst handle = await client.enqueue({ payload: { task: "process-data" }, startAtMs: BigInt(Date.now() + 3_600_000), // 1 hour});
// Check that it's scheduledconst status = await handle.getStatus();console.log(status); // JobStatus.Scheduled
// Business needs changed - run it now!await handle.expedite();
// Job is now immediately available for workersExpediting Mid-Retry Jobs
Section titled “Expediting Mid-Retry Jobs”When a job is retrying with exponential backoff, you can skip the waiting period:
import { JobStatus } from "@silo-ai/client";
// A job failed and is scheduled to retry in 5 minutesconst handle = client.handle("failed-job-123");const status = await handle.getStatus();
if (status === JobStatus.Scheduled) { // You fixed the underlying issue and want to retry immediately await handle.expedite(); console.log("Retry backoff skipped - job will run now");}Expediting vs Higher Priority
Section titled “Expediting vs Higher Priority”If you want a job to run sooner but it’s not necessarily urgent, consider using priority instead of expediting:
// During enqueue, use higher priorityconst handle = await client.enqueue({ payload: { task: "process-data" }, taskGroup: "data-processing", priority: 0, // 0 is highest priority, processed sooner});
// Expedite is for jobs that must run NOW// Priority is for jobs that should run SOONERExpedite is an immediate operation that bypasses time entirely. Priority adjusts ordering among ready jobs. Priority can’t be changed once a job has been enqueued.
Next Steps
Section titled “Next Steps”- Learn about running workers to handle job execution and cancellation
- Set up observability to monitor cancellations and failures
- Explore concurrency limits to control job execution