GitHub Actions scheduler

Purpose

Use GitHub Actions as the agent-invokable scheduling substrate: schedules declared as on: schedule in workflow files on the default branch, executed on ephemeral self-hosted runners with just-in-time (JIT) configuration tokens, credentialed via OIDC federation to cloud targets where applicable. Closes the structural gap where seven rubric criteria — PL5-signal-driven-tasks, PL2-test-quality, PL2-ui-test-coverage, PL2-load-stress-testing, PL4-release-strategy, PL5-pipeline-reliability, PL5-outcome-input-loop — silently presupposed scheduling infrastructure that wasn’t named (resolved in rubric v0.18 by tightening PL5-signal-driven-tasks’s level-2 anchor to make the scheduling prerequisite explicit).

The recipe is substrate-specific by design. Earlier deliberation considered a substrate-agnostic framing, but the structure that matters — delay tail, public-repo self-hosted runner prohibition, OIDC federation shape, workflow-file lifecycle mechanics — is specific enough to GitHub Actions that a generic framing would hide the parts that matter. Additional substrates (Temporal, Cloudflare Cron, pg_cron, etc.) warrant their own sibling recipes if and when a workload forces the question; see research/scheduler-substrate-github-actions.md for the boundary analysis that justifies this specialisation.

Architecture

The scheduler is a GitHub repository running GitHub Actions, configured as follows:

Schedules declared in workflow files on the default branch using on: schedule with POSIX five-field cron syntax. Minimum cadence 5 minutes. Non-round-minute offsets (17 * * * *) preferred over round-hour expressions (0 * * * *) to avoid the platform’s peak-queue delay tail.
Ephemeral self-hosted runners with the --ephemeral flag, provisioned via just-in-time (JIT) configuration tokens from POST /orgs/{org}/actions/runners/generate-jitconfig (or the repo-scoped variant). Every runner processes exactly one job and de-registers; no state persists between runs. On Kubernetes, Actions Runner Controller (ARC) is the reference implementation and passes --ephemeral automatically.

Agent tool surface via gh CLI (shell access) or a thin MCP wrapper. Canonical operations map onto native GitHub Actions mechanisms:

Operation	Mechanism
Create	PR adding a workflow file with `on: schedule` on the default branch
Edit	PR modifying the cron expression in the workflow file
Cancel	`gh workflow disable <id>` (pause) or file deletion (permanent)
List	`gh workflow list` filtered to workflows using `on: schedule`
Status	`gh run list --workflow <id> --event schedule`
Ad-hoc fire	`gh workflow run <id>` (requires `workflow_dispatch` in the workflow)
Cancel run	`gh run cancel <run-id>` (or `rerun-failed-jobs` for recovery)

Credentials via OIDC federation for cloud targets (AWS STS, GCP Workload Identity Federation, Azure federated identity, HCP, Databricks). Workflows declare permissions: id-token: write and exchange GitHub’s short-lived JWT for cloud credentials at fire time. No long-lived secrets stored in repo secrets, the scheduler, or the runner host.
For write paths, composition with GitOps JIT privilege elevation is load-bearing: the PR that creates a scheduled writer routes through the same elevation gate as code changes, and the scheduled job itself elevates at fire time rather than carrying standing write credentials.
Log forwarding from ephemeral runners to external storage (Loki, CloudWatch, S3, vendor) — non-optional because runner-local logs evaporate on de-registration and silent-failure investigation is otherwise impossible.
Observability via GET /repos/{owner}/{repo}/actions/runs endpoints and the workflow-run UI; structured logs queryable through the project’s existing observability surface (rubric PL3-agent-queryability) once forwarding is in place.
Job metadata conventions. Each scheduled workflow file carries metadata as YAML comments parsed by agent tooling and a dedicated sweeper workflow:
- # owner: — agent or human slug (reinforces the native actor context).
- # expires: — TTL date; past this date the sweeper disables or deletes the workflow.
- # risk-tier: low | medium | high — feeds branch-protection tiers; high-risk workflow-file changes require additional reviewers on create and edit.
- # tags: — list form, for aggregation and querying.
- # cost-budget: — per-run or per-month ceiling; sweeper alerts when exceeded.
These are repo-level conventions (not platform features); the sweeper enforces them on the scheduler’s own cadence. Without them, workflow-file proliferation is latent — see Failure modes.

Substrate limits to engineer around: 5-minute minimum cadence, ±30-minute delay tail under platform load, 60-day inactivity disable on public repos (N/A for private), plan-level concurrent-job ceilings (20 Free / 40 Pro / 60 Team / 500 Enterprise), and the GITHUB_TOKEN 1,000 req/hr/repo rate limit for API-chatty jobs. Workloads outside these boundaries need a different substrate — see the evaluation research for the full suitability envelope.

Criteria advanced

PL5-signal-driven-tasks — direct unlock. Rubric v0.18’s level-2 explicitly caps this criterion at level 1 without an agent-invokable scheduler. Deploying this recipe removes the cap; level 2 becomes reachable on the criterion’s own merits (reactive-source coverage still needs its own work).
PL2-test-quality — enables “mutation testing run periodically, not per-PR” (level-2 anchor’s explicit language). Without scheduling, mutation testing is either per-PR (too expensive) or ad-hoc (signal quality collapses).
PL2-ui-test-coverage — enables “coverage across critical flows, run daily.” Without scheduling, daily-run is aspirational.
PL2-load-stress-testing — enables “run on production-mirrored env, scheduled.” Without scheduling, load tests run ad-hoc and drift out of representativeness.
PL4-release-strategy — enables metric-gated stage promotion over time windows rather than immediate-or-never promotion. Caveat: the ±30-minute delay tail makes tight release-window promotion marginal on this substrate; sub-window-sensitive release strategies need a tighter scheduler.
PL5-pipeline-reliability — enables self-healing pipelines that retry, backfill, or alert on schedule drift. Without scheduling, self-healing is reactive-only.
PL5-outcome-input-loop — enables “metric thresholds trigger automated next-cycle tasks” on time windows, not just immediate-event triggers.

On all criteria except PL5-signal-driven-tasks, this recipe is an unlock / prerequisite rather than a full mechanism — each criterion still needs its own domain work (mutation tooling, UI test authoring, load-test rigs, canary promotion policy). But without the scheduler, each is structurally blocked from reaching level 2.

Prerequisites

PL3-structured-state-read ≥ 2 Structured state read access. Jobs that observe state need read access to the state they’re observing. Scheduled jobs without queryable state can only do blind work.
PL3-agent-queryability ≥ 2 Agent queryability. Jobs that react to telemetry (health checks, finding-rate trends) need the same queryability the agent has. Without this, jobs can act but can’t inspect.
PL5-pipeline-reliability ≥ 2 Pipeline reliability. Flaky pipelines compound pain catastrophically when scheduled work is layered on top — schedule keeps firing, failures accumulate, alerting drowns. Don’t deploy this recipe onto an unstable pipeline.
Implicit: GitHub as the project’s VCS, with Actions enabled. Projects not on GitHub need a different scheduling substrate; this recipe does not cover that case. Projects on GitHub where Actions is disabled by the organisation have no path here without that policy change.

Failure modes

Delay-tail-sensitive downstream consumers. A consumer that assumes the schedule fires within a minute of the stated time breaks when it fires 20–40 minutes later. Mitigation: treat scheduler timing as best-effort; downstream consumers tolerate the drift or use a different substrate; document the drift budget in the workflow file itself.
60-day inactivity disable on public repos. Scheduled workflows auto-disable after 60 days with no repository activity. A reported bug extends the disablement to on: push / on: pull_request paths on the same workflow file. Mitigation: prefer private repos for scheduled agent work; if public, adopt keepalive-workflow or synthetic commit cadence — but recognise these are brittle and themselves affected by the disablement rule.
Silent failure with lost logs. Ephemeral runners lose their local logs on de-registration. Without external log forwarding, scheduled jobs fail, no one notices, trust in the signal erodes. Mitigation: log forwarding to external storage is non-optional; alerting on job failure is non-optional; observability on the schedule itself (is it firing? is the runner registering?) is distinct from observability on its outputs.
GITHUB_TOKEN rate-limit exhaustion. Scheduled jobs chatty against the GitHub API (listing runs, posting statuses, updating issues) exhaust the 1,000 req/hr/repo budget quickly. Mitigation: purpose-scoped PATs or GitHub App tokens for chatty jobs; batch API calls; cache listings within the job.
Public-repo self-hosted runner compromise. Public repos with self-hosted runners are vulnerable to fork-PR attacks that execute arbitrary code on the runner host. GitHub’s own guidance is effectively prohibitive: “Self-hosted runners should almost never be used for public repositories.” Mitigation: do not deploy this recipe on public repos with self-hosted runners; private repos only, or GitHub-hosted runners for public-repo scheduling.
Credentials at rest on the runner host. A persistent self-hosted runner with secrets on disk becomes a compounding attack surface. Mitigation: ephemeral + JIT is non-optional; OIDC federation for cloud credentials; environment secrets with required reviewers for non-federated targets.
Privilege escalation via scheduled writes. If the agent can schedule arbitrary writes, the GitOps JIT elevation gate is bypassed by timing — schedule the write, come back later. Mitigation: scheduled writes elevate at fire time, not schedule time; the scheduled job opens a PR that the elevation gate reviews, rather than executing the write directly.
Timezone / DST bugs. Cron expressions behave surprisingly around DST boundaries; GitHub advances schedules forward across skipped hours but fall-back creates potential duplicate fires. Mitigation: schedules expressed and reasoned about in UTC; DST-affected hours (roughly 01:00–03:00 in transitioning zones) avoided where possible; tests around DST boundaries explicit.
Workflow-file proliferation. Agent creates scheduled workflows and never removes them; the repo accumulates abandoned workflow files over months. GitHub Actions has no native TTL on workflow files. Mitigation: required # expires: metadata comment on every scheduled workflow (see Architecture → Job metadata conventions); dedicated sweeper workflow processes the comment and disables or deletes expired entries; quarterly review of scheduled workflows for drift and relevance.
Runaway jobs spawning more jobs. A scheduled job that adds further scheduled jobs (by committing new workflow files) can cascade. GitHub Actions does not cap chain depth or spawn rate natively. Mitigation: PR review on any workflow-file additions (standard code review handles this); per-agent compute budget caps on the runner host; rate limiting on the ingestion surface that writes workflow files.
Scope regression on substrate swap. Schedules expressed as GitHub workflow files need a migration path if this recipe is ever replaced by a different substrate (Temporal, Cloudflare Cron, etc.). Mitigation: keep the agent tool surface thin (the gh-CLI wrapper or MCP) and job logic in scripts invoked by the workflow — the workflow YAML is the trigger wrapper, not the mechanism, so only the wrapper changes on substrate swap.

Cost estimate

Medium. First deployment: 1–2 engineer-weeks including runner-host substrate (ARC on K8s, or scripted VMs with systemd), OIDC-federation wiring for at least one cloud target, log-forwarding pipeline, gh CLI wrapper or MCP surface, and policy layer for write-path schedules. Per-project incremental cost drops sharply after the first deployment — the runner infrastructure and OIDC-federation template are reusable. Ongoing maintenance is moderate: runner-host upkeep, log-storage costs, quarterly review of scheduled workflows for drift and relevance.

Compute costs are substrate-dependent (K8s cluster, VM fleet, physical hardware). GitHub Actions minute billing is zero on self-hosted runners regardless of repo visibility.

Open design questions

These gate promotion from proposed to proven. Each needs an answer from the first integration.

Runner-host substrate. ARC on Kubernetes (if K8s is already in play) vs. scripted VMs with systemd (simpler, fewer moving parts, weaker at scale) vs. a managed runner service. First integration picks one and documents the trade-off.
Log-forwarding destination. Loki / CloudWatch / S3 / vendor product. Non-optional but not yet picked; blocks the silent-failure mitigation.
Action-type policy. What’s allowed in a scheduled workflow step? Options span a risk gradient:
- Pre-registered playbooks only — narrow, safe, every job type manually added.
- MCP tool invocations — broader, bounded by tool registration discipline.
- Claude Code remote-agent runs — broadest leverage, highest risk.
- Arbitrary shell steps — GitHub’s default, weakest containment.
Pick an answer or establish tiered allow-lists by risk class.
Human-in-the-loop for high-risk schedules. Workflow-file changes go through standard PR review. Is that sufficient, or do production-touching / high-cadence schedules need additional approval to schedule (not just to execute)? Risk-tier labels on workflow files with matching branch-protection rules is a candidate answer.
Observability loop into signal-driven tasks. Scheduled jobs produce signal (scan results, test failures, metric alerts); that signal must feed PL5-signal-driven-tasks for the compounding loop to close. Direct MCP integration from the workflow? Filesystem drop + the Ingestion as PR recipe? Still unresolved.
Sweeper implementation. The Job-metadata-conventions architecture names the # expires: mechanism; the sweeper that processes it on each scheduler cadence is not yet written. A first-integration concern rather than a design-question, but noted here until there’s a reference implementation.

Composes with: GitOps JIT privilege elevation — scheduled jobs requiring writes route through the elevation mechanism at fire time. The PR-shaped create_job operation composes naturally with branch-protection-based elevation gates on workflow-file additions.
Composes with: Bot-token credential tenancy — non-OIDC targets (third-party APIs, legacy systems) that need long-lived-ish credentials should use bot tokens scoped to the scheduler’s service identity, not user PATs.
Composes with: Indexed per-entry registry — workflow-run history is a corpus worth indexing; querying “what scheduled jobs have ever run against production” is the analytic value of treating runs as structured data.
Composes with: Ingestion as PR — signals produced by scheduled jobs (scan reports, test failures) ingested through the PR-shaped ingestion path close the loop back into PL5-signal-driven-tasks.
Prerequisite for: future recipes that depend on time-based execution — automated rubric re-scoring, periodic stakeholder-context sweeps, drift detection against design specs, periodic backup / migration validation. None of these recipes have been written yet; several are latent in the repo backlog.
Alternatives to: other scheduling substrates for workloads outside this recipe’s envelope — Temporal, Cloudflare Workers Cron, Durable Object alarms, pg_cron, APScheduler / BullMQ, AWS EventBridge Scheduler. None evaluated yet. Workloads that fall outside the boundary (sub-5-min cadence, tight-SLA timing, public repos with privileged secrets, non-GitHub VCS) need a separate substrate evaluation; see research/scheduler-substrate-github-actions.md for the boundary analysis that would anchor such a comparison. Also specifically rejected: claude.ai-hosted remote triggers — cannot reach the project’s local MCP tool surface, forcing a compromise between agent-invocability and boundary discipline that this recipe is designed to preserve.

References

GitHub Actions as scheduling substrate — full substrate evaluation: scheduling primitive, reliability, ephemeral runners, REST API surface, credential model, suitability envelope, negative findings.
GitHub Actions platform docs — primary-source extracts for schedule event, ephemeral runners, security hardening, REST API, usage limits.

GitHub Actions scheduler

GitHub Actions scheduler

Purpose

Architecture

Criteria advanced

Prerequisites

Failure modes

Cost estimate

Open design questions

Related recipes

References