Files
keystone/docs/implementation-spec.md

20 KiB

Keystone Implementation Spec

1. Product Scope

Keystone is a Laravel Forge-like deployment platform that runs applications and services with Docker. The v1 product is intentionally narrow:

  • Laravel is the only first-class application framework.
  • Application containers use a Keystone-managed Dockerfile based on serversideup/php with FrankenPHP.
  • Services are explicitly coded drivers, not arbitrary Docker images.
  • v1 is agentless and executes operations over SSH.
  • Docker Compose is used as the generated runtime artifact.
  • Caddy 2 is the default and only gateway for v1.
  • The Keystone database is the source of truth. Server files are generated artifacts.

V1 should make the simple path robust before adding generic Docker support, distributed agents, HA databases, edge routing, or additional frameworks.

2. Core Domain Model

Organisation

Owns users, providers, registries, applications, servers, services, and environments.

Application

A source-code project. In v1, first-class applications are Laravel repositories.

Recommended fields:

  • organisation_id
  • name
  • repository_url
  • repository_type
  • default_branch
  • deploy_key_public
  • deploy_key_private encrypted
  • deploy_key_fingerprint
  • deploy_key_installed_at nullable

Environment

The primary application deployment unit. An application has environments such as production, staging, or dev.

Recommended fields:

  • application_id
  • name
  • branch
  • status
  • scheduler_enabled
  • scheduler_target_service_id nullable
  • scheduler_mode: single or every_replica
  • build_config json

Default for Laravel environments:

  • Scheduler enabled.
  • Scheduler target is the primary web service.
  • Scheduler mode is single.

Service

Every deployable thing is represented as a Service.

Examples:

  • Laravel web runtime
  • Laravel worker runtime
  • Laravel websocket runtime
  • Caddy gateway
  • Postgres
  • Valkey
  • Future standalone services

Recommended fields:

  • organisation_id
  • environment_id nullable
  • server_id nullable for single-placement legacy convenience only; long term use replicas
  • name
  • category
  • type
  • version_track
  • driver_name
  • status
  • desired_replicas
  • desired_revision
  • deploy_policy
  • process_roles json
  • current_image_digest nullable
  • available_image_digest nullable
  • update_status
  • default_cpu_limit nullable
  • default_memory_limit_mb nullable
  • config json

Deploy policy defaults:

  • Laravel web: with_environment
  • Laravel worker: with_environment
  • Laravel websocket: with_environment
  • Database/cache/storage: dependency_only
  • Gateway: manual_or_on_route_change
  • Standalone services: manual

The user should not need to configure these defaults during normal setup.

ServiceReplica

A running instance of a service on a server. A service is logical; a replica is runtime placement.

Recommended fields:

  • service_id
  • server_id
  • operation_id nullable
  • container_name
  • container_id nullable
  • image_digest
  • internal_host
  • internal_port
  • public_port nullable
  • status
  • health_status
  • cpu_limit nullable
  • memory_limit_mb nullable
  • config json

Replica resource limits override service defaults. Null means unrestricted except host capacity.

ServiceSlice

A logical sub-resource inside a service. Slices belong to Service, not ServiceReplica.

Examples:

  • Database and user inside Postgres
  • Logical database or namespace inside Valkey
  • Route inside Caddy
  • Future bucket, topic, vhost, etc.

Recommended fields:

  • service_id
  • environment_id nullable
  • name
  • type
  • status
  • config json
  • credentials encrypted json nullable

Slices are not containers and should not be used for scaling. They are stable logical resources that survive service replica replacement.

EnvironmentAttachment

Connects an environment to managed service slices.

Recommended fields:

  • environment_id
  • service_id
  • service_slice_id nullable
  • role: database, cache, queue, storage, gateway, custom
  • env_prefix nullable
  • is_primary

Attachments should point to slices whenever a slice exists. For example, a Laravel environment attaches to a Postgres database/user slice, not merely to the Postgres service.

EnvironmentVariable

Represents user-defined and Keystone-managed runtime environment values.

Recommended fields:

  • environment_id
  • key
  • value encrypted
  • source: user, managed_attachment, system
  • service_slice_id nullable
  • overridable boolean

Managed values should be regenerated from attachments and slices.

3. Operations Model

Rename Deployment to Operation.

An operation is the generic audit and execution object for all state-changing work.

Operation

Recommended fields:

  • id
  • parent_id nullable
  • hash
  • kind
  • target_type
  • target_id
  • status
  • started_at
  • finished_at
  • timestamps

Operation kinds:

  • server_provision
  • service_deploy
  • replica_deploy
  • slice_provision
  • slice_configure
  • environment_deploy
  • gateway_cutover
  • config_change
  • credential_rotation

OperationStep

Rename Step to OperationStep.

Recommended fields:

  • operation_id
  • name
  • order
  • status
  • script
  • logs
  • error_logs
  • secrets encrypted json nullable
  • started_at
  • finished_at
  • timestamps

Parent-Child Operations

Environment deploys are parent operations that create child operations.

Example:

  • environment_deploy
  • child service_deploy for web
  • child replica_deploy for each web replica
  • child slice_configure for Caddy route updates
  • child gateway_cutover

Standalone service deploys and slice operations can also run independently.

4. Server Provisioning

V1 remains agentless over SSH.

Provisioning flow:

  1. Create server through provider API.
  2. Wait for root SSH to become available.
  3. Execute provisioning script over SSH.
  4. Create Keystone management user.
  5. Install Docker Engine, Docker Compose plugin, UFW, fail2ban, and required runtime packages.
  6. Install Keystone SSH public key.
  7. Disable password login.
  8. Enable UFW with SSH open.
  9. Callback or SSH verification marks server active.

Server permanent keys are for Keystone management only. Repository deploy keys must not be permanently installed on servers.

5. Source Providers And Repository Access

V1 source support:

  • Self-hosted Gitea
  • GitHub
  • Generic Git over SSH

Repository access uses a Keystone-generated deploy key per application/repository.

V1 flow:

  1. User enters repo SSH URL.
  2. Keystone generates an ed25519 deploy key.
  3. UI shows the public key.
  4. User adds it to Gitea/GitHub as read-only.
  5. Keystone verifies access with git ls-remote.

During build operations, Keystone injects the encrypted private key into a temporary operation directory and uses GIT_SSH_COMMAND. The key is removed after the build. Repo keys are never permanently stored on target servers or builder services.

6. Registry And Build Artifacts

An external registry is required for multi-server application deployments.

Single-server deployments may build and run a local image without a registry.

Multi-server deployments must:

  1. Build once.
  2. Push the image to the configured external registry.
  3. Pull the exact same image digest on each target server.

Supported registry types:

  • Generic Docker registry
  • Gitea registry
  • GHCR
  • Docker Hub

Build Service

Building is a service capability, not a server type.

A dedicated builder is represented as a Service with category builder. If no builder service exists, Keystone may build on the target server for single-server deployments.

Build strategies:

  • target_server: build on selected target server. Valid for single-server.
  • dedicated_builder: build on builder service, then push/export artifact.
  • external_registry: pull prebuilt image from registry.

For v1:

  • Single-server default: build on target server.
  • Multi-server: require configured registry and build once.
  • Do not rebuild independently on each server.

BuildArtifact

Recommended fields:

  • environment_id
  • commit_sha
  • image_tag
  • image_digest
  • registry_ref nullable
  • built_by_operation_id
  • built_by_service_id nullable
  • status
  • metadata json

7. Managed Laravel Runtime

V1 uses Keystone-managed Dockerfile templates only. Custom Dockerfiles are deferred.

Laravel runtime defaults:

  • Base: serversideup/php FrankenPHP image
  • PHP version configurable
  • Document root default: public
  • Health path default: /up, fallback /
  • Composer install with production defaults
  • JS build step configurable
  • Bun/Node strategy configurable

The same build artifact is used by web, worker, and websocket services. Runtime services differ by entrypoint/command.

Default topology:

  • One web service.
  • No worker service by default.
  • Scheduler enabled on the web service by default.
  • Dedicated worker service is recommended when queues are used, but created only when the user opts in.

Worker options:

  • Dedicated worker service, recommended.
  • Embedded worker in web service, allowed for low-throughput apps but not recommended for production.
  • No workers, default.

Keystone should warn against deployed environments using QUEUE_CONNECTION=sync, but it should not automatically create worker services.

8. Scheduler Model

Mirror Laravel Cloud's scheduler model.

Scheduler is not a standalone service by default. It is a role/capability attached to a selected web or worker service.

Defaults:

  • scheduler_enabled: true for Laravel templates.
  • scheduler_target_service_id: primary web service.
  • scheduler_mode: single.

Runtime behavior:

  • single: run schedule:run every minute on exactly one selected replica.
  • every_replica: run on each replica. This is advanced and explicit.

Keystone should enforce one scheduler runner per environment by default. Users may still use Laravel's onOneServer() for application-level safety.

9. Service Drivers

V1 services are explicitly coded drivers only. No arbitrary Docker image service in the v1 happy path.

Driver contract should define:

  • service type and version track
  • default image policy
  • ports
  • volumes
  • environment schema
  • health checks
  • resource defaults
  • supported slice types
  • Compose rendering
  • operation steps
  • env var exports
  • firewall requirements
  • update behavior

V1 driver list:

  • Caddy 2 gateway
  • Laravel managed runtime using serversideup/php FrankenPHP
  • Postgres 18
  • Valkey 8

Use latest minor versions for new service deploy/update operations by resolving image tags to digests. Store the resolved digest on the operation/service/replica for reproducible rollbacks.

Do not silently update managed service images. Show updates in the UI and require an explicit service update/redeploy operation.

10. Persistent Storage

Use named Docker volumes for persistent service-local data.

Examples:

  • Postgres: keystone_service_<id>_postgres_data
  • Valkey: named volume when persistence is enabled
  • Caddy: named volumes for /data and /config

Avoid distributed storage in v1. Moving a stateful service to another server requires an explicit migration operation.

11. Stateful Service Updates

V1 accepts downtime for single-node stateful updates.

Postgres/Valkey update flow:

  1. User explicitly triggers update/redeploy.
  2. Keystone warns about downtime and data risk.
  3. Optional backup checkbox appears only if backup capability exists.
  4. Stop container.
  5. Preserve named volume.
  6. Start new container with updated image digest.
  7. Health check.
  8. Mark operation complete.

Rolling stateful updates and HA clusters are v2.

12. Slices And Attachments

Attaching a managed service to an environment should create sensible default slices automatically.

Postgres attachment:

  • Create database/user slice by default.
  • Generate credentials.
  • Wire DB_* environment variables.

Valkey attachment:

  • Create/select logical slice if supported.
  • Wire REDIS_*.
  • Recommend CACHE_STORE=redis, SESSION_DRIVER=redis, or QUEUE_CONNECTION=redis depending on role.
  • Do not silently change queue behavior without confirmation.

Caddy/domain attachment:

  • Create route slice.
  • Wire gateway route to environment web service.

Advanced users can select existing slices or create slices manually from service detail pages.

Slice operations should be independent from service container deployments. Creating a Postgres database/user should run as a slice operation against an existing Postgres replica, not redeploy the Postgres container.

13. Environment Variables

Keystone manages env vars from attachments and slices.

Postgres slice should export:

  • DB_CONNECTION=pgsql
  • DB_HOST
  • DB_PORT=5432
  • DB_DATABASE
  • DB_USERNAME
  • DB_PASSWORD

Valkey slice/service should export:

  • REDIS_HOST
  • REDIS_PORT=6379
  • optional CACHE_STORE=redis
  • optional SESSION_DRIVER=redis
  • optional QUEUE_CONNECTION=redis

User-defined variables remain editable. Managed variables should show their source and whether they are overridable.

14. Networking And Internal Aliases

Support both same-server Docker networking and cross-server private networking.

Routing preference:

  1. Same server: Docker network aliases/container DNS.
  2. Same provider private network: private IP and internal port.
  3. Public fallback only if explicitly allowed.

V1 should not build distributed DNS. Use deterministic internal hostnames and generated env vars. Where Keystone controls Docker networks, use network aliases. For cross-server communication, inject private IP/port endpoints.

Future agent/DNS systems should be possible, but are out of scope for v1.

Recommended endpoint model:

  • service_id
  • service_replica_id nullable
  • scope: docker_network, private_network, public
  • hostname
  • ip_address nullable
  • port
  • priority
  • health_status

15. Gateway And Cutover

There must be exactly one gateway service per server for v1.

Caddy owns public ports 80 and 443. Application runtime containers should bind only to internal Docker networks or assigned internal ports.

Zero-downtime deployment happens at the gateway layer:

  1. Render/start new service replica with unique container/project name.
  2. Health check new replica.
  3. Update Caddy upstreams to include the new healthy replica.
  4. Reload Caddy.
  5. Drain/remove old replica from Caddy upstreams.
  6. Stop old container after the drain window.

For same-server upstreams, Caddy can use Docker network names. For cross-server upstreams, Caddy uses private IP and assigned internal port.

Web services may span multiple servers in v1. Keystone provides load balancing through Caddy upstreams but does not optimize global latency or regional placement.

Future v2 doctor page can flag:

  • cross-region upstreams
  • public-network fallbacks
  • missing workers for async queues
  • scheduler every-replica risks
  • inefficient database/cache placement

16. Docker Compose Runtime

Use generated Docker Compose files, not raw docker run, for v1 runtime management.

Suggested server layout:

  • /home/keystone/services/<service-id>/compose.yml
  • /home/keystone/services/<service-id>/.env
  • /home/keystone/gateway/Caddyfile
  • /home/keystone/operations/<operation-hash>/

Compose files are generated artifacts. The Keystone database is canonical.

Compose should be used for:

  • container definitions
  • env files
  • named volumes
  • networks
  • health checks
  • restart policies
  • resource limits
  • labels

Resource controls:

  • Use plain Docker runtime constraints such as cpus, mem_limit, and memswap_limit.
  • Avoid relying on Swarm-only deploy.resources semantics for v1.

Example:

services:
  web:
    image: registry.example.com/app:abc123
    cpus: "1.0"
    mem_limit: 1024m
    memswap_limit: 1024m

17. Environment Deployment Flow

Environment deployment creates a parent environment_deploy operation.

High-level flow:

  1. Resolve target commit.
  2. Create or reuse build artifact.
  3. Compute desired service changes.
  4. Include only services with deploy_policy=with_environment and changed revision/config.
  5. Check dependency-only services and attached slices.
  6. Run pre-switch service steps.
  7. Run application migrations according to service migration policy.
  8. Deploy new web/worker/websocket replicas.
  9. Health check new replicas.
  10. Update gateway routes.
  11. Reload Caddy.
  12. Drain and stop old replicas.
  13. Mark operation complete.

Database/cache services attached to the environment are checked but not redeployed unless the user explicitly deploys or updates them.

18. Migrations

Database migrations are owned by the application runtime service deployment.

Recommended fields on service config:

  • migration_mode: auto, manual, disabled
  • migration_timing: pre_switch, post_switch
  • migration_command: default php artisan migrate --force

Default for Laravel web services:

  • migration_mode=auto
  • migration_timing=pre_switch
  • command php artisan migrate --force

Manual mode should allow the user to run migration operation explicitly.

19. Onboarding

Onboarding should guide users through:

  1. Organisation creation.
  2. Server provider setup, Hetzner first.
  3. Source provider/repository setup, including Gitea/GitHub/generic Git.
  4. Deploy key installation and verification.
  5. Registry setup. Optional for single-server, required for multi-server.
  6. Server creation/provisioning.
  7. Application/environment creation.
  8. Optional service attachments: Postgres, Valkey, domain/gateway.

If an environment spans more than one server and no registry exists, deployment should be blocked with a registry setup prompt.

20. Current Code Migration Plan

The current code already has useful pieces:

  • Provider abstraction
  • Hetzner server creation
  • Server provisioning jobs
  • Service drivers
  • Polymorphic deployments
  • Step execution over SSH

Refactor in phases.

Phase 1: Schema Alignment

  • Add environments table.
  • Rename deployments to operations.
  • Rename steps to operation_steps.
  • Add operations.parent_id.
  • Add operations.kind.
  • Add service_replicas.
  • Add service_slices.
  • Add environment_attachments.
  • Add environment_variables.
  • Add registry/source/build artifact tables.

Phase 2: Model Cleanup

  • Replace Application::instances() as the primary deployment path with Application::environments().
  • Keep or migrate Instance into ServiceReplica depending on implementation cost.
  • Replace Service::slices references with real ServiceSlice relationship.
  • Replace Deployment references with Operation.
  • Replace deployment step jobs with operation step jobs.

Phase 3: Driver Contract

  • Define formal driver interfaces for service deployment, replica rendering, slices, health checks, and env exports.
  • Implement Caddy 2 driver.
  • Implement Postgres 18 driver with database/user slice provisioning.
  • Implement Valkey 8 driver.
  • Implement Laravel runtime driver/template.

Phase 4: Compose Renderer

  • Render Compose files from DB state.
  • Upload generated files over SSH.
  • Run docker compose operations.
  • Capture container IDs and health state into ServiceReplica.

Phase 5: Environment Deploy

  • Build application artifact.
  • Deploy web replicas.
  • Run migrations.
  • Health check.
  • Cut over Caddy.
  • Stop old replicas.

Phase 6: UI Simplification

  • Present environments as the primary application surface.
  • Present services under an environment with sensible defaults.
  • Hide deploy policies by default.
  • Provide one-click add worker.
  • Provide managed attachment flows for Postgres/Valkey/Caddy.

21. Explicit V2 Deferrals

Out of scope for v1:

  • Server agent.
  • Distributed internal DNS.
  • Edge routing or anycast.
  • Automatic regional topology optimization.
  • Custom Dockerfiles.
  • Arbitrary Docker image services.
  • Non-Laravel first-class app frameworks.
  • Managed Docker registry.
  • HA Postgres/Valkey.
  • Rolling stateful updates.
  • Distributed storage.
  • Full backup orchestration.
  • Automatic deploy key installation via Gitea/GitHub API.