20 KiB
Keystone Implementation Spec
1. Product Scope
Keystone is a Laravel Forge-like deployment platform that runs applications and services with Docker. The v1 product is intentionally narrow:
- Laravel is the only first-class application framework.
- Application containers use a Keystone-managed Dockerfile based on
serversideup/phpwith FrankenPHP. - Services are explicitly coded drivers, not arbitrary Docker images.
- v1 is agentless and executes operations over SSH.
- Docker Compose is used as the generated runtime artifact.
- Caddy 2 is the default and only gateway for v1.
- The Keystone database is the source of truth. Server files are generated artifacts.
V1 should make the simple path robust before adding generic Docker support, distributed agents, HA databases, edge routing, or additional frameworks.
2. Core Domain Model
Organisation
Owns users, providers, registries, applications, servers, services, and environments.
Application
A source-code project. In v1, first-class applications are Laravel repositories.
Recommended fields:
organisation_idnamerepository_urlrepository_typedefault_branchdeploy_key_publicdeploy_key_privateencrypteddeploy_key_fingerprintdeploy_key_installed_atnullable
Environment
The primary application deployment unit. An application has environments such as production, staging, or dev.
Recommended fields:
application_idnamebranchstatusscheduler_enabledscheduler_target_service_idnullablescheduler_mode:singleorevery_replicabuild_configjson
Default for Laravel environments:
- Scheduler enabled.
- Scheduler target is the primary web service.
- Scheduler mode is
single.
Service
Every deployable thing is represented as a Service.
Examples:
- Laravel web runtime
- Laravel worker runtime
- Laravel websocket runtime
- Caddy gateway
- Postgres
- Valkey
- Future standalone services
Recommended fields:
organisation_idenvironment_idnullableserver_idnullable for single-placement legacy convenience only; long term use replicasnamecategorytypeversion_trackdriver_namestatusdesired_replicasdesired_revisiondeploy_policyprocess_rolesjsoncurrent_image_digestnullableavailable_image_digestnullableupdate_statusdefault_cpu_limitnullabledefault_memory_limit_mbnullableconfigjson
Deploy policy defaults:
- Laravel web:
with_environment - Laravel worker:
with_environment - Laravel websocket:
with_environment - Database/cache/storage:
dependency_only - Gateway:
manual_or_on_route_change - Standalone services:
manual
The user should not need to configure these defaults during normal setup.
ServiceReplica
A running instance of a service on a server. A service is logical; a replica is runtime placement.
Recommended fields:
service_idserver_idoperation_idnullablecontainer_namecontainer_idnullableimage_digestinternal_hostinternal_portpublic_portnullablestatushealth_statuscpu_limitnullablememory_limit_mbnullableconfigjson
Replica resource limits override service defaults. Null means unrestricted except host capacity.
ServiceSlice
A logical sub-resource inside a service. Slices belong to Service, not ServiceReplica.
Examples:
- Database and user inside Postgres
- Logical database or namespace inside Valkey
- Route inside Caddy
- Future bucket, topic, vhost, etc.
Recommended fields:
service_idenvironment_idnullablenametypestatusconfigjsoncredentialsencrypted json nullable
Slices are not containers and should not be used for scaling. They are stable logical resources that survive service replica replacement.
EnvironmentAttachment
Connects an environment to managed service slices.
Recommended fields:
environment_idservice_idservice_slice_idnullablerole:database,cache,queue,storage,gateway,customenv_prefixnullableis_primary
Attachments should point to slices whenever a slice exists. For example, a Laravel environment attaches to a Postgres database/user slice, not merely to the Postgres service.
EnvironmentVariable
Represents user-defined and Keystone-managed runtime environment values.
Recommended fields:
environment_idkeyvalueencryptedsource:user,managed_attachment,systemservice_slice_idnullableoverridableboolean
Managed values should be regenerated from attachments and slices.
3. Operations Model
Rename Deployment to Operation.
An operation is the generic audit and execution object for all state-changing work.
Operation
Recommended fields:
idparent_idnullablehashkindtarget_typetarget_idstatusstarted_atfinished_at- timestamps
Operation kinds:
server_provisionservice_deployreplica_deployslice_provisionslice_configureenvironment_deploygateway_cutoverconfig_changecredential_rotation
OperationStep
Rename Step to OperationStep.
Recommended fields:
operation_idnameorderstatusscriptlogserror_logssecretsencrypted json nullablestarted_atfinished_at- timestamps
Parent-Child Operations
Environment deploys are parent operations that create child operations.
Example:
environment_deploy- child
service_deployfor web - child
replica_deployfor each web replica - child
slice_configurefor Caddy route updates - child
gateway_cutover
Standalone service deploys and slice operations can also run independently.
4. Server Provisioning
V1 remains agentless over SSH.
Provisioning flow:
- Create server through provider API.
- Wait for root SSH to become available.
- Execute provisioning script over SSH.
- Create Keystone management user.
- Install Docker Engine, Docker Compose plugin, UFW, fail2ban, and required runtime packages.
- Install Keystone SSH public key.
- Disable password login.
- Enable UFW with SSH open.
- Callback or SSH verification marks server active.
Server permanent keys are for Keystone management only. Repository deploy keys must not be permanently installed on servers.
5. Source Providers And Repository Access
V1 source support:
- Self-hosted Gitea
- GitHub
- Generic Git over SSH
Repository access uses a Keystone-generated deploy key per application/repository.
V1 flow:
- User enters repo SSH URL.
- Keystone generates an ed25519 deploy key.
- UI shows the public key.
- User adds it to Gitea/GitHub as read-only.
- Keystone verifies access with
git ls-remote.
During build operations, Keystone injects the encrypted private key into a temporary operation directory and uses GIT_SSH_COMMAND. The key is removed after the build. Repo keys are never permanently stored on target servers or builder services.
6. Registry And Build Artifacts
An external registry is required for multi-server application deployments.
Single-server deployments may build and run a local image without a registry.
Multi-server deployments must:
- Build once.
- Push the image to the configured external registry.
- Pull the exact same image digest on each target server.
Supported registry types:
- Generic Docker registry
- Gitea registry
- GHCR
- Docker Hub
Build Service
Building is a service capability, not a server type.
A dedicated builder is represented as a Service with category builder. If no builder service exists, Keystone may build on the target server for single-server deployments.
Build strategies:
target_server: build on selected target server. Valid for single-server.dedicated_builder: build on builder service, then push/export artifact.external_registry: pull prebuilt image from registry.
For v1:
- Single-server default: build on target server.
- Multi-server: require configured registry and build once.
- Do not rebuild independently on each server.
BuildArtifact
Recommended fields:
environment_idcommit_shaimage_tagimage_digestregistry_refnullablebuilt_by_operation_idbuilt_by_service_idnullablestatusmetadatajson
7. Managed Laravel Runtime
V1 uses Keystone-managed Dockerfile templates only. Custom Dockerfiles are deferred.
Laravel runtime defaults:
- Base:
serversideup/phpFrankenPHP image - PHP version configurable
- Document root default:
public - Health path default:
/up, fallback/ - Composer install with production defaults
- JS build step configurable
- Bun/Node strategy configurable
The same build artifact is used by web, worker, and websocket services. Runtime services differ by entrypoint/command.
Default topology:
- One web service.
- No worker service by default.
- Scheduler enabled on the web service by default.
- Dedicated worker service is recommended when queues are used, but created only when the user opts in.
Worker options:
- Dedicated worker service, recommended.
- Embedded worker in web service, allowed for low-throughput apps but not recommended for production.
- No workers, default.
Keystone should warn against deployed environments using QUEUE_CONNECTION=sync, but it should not automatically create worker services.
8. Scheduler Model
Mirror Laravel Cloud's scheduler model.
Scheduler is not a standalone service by default. It is a role/capability attached to a selected web or worker service.
Defaults:
scheduler_enabled: true for Laravel templates.scheduler_target_service_id: primary web service.scheduler_mode:single.
Runtime behavior:
single: runschedule:runevery minute on exactly one selected replica.every_replica: run on each replica. This is advanced and explicit.
Keystone should enforce one scheduler runner per environment by default. Users may still use Laravel's onOneServer() for application-level safety.
9. Service Drivers
V1 services are explicitly coded drivers only. No arbitrary Docker image service in the v1 happy path.
Driver contract should define:
- service type and version track
- default image policy
- ports
- volumes
- environment schema
- health checks
- resource defaults
- supported slice types
- Compose rendering
- operation steps
- env var exports
- firewall requirements
- update behavior
V1 driver list:
- Caddy 2 gateway
- Laravel managed runtime using
serversideup/phpFrankenPHP - Postgres 18
- Valkey 8
Use latest minor versions for new service deploy/update operations by resolving image tags to digests. Store the resolved digest on the operation/service/replica for reproducible rollbacks.
Do not silently update managed service images. Show updates in the UI and require an explicit service update/redeploy operation.
10. Persistent Storage
Use named Docker volumes for persistent service-local data.
Examples:
- Postgres:
keystone_service_<id>_postgres_data - Valkey: named volume when persistence is enabled
- Caddy: named volumes for
/dataand/config
Avoid distributed storage in v1. Moving a stateful service to another server requires an explicit migration operation.
11. Stateful Service Updates
V1 accepts downtime for single-node stateful updates.
Postgres/Valkey update flow:
- User explicitly triggers update/redeploy.
- Keystone warns about downtime and data risk.
- Optional backup checkbox appears only if backup capability exists.
- Stop container.
- Preserve named volume.
- Start new container with updated image digest.
- Health check.
- Mark operation complete.
Rolling stateful updates and HA clusters are v2.
12. Slices And Attachments
Attaching a managed service to an environment should create sensible default slices automatically.
Postgres attachment:
- Create database/user slice by default.
- Generate credentials.
- Wire
DB_*environment variables.
Valkey attachment:
- Create/select logical slice if supported.
- Wire
REDIS_*. - Recommend
CACHE_STORE=redis,SESSION_DRIVER=redis, orQUEUE_CONNECTION=redisdepending on role. - Do not silently change queue behavior without confirmation.
Caddy/domain attachment:
- Create route slice.
- Wire gateway route to environment web service.
Advanced users can select existing slices or create slices manually from service detail pages.
Slice operations should be independent from service container deployments. Creating a Postgres database/user should run as a slice operation against an existing Postgres replica, not redeploy the Postgres container.
13. Environment Variables
Keystone manages env vars from attachments and slices.
Postgres slice should export:
DB_CONNECTION=pgsqlDB_HOSTDB_PORT=5432DB_DATABASEDB_USERNAMEDB_PASSWORD
Valkey slice/service should export:
REDIS_HOSTREDIS_PORT=6379- optional
CACHE_STORE=redis - optional
SESSION_DRIVER=redis - optional
QUEUE_CONNECTION=redis
User-defined variables remain editable. Managed variables should show their source and whether they are overridable.
14. Networking And Internal Aliases
Support both same-server Docker networking and cross-server private networking.
Routing preference:
- Same server: Docker network aliases/container DNS.
- Same provider private network: private IP and internal port.
- Public fallback only if explicitly allowed.
V1 should not build distributed DNS. Use deterministic internal hostnames and generated env vars. Where Keystone controls Docker networks, use network aliases. For cross-server communication, inject private IP/port endpoints.
Future agent/DNS systems should be possible, but are out of scope for v1.
Recommended endpoint model:
service_idservice_replica_idnullablescope:docker_network,private_network,publichostnameip_addressnullableportpriorityhealth_status
15. Gateway And Cutover
There must be exactly one gateway service per server for v1.
Caddy owns public ports 80 and 443. Application runtime containers should bind only to internal Docker networks or assigned internal ports.
Zero-downtime deployment happens at the gateway layer:
- Render/start new service replica with unique container/project name.
- Health check new replica.
- Update Caddy upstreams to include the new healthy replica.
- Reload Caddy.
- Drain/remove old replica from Caddy upstreams.
- Stop old container after the drain window.
For same-server upstreams, Caddy can use Docker network names. For cross-server upstreams, Caddy uses private IP and assigned internal port.
Web services may span multiple servers in v1. Keystone provides load balancing through Caddy upstreams but does not optimize global latency or regional placement.
Future v2 doctor page can flag:
- cross-region upstreams
- public-network fallbacks
- missing workers for async queues
- scheduler every-replica risks
- inefficient database/cache placement
16. Docker Compose Runtime
Use generated Docker Compose files, not raw docker run, for v1 runtime management.
Suggested server layout:
/home/keystone/services/<service-id>/compose.yml/home/keystone/services/<service-id>/.env/home/keystone/gateway/Caddyfile/home/keystone/operations/<operation-hash>/
Compose files are generated artifacts. The Keystone database is canonical.
Compose should be used for:
- container definitions
- env files
- named volumes
- networks
- health checks
- restart policies
- resource limits
- labels
Resource controls:
- Use plain Docker runtime constraints such as
cpus,mem_limit, andmemswap_limit. - Avoid relying on Swarm-only
deploy.resourcessemantics for v1.
Example:
services:
web:
image: registry.example.com/app:abc123
cpus: "1.0"
mem_limit: 1024m
memswap_limit: 1024m
17. Environment Deployment Flow
Environment deployment creates a parent environment_deploy operation.
High-level flow:
- Resolve target commit.
- Create or reuse build artifact.
- Compute desired service changes.
- Include only services with
deploy_policy=with_environmentand changed revision/config. - Check dependency-only services and attached slices.
- Run pre-switch service steps.
- Run application migrations according to service migration policy.
- Deploy new web/worker/websocket replicas.
- Health check new replicas.
- Update gateway routes.
- Reload Caddy.
- Drain and stop old replicas.
- Mark operation complete.
Database/cache services attached to the environment are checked but not redeployed unless the user explicitly deploys or updates them.
18. Migrations
Database migrations are owned by the application runtime service deployment.
Recommended fields on service config:
migration_mode:auto,manual,disabledmigration_timing:pre_switch,post_switchmigration_command: defaultphp artisan migrate --force
Default for Laravel web services:
migration_mode=automigration_timing=pre_switch- command
php artisan migrate --force
Manual mode should allow the user to run migration operation explicitly.
19. Onboarding
Onboarding should guide users through:
- Organisation creation.
- Server provider setup, Hetzner first.
- Source provider/repository setup, including Gitea/GitHub/generic Git.
- Deploy key installation and verification.
- Registry setup. Optional for single-server, required for multi-server.
- Server creation/provisioning.
- Application/environment creation.
- Optional service attachments: Postgres, Valkey, domain/gateway.
If an environment spans more than one server and no registry exists, deployment should be blocked with a registry setup prompt.
20. Current Code Migration Plan
The current code already has useful pieces:
- Provider abstraction
- Hetzner server creation
- Server provisioning jobs
- Service drivers
- Polymorphic deployments
- Step execution over SSH
Refactor in phases.
Phase 1: Schema Alignment
- Add
environmentstable. - Rename
deploymentstooperations. - Rename
stepstooperation_steps. - Add
operations.parent_id. - Add
operations.kind. - Add
service_replicas. - Add
service_slices. - Add
environment_attachments. - Add
environment_variables. - Add registry/source/build artifact tables.
Phase 2: Model Cleanup
- Replace
Application::instances()as the primary deployment path withApplication::environments(). - Keep or migrate
InstanceintoServiceReplicadepending on implementation cost. - Replace
Service::slicesreferences with realServiceSlicerelationship. - Replace
Deploymentreferences withOperation. - Replace deployment step jobs with operation step jobs.
Phase 3: Driver Contract
- Define formal driver interfaces for service deployment, replica rendering, slices, health checks, and env exports.
- Implement Caddy 2 driver.
- Implement Postgres 18 driver with database/user slice provisioning.
- Implement Valkey 8 driver.
- Implement Laravel runtime driver/template.
Phase 4: Compose Renderer
- Render Compose files from DB state.
- Upload generated files over SSH.
- Run
docker composeoperations. - Capture container IDs and health state into
ServiceReplica.
Phase 5: Environment Deploy
- Build application artifact.
- Deploy web replicas.
- Run migrations.
- Health check.
- Cut over Caddy.
- Stop old replicas.
Phase 6: UI Simplification
- Present environments as the primary application surface.
- Present services under an environment with sensible defaults.
- Hide deploy policies by default.
- Provide one-click add worker.
- Provide managed attachment flows for Postgres/Valkey/Caddy.
21. Explicit V2 Deferrals
Out of scope for v1:
- Server agent.
- Distributed internal DNS.
- Edge routing or anycast.
- Automatic regional topology optimization.
- Custom Dockerfiles.
- Arbitrary Docker image services.
- Non-Laravel first-class app frameworks.
- Managed Docker registry.
- HA Postgres/Valkey.
- Rolling stateful updates.
- Distributed storage.
- Full backup orchestration.
- Automatic deploy key installation via Gitea/GitHub API.