Runbook: Service Startup Failure¶
Use this runbook when one or more services fail to start or repeatedly restart.
Detection Signals¶
docker compose psshows exited or restarting services.- App endpoint unavailable after stack startup.
- Logs show repeated boot failures.
Response Flow¶
flowchart TD
A[Detect startup failure] --> B[Check service state]
B --> C[Inspect app, db, redis, celery logs]
C --> D[Classify root cause]
D --> E[Apply targeted fix]
E --> F[Restart affected services]
F --> G[Validate stack health] Steps¶
- Run
docker compose ps. - Capture logs:
docker compose logs -f app db redis celery. - Validate env values in
.envand service connectivity. - Rebuild if required:
docker compose up -d --build. - Re-check health and run
python manage.py check.
Validation¶
- No restart loop in
docker compose ps. - App logs show clean startup.
- Health checks and basic page load succeed.
Collapsed rollback
If root cause is unclear after one remediation cycle, restore last known-good configuration and escalate with captured logs.