Skip to content

Runbook: Service Startup Failure

Use this runbook when one or more services fail to start or repeatedly restart.

Detection Signals

  • docker compose ps shows exited or restarting services.
  • App endpoint unavailable after stack startup.
  • Logs show repeated boot failures.

Response Flow

flowchart TD
    A[Detect startup failure] --> B[Check service state]
    B --> C[Inspect app, db, redis, celery logs]
    C --> D[Classify root cause]
    D --> E[Apply targeted fix]
    E --> F[Restart affected services]
    F --> G[Validate stack health]

Steps

  1. Run docker compose ps.
  2. Capture logs: docker compose logs -f app db redis celery.
  3. Validate env values in .env and service connectivity.
  4. Rebuild if required: docker compose up -d --build.
  5. Re-check health and run python manage.py check.

Validation

  • No restart loop in docker compose ps.
  • App logs show clean startup.
  • Health checks and basic page load succeed.
Collapsed rollback

If root cause is unclear after one remediation cycle, restore last known-good configuration and escalate with captured logs.