1. Triage the failure pattern first
Separate deterministic failures (always same step) from intermittent failures (infra timing, env drift, permissions). This determines whether you need design correction or runtime hardening.
2. Validate rollback and release safety
- Ensure each environment has a tested rollback path.
- Validate artifact immutability and release traceability.
- Prevent concurrent pipeline runs that overwrite deployment context.
3. Standardize environment assumptions
Pipeline scripts often depend on undocumented variables, credentials, or manually prepared agents. Normalize environment setup to reduce silent drift.
4. Add quality and policy gates
Gate production promotion on test pass criteria, config checks, and approval rules. Rescue work should end with guardrails, not temporary fixes only.
5. Finalize with handover docs
Document release flow, approval owners, rollback runbook, and known risk triggers so teams can sustain reliability.
Related service: Azure DevOps pipeline setup and rescue