Incident Manager Jobs in San Francisco
Search jobs in
1 - 15 of 26
Search Results - Incident Manager Jobs in San Francisco
Robert HalfSan Francisco
roles with deep AWS expertise.
• Strong command of AWS operational services and architectural patterns.
• Proven experience in monitoring, observability, and incident management, including SLOs, SLAs, and error budgets.
• Proficiency with Datadog...
Key TechnologySan Francisco
Strong observability and on-call practices (metrics, tracing, alerting, incident management).
• Excellent collaboration with AI/ML and product teams; clear communication of risk and trade-offs.
• Must be authorised to work in the U.S. and able to work...
Crusoe Energy Systems LLCSan Franciscoappcast.io
capacity growth, and incident management.
• Contribute to long-term site strategy, expansion roadmaps, and scaling models to support 300k+ GPU growth.
• Serve as a thought leader for sustainable AI infrastructure, ensuring Crusoe remains at the forefront...
FluidstackSan Franciscoappcast.io
and efficiency.
• Drive continuous improvement initiatives to optimize workflows, reduce toil, and ensure consistent execution across sites.
• Lead stability improvement and incident management programs, including post-incident reviews (PIR) with root cause...
ClipboardSan Franciscoappcast.io
Maintain accuracy in forecasting, resource planning, incident management, and reporting.
• Drive continuous improvement through structured analysis and strong operational controls.
Qualifications
• Minimum 8 years of experience in operations leadership...
Blue Signal SearchSan Francisco
programs to monitor operational health, flag issues, and enable rapid incident resolution.
• Own incident management processes, including PIRs (post-incident reviews), root cause analysis, and CAPA follow-through.
• Champion preventive maintenance, physical...
Air Apps, Inc.San Franciscoappcast.io
with incident management, debugging, and root cause analysis.
• Proficiency in scripting (Bash, Python, or Go) for automation and system monitoring.
• Knowledge of load balancing, failover strategies, and distributed systems.
• Understanding of security best...
Air AppsSan Franciscoappcast.io
with incident management, debugging, and root cause analysis.
• Proficiency in scripting (Bash, Python, or Go) for automation and system monitoring.
• Knowledge of load balancing, failover strategies, and distributed systems.
• Understanding of security best...
Relevance AISan Franciscoappcast.io
record with Infrastructure as Code (Terraform, Kubernetes/EKS, CDK, or CloudFormation).
• Hands-on with observability stacks (CloudWatch, Grafana, Prometheus, Datadog).
• Incident management experience in production SaaS systems, including on-call...
harvey.aiSan Franciscoappcast.io
monitoring, alerting, and infrastructure resources (compute, storage, networking) across 50+ global regions
• Lead incident management processes, including postmortems, root cause analyses, and driving actionable improvements
• Automate operational tasks...
SierraSan Franciscoappcast.io
tooling, and incident management processes to reduce downtime and response time.
• Define the foundation of SRE practices at Sierra, influencing culture, tooling, and best practices across the engineering org.
What you'll bring
• 5+ years of hands‑on...
EOS IT CompanySan Francisco
Experience with strategic planning, process improvement, and leveraging automation and scripting to optimize workflows
• Familiarity with incident management and escalation processes, including major incident resolution
• Strong organizational skills...
Resolve.AiSan Franciscoappcast.io
in observability, incident-management, monitoring, or GenAI tooling.
• Excellent communicator — able to present to technical and non-technical audiences, including executives.
• Metrics-driven mindset — experience defining, tracking, and improving KPIs (e.g. WAU...
AsanaSan Franciscoappcast.io
incident management process – we’re investing here, and you’ll help shape how it works.
• Build internal platforms and frameworks that help other teams improve the reliability of their services.
• Be part of (and help shape) a sustainable on‑call rotation...
Charles Schwab CorporationSan Franciscoappcast.io
reviews and guidance.
• Strong understanding of observability, incident management and reliability engineering principles.
• Mindset of continuous learning and improvement, adept at both giving and receiving feedback.
• Ability to troubleshoot complex...
12
Companies now hiring in San Francisco:
Incident Manager jobs in San Francisco – Similar offers:
Incident Manager jobs – More cities:
Broaden your job search:
Don’t miss out on new job openings!
Create a job alert for: Incident Manager, San Francisco
It's free, and you can cancel email updates at any time
12