About Autonomize AI
Autonomize AI is revolutionizing healthcare by streamlining knowledge workflows with AI. We reduce administrative burdens and elevate outcomes, empowering professionals to focus on what truly matters — improving lives. We're growing fast and looking for bold, driven teammates to join us.
The Opportunity
We’re looking for a CloudOps / Site Reliability Engineer to lead the charge in building a fully automated, secure, and scalable multi-cloud infrastructure for our AI-powered healthcare platform. Your mission: keep our deployments lightning-fast, reliable, and invisible. You’ll own the orchestration of services across AWS, Azure, and GCP, automating everything from infra provisioning to rollbacks — with security and uptime built in. This is a builder role — ideal for someone who can go deep into CI/CD, lives for IaC, and thinks deployment velocity is just as important as resiliency.
Key Responsibilities
- Multi-Cloud Infra Management: Design and manage highly available, scalable, and secure infrastructure across AWS, Azure, and GCP
- End-to-End Automation: Build deployment workflows using Terraform, Ansible, Helm, ArgoCD, GitHub Actions or equivalent
- CI/CD at Scale: Own automated delivery pipelines for infrastructure and applications across staging and production
- Reliability Engineering: Define and uphold SLAs/SLOs; own incident management, blameless postmortems, and error budgets
- Security & Compliance: Implement and continuously harden controls for HIPAA, SOC2, and zero-trust environments
- Monitoring & Observability: Deploy and maintain logs, metrics, and alerting systems using Prometheus, Grafana, Datadog, etc.
- Documentation & Process: Create robust runbooks, architectural diagrams, and continuous improvement loops
- Installation and configuration of AI Platform and Solutions at customer deployments
- Support in various IT / Info sec discussions and reviews with customers
- Guide the offshore team as necessary and help with automation of deployments
Must-Have Qualifications
- 5+ years in SRE/CloudOps roles with production-grade infrastructure experience
- Expertise in AWS, and solid hands-on experience in Azure and GCP
- Proven track record with Infrastructure as Code (Terraform preferred) and modern deployment frameworks
- Deep CI/CD experience including automated rollbacks, blue/green or canary deployments
- Skilled in Kubernetes, Docker, and container orchestration
- Experience with secure cloud architectures, RBAC, IAM, and secrets management
- Bias for automation — scripting in Python, Bash, or Go
- Culture fit: you take full ownership, run toward complexity, and operate in the final mile
Bonus
- Prior experience supporting healthtech, life sciences, or other regulated domains
- Implemented policy-as-code tools like OPA/Gatekeeper
- Experience running GPU workloads, ML pipelines, or scalable microservices
- Contributions to open-source DevOps/SRE communities
What we offer
- A chance to make a real impact in the future of healthcare
- Autonomy, ownership, and the ability to chart your own growth path
- Competitive compensation and benefits
- 100% employer-paid health, vision, and dental insurance
- Retirement plans (401k), disability insurance, employee assistance programs
How to Apply
Please submit your resume and a brief cover letter to careers@autonomize.ai explaining why you are the ideal candidate for this role. We are excited to meet someone who is eager to bring their skills, enthusiasm, and creativity to our team!