Role Overview
This is a dedicated platform ownership role. You will take full end-to-end responsibility for the
DevOps and infrastructure layer across all products. You will be the single point of accountability
for pipelines, environments, credentials, deployment reliability, financial cost governance, and
compliance posture. Beyond keeping the lights on, you will proactively leverage AI-driven tooling to
improve DevOps workflows and continuously raise the bar on what reliable infrastructure means.
About the role
You will own the complete infrastructure and DevOps layer for a growing SaaS company
running across multi-cloud infrastructure across major cloud providers — partnering directly
with the Lead Engineer and a distributed offshore engineering team. This is not a support role.
You are the primary point of accountability for pipeline reliability, credential hygiene,
observability, cloud cost governance, and compliance posture. You also serve as the company’s
infrastructure voice in financial and compliance conversations, owning the data that informs
leadership decisions on cloud investment and risk.
What you'll do
CI/CD & Deployment
 Own all CI/CD pipelines across every application — consistent, documented, and not
dependent on tribal knowledge
 Manage multi-cloud resource provisioning with Terraform or equivalent IaC tooling — version-
controlled and fully reproducible
ï‚· Own secrets, credentials, and access management across all environments using enterprise
secrets management platforms
ï‚· Set up repo creation, branching standards, and tooling for the distributed engineering team
ï‚· Own cross-platform mobile application build pipelines (iOS and Android) for both app stores
ï‚· Support the work, deployment, and maintenance of our active-active multi-cloud platform
Observability & Uptime (SLA: 99.99%)
 Own SLA commitments for all production systems — 99.99% uptime target with documented
contracts and breach escalation paths
ï‚· Define and maintain on-call rotation, and incident severity matrix
 Maintain and evolve runbooks for all critical failure scenarios — executable by anyone on the
team, not just the author
ï‚· Implement and operate the full observability stack: enterprise observability tooling, centralized
log aggregation, real-time alerting, and synthetic monitoring for all customer-critical paths
 Own customer-facing and internal system health dashboards — surfacing uptime, error rates,
latency, and throughput in real time
Financial Operations & Cost Governance
ï‚· Build and maintain FinOps dashboards surfacing cloud spend, resource utilization, and cost-per-
environment breakdowns
ï‚· Implement financial controls: budget alerts, tagging enforcement, reserved instance planning,
and rightsizing across major cloud providers
 Own the monthly cloud cost review process — flag anomalies, model savings scenarios, and
present findings to leadership
ï‚· Track and report FinOps KPIs: cloud cost as a percentage of revenue, waste percentage, and
savings realized
Security & Compliance (SOC 2 / ISO 27001)
 Own the infrastructure contribution to SOC 2 Type II and ISO 27001 — evidence collection,
control mapping, and audit readiness at all times
ï‚· Conduct regular access reviews and enforce least-privilege IAM across all cloud environments
ï‚· Work directly with the compliance team to respond to auditor requests, close findings, and
maintain continuous compliance posture
ï‚· Implement and monitor infrastructure security controls: network segmentation, encryption at
rest and in transit, vulnerability scanning, and drift detection
AI-Driven Process Improvements
 Identify DevOps workflows where AI tooling can drive measurable efficiency gains — pipeline
generation, intelligent incident detection, and automated runbook execution
ï‚· Evaluate and pilot AI-assisted tools for infrastructure: anomaly detection on metrics streams,
LLM-assisted root cause analysis, and AI-generated IaC scaffolding
 Document and share findings with the engineering team — serve as the infrastructure voice in
the company’s AI-first engineering culture
Qualifications
Hands-on experience with major cloud providers across multi-cloud environments —
provisioning, networking, IAM, and cost management
 Terraform or equivalent IaC tooling — you write it, maintain it, and own it
 CI/CD pipeline design and ownership using modern CI/CD platforms — you design it, maintain
it, and own it
ï‚· Confident with .NET / C# application deployments
ï‚· Secrets and credential management at scale using enterprise secrets management platforms
ï‚· Multi-app, multi-environment deployment pipelines with consistent standards
ï‚· Mobile build pipeline ownership for cross-platform mobile applications (iOS and Android),
including app store deployment automation
ï‚· Production observability engineering: APM tooling, log aggregation, alerting pipelines, and
synthetic monitoring
 SLA ownership experience — on-call processes, incident management, and runbook authorship
ï‚· Demonstrated experience with FinOps practices: cloud cost dashboards, tagging, rightsizing,
and budget governance
 Familiarity with SOC 2 or ISO 27001 control environments — evidence collection, access
reviews, and audit support
ï‚· Comfortable working directly with a distributed offshore engineering team
ï‚· Hands-on experience designing or operating active-active, multi-cloud production
environments — this is a core platform requirement, not a stretch goal
ï‚· Proficiency in Go for infrastructure tooling, automation, and custom operators/controllers
ï‚· Strong Linux shell scripting (bash/sh) for automation, system administration, and pipeline
scripting
Nice to have
 Exposure to AI-assisted DevOps tooling — anomaly detection, LLM-assisted incident response,
or AI-generated IaC
ï‚· SaaS product background, ideally in a multi-tenant environment
ï‚· Experience with FinOps tooling: Azure and AWS Cost management features
 AWS Certified DevOps Engineer – Professional, Microsoft Certified: DevOps Engineer Expert,
and Certified Kubernetes Administrator (CKA)
Learn more about this Employer on their Career Site
