SonicJobs Logo
Left arrow iconBack to search

Senior Software Engineer - ML Infrastructure

Claryo
Posted 4 months ago, valid for a month
Location

San Francisco, CA 94102, US

Salary

$170,000 - $190,000 per year

Contract type

Full Time

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.

Sonic Summary

info
  • We are seeking a Staff Software Engineer – Computer Vision Deployment to enhance our AI-driven warehouse intelligence platform.
  • The role requires 7+ years of professional software engineering experience, including at least 3 years focused on machine learning infrastructure.
  • You will be responsible for developing and maintaining distributed cloud GPU infrastructure and optimizing machine learning models for low-latency inference.
  • The position offers a competitive salary and benefits package, including top-tier medical coverage and unlimited vacation.
  • Collaboration is key, as you will work closely with research scientists and engineers in a hybrid work environment based in San Francisco.

We're looking for a Senior Software Engineer - ML Infrastructure to build and scale the infrastructure that powers our AI-driven warehouse intelligence platform. You'll own the end-to-end lifecycle of computer vision models β€” from training pipelines through optimized cloud deployment β€” ensuring our cutting-edge computer vision and multi-modal AI systems run reliably and efficiently in production. Your work will directly enable the real-time perception and autonomous decision-making capabilities at the core of our platform.

This is a deeply technical role at the intersection of machine learning, distributed systems, and cloud infrastructure. You'll design scalable GPU compute clusters, build robust orchestration pipelines, and optimize model serving for low-latency inference at scale. You'll work closely with our research scientists, computer vision engineers, and product teams to bridge the gap between experimental models and production-ready systems that operate across diverse warehouse environments. We've found tremendous value in collaborative problem-solving, thus our team works from our SF office three days a week.

Responsibilities

  • Develop and maintain distributed cloud GPU infrastructure for large-scale world model training and low-latency inference.

  • Build end-to-end computer vision pipelines β€” from data ingestion and preprocessing through model training, evaluation, and deployment β€” and integrate them into core product workflows.

  • Deploy and optimize state-of-the-art machine learning models in the cloud using model serving platforms and inference optimization techniques, including VLMs and VLAs.

  • Design and operate orchestration systems that enable both engineers and non-engineers to build and manage data and ML pipelines.

  • Establish monitoring, benchmarking, and evaluation frameworks to ensure model performance and reliability in production environments.

Required Experience

  • B.S. / M.S. in Computer Science, Robotics, or similar technical field, or equivalent practical experience.

  • 7+ years of professional software engineering experience, with at least 3 years in machine learning infrastructure β€” developing, scaling, training, deploying, and optimizing large-scale ML systems from data to model.

  • Track record of deploying machine learning models in production environments with real-world constraints.

  • Experience with distributed messaging and compute systems (Kafka, gRPC, ROS2, or similar).

  • Strong programming skills in Python with solid software engineering practices.

Preferred Experience

  • Experience with training and/or deployment of machine learning models in the computer vision domain.

  • Experience developing, running, and managing orchestration systems (Flyte, Temporal, Airflow, or similar) for ML and data pipelines.

  • Proficiency with ML frameworks (PyTorch, TensorFlow, DeepSpeed) and model serving platforms (TorchServe, TensorFlow Serving, NVIDIA Triton Inference Server, or similar).

  • Deep understanding of state-of-the-art machine learning models such as auto-regressive transformers and familiarity with inference optimization techniques (TensorRT, quantization, custom kernels).

  • Experience with C++ or CUDA programming for GPU acceleration.

  • Prior experience working at autonomous vehicles or robotics companies.

Equal Opportunity Statement

We’re an equal opportunity employer that values diversity and inclusion. We welcome teammates of all backgrounds and don’t discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.

Benefits

At Claryo, we offer a competitive benefits package that supports your health and well-being, including β€” top-tier medical, dental, and vision coverage, 401k with employer matching, parental leave, and unlimited vacation.




Learn more about this Employer on their Career Site

Apply now in a few quick clicks

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.