- We develop mission critical solutions across the land, sea, air, space and cyber domains.
- Career advancement
- Autonomy in the workplace
- Well liked management
What you will be doing:Â Dev Ops System Administrator
- Design, implement, and maintain scalable and robust infrastructure for AI/ML model training and inference.
- Develop and manage CI/CD pipelines for automated building, testing, and deployment of AI applications and machine learning models.
- Administer and optimize Linux-based systems and virtualized environments.
- Manage containerization and orchestration platforms (e.g., Docker, Kubernetes) to deploy and scale ML services.
- Automate infrastructure provisioning, configuration management, and deployment processes using Infrastructure as Code (IaC) tools like Ansible or Terraform.
- Manage and allocate GPU resources efficiently for model training and other high-performance computing tasks.
- Implement and maintain monitoring, logging, and alerting systems to ensure platform health and performance.
- Collaborate with development teams to support their infrastructure needs and troubleshoot issues.
Experience you will need:Â Dev Ops System Administrator
- Bachelor’s degree in Computer Science, a related field or equivalent experience is required plus a minimum of 8 years of relevant experience; or Master’s degree plus 6 years of relevant experience.
- Department of Defense TS/SCI with Polygraph security clearance is required at time of hire.
- Advanced understanding of server-based operating systems.
- Strong Linux/Container/AI Skills.
- Subject matter expert (SME) with the ability to mentor others on administrating the server environment.
- Enhanced troubleshooting skills within the server OS as well as both networking and storage technologies.
- Hands-on experience developing, deploying, and supporting large-scale enterprise server solutions.
- Experience working with or familiarity with AI/ML models is preferred.
#INDEH123
Learn more about this Employer on their Career Site
