Description
As a Manager for Site Reliability and Operations (SRE), you will lead a team of Site Reliability Engineers to ensure the reliability, scalability, and performance of production systems. This role combines technical expertise with leadership skills to drive operational excellence and foster a culture of collaboration and continuous improvement. As part of the role, you will work with the team to automate operations, optimize infrastructure, and troubleshoot issues in an exciting, fast-paced environment. This role is designed for driven individuals who: - Love learning new technologies and thrive in solving complex challenges. - Comfortable in a fast-paced, changing environment and able to manage competing priorities. - Ability to work effectively across teams and influence without authority. - Are independent, motivated, and excited to take on ambitious projects. - Excel at collaborating with engineering teams and can stay calm under pressure. - Have a passion for delivering quality, reliable solutions in a dynamic, high-energy workplace.
Minimum Qualifications
BS degree or higher in Computer Science or a related field. 5+ years in a site reliability engineering, DevOps, or related role, with at least 2 years in a lead capacity. Strong understanding of systems architecture, cloud infrastructure, and monitoring tools. Proficiency in one or more programming languages, in particular Java. Proven experience in leading and mentoring engineering teams. Strong analytical skills and the ability to troubleshoot complex systems. Knowledge of fundamentals of network, databases, system administration, version control, CI/CD automations. Machine Learning will be a plus. Strong problem-solving and communication skills.
Preferred Qualifications
Knowledgeable with container-based technologies such as Docker, Kubernetes, or EKS. Knowledgeable with modern web services architectures and cloud platforms such as AWS and GCP. Exceptional analytical and troubleshooting skills in complex Unix/Linux systems environments and applications implementations. Ability to build tools from scratch. Ability to work in a collaborative environment.
Learn more about this Employer on their Career Site
