SonicJobs Logo
Left arrow iconBack to search

Reliability Engineer

KOHLS
Posted 16 days ago, valid for 16 days
Location

Menomonee Falls, Waukesha County 53051, WI

Salary

Competitive

Contract type

Full Time

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.

Sonic Summary

info
  • The Reliability Engineer role at Kohl's focuses on ensuring the resilience and availability of systems and applications.
  • Candidates are required to have a Bachelor's Degree in MIS, Computer Science, or a related field, along with a minimum of 2 years of experience in software development.
  • The position offers a salary of $100,000 per year and requires strong programming skills in languages such as Java, Python, Go, or Node.js.
  • Responsibilities include driving incident response efforts, implementing monitoring mechanisms, and collaborating with product teams to enhance operational excellence.
  • Preferred qualifications include experience with cloud platforms and monitoring tools, as well as knowledge of containerization and orchestration.

About the Role

As Reliability Engineer, you will ensure the resilience and availability of Kohl’s systems and applications and collaborate closely with development teams to review designs, conduct risk assessments and implement robust monitoring and failover mechanisms. 

What You’ll Do

  • Drive incident response efforts, perform root cause analysis and implement preventative measures to enhance system reliability

  • Establish consistent practices that elevate Kohl’s operational excellence through automation and process improvements

  • Follow software lifecycle and drive reliability, observability and efficiency across product teams within an assigned domain

  • Identify repeated toil and find opportunities for automation and risk reduction

  • On-call on a rotation to respond to production incidents and conduct blameless retros and root-cause analyses (RCAs) to drive a culture of continuous improvements

  • Proactively identify failures before they cause outages using chaos engineering techniques such as edge cases, failure modes and design review

  • Advise on capacity planning and provide continuous assessments on systems behavior and consumption

  • Work with product managers to identify and prioritize work for reliability best practices (i.e., leveraging SLIs/SLOs/Error Budgets)

  • Additional tasks may be assigned

What Skills You Have

Required

  • Bachelor's Degree or equivalent in MIS, Computer Science or related field

  • 2+ years of experience in software development

  • Strong programming skills in one or more languages (Java, Python, Go or Node.js)

  • Working knowledge of systems architecture, operating system internals and network fundamentals 

  • Experience working with one cloud platform (e.g., GCP, AWS, or Azure)

Preferred

  • Experience with monitoring techniques and tools (e.g., CloudWatch, Grafana, Prometheus, OpenTelemetry, Tracing) 

  • Working knowledge around containerization and container orchestration (e.g., Docker, Kubernetes, Rancher) 




Learn more about this Employer on their Career Site

Apply now in a few quick clicks

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.