Responsibilities
- Perform hands-on installation, configuration, troubleshooting, and replacement of server hardware components including CPUs, memory, storage, and network interface cards across large-scale data center environments
- Execute rack-level deployments and decommissions in alignment with capacity planning schedules, ensuring accurate asset tracking and documentation
- Respond to hardware failure alerts and production incidents, diagnosing root causes and coordinating repairs to minimize mean time to recovery
- Partner with hardware engineering and network operations teams to validate new server configurations and support qualification testing of next-generation infrastructure
- Develop and refine standard operating procedures for recurring operational tasks, leveraging automation and AI-assisted workflows to improve efficiency and reduce manual effort
- Monitor data center infrastructure health using operational dashboards and tooling, identifying trends that indicate systemic hardware or environmental issues
- Collaborate with facilities and power and cooling teams to ensure physical infrastructure conditions remain within operational thresholds
- Contribute to capacity readiness reviews by tracking hardware inventory, flagging deployment blockers, and communicating status to cross-functional stakeholders
- Support continuous improvement initiatives by analyzing operational data, identifying recurring failure patterns, and proposing process or tooling enhancements
- Maintain accurate records of hardware assets, incident tickets, and change management activities in accordance with data center operational standards
Minimum Qualifications
- 2+ years of experience in data center operations, including hands-on server hardware installation, troubleshooting, and replacement in a production environment
- 2+ years of experience diagnosing and resolving hardware failures across server components such as CPUs, DIMMs, HDDs, SSDs, and network interface cards
- Experience following and contributing to standard operating procedures, change management processes, and asset tracking practices in a large-scale data center
- Experience collaborating with cross-functional teams including network engineering, facilities, and hardware engineering to resolve production issues and support infrastructure deployments
Preferred Qualifications
- Familiarity with data center infrastructure management systems for asset tracking, capacity planning, or incident management
- Experience with scripting or automation tools (such as Python or Bash) to streamline repetitive data center operational tasks
- Exposure to power and cooling systems, including working knowledge of how environmental factors affect server hardware reliability and performance
- Experience supporting hardware qualification or bring-up testing for new server platforms in a production or lab environment
$83,990/year to $130,000/year + bonus + equity + benefits
Learn more about this Employer on their Career Site
