SonicJobs Logo
Left arrow iconBack to search

HPC Systems Engineer

SAIC
Posted 3 months ago, valid for 10 days
Location

Ivy, VA 22945, US

Salary

Competitive

Contract type

Full Time

By applying, a SAIC account will be created for you. SAIC's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.

Sonic Summary

info
  • SAIC is seeking a highly qualified HPC Systems Engineer to support the Army's Golden Dome initiative, focusing on Linux-based High Performance Computing cluster environments.
  • The role requires candidates to have experience with multi-node Linux cluster environments, workload scheduling platforms, and distributed compute workloads, with a preference for RHEL-based systems.
  • Responsibilities include cluster platform configuration, scheduler administration, performance analysis, and GPU compute workload support.
  • Candidates should have a minimum of 5 years of experience in relevant fields and be comfortable working in secure research environments.
  • The position offers a competitive salary of $130,000 to $150,000 annually, depending on experience.

SAIC is looking for a highly qualified HPC Systems Engineer to support the Army’s Golden Dome initiative. The engineer will support the deployment and sustainment of Linux-based High Performance Computing (HPC) cluster environments used for distributed compute workloads, simulation environments, and GPU-enabled processing.

The environment will include:

  • multi-node Linux compute clusters
  • workload scheduling platforms such as Slurm or PBS
  • cluster provisioning frameworks (e.g., xCAT, Warewulf)
  • high-performance networking technologies including RDMA / InfiniBand
  • distributed parallel compute workloads utilizing MPI or OpenMP
  • GPU-enabled compute resources supporting CUDA-based processing

 

The system will be used to support scientific computing, simulation workloads, and other distributed compute operations within a secure research environment.

Candidates should be comfortable working within cluster-scale computing environments where performance, scheduler configuration, and distributed workload execution are critical operational factors.

The HPC Systems Engineer will support the build-out, configuration, and sustainment of HPC cluster platforms.


The role focuses on:

  • cluster platform configuration
  • scheduler administration
  • distributed compute troubleshooting
  • performance analysis across compute, storage, and network layers
  • GPU compute workload support
  • automation and operational tooling

 

Candidates should have experience working with multi-node Linux cluster environments and distributed compute workloads.

Core Technical Capabilities

Candidates should demonstrate capability in most of the following areas.

HPC Cluster Platforms

Experience supporting multi-node Linux compute clusters, including node integration, configuration, and operational sustainment.

Experience with cluster provisioning tools such as xCAT, Warewulf, or similar node deployment systems is beneficial.

Workload Scheduling Platforms

Experience supporting distributed compute workloads using schedulers such as:

  • Slurm
  • PBS / PBS Pro
  • Torque
  • Grid Engine

 

Candidates should understand queue configuration, job submission workflows, and scheduler troubleshooting.

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments.

Linux Systems Administration


Strong Linux administration experience including:

  • command-line system administration
  • server and compute node configuration
  • system troubleshooting in distributed compute environments

 

Experience with RHEL-based environments is preferred.

Distributed and Containerized Workloads


Experience supporting distributed compute workloads utilizing parallel computing frameworks such as:

  • MPI
  • OpenMP
  • GPU compute frameworks

 

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments within HPC clusters.

Familiarity with container technologies commonly used in HPC environments such as:

  • Docker
  • Podman
  • Singularity / Apptainer

 

Candidates should understand how containerized workloads interact with schedulers, GPU resources, and distributed compute environments.

Experience supporting containerized HPC workloads or integrating container platforms with cluster infrastructure is desirable.

HPC Networking


Familiarity with high-performance networking technologies including:

  • RDMA networking
  • InfiniBand
  • high-throughput cluster networking architectures

 

Candidates should be comfortable assisting with troubleshooting cluster communication or performance issues.

GPU Compute Environments

Experience supporting GPU-enabled compute environments and workloads utilizing CUDA frameworks is desirable.


Automation and Operational Tooling
Experience writing scripts or operational tooling using languages such as:

  • Bash
  • Python
     

Automation experience supporting system administration or cluster operations is beneficial.
 


SAIC® is a premier Fortune 500® mission integrator focused on advancing the power of technology and innovation to serve and protect our world. Our robust portfolio of offerings across the defense, space, civilian and intelligence markets includes secure high-end solutions in mission IT, enterprise IT, engineering services and professional services. We integrate emerging technology, rapidly and securely, into mission critical operations that modernize and enable critical national imperatives.

We are approximately 24,000 strong; driven by mission, united by purpose, and inspired by opportunities. SAIC is an Equal Opportunity Employer. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $7.5 billion. For more information, visit saic.com. For ongoing news, please visit our newsroom.



Learn more about this Employer on their Career Site

Apply now in a few quick clicks

By applying, a SAIC account will be created for you. SAIC's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.