SonicJobs Logo
Left arrow iconBack to search

Applied Machine Learning Research Engineer - Multimodal LLMs for Human Understanding

Apple
Posted 2 months ago, valid for 21 days
Location

Sunnyvale, CA 94086, US

Salary

Competitive

Contract type

Full Time

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.

Sonic Summary

info
  • We are seeking an Applied Machine Learning Research Engineer to join our Video Computer Vision group at Apple.
  • The ideal candidate will have a Master's degree and a minimum of 3 years of relevant industry experience.
  • This role involves developing and tuning multimodal large language models and requires strong programming skills in Python.
  • Preferred qualifications include a PhD in a related field and expertise in areas such as computer vision and generative AI.
  • Salary details are not provided, but the position offers the opportunity to work on groundbreaking AI and computer vision projects.
We’re starting to see the incredible potential of multimodal foundation and large language models, and many applications in the computer vision and machine learning domain that previously appeared infeasible are now within reach. We are looking for a highly motivated and skilled Applied Machine Learning Research Engineer to join our team in the Video Computer Vision group and help us push the boundaries of human understanding. The Video Computer Vision org has pioneered human-centric real-time features such as FaceID, FaceKit, and Gaze and Hand gesture control which have changed the way millions of users interact with their devices. We balance research and product requirements to deliver Apple quality, pioneering experiences, innovating through the full stack, and partnering with HW, SW and AI teams to shape Apple's products and bring our vision to life.

Description


You’ll work on ground breaking research projects to advance our AI and computer vision capabilities, contribute to both foundational research and practical applications on multimodal large language models, and design, implement, and evaluate algorithms and models for human understanding. You have a strong background in developing and exploring multimodal large language models that integrate diverse data modalities such as text, image, video, and audio. You’ll have the opportunity to collaborate with multi-functional teams, including researchers, data scientists, software engineers, human interface designers and application domain experts. You’ll stay up-to-date on the latest advancements in AI, machine learning, and computer vision and apply this knowledge to drive innovation within the company.

Minimum Qualifications


Experience in developing, training/tuning multimodal LLMs. Programming skills in Python. Masters degree with a minimum of 3 years relevant industry experience.

Preferred Qualifications


Expertise in one or more of: computer vision, NLP, multimodal fusion, Generative AI. Experience with at least one deep learning framework such as JAX, PyTorch, or similar. Publication record in relevant venues. PhD in Computer Science, Electrical Engineering, or a related field with a focus on AI, machine learning, or computer vision.



Learn more about this Employer on their Career Site

Apply now in a few quick clicks

By applying, a Sonicjobs account will be created for you. Sonicjobs's Privacy Policy and Terms & Conditions will apply.

SonicJobs' Terms & Conditions and Privacy Policy also apply.