Overview

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission to empower every person and every organization to achieve more. You will help build and integrate cutting-edge AI into Microsoft products and services within Experience + Devices (E+D) org, ensuring solutions are inclusive, ethical, and impactful. This role blends applied research, machine learning engineering, and product innovation. You will lead efforts to ship reliable, production-grade AI systems across the stack, from model development and fine-tuning to performance optimization and deployment.

We are in an era of unprecedented AI innovation. As Microsoft leads the way in foundation models, multimodal systems, and AI agents, our goal is to build an open architecture platform where users can interact with tailored AI agents that drive tangible, real-world outcomes. As a Senior Research Engineer, you will:

Bridge the gap between state-of-the-art research and customer-facing features

Drive systems-level innovation across models, infrastructure, and deployment

Champion responsible AI by embedding fairness, safety, privacy, and performance from the ground up

Responsibilities

Bringing State-of-the-Art Research to Products

Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML

Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments

Build and harden prototypes into production-ready services using robust software engineering and MLOps practices

Drive original research and thought leadership (whitepapers, internal notes, patents); convert insights into shipped capabilities

Research Translation: Continuously review emerging work; identify high-potential methods and adapt them to Microsoft problem spaces

End-to-End System Development

ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops

Identify and resolve model quality gaps, latency issues, and scale bottlenecks using PyTorch, or TensorFlow

Operate CI/CD and MLOps workflows including model versioning, retraining, evaluation, and monitoring

Integrate AI components into Microsoft products in close partnership with engineering and product teams

Data-Driven Innovation

Evaluation & Instrumentation: Build robust offline/online evals, experimentation frameworks, and telemetry for model/system performance.

Learning Loop Creation: Operationalize continuous learning from user feedback and system signals; close the loop from experimentation to deployment.

Experimentation & E2E Validation: Design controlled experiments, analyze results, and drive product/model decisions with data.

Develop proofs of concept that validate ideas quickly at realistic scales

Curate high-signal datasets, including synthetic and red-team corpora, and establish labeling protocols and data quality checks tied to evaluation KPIs

Cross-Functional Collaboration

Partner with software engineers, scientists, designers, and product managers to deliver high-impact AI features

Translate research breakthroughs into scalable applications aligned with product priorities

Communicate findings and decisions through internal forums, demos, and documentation

Responsible AI & Ethics

Identify and mitigate risks related to fairness, privacy, safety, security, hallucination, and data leakage

Uphold Microsoft’s Responsible AI principles throughout the lifecycle

Contribute to internal policies, auditing practices, and tools for responsible AI

Operating Altitudes

Paper level (ideas and math): Read, critique, and adapt the latest research; identify gaps; design methods with clear trade-offs and guarantees; communicate complex ideas clearly.
Example: “This objective is brittle under our data regime. Here is a tighter analysis and a revised loss we can test this sprint.”

Code level (implementation): Turn ideas into robust, tested, maintainable modules; integrate with CI/CD; profile and optimize for latency and throughput.
Example: “Refactored the prototype into a reusable PyTorch component, added unit tests and benchmarks, and cut P95 inference latency by 30%.”

Specialty Technical Areas

Large-scale training and fine-tuning of LLMs, vision-language, or multimodal models

Multi-agent systems, dialogue agents, and copilots

Optimization of inference speed, accuracy, reliability, and cost in production

Retrieval systems and hybrid architectures using RAG and vector databases

ML for real-world data constraints such as missing data, noisy labels, and class imbalance

Qualifications

Required Qualifications

Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, Physics, or a related field and 4 or more years in applied ML or AI research and product engineering,
- OR Master’s degree and 3 or more years in applied ML or AI research and product engineering,
- OR PhD in a relevant field and 2 or more years with generative AI, LLMs, or related ML algorithms

Other Requirements

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications

PhD in AI/ML or related field with top-venue publications and/or patents

Experience with Microsoft’s LLMOps stack: Azure AI Foundry, Azure Machine Learning, Semantic Kernel, Azure OpenAI Service, and Azure AI Search for vector/RAG

Familiarity with responsible AI evaluation frameworks and bias mitigation methods

Experience across the product lifecycle from ideation to shipping
Proficiency in Python and at least one deep learning framework such as PyTorch, JAX, or TensorFlow
Experience deploying Fine Tuned LLMs or multimodal models in live production environments
Experience shipping and maintaining production AI systems

#CXAJOBS

#BICJOBS

Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Learn more about this Employer on their Career Site