Software Engineer, Large Scale Pre-Training Performance

Place of work Mountain View
Contract type -
Start date 4 days ago
Salary -

Job details

Job description, work day and responsibilities

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, ****** orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

Snapshot

We are seeking a software engineer to define, drive, and critically contribute to the next generation of the state-of-the-art ML models on TPU. As part of the Pre-Training team you will co-design the model, and implement critical components across Model architecture, ML frameworks, custom kernels and platform, to deliver frontier models with maximum efficiency.

About Us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The Role

We’re looking for a Software Engineer to re-define efficient training of frontier LLMs at massive scale. This role offers an opportunity to influence the design of frontier LLM models, and drive an effort to ensure efficient training and inference.

Key responsibilities:
• Being responsible for Pre-Training efficiency and optimising the performance of the latest models on Google’s fleet of hardware accelerators - throughout the entire LLM research, training and deployment lifecycle.
• Being responsible for guiding model design to ensure inference-efficiency.
• Greatly improving the performance of LLM models on hardware accelerators by optimizing at all levels, including developing custom kernels when necessary.
• Collaborating with the compiler, framework, and platform teams. And ensure efficient training at industry-largest scale.
• Profile models to identify performance bottlenecks and opportunities for optimization.
• Develop low-level custom kernels for maximum performance of the most critical operators.
• Collaborating with research teams by enabling new critical operators in advance of their availability in frameworks and compilers.

About You

You're an engineer looking to re-define efficient training of frontier LLMs at massive scale and have:
• A proven track record of critical contributions to the distributed training of LLMs at 1e25 FLOPs scale on modern GPU/TPU clusters
• Experience in programming hardware accelerators GPU/TPUs via ML frameworks (e.g. JAX, PyTorch) and low-level programming models (e.g. CUDA, OpenCL)
• Experience in leveraging custom kernels and compiler infrastructure to improve performance on hardware
• Experience with Python and neural network training (publications, open-source projects, relevant work experience, etc.)

The US base salary range for this full-time position is between $235,000 - $350,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.

Application deadline: March 12, 2025

Note: In the event your application is successful and an offer of employment is made to you, any offer of employment will be conditional on the results of a background check, performed by a third party acting on our behalf. For more information on how we handle your data, please see our Applicant and Candidate Privacy Policyopen_in_new.

Company address

United States
California
Mountain View
Show on map Get directions
Company Name: Google DeepMind
You will be redirected to another website to apply.
Offer ID: #1035503, Published: 4 days ago, Company registered: 7 months ago

Other offers

Backend Developer Lead
1795login
· Golden, US
Job highlights Identified by Google from the original job post Responsibilities Collaborate with cross-functional teams to define, design, and ship new features Architect and build from a vertical slice out to the full b...
Mobile Developer​/iOS​/Android
1795login
· Charlotte, US
Job highlights Identified by Google from the original job post Qualifications Title: Mobile Developer (iOS or Android) We are seeking a skilled Mobile Developer to build and maintain high-quality mobile applications for ...
Mobile Software Engineer
1795login
· San Jose, US
Job highlights Identified by Google from the original job post Qualifications Strong development experience in an OO language (Java, python, C++, C#, etc.) Experience with cloud application development including scalable...
Senior Mobile Software Developer
1795login
· Bloomington, US
Job highlights Identified by Google from the original job post Qualifications Bachelor's Degree At least 3 years of professional software engineering experience (Internship experience does not apply) At least 2 years of ...
Software Engineer 4
1795login
· Colorado Springs, US
Job highlights Identified by Google from the original job post Qualifications Typically requires a Bachelor's degree and + years of experience Active Secret security clearance U.S. Citizenship is required Benefits In com...
AWS Software Engineer
1795login
· Newport Beach, US
Job highlights Identified by Google from the original job post Qualifications The ideal candidate for the AWS Software Engineer position should possess the following qualifications: Bachelor's Degree in Computer Science,...
Software Engineer (Full Stack)
1795login
· Atlanta, US
Job highlights Identified by Google from the original job post Qualifications Applicants must be authorized to work in the United States Possess excellent design and coding skills and a zeal for owning the complete SDLC ...
Software Engineer 1 - Full Stack (.Net, JavaScript)
1795login
· Atlanta, US
Job highlights Identified by Google from the original job post Qualifications Bachelor’s degree, or equivalent work experience Two to three years of relevant experience NET MVC 4 (Razor), EF, WCF Services, T-SQL, Release...
Lead Software Engineer – Backend, DevOps (Multiple openings) in Charlotte, NC
1795login
· Charlotte, US
At U.S. Bank, we’re on a journey to do our best. Helping the customers and businesses we serve to make better and smarter financial decisions and enabling the communities we support to grow and succeed. We believe it tak...
Mobile Software Engineer
1795login
· Bellevue, US
Job highlights Identified by Google from the original job post Qualifications Experience in creating user interfaces using appropriate and relevant technologies (e.g., HTML 5 coding, CSS (Cascading Style Sheets) librarie...
Software Development Engineer, Fashion & Fitness
1795login
· Sunnyvale, US
Job highlights Identified by Google from the original job post Qualifications A successful candidate will have an established background in developing customer-facing experiences and will be a self-starter, comfortable w...
Software Engineer / Software Developer
1795login
· Schaumburg, US
Job highlights Identified by Google from the original job post Qualifications Roles and responsibilities: 5+ years of experience in a modern development stack, including Golang, Kafka, and REST API development Experience...
Senior Software Engineer - React
1795login
· Cincinnati, US
Job highlights Identified by Google from the original job post Qualifications The ideal candidate will have a strong background in React and Next.js, coupled with experience in Front End, Vue, API Development, and paymen...
.NET Software Engineer – Reston
1795login
· Reston, US
Job highlights Identified by Google from the original job post Qualifications The ideal .NET Software Engineer candidates should have experience with some or all of the following (fill training will be provided to fill a...
Software Engineer - Mobile Payments
1795login
· Philadelphia, US
Job highlights Identified by Google from the original job post Qualifications The role requires strong collaborative skills and a passion for mobile development The ideal candidate will have at least three years of exper...