News and Announcements

AMD Competition Success: 30K+ Submissions and Recognition at Advancing AI (June 2025)

We are thrilled to share that GPU MODE was recognized on stage by Dr Lisa Su at the Advancing AI closing ceremony, where she said "I wanted to thank the GPU MODE team formed by talented developers from Meta, Hugging Face and MIT, they have been great partners throughout and we could not have done this without them." Back when GPU MODE was just a humble reading group, we never imagined we would be recognized on stage by one of the greatest CEO's of our time.

Lisa Su recognizing GPU MODE at Advancing AI

We were missing the giga cracked Erik (ngc92)

Our team built the infrastructure for the AMD $100K kernel competition, which ran for 2 months and saw remarkable participation: over 30,000 submissions from 163+ teams. This volume exceeds the total number of kernels collected in KernelBook from crawling all of Github and this represents a significant milestone in aggregating higher quality kernel data

The results have been outstanding - the best competition kernels are faster than AMD's AITER baselines, all implemented in single files. It was an absolute pleasure meeting some of the top teams in person including Seb, hatoo, Snektron and the grand prize winners ColorsWind.

You can see the full results here.

Several top competitors have generously shared their techniques:

We're planning to release all submissions as a permissively licensed dataset, with each solution representing unique tradeoffs between usability and performance. We're working closely with ROCm engineers to upstream the best kernels to PyTorch, leveraging its position as the premier distribution vehicle for kernels.

In exciting academic news, our KernelBot platform has been accepted to the ICML CodeML workshop with two strong accepts! Reviewer #2 highlighted the virtuous loop we created: "The paper presents KernelBot, a platform for hosting code optimization competitions, specifically for GPU kernels. Users can submit their implementations and let the system rank them. This serves to (i) educate users how to write efficient GPU kernels, (ii) improve the efficiency of existing GPU kernels, and (iii) collect high quality data for GPU programs that can be used to train generative models."

A big thank you to everyone who was involved in Popcorn for inspiration, discord.gg/gpumode community and of course our amazing collaborators at AMD for making this possible.

AMD Developer Challenge 2025: Inference Sprint (April 15, 2025)

We are excited to announce the $100K competition hosted by @AMD as part of the first round of our GPU kernel leaderboard! The theme is writing LLM inference kernels on AMD MI300s, provided for free through the GPU MODE Discord.

You (and anyone around the world) can participate for FREE by signing up here. The format consists of three kernels that are core to DeepSeek's LLM inference: FP8 GEMM, Multi-Head Latent Attention, and Fused MOE!

Competitors will target the AMD MI300 using Triton, but other DSLs and languages that target AMD hardware are permitted! Participants can form teams of three when competing, and prizes will be awarded based on team rankings averaged over the three kernels.

Special thanks to @AMD, @indianspeedster, and the amazing GPU MODE Project Popcorn core devs, @a1zhang, @m_sirovatka, @marksaroufim, ngc92 (Erik S.), and @b9r5 for making this competition possible.

We're very excited to accelerate AI research by building on kernels, and we're super grateful for all of the support from the community!

If you're interested in collaborating with us on future competitions / kernels you think are important, definitely reach out to any one of us on the Popcorn team!

Get started now! The first kernel is the FP8 Groupwise GEMM.

2025 AMD Developer Challenge: Inference Sprint