This page provides an overview of past editions of the MLIR school. When teaching materials are available, they can be accessed by clicking on the corresponding course title.

🇫🇷 2025 Paris Winter School

In recent years, MLIR-based compilers have seen widespread adoption in both academia and industry. MLIR has become a de facto standard in many sectors and a shared compiler infrastructure, allowing new projects to build upon sound and reliable existing solutions. This winter school aims to facilitate the discovery and adoption of this growing, dynamic and innovative software ecosystem.

This event is funded by DeepGreen, a France 2030 project that aims at developing the “Eclipse AIDGE” open-source platform for embedded AI. The platform provides high-level transformations (compression) on a graph-based IR and aims at providing MLIR-based optimised backends for various architectures.

January 27: ML Compilation Infrastructure Workshop at Google Paris

Timetable Speaker Talk
10am Hugo Pompougnac Opening words
10am – 10:40am Saday Sadayappan (U. of Utah) Machine Learning and Compiler Optimization
A fundamental challenge in various aspects of performance optimization by compilers is the development of effective performance models. Often, the space of alternate transformed code versions is explosively large and determining which of those would perform best is very challenging because of the difficulty of developing accurate analytical performance models. Therefore there has been interest in using Machine Learning for performance modeling in optimizing compilers. This talk will discuss some current directions being pursued.
10:40am – 11am Kunwar Grover (AMD) Demystifying Different Attention Variants
Transformer based models have received unprecedented attention lately. One of the main bottlenecks for these models is the Attention layer, and can sometimes decide the entire execution time of these models. A number of recent research papers have focused on different algorithms to codegen different Attention layer variants efficiently. The aim of this talk is to show how most of these proposed algorithms can be thought of as kernel fusions or known matrix multiplication optimizations. The talk will derive a base implementation for FlashAttention and use existing matrix multiplication like split-k as a model to derive different algorithms for attention variants. The talk will also cover how the MLIR-based compiler, IREE, codegens these kernels.
11am – 11:20am   Coffee break
11:20am – 12pm Jorn Tuyls (AMD) MLIR-based Data Tiling Design for Ryzen AI
This workshop talk presents code generation for the NPU in Ryzen AI laptops and focuses mainly on the data tiling and data packing aspects of it. It shows how data tiling maps to the NPU’s cores and caches and how it can describe granular data movement patterns through the NPU’s data streams.
12pm – 12:40pm Sylvain Noiry (INRIA CORSE/Kalray) Translate MLIR into efficient C code for both a Host and an Accelerator
There are many reasons in favor of a code generation from MLIR to standard programming languages like C. In the context of offloading, it can be necessary to handle the host and the accelerator code in different manners, with different compilers. Accelerator-specific compilers are often not up-to-date nor even based on llvm-project.
The use case will be the Kalray MPPA accelerator, which comes with a low-level offloading library that can be exposed in MLIR. The kernels are compiled using a GCC based compiler, and can rely on hardware-specific micro-kernels or builtins to fully exploit the machine. This presentation will focus on major changes from the MLIR to C code generator already present in MLIR-AIE, to use the Kalray accelerator. This includes differentiation between the host related and the offloaded parts from a single MLIR input, code generation of hardware specific operations, and a discussion on the tradeoff between transparency and complexity.
12:40pm – 1pm Sasha Lopoukhine (U. of Cambridge) Hardware-Informed Domain-Specific Transformations
1pm – 2:30pm   Lunch and informal discussions
2:30pm – 3:10pm Olivier Bichler (CEA) Towards AI model and architecture co-optimization for ultra-specialized hardware generation with Aidge
Aidge is a generic, multi-paradigms compute graph manipulation, quantization, mapping, scheduling and code generation tool. Its primary targets are embedded systems with specialized hardware accelerators, especially dataflow or restricted instruction set architectures. It is highly interoperable thanks to built-in ONNX import/export and direct PyTorch interface, and its modularity allows to use any of its features in standalone or in conjunction with other tools along the deployment path of a model to the embedded system. In this presentation, we will go over some of the main differentiating features of the framework and the contexts in which they may shine!
3:10pm – 3:50pm Christophe Guillon (INRIA CORSE) For Transparent and Modular Compiler Autotuning
Automatic tuning of compiler optimizations spans various levels of abstraction and multiple fields of computer science. Programming the underlying software components demands not only expertise in each area but also a deep understanding of compiler internals. As a result, fully leveraging these compilers can be challenging, often giving the impression of dealing with a black box.
In this talk, we present several ideas aimed at enhancing the ergonomics and modularity of these compilers in the fields of search space definition, statistical model construction, transformation language definition (and its integration with the code generator). The talk will conclude with a demonstration within the context of the Aidge framework.
3:50pm – 4:10pm Mathieu Fehr (U. of Edinburgh) Formal Semantics as MLIR Dialects
MLIR is designed to be modular and extensible, allowing for the definition of custom IRs. However, MLIR is primarily focused on syntax, and does not provide a way to define the semantics of operations in a formal way. This makes it difficult to reason about the correctness of transformations and analyses, and is a barrier to the development of formal verification tools for MLIR-based compilers. In this talk, we will introduce a set of semantics dialects, based on SMT-LIB, which allows to define the semantics of MLIR dialects as a compiler transformation. We will show how we can use these semantics dialects to give semantics of core MLIR dialects, such as arith, comb, and memref, and how we can use this new abstraction to define formal verification tooling such as a translation validation tool, a peephole rewrite verifier and synthesizer, and a dataflow analysis verifier.
4:10pm – 4:30pm   Coffee break
4:30pm – 5:10pm Dumitru Potop-Butucaru (INRIA KAIROS) Reactive programming is all you need
ML programming still involves today 3 different levels using different formalisms and practices: that of layers, that of models, and that of driver logic controlling the inference, training, or reinforcement learning process. […] In this presentation we will show that (dataflow) reactive languages provide a more natural programming paradigm encompassing layer, model, driver, and also the I/O processing and possibly the low-level scheduling of operations.
Reactive control primitives can be included as a dialect inside MLIR and then used jointly with tensor functions to allow modular, hierarchic, and stateful specification covering all three levels: layer, model and driver. The specification can then be seamlessly compiled into efficient code. Seen as a high-level specification language, the same reactive primitives support modular automatic differentiation and fully automatic synthesis of the parameter update code without the need to expose parameters at top level.
5:10pm – 6:10pm Everyone Community Roundtable: Why are you here ?
Each participant, one minute each.

January 28, Winter School Day 1

Timetable Speaker Talk
9am – 10am Mehdi Amini MLIR is SSA + Regions + Dialects
10am – 10:30am   Coffee break
10:30am – 12pm Mathieu Fehr & Sasha Lopoukhine Interacting with MLIR/xDSL !
This session will include a hands-on tutorial to learn how to interact with xDSL to optimize and compile programs. While this tutorial will be focused on xDSL, the concepts and tools presented will be applicable to MLIR as well. We will showcase a few MLIR IR files, and show how to interpret them using the xdsl-opt tool. We will then present how to build a pipeline of passes to optimize these files and to generate low-level code.
12pm – 2pm   Lunch and informal discussions
2pm – 3:30pm Mathieu Fehr & Sasha Lopoukhine Defining dialects & rewrites with xDSL
This session will be a hands-on tutorial on how to write new dialects and passes in xDSL. In particular, we will present how a pass composed of simple peephole rewrites (local optimizations) can be written in xDSL. We will task the participants with extending a high-level dialect with a new operation, and to then extend an optimization and transformation to a low-level dialect to support that new operation. While these tasks will be done in xDSL for simplicity, the concepts will be applicable to MLIR as well.
3:30pm – 4pm   Coffee break
4pm – 5pm   Install the MLIR LLVM distribution
7pm   Dinner

January 29, Winter School Day 2

Timetable Speaker Talk
9am – 10am   Install the MLIR LLVM distribution: fix the last issues
10am – 10:30am   Coffee break
10:30am – 12pm Mehdi Amini End-to-end MLIR compilation
12pm – 2pm   Lunch and informal discussions
2pm – 3:30pm William Moses Automatic Differentiation in MLIR
Automatic differentiation (AD) is key to training neural networks, Bayesian inference, and scientific computing. Applying these techniques requires rewriting code in a specific machine learning framework or manually providing derivatives. This talk presents Enzyme, a high-performance automatic differentiation compiler plugin for the LLVM and MLIR compiler frameworks. Enzyme differentiates programs in any language whose compiler targets LLVM/MLIR, including C/C++, Fortran, Julia, Rust, Swift, JaX, etc., thereby providing native AD capabilities in these languages with state-of-the-art performance. Unlike traditional tools, Enzyme performs AD on optimized IR. On a combined machine-learning and scientific computing benchmark suite, AD on optimized IR achieves a geometric mean speedup of 4.2x over AD on IR before optimization.
This talk will also include work that makes Enzyme the first fully automatic reverse-mode AD tool to generate gradients of existing GPU kernels as well as the benefits of operating within high-level structured representations, like MLIR.
3:30pm – 4pm   Coffee break
4pm – 5:30pm Marius Brehler Fluent Machine Learning with torch-mlir
7pm   Dinner

January 30, Winter School Day 3

Timetable Speaker Talk
9am – 10:30am Alex Zinenko Structured code generation
Native high-level code generation support in MLIR is largely based on the idea of structured code generation, which is often mistaken for being synonymous with the linear algebra (Linalg) dialect. Instead, the structured code generation approach evolved hand-in-hand with the progressive lowering philosophy of MLIR and permeates most of its dialects involved in code generation. This talk attempts to demystify the structured code generation in MLIR by introducing the relevant concepts bottom-up from individual arithmetic operations on scalars, to single instruction multiple data (SIMD) operations on vectors, to manipulations on multi-dimensional tensors. Using small examples and illustrations, it demonstrates that this approach boils down to a handful of concepts largely present in modern hardware though with a slightly different terminology. It does not require a deep understanding of MLIR or any specific dialect.
10:30am – 11am   Coffee break
11am – 12:30pm Alex Zinenko Using MLIR from C and Python
MLIR, like the rest of LLVM, is primarily written in C++. However, the C++ API is known to be complex and unstable. Moreover, both quick prototyping and deep integration with client frameworks calls for uses of different languages to work with MLIR, most often Python for its simplicity and C for its ubiquity. This talk will present the MLIR C API and demonstrate how it is used to construct Python bindings. Attendees of this talk will learn how to expose custom dialects in both C and Python as well as how to leverage C API to interact with MLIR from different languages.
12:30pm – 2pm   Lunch and informal discussions
2pm – 3:30pm Alex Zinenko Controllable transformations in MLIR
MLIR features support for declaratively specifying and controlling compiler transformations via the transform dialect. It allows one to request compiler transformations using compiler IR itself, which can be embedded into the original IR that is being transformed (similarly to pragmas) or supplied separately (similarly to scheduling languages). This talk presents the concepts of the MLIR transform dialect and related infrastructure. It will be accompanied by a practical demonstration of three use scenarios. After following the task, the attendees will be able to apply the transform dialect in their work and extend it when necessary. Basic familiarity with MLIR is a prerequisite.
3:30pm – 4pm   Coffee break
4pm – 5:30pm Matthias Springer The pattern rewrite infrastructure in MLIR
Pattern-based IR rewriting through the greedy pattern rewriter and the dialect conversion framework is widely used and one of the core mechanisms of MLIR. This session is a hands-on introduction into the pattern API and the pattern drivers, along with some best practices that programmers can follow when designing pattern-based rewrites. Topics that will be covered include: rewrite pattern API, greedy pattern rewrite driver, walk pattern driver, conversion pattern API, type converter API, dialect conversion, 1:N conversions, declarative pattern definition with PDL, canonicalizer pass, transform dialect integration, debugging strategies for pattern-based rewrites.
7pm   Dinner

January 31, Winter School Day 4

Timetable Speaker Talk
9am – 10:30am Mathieu Fehr Defining dialects: ODS/TableGen and C++
After learning the day before how to define operations, rewrites, and passes in xDSL, we will now present how to transfer this knowledge to MLIR. We will present how an MLIR dialect is structured in C++ and TableGen (a metaprogramming tool used by MLIR), and how to define new operations, attributes, and types. We will also present how to write new passes in C++, and how to write peephole rewrites using the MLIR pattern rewrite infrastructure.
10:30am – 11am   Coffee break
11am – 12:30pm Lorenzo Chelini C/C++ and abstraction raising with Polygeist
The MLIR ecosystem has been rapidly evolving, offering powerful abstractions for building domain-specific compilers and optimizing intermediate representations. However, it currently lacks a robust C and C++ frontend. While Clang provides excellent support for targeting LLVM IR, targeting MLIR directly from C and C++ opens up new opportunities for innovation. This talk will introduce Polygeist and demonstrate how it bridges the gap between C or C++ and MLIR, enabling better integration with higher-level abstractions, preserving high-level semantics such as structured control flow and parallelism (e.g., OpenMP/GPU), and supporting the lowering or raising of constructs to user-defined custom operations. Attendees will gain valuable insights into how to use Polygeist and learn about ongoing research directions in this area.
12:30pm – 2pm   Lunch and informal discussions
2pm – 3:30pm Sasha Lopoukhine Hardware-specific lowering in MLIR
MLIR’s design makes it easy to extend existing compiler pipelines with custom transformations and abstractions. Most existing MLIR-based compilers lower their code via LLVM, benefitting from extensive compiler infrastructure. However, LLVM’s backends may be optimised for best general performance, and may not be suitable for scenarios where precise control and extensibility are desired. This workshop covers an alternative flow, leveraging assembly dialects in MLIR to output assembly for linear algebra micro-kernels, with a mix of standard ISAs as well as custom extensions.
3:30pm – 4pm   Coffee break
4pm – 5:30pm Kunwar Grover IREE, its runtime and its dialects
5:30pm Fabrice Rastello Closing words

Teachers

  • Mehdi Amini
    After a PhD at MINES Paris on automatic parallelisation for accelerators like GPUs, Mehdi joined Apple to work on Clang/LLVM. One of his main contributions to LLVM has been scaling link-time optimizations with ThinLTO. He then joined the Tesla Autopilot group, taking a break from working on compiler, but not for long: he soon went back to it in order to build MLIR at Google and later drive the launch of the OpenXLA initiative. He’s now a Distinguished Engineer at Nvidia working on Deep Learning Frameworks.
  • Marius Brehler
  • Lorenzo Chelini
    Lorenzo Chelini is a compiler engineer at NVIDIA. He holds a Ph.D. in Computer Engineering from the Technical University of Eindhoven, a Master’s degree from the Polytechnic of Turin, and a Bachelor’s from the University of Pisa. Lorenzo actively contributes to the LLVM ecosystem, mainly MLIR and Polygeist.
  • Mathieu Fehr
    Mathieu Fehr is a final-year PhD student at the University of Edinburgh, currently visiting at the University of Cambridge. A large part of his research focuses on improving the accessibility of compiler technology, which includes the design and development of xDSL, a smoother entry-point for MLIR. His broader research interests encompass advancing declarative approaches in compiler design to facilitate formal reasoning and enable an ecosystem of compilation tools, including verifiers, fuzzers, and superoptimizers.
  • Kunwar Grover
  • Sasha Lopoukhine
    Sasha Lopoukhine is a PhD student at the University of Cambridge, researching making machine learning compilers more approachable and extensible. His recent work has been to leverage xDSL to implement a backend for linear algebra micro-kernels targeting ETH’s Snitch core, outperforming the state-of-the-art LLVM backend by a factor of 20.
  • William Moses
    William Moses is an Assistant Professor at the University of Illinois in the Computer Science and Electrical and Computer Engineering departments and Researcher at Google. He received a Ph.D. in Computer Science from MIT, where he also received his M.Eng in electrical engineering and computer science (EECS) and B.S. in EECS and physics. William’s research involves creating compilers and program representations that enable performance and use-case portability, thus enabling non-experts to leverage the latest in high-performance computing and ML. He is known as the lead developer of Enzyme (NeurIPS ‘20, SC ‘21, SC ‘22’), an automatic differentiation tool for LLVM capable of differentiating code in a variety of languages, after optimization, and for a variety of architectures and the lead developer of Polygeist (PACT ‘21, PPoPP ‘23), a polyhedral compiler and C++ frontend for MLIR. He has also worked on the Tensor Comprehensions framework for synthesizing high-performance GPU kernels of ML code, the Tapir compiler for parallel programs (best paper at PPoPP ‘17), and compilers that use machine learning to better optimize (AutoPhase/TransformLLVM). He is a recipient of the ACM SIGHPC Doctoral Dissertation Award, a U.S. Department of Energy Computational Science Graduate Fellowship and the Karl Taylor Compton Prize, MIT’s highest student award.
  • Matthias Springer
    Matthias is a software engineer at NVIDIA Switzerland. He received a Ph.D. in Mathematical and Computing Sciences from the Tokyo Institute of Technology. He has been contributing to MLIR and other MLIR-based open source projects over the last three years.
  • Alex Zinenko
    Alex Zinenko is the Chief Scientist at Brium Inc., a young innovative company in the domain of high-performance AI. Previously, he worked as a staff research engineer at Google DeepMind and a research engineer at Inria. Alex obtained his PhD from the University Paris Saclay (Paris Sud XI) for his work on “Interactive Program Restructuring”. His research interests span from compilation to high-performance systems, to interactive software visualization united for the common goal of making programming efficient programs effectively.