Dylan Hadfield-Menell

Associate Professor of Electrical Engineering and Computer Science, MIT

Dylan Hadfield-Menell is an Associate Professor of EECS at MIT. He runs the Algorithmic Alignment Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL). His research develops methods to ensure that AI systems behavior aligns with the goals and values of their human users and society as a whole, a concept known as 'AI alignment'. His group works to address alignment challenges in multi-agent systems, human-AI teams, and societal oversight of machine learning. Their goal is to enable the safe, beneficial, and trustworthy deployment of AI in real-world settings.
Although there is not a clear research agenda for the CBAI fellows at the moment, they are encouraged to check Algorithmic Alignment Group’s most recent research output to have a sense on the research directions.
Strong candidates have:
- Interest in theoretically grounded approaches to AI alignment.
- Familiarity with modern machine learning methods and infrastructure — you should feel comfortable configuring and running an experiment that finetunes a language model on a custom data set with Direct Preference Optimization.
- Research maturity. The ability to read a research paper and implement the core methods for comparison; the ability to identify related work for a specific research topic in alignment or interpretability; the ability to articulate a clear hypothesis, identify a relevant experiment, and analyze the results.