Sara Fish

PhD Student, Harvard

Sara Fish is a PhD Candidate at Harvard University and Graduate Fellow at the Harvard Kempner Institute. Her research interests center around economics and artificial intelligence. Her current research focuses on algorithmic collusion of large language models, and evaluation methodologies for understanding the "preferences" of large language models. She also has prior work studying how AI can be reliably integrated in democratic decision-making processes. Prior to her PhD, she earned her B.S. in Mathematics from Caltech.
Working title: "Behavioral Evals for LLM Musical Self-Expression"
Description:
Behavioral evals aim to inform how LLMs behave "in the wild" by measuring aspects of LLM behavior such as capabilities, values, and preferences "in the lab". However, the usefulness of behavioral evals is degraded by phenomena such as benchmaxxing, saturation, and eval awareness. In this project, we will explore (classical) music composition as a novel LLM evaluation domain. Research questions include: (1) How should we quantify the capabilities and tendencies of different LLMs in this domain? (2) To what extent does a given LLM exhibit consistent musical preferences? (3) Can we uncover robust connections between LLM behavior in musical and non-musical contexts? This project draws inspiration from this paper, which tackles similar questions in economic domains. For a thrown-together (unscientific) demo, see this website.
Skills:
- Comfort with reading academic literature—especially but not exclusively ML papers—is a hard requirement. For example, you should be able to read papers like these (paper1, paper2) and get a deep understanding of what is going on from the text alone, without needing help from an external resource such as an LLM.
- Basic music knowledge, such as the ability to read sheet music, is a hard requirement. Deeper music knowledge, e.g. theory or history, is a plus.
- Good Python programming skills are a hard requirement. Deeper knowledge of modern ML methods and infrastructure is a plus. Familiarity with AI coding tools is a plus.
Taste:
- Enthusiasm for "basic science" approaches to AI safety, and specifically, for LLM behavioral science. An excellent example of LLM basic science is this paper. In particular, if the style of research in this paper excites you, you are likely to be a better fit.
- A preference for slow, careful, and methodical work.
- A preference for in-person meetings, and/or frequent async communication.

Sara Fish

Biography

Mentor topics

Desired fellow qualifications