Like many AI researchers, I've been intrigued by recent advances in large language models (LLMs), and generative AI more broadly. At the same time – and also in common with many researchers – I'm concerned about their reliability, robustness, and amenability to human control. This cluster of projects looks into what we can do about that.
The general strategy I've been pursuing, with number of collaborators, is to put the LLM inside a larger AI system, instead of using the LLM as the top-level "AI" itself. There are a few possible strategies: the LLM can be called upon for very specific tasks with a highly constrained prompt; the LLM can be used merely to propose solutions or partial solutions that are externally verified and assembled; and more.
I've also grown concerned about reproducibility of published results that use generative AI systems, especially those that use closed-weight, gated models like ChatGPT, so have been doing some work on that as well.
Publications:
Funding provided by:
Collaborators (current): Adam Gaier, Amy K. Hoover, Ioannis Koutis, Joel Lehman, Elliot Meyerson, Arash Moradi Karkaj, Ben Samuel, Mike Treanor