
Selected Papers
(* equal contribution)
Hi! I’m Yanhong Li. I am currently a pre-doctoral researcher at the Allen Institute for AI (Ai2), advised by Luca Soldaini on data-efficient pretraining for LMs and mentored by Will Merrill in theory for linear RNNs and hybrid models. I’m broadly interested in LM efficiency (spanning data efficiency, model architectures, and inference) and in understanding theoretical questions about the expressivity of different LLM architectures. Always happy to chat about research (yanhongl@allenai.org)! :)
This past summer, I was a visiting student at MIT CSAIL, supervised by Prof. Yoon Kim and mentored by the wonderful Songlin Yang (I distilled so much from her!!).
​
I graduated from the University of Chicago in 2025. During undergrad, I was extremely grateful to work with Prof. David McAllester (TTIC), Prof. Karen Livescu (TTIC), Prof. Michael Maire (UChicago), Prof. Jiawei Zhou (Stony Brook), and Prof. Allyson Ettinger (AI2). I couldn’t appreciate their invaluable insights and support more. A huge thank-you as well to my early mentors: David Yunis (TTIC), Chenghao Yang (UChicago), and Marcelo Sandoval-Castañeda (TTIC)—they taught me so much when I knew nothing about research. Be sure to check out their interesting work!
​
Yanhong Li*, Songlin Yang*, Shawn Tan, Mayank Mishra, Rameswar Panda, Jiawei Zhou, Yoon Kim, Distilling to Hybrid Attention Models via KL-Guided Layer Selection. ICLR 2026.
Yanhong Li, Ming Li, Karen Livescu, Jiawei Zhou, On the Predictive Power of Representation Dispersion in Language Models. ICLR 2026.
Yanhong Li*, Zixuan Lan*, Jiawei Zhou, Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text Inputs. EMNLP 2025 Findings.
​
Yanhong Li, Karen Livescu, Jiawei Zhou. Chunk-Distilled Language Modeling. ICLR 2025.
​​​
​
For the full list, please see my Google Scholar page. ​