top of page
WechatIMG215.jpg

Selected Papers
(* equal contribution)

Hi! I’m Yanhong Li. I am currently a pre-doctoral researcher at the Allen Institute for AI (Ai2), advised by Will Merrill, working on theoretical and empirical analyses of model architectures, especially hybrid models and linear RNNs. I am also mentored by Luca Soldaini (now at Microsoft AI) on data-efficient pretraining for LMs. Always happy to chat about research (yanhongl@allenai.org)! :)

 

In summer 2025, I was a visiting student at MIT CSAIL, supervised by Prof. Yoon Kim and mentored by the wonderful Songlin Yang (I distilled so much from her!!). 

​

I graduated from the University of Chicago in 2025. During undergrad, I was extremely grateful to work with Prof. David McAllester (TTIC), Prof. Karen Livescu (TTIC), Prof. Michael Maire (UChicago), Prof. Jiawei Zhou (Stony Brook), and Prof. Allyson Ettinger (AI2). I couldn’t appreciate their invaluable insights and support more. A huge thank-you as well to my early mentors: David Yunis (TTIC), Chenghao Yang (UChicago), and Marcelo Sandoval-Castañeda (TTIC)—they taught me so much when I knew nothing about research. Be sure to check out their interesting work!

​

William Merrill, Hongjian Jiang, Yanhong Li, Anthony Lin, Ashish Sabharwal. Why Are Linear RNNs More Parallelizable?. ICML 2026.

 

William Merrill*, Yanhong Li*, Tyler Romero*, Anej Svete*, Caia Costello*, Pradeep Dasigi, Dirk Groeneveld, David Heineman, Bailey Kuehl, Nathan Lambert, Chuan Li, Kyle Lo, Saumya Malik, DJ Matusz, Benjamin Minixhofer, Jacob Morrison, Luca Soldaini, Finbarr Timbers, Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi, Ashish Sabharwal*. Olmo Hybrid: From Theory to Practice and BackAi2 technical report.

​

Yanhong Li*, Songlin Yang*, Shawn Tan, Mayank Mishra, Rameswar Panda, Jiawei Zhou, Yoon Kim. Distilling to Hybrid Attention Models via KL-Guided Layer Selection. ICLR 2026.


Yanhong Li, Ming Li, Karen Livescu, Jiawei Zhou. On the Predictive Power of Representation Dispersion in Language Models. ICLR 2026.

 

Yanhong Li*, Zixuan Lan*, Jiawei Zhou. Text or Pixels? Evaluating Efficiency and Understanding of LLMs with Visual Text Inputs. EMNLP 2025 Findings.

​

Yanhong Li, Karen Livescu, Jiawei Zhou. Chunk-Distilled Language Modeling. ICLR 2025. â€‹â€‹â€‹

​

For the full list, please see my Google Scholar page. â€‹

Contact
Information

Department of Chemistry
Science Center

500 Terry Francine St.
San Francisco, CA 94158

123-456-7890

  • LinkedIn
  • Twitter

©2035 by Daniel Tenant. Powered and secured by Wix

bottom of page