I am Xinyi Wang (王心怡), a final year computer science PhD candidate at University of California, Santa Barbara (UCSB). I am advised by professor William Yang Wang. I have also worked with Yi Yang, Kun Zhang, Alessandro Sordoni, Yikang Shen, and Rameswar Pandas. I’m honored to be awarded a J.P. Morgan AI PhD Fellowship and a UCSB Computer Science Outstanding Publication Award. My research focuses on developing a principled understanding of large foundation models, especially LLMs, with the goal of improving their capabilities, addressing their limitations, and optimizing their application across diverse domains. My CV can be downloaded here.
[News] I’m joining Princeton Language and Intelligence as a a Postdoctral Researcher in July 2025.
[News] I will join the CSE department at University at Buffalo, SUNY in 2026 Fall as an Assistant Professor and I’m recruiting PhD students in the upcoming cycle.
I’m attending ICLR 2025. Please feel free to reach out to me if you want to chat :)
* indicates equal contribution
Preprints
-
Do Larger Language Models Imply Better Reasoning? A Pretraining Scaling Law for Reasoning
Xinyi Wang , Shawn Tan, Mingyu Jin, William Yang Wang, Rameswar Panda, Yikang Shen
Arxiv Preprint [paper]
-
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang
Arxiv Preprint [paper]
-
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
Arxiv Preprint [paper]
(Co)-First authored publications
-
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
Xinyi Wang*, Antonis Antoniades*, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang
-
Guiding Language Model Math Reasoning with Planning Tokens
Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni
Proceedings of COLM 2024, Philadelphia (poster) [paper][code]
-
Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang
-
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, William Yang Wang
Proceedings of NeurIPS 2023, New Orleans (poster) [paper][code]
-
Causal Balancing for Domain Generalization
Xinyi Wang, Michael Saxon, Jiachen Li, Hongyang Zhang, Kun Zhang, William Yang Wang
-
Counterfactual Maximum Likelihood Estimation for Training Deep Networks
Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang
-
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
Xinyi Wang*, Haiqin Yang*, Liang Zhao, Yang Mo and Jianping Shen
Proceedings of IJCNN 2021, Virtual (oral) [paper]
-
Neural Topic Model with Attention for Supervised Learning
Xinyi Wang, Yi Yang
Coauthored publications
-
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li, Weixi Feng, Tsu-Jui Fu, Xinyi Wang, Sugato Basu, Wenhu Chen, William Yang Wang
Proceedings of NeurIPS 2024, Vancouver (poster) [paper][project]
-
A Survey on Data Selection for Language Models
Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
-
Position: AI/ML Influencers Have a Place in the Academic Process
Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang.
Proceedings of ICML 2024, Vienna (poster) [paper]
-
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang
-
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
-
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning
Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
-
TheoremQA: A Theorem-driven Question Answering dataset
Wenhu Chen, Ming Yin, Max Ku, Elaine Wan, Xueguang Ma, Jianyu Xu, Tony Xia, Xinyi Wang, Pan Lu
-
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Proceedings of EMNLP 2023, Singapore (poster) [paper]
-
PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers
Michael Saxon, Xinyi Wang, Wenda Xu, William Yang Wang
-
A Dataset for Answering Time-Sensitive Questions
Wenhu Chen, Xinyi Wang, William Yang Wang
Proceedings of NeurIPS 2021 Datasets and Benchmarks Track, Virtual (poster) [paper][code]
-
Modeling Discolsive Transparency in NLP Application Descriptions
Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang
* indiacts equal contribution
Talks
- My PhD major area exam presentation in March 2023: [slides]
- Talk at Hong Kong University of Science and Technology in May 2023: [slides]
- Talk at Tsinghua University on October 19, 2023 and at Peking University on October 23, 2023: [slides]
- My PhD proposal presentation in March 2024: [slides]
Services
- Reviewer: NeurIPS Datasets and Benchmarks Track (2021), AAAI (2022, 2023), NeurIPS (2023,2024), ICLR (2024, 2025), ICML (2024), COLM (2024), AISTATS (2025), TPAMI(2024)
- ICLR 2025 Open Science for Foundation Models Workshop organizer