I am Xinyi Wang (王心怡), a final year computer science PhD candidate at University of California, Santa Barbara (UCSB). I am advised by professor William Yang Wang. I have also worked with Yi Yang, Kun Zhang, Alessandro Sordoni, Yikang Shen, and Rameswar Pandas. I have interned at MSR Montreal in 2023 summer, and MIT-IBM Watson lab in 2024 summer. I’m honored to be awarded a J.P. Morgan AI PhD Fellowship. My research focuses on developing a principled understanding of deep learning models, especially large language models, with the goal of improving their capabilities, addressing their limitations, and optimizing their application across diverse domains. My CV can be downloaded here.
I’m on the job market right now. Please feel free to reach out to me if you think I could be a good fit!
Education
- University of California, Santa Barbara, Oct 2020 - Present
- Ph.D. in Computer Science
- Hong Kong University of Science and Technology, Sep 2016 - Jul 2020
- B.Sc. in Applied Mathematics and Computer Science
* indiacts equal contribution
Preprints
-
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
Xinyi Wang*, Antonis Antoniades*, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang
Arxiv Preprint [paper]
-
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang
Arxiv Preprint [paper]
-
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
Arxiv Preprint [paper]
(Co)-First authored publications
-
Guiding Language Model Math Reasoning with Planning Tokens
Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni
-
Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang
-
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning
Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, William Yang Wang
Proceedings of NeurIPS 2023, New Orleans (poster) [paper][code]
-
Causal Balancing for Domain Generalization
Xinyi Wang, Michael Saxon, Jiachen Li, Hongyang Zhang, Kun Zhang, William Yang Wang
-
Counterfactual Maximum Likelihood Estimation for Training Deep Networks
Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang
-
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
Xinyi Wang*, Haiqin Yang*, Liang Zhao, Yang Mo and Jianping Shen
Proceedings of IJCNN 2021, Virtual (oral) [paper]
-
Neural Topic Model with Attention for Supervised Learning
Xinyi Wang, Yi Yang
Coauthored publications
-
T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
Jiachen Li, Weixi Feng, Tsu-Jui Fu, Xinyi Wang, Sugato Basu, Wenhu Chen, William Yang Wang
Proceedings of NeurIPS 2024, Vancouver [paper]
-
A Survey on Data Selection for Language Models
Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
TMLR 2024 [paper]
-
Position Paper: Understanding the Role of Social Media Influencers in AI Research Visibility
Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang.
Proceedings of ICML 2024, Vienna [paper]
-
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang
TACL 2024 [paper]
-
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
-
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning
Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
-
TheoremQA: A Theorem-driven Question Answering dataset
Wenhu Chen, Ming Yin, Max Ku, Elaine Wan, Xueguang Ma, Jianyu Xu, Tony Xia, Xinyi Wang, Pan Lu
-
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang
Proceedings of EMNLP 2023, Singapore [paper]
-
PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers
Michael Saxon, Xinyi Wang, Wenda Xu, William Yang Wang
-
A Dataset for Answering Time-Sensitive Questions
Wenhu Chen, Xinyi Wang, William Yang Wang
Proceedings of NeurIPS 2021 Datasets and Benchmarks Track, Virtual (poster) [paper][code]
-
Modeling Discolsive Transparency in NLP Application Descriptions
Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang
* indiacts equal contribution
Talks
- My PhD major area exam presentation in March 2023: [slides]
- Talk at Hong Kong University of Science and Technology in May 2023: [slides]
- Talk at Tsinghua University on October 19, 2023 and at Peking University on October 23, 2023: [slides]
- My PhD proposal presentation in March 2024: [slides]
Services
- Reviewer: NeurIPS Datasets and Benchmarks Track (2021), AAAI (2022, 2023), NeurIPS (2023,2024), ICLR (2024), ICML (2024), COLM (2024), TPAMI(2024)