I am Xinyi Wang (王心怡), a forth-year computer science PhD candidate at University of California, Santa Barbara (UCSB). I am advised by professor William Yang Wang. I have also worked with Yi Yang, Kun Zhang, and Alessandro Sordoni. My current research interest lies in improving and making better use of foundation models by developing a principled understanding of them. My CV can be downloaded here.

Education

  • University of California, Santa Barbara, Oct 2020 - Present
    • Ph.D. in Computer Science
  • Hong Kong University of Science and Technology, Sep 2016 - Jul 2020
    • B.Sc. in Applied Mathematics and Computer Science

Preprints

  • A Survey on Data Selection for Language Models

    Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang

    Preprint 2024 [paper]

  • Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation

    Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang

    Preprint 2024 [paper]

  • Tweets to Citations: Unveiling the Impact of Social Media Influencers on AI Research Visibility

    Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang.

    Preprint 2024 [paper]

  • Guiding Language Model Reasoning with Planning Tokens

    Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, Alessandro Sordoni

    Preprint 2023 [paper]

First authored publications

  • Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

    Xinyi Wang, Wanrong Zhu, Michael Saxon, Mark Steyvers, William Yang Wang

    Proceedings of NeurIPS 2023, New Orleans (poster) [paper][code]

  • Causal Balancing for Domain Generalization

    Xinyi Wang, Michael Saxon, Jiachen Li, Hongyang Zhang, Kun Zhang, William Yang Wang

    Proceedings of ICLR 2023, Rwanda (poster) [paper][code]

  • Counterfactual Maximum Likelihood Estimation for Training Deep Networks

    Xinyi Wang, Wenhu Chen, Michael Saxon, William Yang Wang

    Proceedings of NeurIPS 2021, Virtual (poster) [paper][code]

  • RefBERT: Compressing BERT by Referencing to Pre-computed Representations

    Xinyi Wang*, Haiqin Yang*, Liang Zhao, Yang Mo and Jianping Shen

    Proceedings of IJCNN 2021, Virtual (oral) [paper]

  • Neural Topic Model with Attention for Supervised Learning

    Xinyi Wang, Yi Yang

    Proceedings of AISTATS 2020, Virtual (poster) [paper][code]

Coauthored publications

  • Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

    Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang

    TACL 2024 [paper]

  • Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

    Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen

    TMLR 2023 [paper][code]

  • Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

    Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang

    Findings of EMNLP 2023, Singapore [paper][code]

  • TheoremQA: A Theorem-driven Question Answering dataset

    Wenhu Chen, Ming Yin, Max Ku, Elaine Wan, Xueguang Ma, Jianyu Xu, Tony Xia, Xinyi Wang, Pan Lu

    Proceedings of EMNLP 2023, Singapore [paper][code]

  • Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

    Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

    Proceedings of EMNLP 2023, Singapore [paper]

  • PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

    Michael Saxon, Xinyi Wang, Wenda Xu, William Yang Wang

    Proceedings of EACL 2023, Croatia [paper][code]

  • A Dataset for Answering Time-Sensitive Questions

    Wenhu Chen, Xinyi Wang, William Yang Wang

    Proceedings of NeurIPS 2021 Datasets and Benchmarks Track, Virtual (poster) [paper][code]

  • Modeling Discolsive Transparency in NLP Application Descriptions

    Michael Saxon, Sharon Levy, Xinyi Wang, Alon Albalak, William Yang Wang

    Proceedings of EMNLP 2021, Virtual (oral) [paper][code]

* indiacts equal contribution

Talks

  • My PhD major area exam presentation in March 2023: [slides]
  • Talk at Hong Kong University of Science and Technology in May 2023: [slides]
  • Talk at Tsinghua University on October 19, 2023 and at Peking University on October 23, 2023: [slides]
  • My PhD proposal presentation in March 2024: [slides]

Services

  • Reviewer: NeurIPS Datasets and Benchmarks Track (2021), AAAI (2022, 2023), NeurIPS (2023), ICLR (2024), ICML (2024), COLM (2024), TPAMI(2024)