Zixuan Ke
I am a research scientist at Salesforce AI Research. I earned my Ph.D. degree at the University of Illinois, Chicago,
where I was fortunate to be advised by Bing Liu (we continue to work closely).
Prior to that, I received my M.Sc. in Computer Science from the University of Texas, Dallas,
under the guidance of Vincent Ng.
During the summers, I was a research intern at Google DeepMind, Meta AI, and Amazon Science.
My research studies how to adapt the foundation models, particularly large language models (LLMs), for an ever-changing world characterized by emerging domains, events, topics or information.
This includes (but is not limited to)!
- Large Language Model (pre-training, post-training and frontiers, e.g., retrieval-augmented generation)
- Continual and Lifelong Learning (of tasks, classes, and domains)
- Natural Language Processing (classification, generation and extraction)
- Argument Mining
Email  / 
Google Scholar  / 
Github  / 
Twitter  / 
LinkedIn  / 
Blog
If you'd like to chat with me about research or anything,
please feel free to reach out via email or schedule a chat here.
|
|
Selected Publications & Preprints (full list in Google Scholar)
(*indicates equal contribution)
Large Language Model
Bridging the Preference Gap between Retrievers and LLMs
Zixuan Ke,
Weize Kong,
Cheng Li,
Mingyang Zhang,
Qiaozhu Mei,
Michael Bendersky
ACL, 2024
arxiv /
talk /
poster
Continual Pre-training of Language Models
TL;DR
Our study examines the continual Pre-training of language models (LMs) in various settings, with a specific focus on continual domain-adaptive pre-training of LMs. To preserve pre-trained/general knowledge and domain knowledge, we propose a novel soft-masking mechanism that also enables knowledge transfer, thus improving end-task performances. Results from evaluations conducted on 6 different domains demonstrate the effectiveness of this approach.
Zixuan Ke*,
Yijia Shao*,
Haowei Lin*,
Tatsuya Konishi,
Gyuhak Kim,
Bing Liu
ICLR, 2023
arxiv /
poster /
model-card /
code
Adapting a Language Model While Preserving its General Knowledge
TL;DR
Our argument is that an effective method for domain-adaptive pre-training of language models (LMs) should satisfy two requirements: (1) the preservation of general knowledge, and (2) the specialization of the LM to the target domain due to polysemy. To address these needs, we propose a novel informed adaptation method, which we evaluate across 6 different domains and demonstrate its effectiveness.
Zixuan Ke,
Yijia Shao,
Haowei Lin,
Hu Xu,
Lei Shu,
Bing Liu
EMNLP, 2022a
arxiv /
poster /
code
Continual Training of Language Models for Few-Shot Learning
TL;DR
Our proposal concerns the challenge of continual domain-adaptive pre-training of language models (LMs) and its differences and challenges compared to conventional continual end-task learning. To address this challenge, we propose a novel task masking method, which we evaluate across 4 different domains and find it to be effective.
Zixuan Ke,
Haowei Lin,
Yijia Shao,
Hu Xu,
Lei Shu,
Bing Liu
EMNLP, 2022b
arxiv /
poster /
model-card /
code
Continual Learning
Sub-network Discovery and Soft-masking for Continual Learning of Mixed Tasks
Zixuan Ke,
Bing Liu,
Wenhan Xiong,
Asli Celikyilmaz,
Haoran Li
EMNLP, 2023
arxiv /
code
Continual Learning of Natural Language Processing Tasks: A Survey
Zixuan Ke,
Bing Liu
arXiv, 2023
arxiv
A Theoretical Study on Solving Continual Learning
Gyuhak Kim,
Changnan Xiao,
Zixuan Ke,
Bing Liu
NeurIPS, 2022
arxiv
Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning
TL;DR
Many existing continual learning methods focus solely on mitigating forgetting, without a mechanism for promoting knowledge transfer. Our proposal is a capsule-based method that addresses both challenges. We evaluate the effectiveness of our approach across 4 different datasets.
Zixuan Ke,
Bing Liu,
Nianzu Ma,
Hu Xu,
Lei Shu
NeurIPS, 2021
arxiv /
talk /
poster /
code
Continual Learning of A Mixed Sequence of Similar and Dissimilar Tasks
TL;DR
Existing research on continual learning focused on dealing with forgetting, where the tasks are assumed to be dissimilar and have little shared knowledge. No technique has been proposed to learn a sequence of mixed similar and dissimilar tasks that can deal with forgetting and also transfer knowledge forward and backward. This paper proposes such a technique and empirical evaluation using sequences of mixed tasks demonstrates the effectiveness of the proposed model.
Zixuan Ke,
Bing Liu,
Xingchang Huang
NeurIPS, 2020
arxiv /
talk /
poster /
code
Argument Mining
Automated Essay Scoring: A Survey of the State of the Art
Zixuan Ke,
Vincent Ng
IJCAI, 2019
Learning to Give Feedback: Modeling Attributes Affecting Argument Persuasiveness in Student Essays
Zixuan Ke, Winston Carlile, Nishant Gurrapadi,
Vincent Ng
IJCAI, 2018
dataset /
dataset-persuasive /
dataset-thesis-strength
|
Recent Talks & Classes
- Adapting Large Language Models for the Dynamic World (slides)
- Continual Learning in NLP (slides), Tutorial at DEIM23, Remote, March 6, 2023.
- Lifelong and Continual Learning (Part 1, Part 2). A Short PhD Course (8 hours), Aalborg University, June 14-16, 2022.
- Conference talks (please refer to the Selected Publications section, and you can find more here)
|
Research Services
-
Area Chair/Action Editor (2024-):
-
Program Committee/Reviewer (2021-):
- ICLR, NeurIPS, ICML, ACL, EMNLP, NAACL, IJCAI, ARR, COLING, Collas, NLPCC
-
Journal Reviewer (2021-):
- TPAMI, TKDE, Neural Networks, Neurocomputing, Artificial Intelligence, TALLIP
|
Awards
- Exceptional Research Premise (the highest honor for CoE PhD students at UIC), 2023
|
Collaborators
I have had the privilege of working with and learning from great mentors and mentees, including:
-
Mentors:
- Bing Liu, distinguished professor at UIC
- Hu Xu, research scientist at Facebook AI Research (FAIR)
- Lei Shu, research scientist at Google Research
-
Mentees:
(They're making great achievements and I couldn't be more thrilled and proud of them)
- Yijia Shao, BS at Peking University ->PhD at Standford
|
Template modified from here.
|
|