How Statistics Can Advance Large Language Models: Fairness Alignment and Watermarking
Abstract: Large language models (LLMs) have rapidly emerged as a transformative innovation in machine learning. However, their increasing influence on human decision-making processes raises critical societal questions. In this talk, we will demonstrate how statistics can help address two key challenges: ensuring fairness for minority groups through alignment and combating misinformation through watermarking. First, we tackle the challenge of creating fair LLMs that equitably represent and serve diverse populations. We derive a regularization term that is both necessary and sufficient for aligning LLMs with human preferences, ensuring equitable outcomes across different demographics. Second, we introduce a general statistical framework to analyze the efficiency of watermarking schemes for LLMs. We develop optimal detection rules for an important watermarking scheme recently developed at OpenAI and empirically demonstrate its superiority over the existing detection method. Throughout the talk, we will showcase how statistical insights can not only address pressing challenges posed by LLMs but also unlock substantial opportunities for the field of statistics to drive responsible generative AI development. This talk is based on arXiv:2405.16455 and arXiv:2404.01245.
Bio: Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department and, by courtesy, in the Departments of Computer Information Science and Mathematics at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning Center. Prior to joining Penn, he received his Ph.D. in statistics from Stanford University in 2016 and bachelor's degree in mathematics from Peking University in 2011. His research interests span the statistical foundations of AI, privacy-preserving machine learning, high-dimensional statistics, and optimization. He serves as an associate editor of the Journal of Machine Learning Research, Journal of the American Statistical Association, Foundations and Trends in Statistics, and Operations Research. His work has been recognized with several awards, such as the Stanford Anderson Dissertation Award, NSF CAREER Award, Sloan Research Fellowship, IMS Peter Hall Prize, SIAM Early Career Prize in Data Science, ASA Noether Early Career Award, and the ICBS Frontiers of Science Award in Mathematics.