Other Events, Seminars

Graduate Student Seminar: Localized Detection of Authenticity in Mixed Source Texts via Epidemic Change-point Perspective

Subhrajyoty Roy, Postdoctoral Research Associate in Statistics & Data Science at Washington University in St. Louis

Abstract: With the increasing popularity of large language models, concerns over content authenticity have led to the development of various watermarking schemes. These schemes can be used to detect a machine-generated text via an appropriate key, while being imperceptible to readers with no such keys. The corresponding detection mechanisms usually take the form of statistical hypothesis testing for the existence of watermarks, spurring extensive research in this direction. However, the finer-grained problem of identifying which segments of a mixed-source text are actually watermarked, is much less explored; the existing approaches either lack scalability or theoretical guarantees robust to paraphrase and post-editing. In this work, we introduce a unique perspective to such watermark segmentation problems through the lens of epidemic change point analysis. By highlighting the similarities as well as differences of these two problems, we motivate and proposed WISER: a novel, computationally efficient, watermark segmentation algorithm. Complementing various theoretical results on consistency, we also find through extensive numerical simulations that WISER outperforms state-of-the-art baseline methods, both in terms of computational speed as well as accuracy for diverse watermarking schemes and diverse large language models. It also shows how insights from a classical statistical problem can lead to a theoretically valid and computationally efficient solution of a modern and pertinent problem.

Graduate Student Seminar: Localized Detection of Authenticity in Mixed Source Texts via Epidemic Change-point Perspective

Share this Event

Graduate Student Seminar: Localized Detection of Authenticity in Mixed Source Texts via Epidemic Change-point Perspective