GitHub GitHub
Google Scholar Scholar
LinkedIn LinkedIn

Eylon Caplan




Hello! I am a third-year Ph.D. student in the Department of Computer Science at Purdue University, advised by Prof. Dan Goldwasser. My work centers on using natural language processing to understand and draw conclusions from large amounts of unstructured data. In particular, I've built scalable and interpretable NLP systems that reason about human behavior, beliefs, emotions, and values as expressed in noisy real-world corpora—especially social media.

During my Ph.D., I developed ConceptCarve, a framework for identifying how abstract social concepts are expressed across communities by combining language model reasoning with scalable retrieval. I introduced Splits!, a Reddit 9.6M-post sandbox for investigating how different demographic groups communicate about shared topics. I also investigated social concepts in a multimodal setting, developing VIBE, a benchmark for evaluating how well VLMs interpret visual cues in videos, and TAIGR, a framework for modeling influencer content on social media via structured, pragmatic inference.

From a technical standpoint, my work has extensively involved large-scale text/video data processing, retrieval, reranking, text clustering, and dataset design, collection, annotation, and validation—in multiple modalities—all working with real, messy data sources.

Before coming to Purdue, I earned my B.Sc. in Computer Science and Mathematics from the University of Nebraska-Lincoln, where I was first introduced to ML in working with Prof. Stephen Scott and Prof. M. R. Hasan.




News