Eylon Caplan

Hello! I am a third-year Ph.D. student in the Department of Computer Science at Purdue University, advised by Prof. Dan Goldwasser. My work centers on using natural language processing to understand and draw conclusions from large amounts of unstructured data. In particular, I've built scalable and interpretable NLP systems that reason about human behavior, beliefs, emotions, and values as expressed in noisy real-world corpora—especially social media.

During my Ph.D., I developed ConceptCarve, a framework for identifying how abstract social concepts are expressed across communities by combining language model reasoning with scalable retrieval. I introduced Splits!, a Reddit 9.6M-post sandbox for investigating how different demographic groups communicate about shared topics. I also investigated social concepts in a multimodal setting, developing VIBE, a benchmark for evaluating how well VLMs interpret visual cues in videos, and TAIGR, a framework for modeling influencer content on social media via structured, pragmatic inference.

From a technical standpoint, my work has extensively involved large-scale text/video data processing, retrieval, reranking, text clustering, and dataset design, collection, annotation, and validation—in multiple modalities—all working with real, messy data sources.

Before coming to Purdue, I earned my B.Sc. in Computer Science and Mathematics from the University of Nebraska-Lincoln, where I was first introduced to ML in working with Prof. Stephen Scott and Prof. M. R. Hasan.

News

[Jul 2026] ✈️ Looking forward to attending ACL 2026 in San Diego!
[May 2026] 💼 Excited to share that I'll be joining AWS AI as an Applied Science intern in Santa Clara, CA for the summer!
[Apr 2026] 🎉 Our paper, Splits! Flexible Sociocultural Linguistic Investigation at Scale, was accepted to ACL 2026 Main as an oral!
[Apr 2026] 🎉 Our paper, TAIGR: Towards Modeling Influencer Content on Social Media via Structured, Pragmatic Inference, was accepted to ACL 2026 Main Conference.
[Nov 2025] ✈️ Looking forward to attending EMNLP 2025 in Suzhou, China!
[Aug 2025] 🎉 Our paper, VIBE: Can a VLM Read the Room?, has been accepted to Findings of EMNLP 2025.
[Jul 2025] ✈️ Looking forward to attending ACL 2025 in Vienna, Austria!
[May 2025] 🎉 Our paper, ConceptCarve: Dynamic Realization of Evidence, has been accepted to ACL 2025 Main Conference.
[Jan 2025] 💻 I created and released ACL Searcher, an open-source semantic search tool for ACL paper abstracts using ColBERT.
[Aug 2023] 🎓 Started my Ph.D. in Computer Science at Purdue University (advisor: Dan Goldwasser).