Eylon Caplan




Hello! I am a second-year Ph.D. student in the Department of Computer Science at Purdue University, advised by Prof. Dan Goldwasser. My work centers on using natural language processing to understand and draw conclusions from large amounts of unstructured data. In particular, I build scalable and interpretable NLP systems that reason about human behavior, beliefs, and values as expressed in noisy real-world corpora—especially social media.

Recently, I developed ConceptCarve, a framework for identifying how abstract social concepts are expressed across communities by combining language model reasoning with scalable retrieval. I also introduced the Splits! dataset, a large Reddit-based dataset with demographic and topical annotations which allows for investigation of how different demographic groups communicate about shared topics. From a technical standpoint, my work has extensively involved large-scale data processing, retrieval, reranking, text clustering, and dataset design, collection, and validation.

Before coming to Purdue, I earned my B.Sc. in Computer Science and Mathematics from the University of Nebraska–Lincoln. There, I worked with Prof. Stephen Scott on continuous-layered neural architectures guided by integral equations, and with Prof. M. R. Hasan on improving classification performance for rare classes.




News

  • May 2025 — Submitted VIBE: Can a VLM Read the Room? to EMNLP 2025 (under review) (arXiv).
  • May 2025 — Our paper, ConceptCarve: Dynamic Realization of Evidence has been accepted to ACL 2025 Main Conference! (link).
  • Apr 2025 — Submitted Splits! A Flexible Dataset for Evaluating a Model’s Demographic Social Inference (under review) (arXiv).
  • Jan 2025 — Released ACL Searcher, an open-source semantic search tool for ACL paper abstracts using ColBERT (GitHub).
  • Dec 2024 — Submitted ConceptCarve: Dynamic Realization of Evidence to ACL 2025 (under review) (arXiv).
  • Aug 2023 — Started my Ph.D. in Computer Science at Purdue University (advisor: Dan Goldwasser).
  • Aug 2020 — Completed a second UCARE undergraduate research fellowship at UNL with Prof. Stephen Scott. Presented Continuous-Layered Dense Artificial Neural Networks at the 2020 Virtual UCARE Symposium.
  • Aug 2019 — Completed a UCARE undergraduate research fellowship at UNL with Prof. M. R. Hasan. Presented Improving Accuracy of Rare Classes in Machine Learning Classifiers at the 2019 Virtual UCARE Symposium, and was featured in a UNL article.