Research

Open Loop Lab studies problems where cognition, social behavior, evidence, and evaluation intersect. Each project is built through iterative human-AI collaboration, with human judgment governing the research question, assumptions, analysis, and interpretation.

The Violence Project

The Violence Project is a multi-level evidence synthesis on interpersonal violence and conflict. It organizes research across individual, relational, community, and societal levels of analysis, with an emphasis on mechanisms rather than isolated findings.

The project began as a structured literature and evidence dashboard. That dashboard remains the living knowledge base: sources are organized by ecological level, tagged by mechanism, and designed to support cumulative synthesis rather than one-time summary.

The current modeling work extends that synthesis into historical conflict analysis. Using path integral conflict modeling, the project represents conflict as movement through a multidimensional state space, where different trajectories approach different outcome basins: military victory, negotiated settlement, frozen conflict, external coercion, conflict expansion, and related endpoints.

Initial retrodictive analyses focus on Bosnia and Kosovo. The models evaluate how combinations of military, diplomatic, proxy, and external intervention variables alter the probability of different outcomes. Counterfactual analyses are used to identify load-bearing variables and test whether particular events were necessary, sufficient, or contingent within the modeled trajectory.

Methods: systematic evidence synthesis, ecological coding, path integral modeling, Boltzmann-weighted basin probability, counterfactual sensitivity analysis, UCDP event data, interactive dashboard development.

Hallucitations

Hallucitations studies AI-generated citation error in scholarly and professional writing.

The term names a specific failure mode: fabricated or distorted citations that look plausible enough to survive into drafts, submissions, or published work. The problem is not just that AI systems sometimes invent sources. The larger issue is that scholarly trust depends on citation practices that can be quietly degraded when references are not checked against real documents.

The pilot study examines when citation hallucinations occur, how they can be detected, and whether risk varies by domain, author expertise, and paper section. The project combines automated citation extraction, source verification, LLM-assisted flagging, and manual review.

The broader question is epistemological: what happens to scholarly communication when writing systems can produce fluent claims faster than people can verify their evidentiary basis?

Methods: text analysis, citation extraction, source verification, LLM-assisted detection, manual coding, expertise assessment.

Evaluation Methods

Evaluation Methods compares frameworks from educational assessment and AI system evaluation.

The two fields have developed largely in parallel, but they face structurally similar problems. Both ask whether a system does what it claims to do, whether performance is reliable, whether scores or outputs are valid for their intended use, and whether results generalize beyond the evaluation setting.

Educational assessment has a long tradition of validity theory, reliability, construct representation, fairness, and consequential validity. AI evaluation has developed rapidly around benchmarks, task performance, human preference ratings, adversarial testing, and model behavior under distribution shift.

This project maps the overlap between the two domains, identifies blind spots in each, and asks how evaluation practice might improve if the fields were read together.

Methods: framework analysis, conceptual synthesis, comparative methodology, validity analysis.