All datasets on this page are experimental panel data generated by semi-synthetic simulators CausalMP and LLM-SocioPol.
Each dataset contains complete treatment assignments, observed outcomes over time, and the ground-truth counterfactual trajectories for the all-treated and all-control scenarios. This makes the datasets useful for benchmarking causal estimators under network interference and for validating counterfactual predictions.
Below is a short description of the seven experimental settings currently available.
This dataset comes from our LLM-SocioPol simulator that mirrors a large-scale online mobilization experiment similar in spirit to the 61-million-user Facebook voting study. The experiment assigns users to either an informational message or a social message that displays peer civic behavior. The main outcomes are daily voting intentions generated by LLM-based agents whose personas are grounded in realistic demographic, behavioral, and activity data drawn from U.S. Census microdata and Twitter network. Interference arises through the social network, since each user’s behavior evolves in response to the posts and actions of their friends. We release five independent runs of the panel outcomes together with ground-truth counterfactual trajectories for all-treated and all-control scenarios. Detailed logs of user-level interactions, including generated posts, exposures, and histories, also exist. Interested researchers can email us to request access to these richer interaction datasets. [Data] [Related paper]
This environment simulates a social media platform where AI agents interact through content feeds. Treatments change the feed-ranking algorithm: the control shows posts in random order, while the treated ranking prioritizes posts with high friend engagement. Outcomes measure user engagement, defined as the number of likes and replies per user in each round. Interference arises through the directed follower graph, generated via a preferential-attachment model, since ranking changes for one user alter what their neighbors see and how they respond. Agent personas and interests come from US Census data–based demographic profiles. Ten independent runs include full panels and ground-truth counterfactuals for the all-treated and all-control trajectories. [Data] [Related paper]
This environment implements an opinion-diffusion process based on a network coordination game. Treatments are promotional interventions encouraging adoption of Opinion A, while outcomes record whether each user adopts Opinion A or sticks to Opinion B in each period. Interference arises because adoption decisions depend on neighbors’ current opinions through payoff-based coordination incentives. The underlying social graphs come from three real Pokec community networks with demographic attributes. Ten independent runs for each network (Krupina: 3,366 users, Topolcany: 18,246 users, and Zilina: 42,971 users) include panel outcomes and ground-truth counterfactuals for the all-treated and all-control scenarios. [Data] [Related paper]
This environment models interference among urban taxi routes in New York City. Treatments represent pricing interventions applied to randomly selected routes. Outcomes are route-level trip counts measured every six hours. Interference arises because adjacent or functionally related routes shift demand toward or away from one another, captured through a route-adjacency network derived from geographic and operational proximity. Baseline temporal patterns come from 58 million real high-volume taxi trips from January to March 2024. Ten independent runs contain full observed panels and ground-truth counterfactuals under all-treated and all-control scenarios. [Data] [Related paper]
This environment captures a digital health experiment where treatments are motivational messages encouraging physical activity. Outcomes are binary exercise decisions in each round. Interference occurs through Twitter-based social circles, since an individual’s activity influences their peers’ probability of exercising. User demographics and weekly activity cycles follow patterns constructed from US Census microdata. Ten independent runs provide panel outcomes and ground-truth counterfactual paths for the all-treated and all-control settings. [Data] [Related paper]
This environment simulates a network of interconnected servers handling stochastic job arrivals. Treatments increase processing power for a subset of servers. Outcomes record server utilization, that is, the fraction of each interval during which the server is busy. Interference arises because join-the-shortest-queue routing links servers: enhancing one server’s capacity alters downstream loads elsewhere in the network. The workload uses realistic, time-varying Poisson arrival rates with heterogeneous job types. Ten independent runs include full panels and counterfactual trajectories for the all-treated and all-control cases. [Data] [Related paper]
This environment models repeated auctions where bidders compete over heterogeneous objects. Treatments raise valuations for selected objects by a fixed percentage. Outcomes measure realized object values in each auction round. Interference arises through strategic bidding and market-clearing prices: shifting valuations for one object affects bidding incentives and price formation for others. The simulator includes diverse bidder types with different valuation patterns, mimicking real market heterogeneity. Ten independent runs include observed panels and ground-truth counterfactuals for the all-treated and all-control scenarios. [Data] [Related paper]