SEA @ NeurIPS 2025

Scaling Environments for Agents (SEA)

NeurIPS 2025 Workshop

Date: Sat Dec 6th | Location: San Diego Convention Center, San Diego, USA

The development of intelligent agents – particularly those powered by large language models (LLMs) – has emphasized the critical role of environments in shaping agent behavior and capabilities, especially for achieving end-to-end autonomy. Environments are not merely testing grounds; they are dynamic, interactive contexts that serve as the essential "data" for agents to learn adaptive behavior, complex reasoning, and long-term decision-making skills. Just as scaling model size has led to emergent capabilities in LLMs, we posit that scaling the structure, fidelity, and diversity of environments is essential for unlocking new forms of agent intelligence. Moreover, recent advances in end-to-end reinforcement learning (RL), particularly when paired with LLM-based agents, have made it increasingly viable to train agents through sustained interaction. These agents can now acquire skills, strategies, and planning abilities through environmental feedback, rather than relying solely on imitation learning or static prompt engineering. As we move toward more autonomous, general-purpose agents, the need for scalable, richly interactive, and diverse environments has become both urgent and foundational.

Call for Papers Scaling Environments for Agents

We invite submissions to two complementary tracks:

[Special Track] Agent Environment Design and Evaluation

The special track focuses on our core theme: agent environment design and evaluation. We welcome work that advances how environments are specified, generated, measured, and shared.

Task & World Specification: formalization, compositionality, affordance modeling, procedural generation, simulator integration.
Evaluation Methodologies: multi‑step interaction metrics, generalization tests, open‑ended benchmarks, curriculum scaling, human‑in‑the‑loop assessments.
Environment Exemplars (illustrative, non‑exhaustive):
- CodeArena: multi‑language software‑engineering sandbox for agent tool‑use benchmarking.
- HouseWorld: 3‑D embodied household simulator with spatial reasoning tasks.
- WebShop‑X: dynamic e‑commerce website emulator for goal‑conditioned browsing and checkout.
- SocialTown: multi‑agent social environment for coordination, negotiation, and role‑play evaluation.
Artifacts & Reproducibility: dataset/specification releases, leaderboards, reproducibility studies.

[General Track] Other Relevant Topics

The general track welcomes research broadly related to scaling environments for agents, including but not limited to:

LLMs in Interactive Environments: policy learning, planning, reward shaping, hybrid training (e.g. RLHF, PPO), interaction‑based fine‑tuning.
Tool‑Use and Software Environments: agents as programmers, API orchestration, agentic debugging, self‑healing code, software manipulation, web navigation.
Multi‑Agent & Social Environments: population scaling, emergent behaviors, communication, coordination, competition, social alignment and safety.
Embodiment & Grounding: perception‑action loops, physical simulation, spatial reasoning, robotics integration, sim‑to‑real transfer.
Sim2Real & Deployment: domain adaptation, real‑world API integration, robustness under scale, safety, large‑scale deployment.

Awards

All accepted papers will be presented in a poster session. Up to four outstanding papers (two per track) will be invited for oral presentations. Each track will confer its own Best Paper Award.

Submission Guidelines

We manage paper submissions through OpenReview. The review process is double‑blind, so submissions must be anonymized. We welcome work that is (1) original and unpublished, (2) recently published, or (3) work‑in‑progress. Submissions will not be indexed or have archival proceedings.

Please use the NeurIPS 2025 LaTeX style file; it includes a preprint option for non‑anonymous preprints posted online (see additional formatting details here). Submissions should be PDFs of ≤ 9 pages (excluding references and appendices).

Aug 22	Paper Submission Deadline
Sep 22	Notification of Acceptance
Oct 10	Camera‑ready Paper Submission
Dec 6	Workshop at NeurIPS

Time	Session	Duration
09:00	Opening Remarks	10 min
09:10	Contributed Talks 1 & 2	50 min
10:00	Coffee Break	20 min
10:20	Poster Session 1	50 min
11:10	Invited Talks 1 & 2	60 min
12:10	Lunch Break	50 min
13:00	Invited Talks 3 & 4	60 min
14:00	Coffee Break	20 min
14:20	Contributed Talks 3 & 4	50 min
15:10	Poster Session 2	50 min
16:00	Invited Talk 5 & 6	60 min
17:00	Panel Discussion	50 min
17:50	Closing Remarks	10 min

NeurIPS 2025 Workshop

[Special Track] Agent Environment Design and Evaluation

[General Track] Other Relevant Topics

Awards

Submission Guidelines

Important Dates (Anywhere on Earth)

Invited Speakers

Panelists