Scaling Environments for Agents (SEA)

NeurIPS 2025 Workshop

Date: Sat Dec 6th   |   Location: San Diego Convention Center, San Diego, USA

The development of intelligent agents – particularly those powered by large language models (LLMs) – has emphasized the critical role of environments in shaping agent behavior and capabilities, especially for achieving end-to-end autonomy. Environments are not merely testing grounds; they are dynamic, interactive contexts that serve as the essential "data" for agents to learn adaptive behavior, complex reasoning, and long-term decision-making skills. Just as scaling model size has led to emergent capabilities in LLMs, we posit that scaling the structure, fidelity, and diversity of environments is essential for unlocking new forms of agent intelligence. Moreover, recent advances in end-to-end reinforcement learning (RL), particularly when paired with LLM-based agents, have made it increasingly viable to train agents through sustained interaction. These agents can now acquire skills, strategies, and planning abilities through environmental feedback, rather than relying solely on imitation learning or static prompt engineering. As we move toward more autonomous, general-purpose agents, the need for scalable, richly interactive, and diverse environments has become both urgent and foundational.

Call for Papers Scaling Environments for Agents

We invite submissions to two complementary tracks:

[Special Track] Agent Environment Design and Evaluation

The special track focuses on our core theme: agent environment design and evaluation. We welcome work that advances how environments are specified, generated, measured, and shared.

  • Task & World Specification: formalization, compositionality, affordance modeling, procedural generation, simulator integration.
  • Evaluation Methodologies: multi‑step interaction metrics, generalization tests, open‑ended benchmarks, curriculum scaling, human‑in‑the‑loop assessments.
  • Environment Exemplars (illustrative, non‑exhaustive):
    • CodeArena: multi‑language software‑engineering sandbox for agent tool‑use benchmarking.
    • HouseWorld: 3‑D embodied household simulator with spatial reasoning tasks.
    • WebShop‑X: dynamic e‑commerce website emulator for goal‑conditioned browsing and checkout.
    • SocialTown: multi‑agent social environment for coordination, negotiation, and role‑play evaluation.
  • Artifacts & Reproducibility: dataset/specification releases, leaderboards, reproducibility studies.

[General Track] Other Relevant Topics

The general track welcomes research broadly related to scaling environments for agents, including but not limited to:

  • LLMs in Interactive Environments: policy learning, planning, reward shaping, hybrid training (e.g. RLHF, PPO), interaction‑based fine‑tuning.
  • Tool‑Use and Software Environments: agents as programmers, API orchestration, agentic debugging, self‑healing code, software manipulation, web navigation.
  • Multi‑Agent & Social Environments: population scaling, emergent behaviors, communication, coordination, competition, social alignment and safety.
  • Embodiment & Grounding: perception‑action loops, physical simulation, spatial reasoning, robotics integration, sim‑to‑real transfer.
  • Sim2Real & Deployment: domain adaptation, real‑world API integration, robustness under scale, safety, large‑scale deployment.

Awards

All accepted papers will be presented in a poster session. Up to four outstanding papers (two per track) will be invited for oral presentations. Each track will confer its own Best Paper Award.

Submission Guidelines

We manage paper submissions through OpenReview. The review process is double‑blind, so submissions must be anonymized. We welcome work that is (1) original and unpublished, (2) recently published, or (3) work‑in‑progress. Submissions will not be indexed or have archival proceedings.

Please use the NeurIPS 2025 LaTeX style file; it includes a preprint option for non‑anonymous preprints posted online (see additional formatting details here). Submissions should be PDFs of ≤ 9 pages (excluding references and appendices).

Important Dates (Anywhere on Earth)

Paper Submission Deadline
Notification of Acceptance
Camera‑ready Paper Submission
Workshop at NeurIPS
Schedule (Tentative) Scaling Environments for Agents
Time Session Duration Note
09:00 Opening Remarks 10 min
09:10 Contributed Talks 1 & 2 50 min
10:00 Coffee Break 20 min
10:20 Poster Session 1 50 min
11:10 Invited Talks 1 & 2 60 min
12:10 Lunch Break 50 min
13:00 Invited Talks 3 & 4 60 min
14:00 Coffee Break 20 min
14:20 Contributed Talks 3 & 4 50 min
15:10 Poster Session 2 50 min
16:00 Invited Talk 5 & 6 60 min
17:00 Panel Discussion 50 min
17:50 Closing Remarks 10 min
Support Team Scaling Environments for Agents
  • Web Chair: Douglas Lai
  • Logistics Coordinators: Some people here...

We thank our support team for their dedication and behind-the-scenes work that made this workshop possible.