Shaping Responsible Synthetic Data in the Era of Foundation Models

A Workshop at AAAI 2026

27th January 2026

Singapore Expo, Singapore

RSD @ AAAI 2026

Welcome to the AAAI'26 Workshop on Shaping Responsible Synthetic Data in the Era of Foundation Models (RSD).

Foundation models (LLMs and multimodal FMs) are increasingly supplemented with synthetic and LLM-generated data - to extend training corpora, fill coverage gaps, and navigate privacy and fairness requirements. At the same time, these models serve as powerful generators of synthetic data for downstream applications, such as training specialized ML models, augmenting datasets in privacy-sensitive domains, and generating test cases for system validation.

But synthetic datasets, whether used to train FMs or generated by them for other uses, introduce a plethora of risks: legal (copyright, consent), security & privacy related (leakage, membership inference), ethical (bias amplification), and technical (model collapse, quality degradation).

This workshop examines how synthetic data can be responsibly generated and used to fuel, test, and govern foundation models across their lifecycle (pre-training, fine-tuning, evaluation, auditing), as well as how FM-generated synthetic data impacts downstream applications and systems, and what technical, ethical, and regulatory guardrails are needed across this synthetic data ecosystem.

Topics of Interest (But not limited to)

Lifecycle Uses & LLM‑Driven Generation

Synthetic data for pre-training and fine-tuning (RLHF/RLAIF), continual evaluation, and self-training or bootstrapping loops—along with their limitations.

Explainability, Interpretability & Uncertainty

Synthetic counterfactuals and narratives for debugging; quantifying uncertainty; cross-domain benchmarks spanning tabular, time-series, text, vision, and multimodal data.

Safety, Robustness & Red‑Teaming

Synthetic adversarial probes, jailbreak tests, and edge-case simulations; preventing model collapse or shortcut learning through robust real/synthetic mixes.

Fairness, Bias & Representation

Synthetic adversarial probes, jailbreak tests, and edge-case simulations; preventing model collapse or shortcut learning through robust real/synthetic mixes..

Critical Perspectives on Synthetic Data

Cross-disciplinary examinations from law, policy, ethics, and social sciences; defining what is synthetic data, questioning appropriateness and societal impacts; tensions between technological possibility and human authenticity; critical assessments of synthetic data's role in shaping future AI systems.

Standards, Metrics & Tooling for Trustworthy Use

Metric suites for fidelity, utility, and privacy; responsible generation protocols (e.g., source curation, prompt filtering, DP noise, provenance/watermarking, disclosure “data cards”); validation pipelines and audit checklists; open-source vs. commercial generators.

Privacy, Security & Data Governance

Differential privacy and other PETs; leakage and membership-inference risks; consent, copyright, and provenance concerns; comparative perspectives on regulation.

Call for Papers

Important Dates

  • Submission Due Date: October 20th, 2025 AoE
  • Notification of Acceptance: November 10th, 2025, AoE
  • Workshop Dates: January 27th 2026, Singapore

Submission Instructions

  • Papers should be no more than 4 pages in length, though additional pages for references and appendices are permitted. Note that reviewers are not required to read appendices during their evaluation.
  • Submissions must be in a single PDF, formatted with the official AAAI LaTeX template, and uploaded via OpenReview.
  • Submissions are double-blind. Please ensure your paper is fully anonymized. Papers that are not anonymized or that exceed the page limit will be automatically rejected.
  • There is no rebuttal period. Final decisions are based only on the initial submission and reviewer feedback. Rejected or withdrawn papers will remain private. Reviews will not be published.

This is a non-archival workshop. While accepted papers will be accessible on the workshop website, we do not publish formal proceedings. You may submit work that is:

  • Previously presented conference/workshop papers: Significant updates are required.
  • Journal papers not previously presented at conferences/workshops: Submissions are welcome if they offer novel value for the community.
  • Dual submission to other workshops: Generally allowed.

Call for Reviewers

We are looking for reviewers for the workshop. If you would like to volunteer as a reviewer, please fill the Call for Reviewers Google Form.

Details on Program, Invited Speakers and Panelists

Coming soon

Workshop Organizers

Contact Us

Email us at aaai26-responsiblesyntheticdata@googlegroups.com