Home
AISci BriefProblems

AI research agents need provenance before they need more polish.

Research agents can read, code, summarize, and propose experiments. The scientific value comes only when their outputs preserve sources, assumptions, failed attempts, data, code, and uncertainty.

ProblemReproducibility
UsersLabs + reviewers
OutputEvidence objects

The search problem

People searching for reproducible AI research agents are not looking for another chatbot. They are looking for a way to make AI-generated scientific claims inspectable, executable, and reviewable by another researcher.

reproducible AI research agentsAI for science provenanceresearch agent workflowscientific source trailsAI replication benchmarkopen science automation

What every agent output should preserve

  • Claim: the exact scientific statement or hypothesis, with uncertainty and scope.
  • Evidence: source links, quoted methods, data provenance, and missing counter-evidence.
  • Execution: notebook, code, environment, parameters, seeds, and run instructions.
  • Failure log: rejected hypotheses, failed runs, negative evidence, and reviewer disagreement.

Scientists and institutions AISci should keep mapping

  • Open-science communities and reproducibility editors who define review norms.
  • AI-for-science teams building agents that run executable workflows instead of only summaries.
  • Data stewards and benchmark builders who can convert messy research artifacts into durable evidence objects.

Proof-of-work task for young researchers

Choose one important AI-for-science paper and turn it into a reproducible packet: claim map, source list, executable notebook, environment file, failed-run notes, and a one-page limitation memo.

Submit proof-of-work

Why capital should care

Pharma, materials, climate, and enterprise R&D teams all want faster discovery, but regulated and high-stakes domains also need proof. The commercial wedge is provenance infrastructure for serious labs.

Sources and next reading