Research Tools & Code·arXiv cs.LG·Apr 17

Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation

Researchers propose RISE, a scalable method for attributing and valuing training data in large language models by focusing on influence hotspots at the output layer rather than computing gradients across entire models. The technique uses dual-channel sketching to reduce computational overhead, addressing a major bottleneck in understanding which data drives LLM behavior.

Modelwire context

Explainer

The real significance isn't just speed: data attribution at scale is a prerequisite for meaningful data governance, meaning RISE is as much a compliance and auditing tool as an efficiency one. Most coverage of influence functions treats them as a research curiosity, but the ability to cheaply ask 'which training examples caused this output' has direct implications for copyright disputes, data poisoning detection, and model audits.

This connects most directly to the reliability and interpretability thread running through recent Modelwire coverage. The 'Diagnosing LLM Judge Reliability' piece from April 16 exposed how aggregate consistency metrics can mask per-instance failures, and RISE addresses a structurally similar problem: aggregate model behavior obscures which specific data is responsible for specific outputs. Both papers are pushing toward finer-grained accountability in LLM pipelines. The 'Context Over Content' evaluation-faking story is also relevant here, since a method that can attribute outputs to training data could, in principle, help diagnose why a judge model behaves the way it does under stakes signaling. That connection is speculative, but the tooling gap both papers point at is the same one.

Watch whether RISE gets adopted in any data marketplace or licensing audit context within the next six months. If a legal or compliance use case cites it before an ML efficiency use case does, that confirms the governance angle is the actual driver of interest here.

Coverage we drew on

Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations · arXiv cs.LG

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsRISE · CountSketch · LLM

Read full story at arXiv cs.LG →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.