Research Tools & Code·arXiv cs.CL·4d ago

Learning Whom to Trust: Market-Feedback Adaptive Retrieval for Frozen LLMs in Event-Driven Financial RAG

Researchers propose a frozen-LLM architecture for financial event prediction that decouples retrieval ranking from language understanding. Rather than relying on static textual similarity, the system learns which information sources matter most through market feedback, updating a Bayesian memory layer as predictions mature against actual returns. This approach addresses a core RAG limitation: relevance signals vary by context and time horizon, yet most systems treat all evidence equally. The work suggests that production LLM systems can remain static while adaptive retrieval layers capture domain-specific signal patterns, potentially reducing retraining costs in high-stakes applications.

Modelwire context

Explainer

The paper's actual contribution is narrower than it appears: the frozen LLM isn't new, but using realized returns as a feedback signal to retrain only the retrieval ranker is. This sidesteps the assumption that textual similarity predicts relevance in finance, where the same document matters differently depending on time horizon and market regime.

This connects directly to the on-device learning survey from late May, which mapped how systems must adapt after deployment when real-world conditions diverge from training data. Here, the adaptation happens not in the language model itself but in the retrieval layer, learning which sources actually predict returns as market conditions shift. The frozen-LLM constraint also echoes the fixed-point masked generative modeling work from the same period, which showed that keeping core model weights static while optimizing auxiliary components (solvers there, rankers here) can reduce retraining friction in production systems.

If FinRL-DeepSeek publishes live trading results showing the Bayesian retrieval layer's signal decays predictably over quarters (requiring periodic reranking), that confirms the approach scales beyond backtests. If the same team or competitors report that static LLM plus adaptive retrieval outperforms end-to-end fine-tuning on held-out event windows, the cost argument holds; otherwise it's just a different trade-off.

Coverage we drew on

What changes after deployment? A survey on On-device Learning in TinyML · arXiv cs.LG

This analysis is generated by Modelwire’s editorial layer from our archive and the summary above. It is not a substitute for the original reporting. How we write it.

MentionsFinRL-DeepSeek · FNSPID · Nasdaq · SEC

Read full story at arXiv cs.CL →(arxiv.org)

Modelwire Editorial

This synthesis and analysis was prepared by the Modelwire editorial team. We use advanced language models to read, ground, and connect the day’s most significant AI developments, providing original strategic context that helps practitioners and leaders stay ahead of the frontier.

Our mission How we write

Modelwire summarizes, we don’t republish. The full content lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.