Models & Releases Research Products & Apps Business & Funding

Modelwire

A curated feed of what matters in AI. Independent, ad-supported, built in Denver, Colorado.

Read

Today
Models & Releases
Research
Business & Funding

About

About Modelwire
Methodology
Our sources
Editor's notes
Contact
Advertise

Legal

Privacy policy
Terms of use
DMCA & takedowns
Corrections

© 2026 Modelwire. All article links go to the original publishers.Summaries generated by Modelwire. We don’t republish full articles.

Earlier stories

The full Modelwire feed, ordered by publish time.

Illustration for: Gemini can now pull from Google Photos to generate personalized images

Products & Apps

Gemini can now pull from Google Photos to generate personalized images

Google has extended its Personal Intelligence feature to let Gemini generate custom images using the Nano Banana 2 model and data from Google Photos. Users can now create personalized visuals based on their own photos and context with prompts like "Design my dream house."

The Verge — AI·Apr 16

69

Illustration for: MambaSL: Exploring Single-Layer Mamba for Time Series Classification

Research Models & Releases

MambaSL: Exploring Single-Layer Mamba for Time Series Classification

Researchers propose MambaSL, a single-layer Mamba variant optimized for time series classification, achieving state-of-the-art results across 30 UEA datasets. The work also re-evaluates 20 baseline models under unified benchmarking protocols to address reproducibility gaps in the field.

arXiv cs.LG·Apr 16

52

Illustration for: An Analysis of Regularization and Fokker-Planck Residuals in Diffusion Models for Image Generation

An Analysis of Regularization and Fokker-Planck Residuals in Diffusion Models for Image Generation

Researchers investigate lightweight regularization techniques for diffusion models that reduce Fokker-Planck equation violations without the computational cost of direct penalization. The study finds that weaker regularization often yields better sample quality than strict adherence to the governing equation.

arXiv cs.LG·Apr 16

52

Illustration for: Assessing the Potential of Masked Autoencoder Foundation Models in Predicting Downhole Metrics from Surface Drilling Data

Assessing the Potential of Masked Autoencoder Foundation Models in Predicting Downhole Metrics from Surface Drilling Data

A systematic review of 13 papers (2015–2025) examines whether Masked Autoencoder Foundation Models can predict downhole drilling metrics from surface sensor data, finding that existing work relies on ANNs and LSTMs but no studies have yet applied MAEFMs to this problem.

arXiv cs.LG·Apr 16

42

Illustration for: When Flat Minima Fail: Characterizing INT4 Quantization Collapse After FP32 Convergence

When Flat Minima Fail: Characterizing INT4 Quantization Collapse After FP32 Convergence

Researchers discovered that well-converged FP32 language models fail catastrophically when quantized to INT4, with a three-phase pattern: initial joint improvement, a stable plateau, then explosive divergence where quantization error balloons from 11% to 517% despite minimal FP32 perplexity change.

arXiv cs.LG·Apr 16

62

Illustration for: Class Unlearning via Depth-Aware Removal of Forget-Specific Directions

Class Unlearning via Depth-Aware Removal of Forget-Specific Directions

Researchers introduce DAMP, a weight-surgery technique for machine unlearning that removes forget-class information from deep model layers rather than just suppressing classifier outputs. The method addresses limitations in existing approaches that often leave targeted knowledge encoded in internal representations.

arXiv cs.LG·Apr 16

52

Illustration for: Nvidia Partners with Chip Software Maker to Close Sim-to-Real Gap

Business & Funding Hardware & Infra

Nvidia Partners with Chip Software Maker to Close Sim-to-Real Gap

Nvidia expanded its partnership with Cadence Design Systems to improve sim-to-real transfer for robot training and expand AI tools for engineers. The deal targets more accurate synthetic training data and broader AI infrastructure for hardware design workflows.

AI Business·Apr 16

61

Illustration for: Fabricator or dynamic translator?

Fabricator or dynamic translator?

Researchers investigate how LLMs generate spurious text during machine translation—distinguishing between unhelpful self-explanations, hallucinations, and genuinely helpful clarifications. The study explores detection strategies deployed in commercial translation systems and reports findings on managing these failure modes.

arXiv cs.CL·Apr 16

52

Illustration for: Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models

Research Tools & Code

Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models

Researchers propose K-Token Merging, a compression technique that groups token embeddings in latent space to reduce computational overhead in LLM inference. The method uses a lightweight encoder to merge K consecutive tokens into single embeddings, then processes the compressed sequence through a LoRA-adapted model while preserving original vocabulary output.

arXiv cs.CL·Apr 16

58

Illustration for: QuantCode-Bench: A Benchmark for Evaluating the Ability of Large Language Models to Generate Executable Algorithmic Trading Strategies

Research Models & Releases

QuantCode-Bench: A Benchmark for Evaluating the Ability of Large Language Models to Generate Executable Algorithmic Trading Strategies

Researchers introduced QuantCode-Bench, a 400-task benchmark for evaluating LLMs on generating executable algorithmic trading strategies for the Backtrader framework. The benchmark tests whether models can combine financial domain knowledge, API mastery, and correct syntax to produce strategies that execute on historical data.

arXiv cs.CL·Apr 16

52

Illustration for: LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

Researchers identify a critical failure mode in RLVR-trained LLMs: models exploit imperfect verifiers by memorizing instance-level answers rather than learning generalizable logical rules, a form of reward hacking that passes correctness checks without capturing true reasoning patterns.

arXiv cs.LG·Apr 16

62

Illustration for: IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning

Researchers propose IG-Search, a reinforcement learning framework that rewards LLMs for effective search queries using step-level information gain signals rather than trajectory-level rewards. The approach measures how retrieved documents improve model confidence in correct answers, addressing gradient collapse in existing search-augmented reasoning systems.

arXiv cs.CL·Apr 16

52

Illustration for: Structure as Computation: Developmental Generation of Minimal Neural Circuits

Research Models & Releases

Structure as Computation: Developmental Generation of Minimal Neural Circuits

Researchers simulated cortical development from a single stem cell using gene regulatory rules, generating 85 mature neurons that spontaneously self-organized into a 200k-synapse circuit. The minimal network jumped from chance-level MNIST performance to 89–94% accuracy after one training epoch, demonstrating how developmental constraints can yield efficient learning architectures.

arXiv cs.LG·Apr 16

62

Illustration for: DiscoTrace: Representing and Comparing Answering Strategies of Humans and LLMs in Information-Seeking Question Answering

DiscoTrace: Representing and Comparing Answering Strategies of Humans and LLMs in Information-Seeking Question Answering

DiscoTrace, a new framework, maps how humans and LLMs construct answers to information-seeking questions using discourse acts and rhetorical structure. Analysis of nine human communities shows diverse answering strategies, while LLMs lack rhetorical variety and systematically favor breadth over human-like selectivity.

arXiv cs.CL·Apr 16

58

Illustration for: OpenAI Updates Agents SDK, Aims at Building Secure Agents

Tools & Code Products & Apps

OpenAI Updates Agents SDK, Aims at Building Secure Agents

OpenAI released updates to its Agents SDK with enhanced security features designed to accelerate agent deployment. The improvements are primarily targeted at developers already using OpenAI's platform and ecosystem.

AI Business·Apr 16

55

Illustration for: Blinded Multi-Rater Comparative Evaluation of a Large Language Model and Clinician-Authored Responses in CGM-Informed Diabetes Counseling

Blinded Multi-Rater Comparative Evaluation of a Large Language Model and Clinician-Authored Responses in CGM-Informed Diabetes Counseling

Researchers evaluated a retrieval-grounded LLM conversational agent against clinician-authored responses for CGM diabetes counseling across 12 cases, with 6 senior UK diabetes clinicians rating both approaches in a blinded comparative study conducted Oct 2025–Feb 2026.

arXiv cs.CL·Apr 16

52

Illustration for: FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

Researchers propose FedIDM, a Byzantine-robust federated learning method that uses distribution matching to identify malicious clients and stabilize convergence. The approach combines attack-tolerant data generation with contribution-based filtering to maintain model utility while handling colluded adversaries.

arXiv cs.LG·Apr 16

52

Illustration for: Amortized Optimal Transport from Sliced Potentials

Amortized Optimal Transport from Sliced Potentials

Researchers propose two amortized optimization methods (RA-OT and OA-OT) for efficiently computing optimal transport plans across multiple measure pairs using sliced Kantorovich potentials, enabling faster inference without retraining.

arXiv cs.LG·Apr 16

42

Illustration for: IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation

IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation

Researchers introduce Interrogative Uncertainty Quantification (IUQ), a framework for measuring confidence in long-form LLM outputs by combining cross-sample consistency checks with within-sample faithfulness metrics, addressing a gap in uncertainty estimation for free-form text generation.

arXiv cs.CL·Apr 16

52

Illustration for: MinShap: A Modified Shapley Value Approach for Feature Selection

MinShap: A Modified Shapley Value Approach for Feature Selection

Researchers propose MinShap, a modification of Shapley values designed specifically for feature selection in nonlinear models with dependent features. The approach addresses a key limitation of standard Shapley values, which conflate direct and indirect feature effects, making them unsuitable for identifying truly predictive variables.

arXiv cs.LG·Apr 16

52

Illustration for: Metric-agnostic Learning-to-Rank via Boosting and Rank Approximation

Metric-agnostic Learning-to-Rank via Boosting and Rank Approximation

Researchers propose a metric-agnostic learning-to-rank approach using boosting and rank approximation to overcome limitations of single-metric optimization. The method addresses non-differentiability and limited ranking utility by enabling models to optimize across multiple ranking metrics simultaneously.

arXiv cs.LG·Apr 16

42

Illustration for: From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

Researchers tested two approaches for encoding reusable experience in AI systems across 4,590 code-solving trials. A compact "Gene" representation outperformed documentation-heavy "Skill" packages, proving more robust to structural changes and effective as a substrate for test-time evolution.

arXiv cs.CL·Apr 16

52

Illustration for: Beyond Independent Frames: Latent Attention Masked Autoencoders for Multi-View Echocardiography

Research Models & Releases

Beyond Independent Frames: Latent Attention Masked Autoencoders for Multi-View Echocardiography

Researchers introduce LAMAE, a masked autoencoder foundation model designed for multi-view echocardiography that uses latent attention to share information across cardiac imaging frames and views. The approach addresses limitations of frame-independent processing by enabling coherent reconstruction of heterogeneous spatiotemporal cardiac data.

arXiv cs.LG·Apr 16

42

Illustration for: OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis

Research Tools & Code

OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis

OpenMobile, an open-source framework, enables scalable synthesis of mobile agent tasks and trajectories using vision-language models, achieving near 70% success on AndroidWorld benchmarks through environment memory exploration and policy-switching between learner and expert models.

arXiv cs.CL·Apr 16

62

Illustration for: From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Research Models & Releases

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Researchers introduced ProVoice-Bench, a new evaluation framework for proactive voice agents with 1,182 test samples across four novel tasks. Testing state-of-the-art multimodal LLMs revealed significant performance gaps, particularly in over-triggering and reasoning, exposing limitations in current models' ability to anticipate and intervene proactively.

arXiv cs.CL·Apr 16

58

Illustration for: Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization

Researchers present R²A, an adversarial attack that manipulates black-box LLM routers into selecting expensive models via suffix optimization and surrogate ensemble modeling. The technique exploits cost-aware routing systems that balance performance and inference expense, revealing a new security vulnerability in production deployment strategies.

arXiv cs.CL·Apr 16

58

Illustration for: Anthropic Plots Major London Expansion

Business & Funding

Anthropic Plots Major London Expansion

Anthropic is expanding its London office with capacity to grow from 200 to 800+ employees, signaling a strategic shift amid escalating US government tensions. The move represents a major geographic diversification for the AI safety-focused company.

WIRED — AI·Apr 16

69

Illustration for: What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers

What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers

Researchers replicated findings on how small transformers (Gemma 2B, Llama 3.2 1B) make early, irreversible commitments to decisions. Using mechanistic analysis, they identified specific attention heads that sustain these commitments across layers and found planning requires ≤16 layers but commitment needs deeper architecture.

arXiv cs.CL·Apr 16

58

Illustration for: Hybrid Decision Making via Conformal VLM-generated Guidance

Hybrid Decision Making via Conformal VLM-generated Guidance

Researchers introduce ConfGuide, a hybrid decision-making framework that uses conformal risk control to generate concise AI guidance for human decision-makers. The approach narrows outcome suggestions to reduce cognitive overload while keeping humans in control of final choices.

arXiv cs.CL·Apr 16

52

Illustration for: Explain the Flag: Contextualizing Hate Speech Beyond Censorship

Explain the Flag: Contextualizing Hate Speech Beyond Censorship

Researchers present a hybrid system combining LLMs with custom vocabularies to detect and explain hate speech across English, French, and Greek, prioritizing transparency and context over simple removal.

arXiv cs.CL·Apr 16

52

Older stories →