Modelwire
Subscribe

Learning to Think Like a Cartoon Captionist: Incongruity-Resolution Supervision for Multimodal Humor Understanding

Researchers introduce IRS, a framework that decomposes humor understanding into incongruity detection, resolution modeling, and preference alignment, grounded in cognitive theory and tested on the New Yorker Cartoon Caption Contest benchmark.

MentionsNew Yorker Cartoon Caption Contest · IRS · incongruity-resolution theory

Modelwire summarizes — we don’t republish. The full article lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Related

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning

arXiv cs.CL·

Structural interpretability in SVMs with truncated orthogonal polynomial kernels

arXiv cs.LG·

DiscoTrace: Representing and Comparing Answering Strategies of Humans and LLMs in Information-Seeking Question Answering

arXiv cs.CL·
Learning to Think Like a Cartoon Captionist: Incongruity-Resolution Supervision for Multimodal Humor Understanding · Modelwire