Learning to Think Like a Cartoon Captionist: Incongruity-Resolution Supervision for Multimodal Humor Understanding

Researchers introduce IRS, a framework that decomposes humor understanding into incongruity detection, resolution modeling, and preference alignment, grounded in cognitive theory and tested on the New Yorker Cartoon Caption Contest benchmark.
MentionsNew Yorker Cartoon Caption Contest · IRS · incongruity-resolution theory
Read full story at arXiv cs.CL →(arxiv.org)
Modelwire summarizes — we don’t republish. The full article lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.