Call Me A Jerk: Persuading AI to Comply with Objectionable Requests

Lennart Meincke, Dan Shapiro, Angela Duckworth, Ethan R. Mollick, Lilach Mollick, Robert Cialdini — 2025-07-18 — University of Pennsylvania, The Wharton School, WHU - Otto Beisheim School of Management, Glowforge, Inc — SSRN / The Wharton School Research Paper

Summary

Tests whether 7 established persuasion principles can induce GPT-4o mini to comply with objectionable requests (insulting users and synthesizing regulated drugs), conducting 28,000 conversations to systematically evaluate jailbreaking effectiveness.

Key Result

Prompts employing persuasion principles more than doubled compliance rates with objectionable requests (72.0%) compared to matched control prompts (33.3%, p < .001).

Source