September 24, 2024
Mitigating Reverse Preference Attacks in Large Language Models through Modality Fusio...
Yoshiki Nishikado, Souta Uemura, Haruto Matsushige, et al.