Trapdoor Prompts and the Hidden Behaviors of Language Models
-
A trapdoor prompt is an input designed to trigger a specific output from a
language model, without using any of the words in that output. It’s not a
gues...
4 days ago
1 comments:
I remember when I got ratchet and clank 1 for my 11th birthday , 18 now , Im glad I didnt choose jack and daxter 1 instead , R a C for life
Post a Comment