![]() ![]() And later, they helped verify the attack results and discussed possible defenses in the interest of helping the security community. Zhang said he and his co-authors worked with Carlini, providing him with their defense model and source code. ![]() That is where the approach begins to break our provided prototype in the paper." " approach starts by recovering the mask of the patch-based trigger, which definitely is possible and smart since the 'key' space of the mask is limited, thus suffering from a simple brute force attack. AI-Guardian is designed to detect when images have likely been manipulated to trick a classifier, and GPT-4 was tasked with evading that detection. ![]() Specifically, GPT-4 emits scripts (and explanations) for tweaking images to fool a classifier – for example, making it think a photo of someone holding a gun is a photo of someone holding a harmless apple – without triggering AI-Guardian's suspicions. In a paper titled, "A LLM Assisted Exploitation of AI-Guardian," Nicholas Carlini, a research scientist for Google's Deep Mind, explores how AI-Guardian, a defense against adversarial attacks on models, can be undone by directing the GPT-4 chatbot to devise an attack method and to author text explaining how the attack works.Ĭarlini's paper includes Python code suggested by GPT-4 for defeating AI-Guardian's efforts to block adversarial attacks. Analysis A Google scientist has demonstrated that OpenAI's GPT-4 large language model (LLM), despite its widely cited capacity to err, can help smash at least some safeguards put around other machine learning models – a capability that demonstrates the value of chatbots as research assistants. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |