About me
I research methods to detect and understand unexpected phenomena in the behavior of machine learning models and in scientific data.
I am now a post-doc at New York University, working with Prof. Chinmay Hegde. Prior to that I received my PhD at the Hebrew University, supervised by Prof. Yedid Hoshen.
You can also check my personal site with some nice riddles and more.
News
Three papers I co-authored have been accepted to NeurIPS 2024 (on Tabular Data - Main Track, Data Curation - D & B, and Jailbreaking - D & B Spotlight)
Our workshop Red Teaming GenAI: What Can We Learn from Adversaries will take place at NeurIPS 2024. We are accepting submissions until September 14, 2024!
My team together with Yuval Lemberg, Hestia, won 1st place (Defense track) and 3st place (Attack track) in the Large Language Model Capture-the-Flag Competition SaTML 2024.
Our work Circumventing concept erasure methods for text-to-image generative models will be presented in ICLR 2024.