This is Swadesh Swain
Incoming PhD Student @ UMD | IIT Roorkee Undergraduate Researcher
BTech Student
Electronics and Communication Engineering
IIT Roorkee, Uttarakhand, India
Hey there! I’m an incoming PhD student in ECE at the University of Maryland, College Park, and currently an undergrad at IIT Roorkee studying Electronics and Communication Engineering. I spend most of my time thinking about how to make AI systems both powerful and safe. My research interests lie at the intersection of Mechanistic Interpretability, AI Safety, and Adversarial Robustness of Vision-Language Models and LLMs.
My first foray into interpretability research led to a paper on improving theoretical guarantees of Integrated Gradients attribution methods, accepted at the NeurIPS 2024 Interpretable AI Workshop. This was followed by an extensive study on adversarial vulnerabilities in VLMs – our reproducibility and enhancement work on Cross-Prompt Attacks was published in TMLR and received the Best Paper Award at MLRC 2025 (presented at Princeton). We further extended this into CroPA++, accepted at the NeurIPS 2025 Reliable ML Workshop, introducing three-fold enhancements that made attacks transferable across images and models.
Currently, I’m collaborating with Dr. Koustuv Sinha at META AI (FAIR) on benchmarking world model understanding and anticipation mechanisms in Video Language Models. I’m also working with Dr. Sanghamitra Dutta at the University of Maryland, College Park on investigating suppressed safety-critical features in LLM reasoning circuits and their causal impact on jailbreaks. At Virginia Tech, I’m developing user-intervenable LLM pipelines with Dr. Nagender Aneja, applying circuit-tracing interpretability methods for real-time model steering.
On the generative AI front, I developed RIGS – a lightweight Riemannian-guided diffusion framework for synthetic signal data generation in collaboration with BOSCH India (under review at IJCAI 2026). During my internship at AuraML, I built text-to-3D scene generation frameworks using Graph Diffusion Models for industrial simulation, contributing directly to their product AuraSim. I’ve also worked on multi-task RL with Diffusion Models at IIIT Hyderabad’s Robotics Research Centre, and on medical AI pipelines at IIT Bombay’s Koita Centre for Digital Health as part of the BharatGen consortium.
Beyond research, I lead the Data Science Group at IIT Roorkee as Joint Secretary, heading the research division. Under my tenure, our members have published 15+ papers at venues like NeurIPS, CVPR, and ICLR – most led solely by undergraduate teams. I also serve as a reviewer for TMLR.
When I’m not debugging code or reading papers, you’ll probably find me at campus chai spots discussing the latest ML papers, or exploring Roorkee’s food scene. Feel free to reach out if you want to chat about research, collaborate on projects, or just grab a cup of chai!
news
| Apr 19, 2026 | Excited to share our new paper “The Encoding-Behavior Dissociation: How Distributed Safety Representations Yield Single-Direction Vulnerabilities in Vision-Language Models” – uncovering how VLM safety is encoded in high-dimensional representations yet gated by a single direction, exposing a structural limitation of current alignment. Submitted to TMLR 2026, currently under review. PDF |
|---|---|
| Mar 31, 2026 | Our new paper “CAP: Counterfactual Activation Potential for Quantifying Suppressed Safety Features in Language Models” has been submitted to COLM 2026! |
| Mar 20, 2026 | Received offers of admission for NYU Masters in CSE from both Courant and Tandon schools, each with a scholarship of $5000 per year! |
| Feb 27, 2026 | Received offer of admission into the ECE PhD program at the University of Maryland, College Park! |
| Jan 19, 2026 | Our new paper “Riemannian-Guided Diffusion for Scalable Synthetic Signal Data Generation” has been submitted to IJCAI 2026! |
| Jan 09, 2026 | Received offer of admission in Northeastern University’s MSc programs of AI and CS, with 2 merit awards for scholarships! |
| Dec 02, 2025 | I am here at NeurIPS 2025, meet up to talk all things AI Safety or just to hang about! |
| Nov 01, 2025 | Excited to finally start my collaboration with Prof. Sanghamitra Dutta from the University of Maryland, College Park! Our work will focus on investigating reasoning mechanisms inside LLMs which can aid in guardrailing against jailbreaks. |
| Sep 30, 2025 | My paper “CroPA++: Exposing Vulnerabilities in Vision Language Models and Enhancing Adversarial Transferability of Cross-Prompt Attacks” has been accepted at the NeurIPS Reliable ML Workshop, 2025! |
| Sep 15, 2025 | Excited to start our research with Dr. Koustuv Sinha from META AI (FAIR), on evaluating world model understanding of VideoLMs! |
| Sep 09, 2025 | Fortunate to be accepted by Professor Nagendra Aneja at Virginia Tech to pursue applied interpretability research under his guidance. Our work involves designing user-intervenable reasoning agents for on-the-fly steering of LLMs. |
| Aug 21, 2025 | Attending MLRC at Princeton University! Super excited to present our first oral presentation! |
| Aug 11, 2025 | Our work “Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language Models” is the recipient of Best Paper at the Machine Learning Reproducibility Challenge at Princeton University! Catch the tweet here. |
| Jun 27, 2025 | Revisiting CroPA is further accepted at the Machine Learning Reproducibility Challenge. |
| Jun 16, 2025 | Our work “Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language Models” has been accepted into TMLR journal! |
| Jun 02, 2025 | I will be joining AuraML as a Research Intern to work on cutting edge Generative 3D Vision for industrial simulation applications. |
| Apr 15, 2025 | I will be joining Robotics Research Centre, IIIT Hyderabad as undergraduate research intern for the summer. |
| Dec 10, 2024 | Here at NeurIPS to attend my first ever in-person conference! |
| Oct 10, 2024 | My debut paper “Riemann Sum Optimization for Accurate Integrated Gradients Computation” has been accepted to the NeurIPS 2024, Interpretable AI Workshop! |
| Apr 04, 2024 | I will be joining the BharatGen Team at Indian Institute of Technology (IIT) Bombay as Machine Learning Research intern this summer. |
selected publications
- Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language ModelsTransactions on Machine Learning Research, Jun 2025