publications

publications by categories in reversed chronological order. * denotes equal contribution.

2026

  1. Under Review
    vlm_safety.png
    The Encoding-Behavior Dissociation: How Distributed Safety Representations Yield Single-Direction Vulnerabilities in Vision-Language Models
    Swadesh Swain and Sparsh Mittal
    Under Review at TMLR 2026, Apr 2026
  2. Under Review
    cap.png
    CAP: Counterfactual Activation Potential for Quantifying Suppressed Safety Features in Language Models
    Swadesh Swain and Sanghamitra Dutta
    In Under Review at COLM 2026, Feb 2026
  3. Under Review
    rigs.png
    Riemannian-Guided Diffusion for Scalable Synthetic Signal Data Generation
    Swadesh Swain, Aakash Kumar Singh, and Sparsh Mittal
    In Under Review at IJCAI 2026, Jan 2026

2025

  1. CroPA++: Exposing Vulnerabilities in Vision Language Models and Enhancing Adversarial Transferability of Cross-Prompt Attacks
    Atharv Mittal*, Agam Pandey*, Swadesh Swain*, and 2 more authors
    In NeurIPS 2025 Workshop on Reliable ML, Sep 2025
  2. Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language Models
    Atharv Mittal*, Agam Pandey*, Swadesh Swain*, and 2 more authors
    Transactions on Machine Learning Research, Jun 2025

2024

  1. Riemann Sum Optimization for Accurate Integrated Gradients Computation
    Shree Singhi* and Swadesh Swain*
    In NeurIPS 2024 Workshop on Interpretable AI: Past, Present and Future, Dec 2024