Publications

* denotes equal contributions

2025

  1. multi-agent-safety.png
    Divide-and-Conquer Attacks on LLM Agents: Orchestrating Multi-Step Jailbreaks in Tool-Enabled Systems
    Xiaofeng Lin*, Yukai Yang*, Daniel Guo, Sahil Arun Nale, Charles Fleming, and Guang Cheng
    In Preprint, 2025
    Under Review

2024

  1. uq-llm.png
    Weak-to-Strong Confidence Prediction
    Yukai Yang*, Tracy Zhu*, Marco Morucci, and Tim G. J. Rudner
    In NeurIPS 2024 Workshop on Statistical Foundations of LLMs, 2024
    Workshop Paper