About

About Me

My interdisciplinary approach integrates reinforcement learning, control theory, and optimization, bridging fundamental AI research with critical applications in infrastructure, cybersecurity, and human-AI interaction.

Education: PhD in EECS from UC Berkeley | BEng (honors) from HKUST
Previously: Postdoc in IEOR at UC Berkeley

Selected Publications

For the full list, see my Google Scholar.

2025

Reinforcement Learning with Backtracking Feedback

Bilgehan Sel, Vaishakh Keshava, Phillip Wallis, Lukas Rutishauser, Ming Jin, Dingcheng Li

Conference on Neural Information Processing Systems (NeurIPS), 2025

Official

TL;DR: A backtracking method that reverts to safer points during generation, reducing safety violations Abstract: Addressing the critical need for robust safety in Large Language Models (LLMs), particularly against adversarial attacks and in-distribution errors, we introduce Reinforcement Learning with Backtracking Feedback (RLBF). This framework advances upon prior methods, such as BSAFE, by primarily leveraging a Reinforcement Learning (RL) stage where models learn to dynamically correct their own generation errors. Through RL with critic feedback on the model's live outputs, LLMs are trained to identify and recover from their actual, emergent safety violations by emitting an efficient "backtrack by x tokens" signal, then continuing generation autoregressively. This RL process is crucial for instilling resilience against sophisticated adversarial strategies, including middle filling, Greedy Coordinate Gradient (GCG) attacks, and decoding parameter manipulations. To further support the acquisition of this backtracking capability, we also propose an enhanced Supervised Fine-Tuning (SFT) data generation strategy (BSAFE+). This method improves upon previous data creation techniques by injecting violations into coherent, originally safe text, providing more effective initial training for the backtracking mechanism. Comprehensive empirical evaluations demonstrate that RLBF significantly reduces attack success rates across diverse benchmarks and model scales, achieving superior safety outcomes while critically preserving foundational model utility.

Position: AI Safety Must Embrace an Antifragile Perspective

Ming Jin, Hyunin Lee

International Conference on Machine Learning (ICML), Position Paper Track, 2025

OpenReview

TL;DR: This paper promotes an antifragile approach to AI safety, highlighting the need for systems to evolve and enhance their handling of unexpected events beyond static tests, ensuring long-term AI safety. Abstract: This position paper contends that modern AI research must adopt an antifragile perspective on safety—one in which the system's capacity to handle rare or out-of-distribution (OOD) events adapts and expands over repeated exposures. Conventional static benchmarks and single-shot robustness tests overlook the reality that environments evolve and that models, if left unchallenged, can drift into maladaptation (e.g., reward hacking, over-optimization, or atrophy of broader capabilities). We argue that an antifragile approach—rather than striving to rapidly reduce current uncertainties, the emphasis is on leveraging those uncertainties to better prepare for potentially greater, more unpredictable uncertainties in the future—is pivotal for the long-term reliability of open-ended ML systems. In this position paper, we first identify key limitations of static testing, including scenario diversity, reward hacking, and over-alignment. We then explore the potential of dynamic, antifragile solutions to manage rare events. Crucially, we advocate for a fundamental recalibration of the methods used to measure, benchmark, and continually improve AI safety over the long term, complementing existing robustness approaches by providing ethical and practical guidelines towards fostering an antifragile AI safety community. Lay Summary: Problem: Current AI safety approaches test systems once and declare them robust, but real-world environments constantly evolve with new threats, such as new attack methods, unexpected user behaviors, and environmental changes that weren't anticipated during development. Solution: We propose "antifragile" AI safety, inspired by biological immune systems that get stronger after exposure to threats. Instead of hoping our initial safety tests cover everything, we design AI systems that continuously learn from new failures and stress-test themselves in safe environments. When a system encounters an unexpected problem, it doesn't just patch that specific issue—it uses the experience to become more robust against similar future threats. Impact: This approach could prevent catastrophic AI failures by ensuring systems improve from every new challenge they encounter, rather than becoming brittle over time. Instead of playing an endless game of whack-a-mole with new vulnerabilities, we can build AI that evolves to handle tomorrow's unknown threats. This is crucial as AI systems become more powerful and are deployed in critical areas like healthcare, infrastructure, and finance where unexpected failures could have severe consequences.

Safe and balanced: A framework for constrained multi-objective reinforcement learning

Shangding Gu*, Bilgehan Sel*, Yuhao Ding*, Lu Wang, Qingwei Lin, Alois Knoll, Ming Jin

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

PDF arXiv

In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints. To tackle this issue, we propose a primal-based framework that orchestrates policy optimization between multiobjective learning and constraint adherence. Our method employs a novel natural policy gradient manipulation method to optimize multiple RL objectives and overcome conflicting gradients between different objectives, since the simple weighted average gradient direction may not be beneficial for specific objectives due to misaligned gradients of different objectives. When there is a violation of a hard constraint, our algorithm steps in to rectify the policy to minimize this violation. Particularly, We establish theoretical convergence and constraint violation guarantees, and our proposed method also outperforms prior state-of-the-art methods on challenging safe multi-objective RL tasks.

Don’t Trade Off Safety: Diffusion Regularization for Constrained Offline RL

Junyu Guo, Zhi Zheng, Donghao Ying, Ming Jin, Shangding Gu, Costas Spanos, Javad Lavaei

Conference on Neural Information Processing Systems (NeurIPS), 2025

TL;DR: A new approach to learning offline safe reinforcement learning with high performance and safety guarantee Abstract: Constrained reinforcement learning (RL) seeks high-performance policies under safety constraints. We focus on an offline setting where the agent has only a fixed dataset---common in realistic tasks to prevent unsafe exploration. To address this, we propose Diffusion-Regularized Constrained Offline Reinforcement Learning (DRCORL), which first uses a diffusion model to capture the behavioral policy from offline data and then extracts a simplified policy to enable efficient inference. We further apply gradient manipulation for safety adaptation, balancing the reward objective and constraint satisfaction. This approach leverages high-quality offline data while incorporating safety requirements. Empirical results show that DRCORL achieves reliable safety performance, fast inference, and strong reward outcomes across robot learning tasks. Compared to existing safe offline RL methods, it consistently meets cost limits and performs well with the same hyperparameters, indicating practical applicability in real-world scenarios.

Probing Hidden Knowledge Holes in Unlearned LLMs

Myeongseob Ko, Hoang Anh Just, Charles Fleming, Ming Jin, Ruoxi Jia

Conference on Neural Information Processing Systems (NeurIPS), 2025

Official

Abstract: Machine unlearning has emerged as a prevalent technical solution for selectively removing unwanted knowledge absorbed during pre-training, without requiring full retraining. While recent unlearning techniques can effectively remove undesirable content without severely compromising performance on standard benchmarks, we find that they may inadvertently create ``knowledge holes''---unintended losses of benign knowledge that standard benchmarks fail to capture. To probe where unlearned models reveal knowledge holes, we propose a test case generation framework that explores both immediate neighbors of unlearned content and broader areas of potential failures. Our evaluation demonstrates significant hidden costs of unlearning: up to 98.7% of the test cases yield irrelevant or nonsensical responses from unlearned models, despite being answerable by the pretrained model. These findings necessitate rethinking the conventional approach to evaluating knowledge preservation in unlearning, moving beyond standard, static benchmarks.

Retracing the Past: LLMs Emit Training Data When They Get Lost

Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen, Charles Fleming, Ming Jin, Ruoxi Jia

The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

The memorization of training data in large language models (LLMs) poses significant privacy and copyright concerns. Existing data extraction methods, particularly heuristic-based divergence attacks, often exhibit limited success and offer limited insight into the fundamental drivers of memorization leakage. This paper introduces Confusion-Inducing Attacks (CIA), a principled framework for extracting memorized data by systematically maximizing model uncertainty. We empirically demonstrate that the emission of memorized text during divergence is preceded by a sustained spike in token-level prediction entropy. CIA leverages this insight by optimizing input snippets to deliberately induce this consecutive high-entropy state. For aligned LLMs, we further propose Mismatched Supervised Fine-tuning (SFT) to simultaneously weaken their alignment and induce targeted confusion, thereby increasing susceptibility to our attacks. Experiments on various unaligned and aligned LLMs demonstrate that our proposed attacks outperform existing baselines in extracting verbatim and near-verbatim training data without requiring prior knowledge of the training data. Our findings highlight persistent memorization risks across various LLMs and offer a more systematic method for assessing these vulnerabilities.

Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories

Mohammad Beigi, Ying Shen, Parshin Shojaee, Qifan Wang, Zichao Wang, Chandan K. Reddy, Ming Jin, Lifu Huang

The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

Despite the remarkable capabilities of large language models, current training paradigms inadvertently foster sycophancy—alignment with user-provided information, regardless of factual accuracy. In this paper, we introduce SMART (Sycophancy Mitigation through Adaptive Reasoning Trajectories), reconceptualizing sycophancy as a reasoning optimization problem rather than an output alignment issue. SMART employs a two-stage approach: (1) Uncertainty-Aware Adaptive Monte Carlo Tree Search (UA-MCTS), which dynamically adjusts exploration based on state-level uncertainty; and (2) progress-based reinforcement learning that distills these improved reasoning patterns into model adaptation. Through extensive experiments, we show that SMART significantly outperforms existing baselines in effectively reducing sycophancy while maintaining performance on out-of-distribution inputs. These findings demonstrate the importance of optimizing internal reasoning processes for developing aligned truthful AI assistant.

Reinforcement Learning Meets the Power Grid: A Contemporary Survey with Emphasis on Safety and Multi-agent Challenges

Ming Jin

Foundations and Trends in Electric Energy Systems, 2025

PDF Official

Modern power systems face increasing challenges from renewable energy integration, distributed resources, and complex operational requirements. This survey examines Safe Reinforcement Learning (Safe RL) as a framework for maintaining reliable power system operation while optimizing performance. We review both model-free and model-based approaches, analyzing how different safety constraints and architectures can be implemented in practice. The survey explores multi-agent frameworks for coordinated control in distributed settings and examines runtime assurance methods that provide formal safety guarantees. Applications span various timescales, from frequency regulation to demand management, with different safety requirements and operational contexts. Through analysis of current simulation environments and practical implementations, we identify remaining challenges in scaling safe RL to large power systems, handling uncertainty, and integration with existing infrastructure. Suggested citation: Ming Jin (2025), "Reinforcement Learning Meets the Power Grid: A Contemporary Survey with Emphasis on Safety and Multi-agent Challenges", Foundations and Trends® in Electric Energy Systems: Vol. 8: No. 3-4, pp 169-316. https://doi.org/10.1561/3100000043

From Capabilities to Performance: Evaluating Key Functional Properties of LLM Architectures in Penetration Testing

Lanxiao Huang, Daksh Dave, Tyler Cody, Peter A. Beling, Ming Jin

The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

Large Language Models (LLMs) have been explored for automating or enhancing penetration testing tasks, but their effectiveness and reliability across diverse attack phases remain open questions. This study presents a comprehensive evaluation of multiple LLM-based agents—from singular to modular—across realistic penetration testing scenarios, analyzing their empirical performance and recurring failure patterns. We further investigate the impact of core functional capabilities on agent success, operationalized through five targeted augmentations: Global Context Memory (GCM), Inter-Agent Messaging (IAM), Context-Conditioned Invocation (CCI), Adaptive Planning (AP), and Real-Time Monitoring (RTM). These interventions respectively support the capabilities of Context Coherence & Retention, Inter-Component Coordination & State Management, Tool Usage Accuracy & Selective Execution, Multi-Step Strategic Planning & Error Detection & Recovery, and Real-Time Dynamic Responsiveness. Our findings reveal that while some architectures natively exhibit select properties, targeted augmentations significantly enhance modular agent performance—particularly in complex, multi-step, and real-time penetration testing scenarios.

Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning

Mahavir Dabas, Si Chen, Charles Fleming, Ming Jin, Ruoxi Jia

International Conference on Machine Learning (ICML), 2025

Official arXiv

Abstract: Safety alignment is crucial for Large Language Models (LLMs) to resist malicious instructions but often results in over-refusals, where benign prompts are unnecessarily rejected, impairing user experience and model utility. To this end, we introduce ACTOR (Activation-Based Training for Over-Refusal Reduction), a robust and compute- and-data efficient training framework that mini- mizes over-refusals by utilizing internal activation patterns from diverse queries. ACTOR precisely identifies and adjusts the activation components that trigger refusals, providing stronger control over the refusal mechanism. By fine-tuning only a single model layer, ACTOR effectively reduces over-refusals across multiple benchmarks while maintaining the model’s ability to handle harmful queries and preserving overall utility. Lay Summary: Problem — Today’s AI chatbots often panic: they refuse innocent questions just because the wording sounds dangerous, blocking help with topics like first-aid or chemistry homework. Solution — We discovered a tell-tale “refusal signal” hidden inside the model’s internal calculations. By gently adjusting that single signal—rather than overhauling the whole network—we teach the AI to pause only when a request is truly harmful. The training needs just minutes and a small set of examples. Impact — In tests, our fix let the chatbot answer up to one-third more harmless questions while keeping its existing safety guardrails almost untouched. Because the method is quick, cheap, and leaves the rest of the model unchanged, it can be slotted into real-world systems right away, making AI assistants more helpful without making them more risky.

LLMs Can Reason Faster Only If We Let Them

Bilgehan Sel, Lifu Huang, Naren Ramakrishnan, Ruoxi Jia, Ming Jin

International Conference on Machine Learning (ICML), 2025

Official OpenReview

TL;DR: Enabling faster large language model solutions for autonomous reasoning and planning Abstract: Large language models (LLMs) are making inroads into classical AI problems such as automated planning, yet key shortcomings continue to hamper their integration. Chain-of-Thought (CoT) struggles in complex multi-step reasoning, and Tree-of-Thoughts requires multiple queries that increase computational overhead. Recently, Algorithm-of-Thoughts (AoT) have shown promise using in-context examples, at the cost of significantly longer solutions compared to CoT. Aimed at bridging the solution length gap between CoT and AoT, this paper introduces AoT-O3, which combines supervised finetuning on AoT-style plans with a reinforcement learning (RL) framework designed to reduce solution length. The RL component uses a reward model that favors concise, valid solutions while maintaining planning accuracy. Empirical evaluations indicate that AoT-O3 shortens solution length by up to 80% compared to baseline AoT while maintaining or surpassing prior performance. These findings suggest a promising pathway for more efficient, scalable LLM-based planning. Lay Summary: Large language models (LLMs) can solve complex problems better when they are guided in smarter ways. The paper introduces a new method called AoT-O3 that helps these models plan more efficiently by giving rewards for shorter, accurate solutions. This approach significantly cuts down on the steps needed to reach a solution—by up to 80%—without sacrificing quality. As a result, it also reduces energy use and makes AI more scalable and environmentally friendly.

DiPT: Enhancing LLM reasoning through diversified perspective-taking

Hoang Anh Just, Mahavir Dabas, Lifu Huang, Ming Jin, Ruoxi Jia

NAACL Findings, 2025

arXiv

Existing work on improving language model reasoning typically explores a single solution path, which can be prone to errors. Inspired by perspective-taking in social studies, this paper introduces DiPT, a novel approach that complements current reasoning methods by explicitly incorporating diversified viewpoints. This approach allows the model to gain a deeper understanding of the problem's context and identify the most effective solution path during the inference stage. Additionally, it provides a general data-centric AI recipe for augmenting existing data to improve their quality for fine-tuning. Our empirical results demonstrate that DiPT can be flexibly integrated into existing methods that focus on a single reasoning approach, enhancing their reasoning performance and stability when presented with paraphrased problems. Furthermore, we illustrate improved context understanding by maintaining the model's safe outputs against "jailbreaking" prompts intentionally designed to bypass safeguards built into deployed models. Lastly, we show that fine-tuning with data enriched with diverse perspectives can boost the reasoning capabilities of the model compared to fine-tuning with raw data alone.

A Hypothesis on Black Swan in Unchanging Environments

Hyunin Lee, Chanwoo Park, David Abel, Ming Jin

International Conference on Learning Representations (ICLR), 2025

arXiv

Black swan events are statistically rare occurrences that carry extremely high risks. A typical view of defining black swan events is heavily assumed to originate from an unpredictable time-varying environments; however, the community lacks a comprehensive definition of black swan events. To this end, this paper challenges that the standard view is incomplete and claims that high-risk, statistically rare events can also occur in unchanging environments due to human misperception of their value and likelihood, which we call as spatial black swan event. We first carefully categorize black swan events, focusing on spatial black swan events, and mathematically formalize the definition of black swan events. We hope these definitions can pave the way for the development of algorithms to prevent such events by rationally correcting human perception.

Distributed Optimization and Distributed Learning: A Paradigm Shift for Power Systems

Ahmad Al-Tawaha*, Elson Cibaku*, SangWoo Park, Javad Lavaei, Ming Jin

IEEE Systems Journal, 2025

PDF

This survey provides a comprehensive overview of recent advances in distributed optimization and machine learning for power systems, particularly focusing on optimal power flow (OPF) problems. We cover distributed algorithms for convex relaxations and nonconvex optimization, highlighting key algorithmic ingredients and practical considerations for their implementation. Furthermore, we explore the emerging field of distributed machine learning, including deep learning and (multi-agent) reinforcement learning, and their applications in areas such as OPF and voltage control. We investigate the synergy between optimization and learning, particularly in the context of learning-assisted distributed optimization, and provide the first comprehensive survey of distributed real-time OPF, addressing time-varying conditions and constraint handling. Throughout the survey, we emphasize practical considerations such as data efficiency, scalability, and safety, aiming to guide researchers and practitioners in developing and deploying effective solutions for a more efficient and resilient power grid.

Defense against Joint Poison and Evasion Attacks: A Case Study of DERMS

Zain ul Abdeen, Padmaksha Roy, Ahmad Al-Tawaha, Rouxi Jia, Laura Freeman, Peter Beling, Chen-Ching Liu, Alberto Sangiovanni-Vincentelli, Ming Jin

The AAAI-25 Workshop on Artificial Intelligence for Cyber Security (AICS), 2025 Oral

arXiv

There is an upward trend of deploying distributed energy resource management systems (DERMS) to control modern power grids. However, DERMS controller communication lines are vulnerable to cyberattacks that could potentially impact operational reliability. While a data-driven intrusion detection system (IDS) can potentially thwart attacks during deployment, also known as the evasion attack, the training of the detection algorithm may be corrupted by adversarial data injected into the database, also known as the poisoning attack. In this paper, we propose the first framework of IDS that is robust against joint poisoning and evasion attacks. We formulate the defense mechanism as a bilevel optimization, where the inner and outer levels deal with attacks that occur during training time and testing time, respectively. We verify the robustness of our method on the IEEE-13 bus feeder model against a diverse set of poisoning and evasion attack scenarios. The results indicate that our proposed method outperforms the baseline technique in terms of accuracy, precision, and recall for intrusion detection.

LLMs Tackle Meta-Analysis: Automating Scientific Hypothesis Generation with Statistical Rigor

Tung-Wei Lin, Runing Yang, Zain ul Abdeen, Alberto Sangiovanni-Vincentelli, Haibo Huang, Ming Jin

2nd AI4Research Workshop: Towards a Knowledge-grounded Scientific Research Lifecycle, 2025

PDF

We propose the use of Large Language Models (LLMs) for generating statistically supported hypotheses from scientific literature. We present a two-stage framework that effectively leverages LLMs’ capacity to analyze vast literature and extract pertinent information to formulate evidence-based hypotheses. Our method comprises two phases: 1) data extraction via decomposed zero-shot prompting, and 2) hypothesis generation by auto-formulating and solving an optimization problem. We demonstrate this framework in agricultural science, where field data is particularly limited.

Monte Carlo Grid Dynamic Programming: Almost Sure Convergence and Probability Constraints

Mohammad S Ramadan, Ahmad Al-Tawaha, Mohamed Shouman, Ahmed Atallah, Ming Jin

American Control Conference (ACC), 2025

PDF

Dynamic Programming suffers from the well-known "curse of dimensionality", further exacerbated by expectations in stochastic systems. This paper presents a Monte Carlo-based sampling approach of the state and input spaces and an interpolation procedure for the resulting value function in a "self-approximating" fashion, eliminating the need for ordering or set-membership tests. We provide a proof of almost sure convergence for the value iteration (and consequently, policy iteration) procedure. The proposed sampling and self-approximating algorithm alleviates the burden of gridding and interpolation traditionally required in DP. Moreover, we demonstrate that the proposed interpolation procedure is well-suited for handling probabilistic constraints by sampling both infeasible and feasible regions. The curse of dimensionality cannot be avoided, however, this approach offers a convenient framework for addressing.

An Analytical Approach to Signal Denoising Based on Singular Value Decomposition (I)

Ahmad Al-Tawaha, Ahmad Alshorman, Ming Jin, Mohammad Al Janaideh, Khaled Aljanaideh

American Control Conference (ACC), 2025

PDF

Signal denoising is a fundamental task in signal processing that aims to extract the true underlying signal from noisy observations. Existing signal denoising methods, such as wavelet transform and Fourier-based filtering, suffer from low computational efficiency and potential loss of important signal components during reduction. Moreover, determining the optimal threshold for singular value selection remains a challenge in traditional techniques that are based on singular value decomposition. In this paper, we introduce an efficient, non-iterative algorithm for signal denoising that leverages two noisy observations. By constructing Hankel matrices from these observations, the proposed method establishes a threshold using the largest singular value of their difference, effectively separating true signal components from noise without the need for iterative optimization. We validate the approach on synthetic data and real-world measurements, including smartphone sensor readings and displacement data from a micro-positioning system with a piezoelectric actuator.

Data-centric human preference optimization with rationales

Hoang Anh Just, Ming Jin, Anit Sahu, Huy Phan, Ruoxi Jia

Conference on Language Modeling (COLM), 2025

Code OpenReview

Reinforcement learning from human feedback plays a crucial role in aligning language models towards human preferences, traditionally represented through comparisons between pairs or sets of responses within a given context. While many studies have enhanced algorithmic techniques to optimize learning from such data, this work shifts focus to improving preference learning through a data-centric approach. Specifically, we propose enriching existing preference datasets with machine-generated rationales that explain the reasons behind choices. We develop a simple and principled framework to augment current preference learning methods with rationale information. Our comprehensive analysis highlights how rationales enhance learning efficiency. Extensive experiments reveal that rationale-enriched preference learning offers multiple advantages: it improves data efficiency, accelerates convergence to higher-performing models, and reduces verbosity bias and hallucination. Furthermore, this framework is versatile enough to integrate with various preference optimization algorithms. Overall, our findings highlight the potential of re-imagining data design for preference learning, demonstrating that even freely available machine-generated rationales can significantly boost performance across multiple dimensions. The code repository is available at https://github.com/reds-lab/preference-learning-with-rationales

A Dynamic Penalization Framework for Online Rank-1 Semidefinite Programming Relaxations

Ahmad Al-Tawaha, Javad Lavaei, Ming Jin

7th Annual Learning for Dynamics & Control Conference (L4DC), 2025

PDF

We propose a unified framework for solving sequences of semidefinite programs (SDPs) with rank-one constraints—critical in domains such as combinatorial optimization and power systems. Enforcing rank−1 feasibility in SDP relaxations is especially challenging in dynamic environments where problem parameters evolve across time. To address this, our method operates on two complementary levels. At the per-task level, we introduce a two-phase optimization scheme: the decision phase solves a penalized SDP using a task-specific penalty matrix to approximate a rank−1 solution, while the learning phase updates this penalty matrix via subgradient-based feedback to progressively enforce rank−1 feasibility. At the meta level, we introduce a meta-learning framework that accelerates optimization across tasks by predicting effective initializations for the penalty matrix in each new task. The meta-model leverages historical solutions to produce informed penalty initializations, reducing the number of inner iterations needed per task. We provide theoretical guarantees showing that the task-averaged regret decreases with the number of tasks, with rates that improve under higher task similarity. Empirical results in real-world rank-constrained applications, including the Max-Cut problem and Optimal Power Flow (OPF), demonstrate that our method consistently recovers rank−1 solutions.

Distribution Grid Critical Load Restoration under Uncertain Topology Changes via a Hierarchical Multi-Agent Reinforcement Learning Approach

Vanshaj Khattar, Yiyun Yao, Fei Ding, Ming Jin

IEEE PES General Meeting, 2025

PDF

IP-FL: Incentive-driven Personalization in Federated Learning

A. Khan, X. Wang, Q. Le, Z. ul Abdeen, A. a. Khan, M. Jin, J. Ding, A. Butt, A. Anwar

39th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2025

PDF

Federated Learning (FL) is an approach for privacy-preserving Machine Learning (ML), enabling model training across multiple clients without centralized data collection. Existing incentive solutions for traditional Federated Learning (FL) focus on individual contributions to a single global objective, neglecting the nuances of clustered personalization with multiple cluster-level models and the non-monetary incentives such as personalized model appeal for clients. In this paper, we first propose to treat incentivization and personalization as interrelated challenges and solve them with an incentive mechanism that fosters personalized learning. Additionally, current methods depend on an aggregator for client clustering, which is limited by a lack of access to clients' confidential information due to privacy constraints, leading to inaccurate clustering. To overcome this, we propose direct client involvement, allowing clients to indicate their cluster membership preferences based on data distribution and incentive-driven feedback. Our approach enhances the personalized model appeal for self-aware clients with high-quality data leading to their active and consistent participation. Our evaluation demonstrates significant improvements in test accuracy (8–45%), personalized model appeal (3–38%), and participation rates (31–100%) over existing FL models, including those addressing data heterogeneity and personalization.

LLMs Can Plan Only If We Tell Them

Bilgehan Sel, Ruoxi Jia, Ming Jin

International Conference on Learning Representations (ICLR), 2025

arXiv OpenReview

Large language models (LLMs) have demonstrated significant capabilities in natural language processing and reasoning, yet their effectiveness in autonomous planning has been under debate. While existing studies have utilized LLMs with external feedback mechanisms or in controlled environments for planning, these approaches often involve substantial computational and development resources due to the requirement for careful design and iterative backprompting. Moreover, even the most advanced LLMs like GPT-4 struggle to match human performance on standard planning benchmarks, such as the Blocksworld, without additional support. This paper investigates whether LLMs can independently generate long-horizon plans that rival human baselines. Our novel enhancements to Algorithm-of-Thoughts (AoT), which we dub AoT+, help achieve state-of-the-art results in planning benchmarks out-competing prior methods and human baselines all autonomously.

Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning

Shangding Gu, Laixi Shi, Muning Wen, Ming Jin, Eric Mazumdar, Yuejie Chi, Adam Wierman, Costas Spanos

International Conference on Learning Representations (ICLR), 2025

arXiv Code OpenReview

Driven by inherent uncertainty and the sim-to-real gap, robust reinforcement learning (RL) seeks to improve resilience against the complexity and variability in agent-environment sequential interactions. Despite the existence of a large number of RL benchmarks, there is a lack of standardized benchmarks for robust RL. Current robust RL policies often focus on a specific type of uncertainty and are evaluated in distinct, one-off environments. In this work, we introduce Robust-Gymnasium, a unified modular benchmark designed for robust RL that supports a wide variety of disruptions across all key RL components-agents' observed state and reward, agents' actions, and the environment. Offering over sixty diverse task environments spanning control and robotics, safe RL, and multi-agent RL, it provides an open-source and user-friendly tool for the community to assess current methods and foster the development of robust RL algorithms. In addition, we benchmark existing standard and robust RL algorithms within this framework, uncovering significant deficiencies in each and offering new insights.

2024

Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs

Bilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia, Ming Jin

The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024

PDF arXiv

Large Language Models excel on many tasks but struggle with moral reasoning and ethical decision-making, particularly when multiple stakeholders are involved. We introduce SKIG, a framework that simulates accountability alongside empathy and risk assessment to improve decision-making. Across moral reasoning benchmarks using proprietary and open-source LLMs, SKIG yields stronger, more aligned decisions. Ablations highlight the importance of its core components for multi-stakeholder alignment.

Pausing Policy Learning in Non-stationary Reinforcement Learning

Hyunin Lee, Ming Jin, Javad Lavaei, Somayeh Sojoudi

International Conference on Machine Learning (ICML), 2024 Oral

arXiv

Real-time inference is a challenge of real-world reinforcement learning due to temporal differences in time-varying environments: the system collects data from the past, updates the decision model in the present, and deploys it in the future. We tackle a common belief that continually updating the decision is optimal to minimize the temporal gap. We propose forecasting an online reinforcement learning framework and show that strategically pausing decision updates yields better overall performance by effectively managing aleatoric uncertainty. Theoretically, we compute an optimal ratio between policy update and hold duration, and show that a non-zero policy hold duration provides a sharper upper bound on the dynamic regret. Our experimental evaluations on three different environments also reveal that a non-zero policy hold duration yields higher rewards compared to continuous decision updates.

Boosting Alignment for Post-Unlearning Text-to-Image Generative Models

Myeongseob Ko, Henry Li, Zhun Wang, Jonathan Patsenker, Jiachen T. Wang, Qinbin Li, Ming Jin, Dawn Song, Ruoxi Jia

Conference on Neural Information Processing Systems (NeurIPS), 2024

PDF

Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data. However, this often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns. Driven by these concerns, machine unlearning has become crucial to effectively purge undesirable knowledge from models. While existing literature has studied various unlearning techniques, these often suffer from either poor unlearning quality or degradation in text-image alignment after unlearning, due to the competitive nature of these objectives. To address these challenges, we propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives. We further derive the characterization of such an update. In addition, we design procedures to strategically diversify the unlearning and remaining datasets to boost performance improvement. Our evaluation demonstrates that our method effectively removes target classes from recent diffusion-based generative models and concepts from stable diffusion models while maintaining close alignment with the models' original trained states, thus outperforming state-of-the-art baselines.

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Shangding Gu, Laixi Shi, Yuhao Ding, Alois Knoll, Costas Spanos, Adam Wierman, Ming Jin

Conference on Neural Information Processing Systems (NeurIPS), 2024

PDF

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe RL often suffers from sample inefficiency, requiring extensive interactions with the environment to learn a safe policy. We propose Efficient Safe Policy Optimization (ESPO), a novel approach that enhances the efficiency of safe RL through sample manipulation. ESPO employs an optimization framework with three modes: maximizing rewards, minimizing costs, and balancing the trade-off between the two. By dynamically adjusting the sampling process based on the observed conflict between reward and safety gradients, ESPO theoretically guarantees convergence, optimization stability, and improved sample complexity bounds. Experiments on the Safety-MuJoCo and Omnisafe benchmarks demonstrate that ESPO significantly outperforms existing primal-based and primal-dual-based baselines in terms of reward maximization and constraint satisfaction. Moreover, ESPO achieves substantial gains in sample efficiency, requiring 25--29% fewer samples than baselines, and reduces training time by 21--38%.

Fairness-Aware Meta-Learning via Nash Bargaining

Yi Zeng, Xuelin Yang, Li Chen, Cristian Canton Ferrer, Ming Jin, Michael I Jordan, Ruoxi Jia

Conference on Neural Information Processing Systems (NeurIPS), 2024

PDF

To address issues of group-level fairness in machine learning, it is natural to adjust model parameters based on specific fairness objectives over a sensitive-attributed validation set. Such an adjustment procedure can be cast within a meta-learning framework. However, naive integration of fairness goals via meta-learning can cause hypergradient conflicts for subgroups, resulting in unstable convergence and compromising model performance and fairness. To navigate this issue, we frame the resolution of hypergradient conflicts as a multi-player cooperative bargaining game. We introduce a two-stage meta-learning framework in which the first stage involves the use of a Nash Bargaining Solution (NBS) to resolve hypergradient conflicts and steer the model toward the Pareto front, and the second stage optimizes with respect to specific fairness goals. Our method is supported by theoretical results, notably a proof of the NBS for gradient aggregation free from linear independence assumptions, a proof of Pareto improvement, and a proof of monotonic improvement in validation loss. We also show empirical effects across various fairness objectives in six key fairness datasets and two image classification tasks.

Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

Jianfeng He, Runing Yang, Linlin Yu, Changbin Li, Ruoxi Jia, Feng Chen, Ming Jin, Chang-Tien Lu

Empirical Methods in Natural Language Processing (EMNLP) Main Conference, 2024

PDF arXiv Code

Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncertainty model metrics on diverse and potentially conflicting NLG metrics. To address this issue, we introduce a comprehensive UE-TS benchmark incorporating 31 NLG metrics across four dimensions. The benchmark evaluates the uncertainty estimation capabilities of two large language models and one pre-trained language model on three datasets, with human-annotation analysis incorporated where applicable. We also assess the performance of 14 common uncertainty estimation methods within this benchmark. Our findings emphasize the importance of considering multiple uncorrelated NLG metrics and diverse uncertainty estimation methods to ensure reliable and efficient evaluation of UE-TS techniques. Our code and data are available at https://github.com/he159ok/Benchmark-of-Uncertainty-Estimation-Methods-in-Text-Summarization.

Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning

Vanshaj Khattar, Ming Jin

American Control Conference (ACC), 2024

arXiv

Offline reinforcement learning (RL) faces challenges such as limited data coverage and value overestimation. We propose an implicit actor-critic (iAC) framework that uses optimization solution functions as a deterministic policy (actor) and a monotone function over the optimal value as a critic. By encoding optimality in the actor policy, the learned policies become robust to suboptimality of actor parameters through an exponentially decaying sensitivity (EDS) property. We provide performance guarantees and validate the framework on two real-world applications, showing notable gains over state-of-the-art offline RL methods.

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Myeongseob Ko, Feiyang Kang, Weiyan Shi, Ming Jin, Zhou Yu, Ruoxi Jia

Conference on Computer Vision and Pattern Recognition (CVPR), 2024

PDF

Large-scale black-box models are ubiquitous, but understanding how individual training points influence predictions remains computationally challenging. We introduce the Mirrored Influence Hypothesis, revealing a reciprocity between training and test data influence that reformulates the estimation task. Building on this, we estimate training data influence by computing gradients only for selected test samples and using forward passes for each training point—yielding substantial efficiency gains. We demonstrate broad applicability, including data attribution for diffusion models, leakage detection, memorization analysis, mislabeled data detection, and tracing behavior in LMs.

Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models

Bilgehan Sel, Ahmad Al-Tawaha, Vanshaj Khattar, Lu Wang, Ruoxi Jia, Ming Jin

International Conference on Machine Learning (ICML), 2024

PDF arXiv Code

Many approaches beyond Chain-of-Thought (CoT) rely on external control of the generation process, incurring large query counts and compute. Algorithm of Thoughts (AoT) instead guides LLMs along algorithmic reasoning paths using fully in-context algorithmic exemplars, enabling broader idea exploration with one or few queries. AoT outperforms prior single-query methods and competitive multi-query tree-search strategies with fewer tokens, suggesting that instructing an LLM with an algorithm can even surpass the algorithm itself. Code and materials: https://algorithm-of-thoughts.github.io.

Zero-day Attack Detection in Digital Substations using In-Context Learning

Faizan Manzoor, Vanshaj Khattar, Chen-Ching Liu, Ming Jin

IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 2024

PDF

We address the challenge of detecting zero-day attacks in digital substations that use the IEC-61850 protocol. While prior heuristic and ML-based methods struggle to generalize to unknown attacks, we leverage the in-context learning ability of transformer models to adapt from a few examples without retraining. On the IEC-61850 dataset, our method achieves >87% detection accuracy on zero-day attacks where existing baselines fail, showing promise for securing modern power systems.

Data-centric defense: Shaping loss landscape with augmentations to counter model inversion

Si Chen, Nikhil Abhyankar, Feiyang Kang, Ming Jin, Ruoxi Jia

Transactions on Machine Learning Research (TMLR), 2024

Code OpenReview

Machine Learning models have shown susceptibility to various privacy attacks, with model inversion (MI) attacks posing a significant threat. Current defense techniques are mostly model-centric, involving modifying model training or inference. However, these approaches require model trainers’ cooperation, are computationally expensive, and often result in a significant privacy-utility tradeoff. To address these limitations, we propose a novel data-centric approach to mitigate MI attacks. We introduce privacy-focused data augmentations that shape the resulting model’s loss landscape, making it challenging for attackers to generate private target samples. We provide theoretical analysis explaining why such augmentations can reduce MI risk and demonstrate effectiveness and robustness across models and datasets. On face recognition benchmarks, we reduce reconstruction success rates to ≤ 5% with only ~2% accuracy drop, surpassing model-centric defenses.

TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning

Shangding Gu, Alois Knoll, Ming Jin

Transactions on Machine Learning Research (TMLR), 2024

arXiv Code OpenReview

The development of Large Language Models (LLMs) often confronts challenges stemming from the heavy reliance on human annotators in reinforcement learning with human feedback (RLHF), or the frequent and costly external queries tied to self-instruct. We pivot to Reinforcement Learning (RL) with a twist: instead of refining LLMs post instruction training, we use RL to directly generate the foundational instruction dataset that alone suffices for fine-tuning. TeaMs-RL uses textual operations and rules to diversify training data, enabling high-quality data generation without heavy reliance on external advanced models—paving the way for a single fine-tuning step and negating subsequent RLHF stages. Results show reduced human involvement and far fewer model queries (~5.73% of a strong baseline), improved ability to craft and comprehend complex instructions, and stronger privacy protection.

InternalInspector I2: Robust Confidence Estimation in LLMs through Internal States

Mohammad Beigi, Ying Shen, Runing Yang, Zihao Lin, Qifan Wang, Ankith Mohan, Jianfeng He, Ming Jin, Chang-Tien Lu, Lifu Huang

Empirical Methods in Natural Language Processing (EMNLP), 2024

Official arXiv

Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task.

Balance reward and safety optimization for safe reinforcement learning: A perspective of gradient manipulation

Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, Qingwei Lin, Ming Jin, Alois Knoll

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2024

PDF arXiv

Ensuring the safety of Reinforcement Learning (RL) is crucial for its deployment in real-world applications. Nevertheless, managing the trade-off between reward and safety during exploration presents a significant challenge. Improving reward performance through policy adjustments may adversely affect safety performance. In this study, we address this conflicting relationship by leveraging the theory of gradient manipulation. We analyze the conflict between reward and safety gradients and propose a soft switching policy optimization method, with convergence analysis. Based on our theoretical examination, we provide a safe RL framework to overcome the aforementioned challenge, and develop a Safety-MuJoCo Benchmark to assess the performance of safe RL algorithms. We evaluate the effectiveness of our method on Safety-MuJoCo and Safety Gymnasium. Experimental results demonstrate that our algorithms outperform several state-of-the-art baselines in terms of balancing reward and safety optimization.

Does online gradient descent (and variants) still work with biased gradient and variance?

Ahmad Al-Tawaha, Ming Jin

American Control Conference (ACC), 2024

PDF

Deterministic bias and stochastic unbiased noise in gradients can affect the performance of online learning algorithms. While existing studies provide bounds for dynamic regret under these uncertainties, they offer limited insight into the specific functionality of the algorithms. This paper investigates the efficacy of online gradient-based algorithms (OGD) with inexact gradients, quantifying the degree of tolerance to these uncertainties and identifying conditions for ensuring robustness. Our analysis reveals that bias and variance function independently, and the tolerance of OGD to inexactness depends on factors such as decision dimension, gradient norm, function variations, alignment of gradients, and function curvature. We verify results numerically and experimentally and introduce a general online optimization algorithm as a case study.

Machine Learning-Assisted Surface-Enhanced Raman Spectroscopy Detection for Environmental Applications: A Review

Sonali Srivastava, Wei Wang, Wei Zhou, Ming Jin, Peter J. Vikesland

Environmental Science & Technology, 2024

PDF

Surface-enhanced Raman spectroscopy (SERS) has gained significant attention for its ability to detect environmental contaminants with high sensitivity and specificity. The cost-effectiveness and potential portability of the technique further enhance its appeal for widespread application. However, challenges such as managing high-dimensional data, detecting low-concentration targets amid environmental interferents, and navigating overlapping spectral peaks remain. In response, there is a growing trend toward using machine learning (ML) approaches that encompass multivariate tools for effective SERS data analysis. This comprehensive review details key steps for applying ML techniques to SERS, surveys environmental applications where ML tools are integrated for detecting pathogens and (in)organic pollutants, and discusses the considerations and benefits of ML in these contexts. The review also outlines opportunities for synergizing SERS with ML for real-world deployments.

CausalPrompt: Enhancing LLMs with weakly supervised causal reasoning for robust performance in non-language tasks

Tung-Wei Lin, Vanshaj Khattar, Yuxuan Huang, Junho Hong, Ruoxi Jia, Chen-Ching Liu, Alberto Sangiovanni-Vincentelli, Ming Jin

International Conference on Learning Representations (ICLR) Workshop: Tackling Climate Change with Machine Learning, 2024

PDF

We introduce CausalPrompt, a framework that equips LLMs with weakly supervised causal reasoning to improve robustness on non-language tasks. The method leverages causal cues and domain structure to guide prompting and decision-making, yielding stronger performance under distribution shifts pertinent to climate and infrastructure applications.

Simulation and Analysis of Cyber Attacks on Power and Energy Systems

Zachary A Ruttle, Baza Somda, Chen-Ching Liu, Ming Jin

2024 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), 2024

PDF

The power grid has evolved over decades with cyber systems and communications such as SCADA; however, the cyber-power system can be infiltrated by malicious attackers due to this connectivity. Encryption alone is not sufficient. We propose a criminology- and fuzzy-logic-inspired attack algorithm to serve as a testbed for AI-based defenses, recognizing that real attackers may strike varying components in uncertain orders. The method provides a consistent-yet-diverse attack generator for evaluating defensive tools on cyber-physical system models.

Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

Padmaksha Roy, Himanshu Singhal, Timothy J O’Shea, Ming Jin

Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024

arXiv

Unsupervised anomaly detection with autoencoders is effective because anomalies reconstruct differently from a well-regularized latent space. Real sensor data are often skewed and non-Gaussian, rendering mean-based estimators unreliable. Reconstruction error via Euclidean distance overlooks correlation structure in the latent space, weakening detection of near anomalies with similar feature distributions. We propose a correlation-aware latent distance that better preserves informative structure, improving anomaly detection on skewed data.

Enhancing Distribution System Resilience: A First-Order Meta-RL algorithm for Critical Load Restoration

Zain ul Abdeen, Xiangyu Zhang, Warris Gil, Ming Jin

IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 2024

PDF

The increasing frequency of extreme events and the integration of distributed energy resources (DERs) elevate the need for resilient and efficient critical load restoration strategies. We propose a First-Order Meta-based RL (FOM-RL) algorithm within an online framework for adaptive, robust restoration. By leveraging local DERs, FOM-RL enables rapid adaptation to unseen situations with reduced tuning. Experiments indicate improved resilience under uncertainty compared to standard and warm-start RL baselines.

Democratizing Energy Management with LLM-Assisted Optimization Autoformalism

Ming Jin, Bilgehan Sel, Fnu Hardeep, Wotal and Yin

IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 2024

PDF

This paper introduces a method for personalizing energy optimization using large language models (LLMs) combined with an optimization solver. This approach, termed human-guided optimization autoformalism, translates natural language specifications into optimization problems, enabling LLMs to handle various user-specific energy-related tasks. It allows for nuanced understanding and nonlinear reasoning tailored to individual preferences. The research covers common energy sector tasks like electric vehicle charging, HVAC control, and long-term planning for renewable energy installations. This novel strategy represents a significant advancement in context-based optimization using LLMs, facilitating sustainable energy practices customized to individual needs.

2023

A CMDP-within-online framework for Meta-Safe Reinforcement Learning

Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, Ming Jin

International Conference on Learning Representations (ICLR), 2023 Spotlight

PDF arXiv OpenReview

We study meta-safe reinforcement learning (Meta-SRL) via a CMDP-within-online framework and establish the first provable guarantees. Using gradient-based meta-learning, we derive task-averaged regret bounds for reward optimality and constraint violations that improve with task similarity/relatedness. We propose a practical meta-algorithm performing inexact online learning on upper bounds estimated via off-policy stationary distribution corrections, with per-task adaptive learning rates and an extension to a competing dynamic oracle. Experiments validate the approach.

Tempo Adaption in Non-stationary Reinforcement Learning

Hyunin Lee, Yuhao Ding, Jongmin Lee, Ming Jin, Javad Lavaei, Somayeh Sojoudi

Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023

PDF

We raise and tackle a time synchronization issue between agent and environment in non-stationary RL, where environment changes occur over wall-clock time rather than episode count. We propose a Proactively Synchronizing Tempo (PST) framework that optimizes a suboptimal sequence of interaction/training durations by minimizing an upper bound on dynamic regret. The result trades off agent training tempo with environment change tempo, yielding a sublinear dynamic regret bound. Experiments on high-dimensional non-stationary environments show PST achieves higher online return at a non-zero optimal tempo compared to existing methods.

Winning the CityLearn Challenge: Adaptive Optimization with Evolutionary Search under Trajectory-based Guidance

Vanshaj Khattar, Ming Jin

AAAI Conference on Artificial Intelligence (AAAI), AI for Social Impact Track, 2023 Oral

PDF

Presents a method that uses optimization solution functions as policies for sequential decision-making and adapts model parameters online via an evolutionary algorithm with trajectory-based guidance. The approach has formal global convergence guarantees and won the 2021 CityLearn Challenge, achieving superior performance across metrics while maintaining interpretability.

Learning-to-learn to guide random search: Derivative-free meta blackbox optimization on manifold

Bilgehan Sel, Ahmad Al-Tawaha, Yuhao Ding, Ruoxi Jia, Bo Ji, Javad Lavaei, Ming Jin

Learning for Dynamics and Control Conference (L4DC), 2023 Oral

PDF

Solving a sequence of high-dimensional, nonconvex, yet similar optimization problems is common in engineering. We propose a meta-learning framework that exploits shared structure across tasks to improve computational efficiency and sample complexity for derivative-free optimization. Assuming practical high-dimensional objectives lie on a shared low-dimensional manifold, we jointly learn a meta-initialization and a meta-manifold. We provide theoretical benefits and demonstrate effectiveness on two high-dimensional RL tasks.

Certifiably robust neural ODE with learning-based barrier function

Runing Yang, Ruoxi Jia, Xiangyu Zhang, Ming Jin

IEEE Control Systems Letters, 2023

PDF

Neural Ordinary Differential Equations (ODEs) have gained traction across applications, yet certified robustness remains limited. This letter proposes training a neural ODE using barrier functions, demonstrating improved robustness on classification tasks. We also provide a first generalization guarantee of robustness against adversarial attacks via a wait-and-judge scenario approach.

On Solution Functions of Optimization: Universal Approximation and Covering Number Bounds

Ming Jin, Vanshaj Khattar, Harshal Kaushik, Bilgehan Sel, Ruoxi Jia

AAAI Conference on Artificial Intelligence (AAAI), 2023 Oral

PDF

We analyze the expressivity and learnability of solution functions of convex optimization and their multi-layer extensions. Results include: (1) solution functions of LP/QP are universal approximators for smooth models (or restricted Sobolev spaces) with characterized rate–distortion; (2) a regression-error perspective where information is provided via data observations; (3) compositional architectures with optimization-as-a-layer can exactly reconstruct basic numerical-analysis functions, implying (4) substantial rate–distortion reduction with a universal architecture; and (5) empirical covering-number bounds for LP/QP and a generic (possibly nonconvex) problem via tame geometry. This provides a first rigorous analysis of approximation and learning-theoretic properties of solution functions with implications for algorithm design and guarantees.

Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study

Myeongseob Ko, Ming Jin, Chenguang Wang, Ruoxi Jia

IEEE/CVF International Conference on Computer Vision (ICCV), 2023

PDF Code

Membership inference attacks (MIAs) seek to infer whether a data point was used in training. We develop practical MIAs for large-scale multi-modal models like CLIP, overcoming computational challenges via (i) cosine-similarity thresholding between text and image features with augmentation aggregation, and (ii) a weakly supervised attack using ground-truth non-members. CLIP models are shown susceptible: simple baselines exceed 75% accuracy, and enhanced attacks improve average-case performance by 17% and are ≥7× more effective at low FPRs—highlighting privacy risks in foundational multi-modal models.

LAVA: Data Valuation without Pre-Specified Learning Algorithms

Hoang. Just, Feiyang Kang, Tianhao Wang, Yi Zeng, Myeongseob Ko, Ming Jin, Ruoxi Jia

International Conference on Learning Representations (ICLR), 2023 Spotlight

arXiv Code

We introduce LAVA, a framework for valuing data independent of any pre-specified learning algorithm. We derive a proxy for validation performance via a class-wise Wasserstein distance between training and validation sets and show it upper bounds validation performance under Lipschitz conditions. Sensitivity analysis of this distance yields pointwise values that can be obtained directly from solver outputs, avoiding repeated training runs. We evaluate LAVA across diverse settings, enabling algorithm-agnostic data valuation for acquisition and pricing.

Towards Robustness Certification Against Universal Perturbations

Yi Zeng, Zhouxing Shi, Ming Jin, Feiyang Kang, Lingjuan Lyu, Cho-Jui Hsieh, Ruoxi Jia

International Conference on Learning Representations (ICLR), 2023

OpenReview

We study certifying neural network robustness against universal perturbations (UPs), which are shared across all samples and arise in universal adversarial and backdoor attacks. Sample-wise certification bounds are loose under the UP threat model as they ignore the shared-perturbation constraint. We combine linear relaxation–based analysis with MILP to establish the first certification method for UPs and develop a framework to compute population error bounds from certification on random batches. The certifications further enable efficient comparison of model robustness and defenses and accurate detection of backdoor target classes.

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

Yuhao Ding, Ming Jin, Javad Lavaei

AAAI Conference on Artificial Intelligence (AAAI), 2023 Oral

PDF

We study risk-sensitive RL with entropic risk in episodic non-stationary MDPs where rewards and transitions vary over time under a variation budget. We propose restart-based algorithms (Restart-RSMB, Restart-RSQ) with dynamic regret guarantees and a meta-algorithm that adaptively detects non-stationarity without prior variation knowledge. We establish a dynamic regret lower bound, showing near-optimality. Results reveal that risk control and handling non-stationarity can be designed separately when the variation budget is known, while adaptive detection depends on the risk parameter.

TUNEOPT: An Evolutionary Reinforcement Learning HVAC Controller For Energy-Comfort Optimization Tuning

Mostafa Meimand, Vanshaj Khattar, Zahra Yazdani, Farrokh Jazizadeh, Ming Jin

Proceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation (ACM BuildSys), 2023

PDF

HVAC systems dominate building energy use. Existing energy–comfort co-optimization often relies on manual tuning of the trade-off coefficient, limiting generalizability across scenarios. TUNEOPT proposes an implicit evolutionary RL approach that learns and adapts this trade-off online within a predictive comfort–energy co-optimization formulation for setpoint control, improving both energy efficiency and occupant comfort.

Decision-focused learning for inverse noncooperative games: Generalization bounds and convergence analysis

Ahmad Al-Tawaha, Harshal Kaushik, Bilgehan Sel, Ruoxi Jia, Ming Jin

IFAC-PapersOnLine, 2023

PDF

We study inverse noncooperative games: learning players' utilities from observed equilibrium actions and predicting future behavior. Rather than estimate-then-predict, we propose a decision-focused learning approach that embeds the game's equilibrium as a differentiable layer in an end-to-end system. We provide covering-number bounds for solution-function classes arising from parametric variational inequalities and derive generalization guarantees with smooth losses. Experiments highlight improved predictive accuracy over baselines.

A theoretical analysis of using gradient data for Sobolev training in RKHS

Zain ul Abdeen, Ruoxi Jia, Vassilis Kekatos, Ming Jin

IFAC-PapersOnLine, 2023

PDF

Recent works show that incorporating target derivatives during training can improve accuracy and data efficiency. We provide a theoretical analysis of gradient-augmented (Sobolev) training in RKHS, highlighting (i) limitations and benefits in low-data regimes, and (ii) how gradients affect learning rates. We show thresholds on the target's Lipschitz constant and data size under which Sobolev training outperforms classical value-only training in sample efficiency.

2022

Adversarial Unlearning of Backdoors via Implicit Hypergradient

Yi Zeng, Si Chen, Won Park, Zhuoqing Mao, Ming Jin, Ruoxi Jia

International Conference on Learning Representations (ICLR), 2022

PDF arXiv OpenReview

We propose a minimax formulation to remove backdoors from a poisoned model using only a small clean set. Our Implicit Backdoor Adversarial Unlearning (I-BAU) algorithm solves the minimax via implicit hypergradients, capturing inner–outer dependencies unlike prior methods. We prove convergence and generalization of robustness from clean-data minimax to unseen test data. Across seven attacks and two datasets, I-BAU matches or outperforms six state-of-the-art defenses, is robust to trigger/settings/poison ratios, needs less compute (notably >× faster in single-target attacks), and remains effective with only 100 clean samples.

Recurrent neural network controllers synthesis with stability guarantees for partially observed systems

Fangda Gu, He Yin, Laurent El Ghaoui, Murat Arcak, Peter Seiler, Ming Jin

AAAI Conference on Artificial Intelligence (AAAI), 2022

PDF

Neural network controllers are attractive for control tasks, yet safety-critical systems demand stability, especially under partial observability where long-term memory is needed. We consider RNNs as dynamic controllers for nonlinear uncertain partially observed systems and derive convex stability conditions via integral quadratic constraints, S-lemma, and sequential convexification. To enforce stability during learning and control, we propose a projected policy gradient in a reparameterized space using mild additional system information. Experiments show stabilizing controllers learned with fewer samples and higher final performance than policy gradient baselines.

Dynamic Regret Bounds for Constrained Online Nonconvex Optimization Based on Polyak-Lojasiewicz Regions

Julie Mulvaney-Kemp, SangWoo Park, Ming Jin, Javad Lavaei

IEEE Transactions on Control of Network Systems, 2022

PDF

We study constrained online nonconvex optimization where performance is measured by dynamic regret against the time-varying optimal decision. For losses that can be arbitrarily nonconvex yet admit slowly time-varying global solutions, we analyze regions around each time's optimizer to define time-varying target sets that satisfy proximal Polyak–Łojasiewicz and other properties under projected gradient descent. We design two algorithms and prove dynamic regret bounds scaling with problem variation.

Controlling Smart Inverters Using Proxies: A Chance-Constrained DNN-based Approach

Sarthak Gupta, Vassilis Kekatos, Ming Jin

IEEE Transactions on Smart Grid, 2022

PDF

Coordinating inverters at scale under uncertainty is essential for integrating renewables in distribution grids. When only proxies of grid conditions are available, we integrate DNN-based inverter policies directly into OPF and train them via two formulations that confine voltage deviations: an average-violation approach and a convex restriction of chance constraints. The trained DNNs can be driven by partial/noisy/proxy descriptors, enabling operation on unobservable feeders.

Learning Neural Networks under Input-Output Specifications

Zain Ul Abdeen, He Yin, Vassilis Kekatos, Ming Jin

American Control Conference (ACC), 2022

PDF arXiv

We study learning neural networks that certifiably satisfy input–output specifications. The key idea is to convexify the verification condition by abstracting nonlinear specs and activations with quadratic constraints and introducing a loop-transformation-based reparameterization, yielding a convex condition enforceable during training. The resulting inner approximation of admissible parameters enables certification. We validate on reachability-style specs across input regions.

2021

Power Up! Robust Graph Convolutional Network via Graph Powering

Ming Jin, Heng Chang, Wenwu Zhu, Somayeh Sojoudi

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021

PDF

We enhance adversarial robustness of GCNs by moving beyond spectral graph theory to robust graph theory. We introduce a novel convolution operator that is provably robust in the spectral domain and incorporate it into the GCN architecture to improve expressivity and interpretability. Extending the original graph to a sequence of graphs yields a robust training paradigm encouraging transferability across spatial and spectral characteristics. Extensive experiments show simultaneous gains in benign and adversarial settings.

Imitation Learning with Stability and Safety Guarantees

He Yin, Peter Seiler, Ming Jin, Murat Arcak

IEEE Control Systems Letters, 2021

PDF

Presents a method to learn neural network controllers with certified stability and safety via imitation learning for LTI systems. By merging Lyapunov theory with local quadratic constraints on activations, convex conditions are derived and incorporated into the IL process to jointly minimize IL loss and maximize the certified region of attraction. An ADMM-based algorithm solves the resulting problem. Demonstrated on vehicle lateral control.

Diminishing Regret for Online Nonconvex Optimization

SangWoo Park, Julie Mulvaney-Kemp, Ming Jin, Javad Lavaei

American Control Conference (ACC), 2021

PDF

We introduce nonconvexity regret to evaluate local-search methods for online nonconvex optimization (ONO) against a global solver. We define the depth of a global minimum and show that memory and random exploration drive nonconvexity regret to zero when objective variability is small relative to depth. We provide probabilistic regret bounds depending on the evolution of time-varying landscapes, leveraging notions of missing mass and 1-occupancy.

2020

Stability-certified Reinforcement Learning: A Control-theoretic Perspective

Ming Jin, Javad Lavaei

IEEE Access, 2020

PDF

We study certifying stability of RL policies interconnected with nonlinear dynamical systems. By regulating partial policy gradients, robust stability can be certified via an SDP feasibility problem that exploits problem structure. We analyze (non)conservatism and empirically demonstrate high performance within the certified parameter space and stable long-run learning on multi-flight formation and power system frequency regulation.

Boundary Defense Against Cyber Threat for Power System State Estimation

Ming Jin, Javad Lavaei, Somayeh Sojoudi, Ross Baldick

IEEE Transactions on Information Forensics and Security, 2020

PDF Supp

Power grid operation is increasingly data-centric, raising reliability concerns under data attacks. We quantify and visualize non-robust regions for state estimation—graph-structured quadratic sensing—where local data manipulation can induce global estimation errors. We propose an optimization-based graphical boundary defense to localize manipulated regions, preventing local attacks from having global effects and enhancing situational awareness. The framework reveals key geometric and algebraic factors impacting robustness and is applied to the U.S. grid.

A Survey on Conic Relaxations of Optimal Power Flow Problem

Fariba Zohrizadeh, Cedric Josz, Ming Jin, Ramtin Madani, Javad Lavaei, Somayeh Sojoudi

European Journal of Operational Research, 2020

PDF

Reviews the success of conic optimization—LP, SOCP, and SDP—in addressing optimal power flow (OPF) in modern power systems. Emphasizes scalability and reliability amidst increasing grid complexity due to renewables and EVs, and surveys recent advances with theoretical guarantees and practical implications.

Towards Off-Policy Evaluation as a Prerequisite for Real-World Reinforcement Learning in Building Control

Bingqing Chen, Ming Jin, Zhe Wang, Tianzhen Hong, Mario Bergés

Proceedings of the 1st International Workshop on Reinforcement Learning for Energy Management in Buildings & Cities (RL4EB), 2020

PDF

Off-policy evaluation (OPE) estimates a policy's performance without online interaction, enabling safety and performance checks before deployment in building control. We review OPE methods against the characteristics of building operation data—deterministic behavior policies and limited coverage—adopt an approximate model approach, and use bootstrapping to quantify uncertainty and correct for bias. Simulation results highlight practical considerations for real-world RL in buildings.

Control of Superheat of Organic Rankine Cycle under Transient Heat Source Based on Deep Reinforcement Learning

Xuan Wang, Rui Wang, Ming Jin, Gequn Shu, Hua Tian, Jiaying Pan

Applied Energy, 2020

PDF

The organic Rankine cycle (ORC) recovers engine waste heat, yet transient operating conditions make superheat control challenging. This work proposes two DRL-based controllers for superheat regulation under a transient heat source, alleviating dependence on disturbance prediction that hampers MPC and DP methods. The DRL controllers are evaluated against baselines, showing strong tracking performance under real-world variability.

Deep Learning for Reactive Power Control of Smart Inverters under Communication Constraints

Sarthak Gupta, Vassilis Kekatos, Ming Jin

IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), 2020

PDF

Between cyber-intensive OPF and local droop control, we propose training deep neural networks (DNNs) to set inverter injections. The DNN embeds the feeder model and is trained to minimize a grid-wide objective subject to inverter/network constraints in expectation over uncertainty. Learning is posed as stochastic OPF with primal–dual updates. A master–slave architecture broadcasts a condensed utility signal to local inverter DNNs that combine it with local measurements, enabling operation under bandwidth constraints.

Pre-2020 Selected

Scalable and Robust State Estimation from Abundant but Untrusted Data

Ming Jin, Igor Molybog, Reza Mohammadi-Ghazi, Javad Lavaei

IEEE Transactions on Smart Grid, 2019

PDF

We propose a linear representation that captures grid topology and enables an efficient two-stage estimator for power system state estimation from abundant but potentially untrusted data. We derive an identifiability condition delineating when the unique global optimum can be efficiently recovered, and introduce a robustness metric—mutual incoherence—to analyze global recovery and statistical error bounds under dense noise and bad data. The method outperforms prior approaches in accuracy and robustness, scales to >13,000-bus systems, and achieves minute-level runtimes.

Power Grid AC-based State Estimation: Vulnerability Analysis Against Cyber Attacks

Ming Jin, Javad Lavaei, Karl Henrik Johansson

IEEE Transactions on Automatic Control, 2018

PDF

We analyze the vulnerability of AC-based power system state estimation (SE) to false data injection attacks (FDIA). A convexification framework based on semidefinite programming (SDP) solves the FDIA design efficiently despite nonlinear AC models and sparsity constraints. From optimal SDP solutions, we delineate attackable regions given measurement types and grid topology, prove stealthiness and sparsity properties, and derive performance bounds. Simulations on IEEE test cases validate the approach and inform protection via security metrics, redesigned bad data detection, and grid hardening.

Microgrid to Enable Optimal Distributed Energy Retail and End-User Demand Response

Ming Jin, Wei Feng, Chris Marnay, Costas Spanos

Applied Energy, 2018

PDF

Investigates pricing and operation with demand response for a microgrid (MG) retailer in an integrated energy system. Co-optimizes retail rates and MG dispatch via an MIQP, yielding dynamic pricing that reflects generation cost and promotes DR, alongside optimal dispatch that exploits flexibility. Demonstrates value of distributed generation and demand flexibility for sustainability and resilience.

Automated Mobile Sensing: Towards High-Granularity Agile Indoor Environmental Quality Monitoring

Ming Jin, Shichao Liu, Stefano Schiavon, Costas Spanos

Building and Environment, 2018 Best Paper Award

PDF

Proposes an automated mobile sensing system that dispatches a sensor-rich, navigation-capable robot for agile IEQ monitoring. To handle sparse spatio-temporal data, a tailored interpolation algorithm captures global trends and local variations, enabling efficient IEQ evaluation. Demonstrates high-granularity monitoring with adaptability to dynamic indoor environments. Best Paper Award.

Design Automation for Smart Building Systems

Ruoxi Jia, Baihong Jin, Ming Jin, Yuxun Zhou, Ioannis C Konstantakopoulos, Han Zou, Joyce Kim, Dan Li, Weixi Gu, Reza Arghandeh, Pierluigi Nuzzo, Stefano Schiavon, Alberto L Sangiovanni-Vincentelli, Costas J Spanos

Proceedings of the IEEE, 2018

PDF

Presents a platform-based methodology for smart building design. Platform-based design promotes reuse on shared infrastructures, rapid prototyping, and design-space exploration. The paper formalizes building components and provides a design flow mapping high-level application specs to physical implementations. A case study on on-demand HVAC systems demonstrates the methodology.

Virtual Occupancy Sensing: Using Smart Meters to Indicate Your Presence

Ming Jin, Ruoxi Jia, Costas Spanos

IEEE Transactions on Mobile Computing, 2017

PDF

Occupancy detection enables energy efficiency, comfort, and space utilization but typically requires custom sensors, calibration, and maintenance. Leveraging widely deployed electricity meters, we develop non-intrusive, cost-effective occupancy detection methods suited for limited or no labeled data. Evaluations on residential and commercial buildings show binary occupancy detection accuracy comparable to fully supervised models (≈78–93% in residences, ≈90% in offices).

Inverse Reinforcement Learning via Deep Gaussian Process

Ming Jin, Andreas Damianou, Pieter Abbeel, Costas Spanos

Conference on Uncertainty in Artificial Intelligence (UAI), 2017

PDF Supp arXiv

We propose an IRL approach based on deep Gaussian processes (deep GPs) to learn complex reward structures from few demonstrations. Stacking latent GP layers yields abstract representations of state features, linked to demonstrations via maximum entropy IRL. We develop a non-standard variational approximation to enable tractable inference despite the embedded IRL engine, providing approximate Bayesian treatment and guarding against overfitting. The joint representation+IRL learning outperforms state-of-the-art baselines on standard and new benchmarks.

Sensing by Proxy: Occupancy Detection Based on Indoor CO2 Concentration

Ming Jin, Nikos Bekiaris-Liberis, Kevin Weekly, Costas Spanos, Alexandre M. Bayen

The 9th International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies (UBICOMM'15), 2015 Best Paper Award

PDF Code

We introduce a sensing-by-proxy paradigm that infers latent occupancy from CO2 measurements using constitutive models capturing spatial and physical system structure. A link model relates proxy signals to unknown human emission via a coupled PDE–ODE system with data-driven parameters exhibiting stability and robustness across experiments. Field evaluations using both a CO2 pump and controlled occupancy show superior accuracy to ML baselines for estimating room occupancy, enabling occupancy-aware HVAC, lighting, and services.

Social Game for Building Energy Efficiency: Incentive Design

Lillian J Ratliff, Ming Jin, Ioannis C Konstantakopoulos, Costas J Spanos, S Shankar Sastry

52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2014

PDF

We design an incentive mechanism via a social game to encourage energy-efficient behavior of building occupants. Occupants receive points influencing lottery-winning probabilities. We estimate occupant utilities and model the interaction with the building manager as a reversed Stackelberg game with multiple non-cooperative followers. The resulting bilevel optimization is solved using particle swarm optimization; the induced leader choice yields a Nash equilibrium derived from the player-state distribution.