线上老虎机 学术讲座——黎波教授
发布者:殳妮 发布时间:2025-03-28 浏览次数:74
时间:2025年4月2日 上午9:00-10:00
地点:东校区财科馆318室(原EMBA教室)
题目:Practical Performative Policy Learning with Strategic Agents
摘要:This paper studies the performative policy learning problem, where agents adjust their features in response to a released policy to improve their potential outcomes, inducing an endogenous distribution shift. There has been a growing interest in training machine learning models in strategic environments, including strategic classification [Hardt et al., 2016] and performative prediction [Perdomo et al., 2020]. However, existing approaches often rely on restrictive parametric assumptions: micro-level utility models in strategic classification and macro-level data distribution maps in performative prediction, severely limiting scalability and generalizability. We approach this problem as a complex causal inference task, relaxing parametric assumptions on both micro-level agent behavior and macro-level data distribution. Leveraging bounded rationality, we uncover a practical low-dimensional structure in distribution shifts and construct an effective mediator in the causal path from the deployed model to the shifted data. We then propose a gradient- based policy optimization algorithm with a differentiable classifier serving as a substitute for the high-dimensional distribution map. Our algorithm efficiently utilizes batch feedback and limited manipulation patterns. Our approach achieves high sample efficiency compared to methods reliant on bandit feedback or zero-order optimization. We also provide theoretical guarantees for algorithmic convergence. Extensive and challenging experiments1 on high-dimensional settings demonstrate our method’s practical efficacy.
主讲人简介:黎波,清华大学经济管理学院管理科学与工程系长聘副教授,2002年本科毕业于北京大学数学科学学院数学系,2006年博士毕业于加州大学伯克利分校统计系。他近年的研究兴趣包括平台实验设计、因果学习与推断、数据驱动的决策制定、可信机器学习、机器学习与经济学的交叉等。已在经济、管理、统计、人工智能等多个领域的期刊及会议发表论文60余篇,包括Biometrika, JRSSB, Management Science, TKDE,管理科学学报等期刊以及NeurIPS, ICML, ICLR, AAAI, KDD等会议。