Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game

CHENG Zhaoyang, CHEN Guanpu, HONG Yiguang

系统科学与复杂性(英文) ›› 2024, Vol. 37 ›› Issue (1) : 184-203.

PDF(431 KB)
PDF(431 KB)
系统科学与复杂性(英文) ›› 2024, Vol. 37 ›› Issue (1) : 184-203. DOI: 10.1007/s11424-024-3408-5

Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game

    CHENG Zhaoyang1,2, CHEN Guanpu3, HONG Yiguang4,5
作者信息 +

Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game

    CHENG Zhaoyang1,2, CHEN Guanpu3, HONG Yiguang4,5
Author information +
文章历史 +

摘要

This paper focuses on the performance of equalizer zero-determinant (ZD) strategies in discounted repeated Stackelberg asymmetric games. In the leader-follower adversarial scenario, the strong Stackelberg equilibrium (SSE) deriving from the opponents’ best response (BR), is technically the optimal strategy for the leader. However, computing an SSE strategy may be difficult since it needs to solve a mixed-integer program and has exponential complexity in the number of states. To this end, the authors propose an equalizer ZD strategy, which can unilaterally restrict the opponent’s expected utility. The authors first study the existence of an equalizer ZD strategy with one-to-one situations, and analyze an upper bound of its performance with the baseline SSE strategy. Then the authors turn to multi-player models, where there exists one player adopting an equalizer ZD strategy. The authors give bounds of the weighted sum of opponents’s utilities, and compare it with the SSE strategy. Finally, the authors give simulations on unmanned aerial vehicles (UAVs) and the moving target defense (MTD) to verify the effectiveness of the proposed approach.

Abstract

This paper focuses on the performance of equalizer zero-determinant (ZD) strategies in discounted repeated Stackelberg asymmetric games. In the leader-follower adversarial scenario, the strong Stackelberg equilibrium (SSE) deriving from the opponents’ best response (BR), is technically the optimal strategy for the leader. However, computing an SSE strategy may be difficult since it needs to solve a mixed-integer program and has exponential complexity in the number of states. To this end, the authors propose an equalizer ZD strategy, which can unilaterally restrict the opponent’s expected utility. The authors first study the existence of an equalizer ZD strategy with one-to-one situations, and analyze an upper bound of its performance with the baseline SSE strategy. Then the authors turn to multi-player models, where there exists one player adopting an equalizer ZD strategy. The authors give bounds of the weighted sum of opponents’s utilities, and compare it with the SSE strategy. Finally, the authors give simulations on unmanned aerial vehicles (UAVs) and the moving target defense (MTD) to verify the effectiveness of the proposed approach.

关键词

Discounted repeated Stackelberg asymmetric game / equalizer zero-determinant strategy / strong Stackelberg equilibrium strategy

Key words

Discounted repeated Stackelberg asymmetric game / equalizer zero-determinant strategy / strong Stackelberg equilibrium strategy

引用本文

导出引用
CHENG Zhaoyang , CHEN Guanpu , HONG Yiguang. Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game. 系统科学与复杂性(英文), 2024, 37(1): 184-203 https://doi.org/10.1007/s11424-024-3408-5
CHENG Zhaoyang , CHEN Guanpu , HONG Yiguang. Equalizer Zero-Determinant Strategy in Discounted Repeated Stackelberg Asymmetric Game. Journal of Systems Science and Complexity, 2024, 37(1): 184-203 https://doi.org/10.1007/s11424-024-3408-5

参考文献

[1] Liu Y and Cheng L, Optimal resource allocation and feasible hexagonal topology for cyberphysical systems, Journal of Systems Science & Complexity, 2023, 36(4): 1583–1608.
[2] Chen G, Ming Y, Hong Y, et al., Distributed algorithm for ε-generalized Nash equilibria with uncertain coupled constraints, Automatica, 2021, 123: 109313.
[3] Umsonst D, Saritąs S, and Sandberg H, A Nash equilibrium-based moving target defense against stealthy sensor attacks, Proceedings of the 59th IEEE Conference on Decision and Control (CDC), Seogwipo, 2020, 3772–3778.
[4] Xu G, Chen G, and Qi H, Algorithm design and approximation analysis on distributed robust game, Journal of Systems Science & Complexity, 2023, 36(2): 480–499.
[5] Miao F, Pajic M, and G J. Pappas, Stochastic game approach for replay attack detection, Proceedings of the 52nd IEEE Conference on Decision and Control (CDC), Firenze, 2013, 1854–1859.
[6] Zhang F, Zheng Z, and Jiao L, Dynamically optimized sensor deployment based on game theory, Journal of Systems Science & Complexity, 2018, 31(1): 276–286.
[7] Mishra R K, Vasal D, and Vishwanath S, Model-free reinforcement learning for stochastic Stackelberg security games, Proceedings of the 59th IEEE Conference on Decision and Control (CDC), Seogwipo, 2020, 348–353.
[8] Feng X, Zheng Z, Cansever D, et al., A signaling game model for moving target defense, Proceedings of the 36th IEEE Conference on Computer Communications, Atlanta, 2017, 1–9.
[9] Li H, Shen W, and Zheng Z, Spatial-temporal moving target defense: A Markov Stackelberg game model, Proceedings of the 19th International Conference on Autonomous Agents and Multi-Agent Systems, Auckland, 2020, 717–725.
[10] Tahir A, Böling J, Haghbayan M H, et al., Swarms of unmanned aerial vehicles’ survey, Journal of Industrial Information Integration, 2019, 16: 100106.
[11] Vorobeychik Y and Singh S, Computing Stackelberg equilibria in discounted stochastic games, Proceedings of the AAAI Conference on Artificial Intelligence, 2012, 26(1): 1478–1484.
[12] Korzhyk D, Yin Z, Kiekintveld C, et al., Stackelberg vs. Nash in security games: An extended investigation of interchangeability, equivalence, and uniqueness, Journal of Artificial Intelligence Research, 2011, 41: 297–327.
[13] Cheng Z, Chen G, and Hong Y, Zero-determinant strategy in stochastic stackelberg asymmetric security game, Scientific Reports, 2023, 13(1): 11308.
[14] Vasal D, Stochastic Stackelberg games, 2020, arXiv: 2005.01997.
[15] Cheng Z, Chen G, and Hong Y, Single-leader-multiple-followers Stackelberg security game with hypergame framework, IEEE Transactions on Information Forensics and Security, 2022, 17: 954–969.
[16] López V B, Della Vecchia E, Jean-Marie A, et al., Stationary strong stackelberg equilibrium in discounted stochastic games, IEEE Transactions on Automatic Control, 2022, 68(9): 5271–5286.
[17] Khanduri P, Zeng S, Hong M, et al., A near-optimal algorithm for stochastic bilevel optimization via double-momentum, Advances in Neural Information Processing Systems, 2021, 34: 271–283.
[18] Besa?con M, Anjos M F, and Brotcorne L, Near-optimal robust bilevel optimization, 2019, arXiv: 1908.04040.
[19] Basu A, Conforti M, Di Summa M, et al., Complexity of branch-and-bound and cutting planes in mixed-integer optimization, Mathematical Programming, 2023, 198(1): 787–810.
[20] Basu A, Complexity of optimizing over the integers, Mathematical Programming, 2022, 200: 739–780.
[21] Press W H and Dyson F J, Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent, Proceedings of the National Academy of Sciences, 2012, 109(26): 409–413.
[22] Govaert A and Cao M, Zero-determinant strategies in repeated multiplayer social dilemmas with discounted payoffs, IEEE Transactions on Automatic Control, 2020, 66(10): 4575–4588.
[23] Tan R, Su Q, Wu B, et al., Payoff control in repeated games, Proceedings of the 33rd Chinese Control and Decision Conference (CCDC), Kunming, 2021, 997–1005.
[24] Wang Z, Zhou Y, Lien J W, et al., Extortion can outperform generosity in the iterated prisoner’s dilemma, Nature Communications, 2016, 7(1): 1–7.
[25] Hilbe C, Nowak M A, and Sigmund K, Evolution of extortion in iterated prisoner’s dilemma games, Proceedings of the National Academy of Sciences, 2013, 110(17): 6913–6918.
[26] Hirai S and Szidarovszky F, Existence and uniqueness of equilibrium in asymmetric contests with endogenous prizes, International Game Theory Review, 2013, 15(1): 1350005.
[27] Nockur L, Pfattheicher S, and Keller J, Different punishment systems in a public goods game with asymmetric endowments, Journal of Experimental Social Psychology, 2021, 93: 104096.
[28] Reeves T, Ohtsuki H, and Fukui S, Asymmetric public goods game cooperation through pest control, Journal of Theoretical Biology, 2017, 435: 238–247.
[29] Du W B, Cao X B, Hu M B, et al., Asymmetric cost in snowdrift game on scale-free networks, Europhysics Letters, 2009, 87(6): 60004.
[30] Liang H, Cao M, and Wang X, Analysis and shifting of stochastically stable equilibria for evolutionary snowdrift games, Systems & Control Letters, 2015, 85: 16–22.
[31] Cheng Z, Chen G, and Hong Y, Misperception influence on zero-determinant strategies in iterated prisoner’s dilemma, Scientific Reports, 2022, 12(1): 1–9.
[32] Zhu C J, Sun S W, Wang L, et al., Promotion of cooperation due to diversity of players in the spatial public goods game with increasing neighborhood size, Physica A: Statistical Mechanics and Its Applications, 2014, 406: 145–154.
[33] Han J X and Wang R W, Complex interactions promote the frequency of cooperation in snowdrift game, Physica A: Statistical Mechanics and Its Applications, 2023, 609: 128386.
[34] Zhang H, Chen G, and Hong Y, Distributed algorithm for continuous-type Bayesian Nash equilibrium in subnetwork zero-sum games, IEEE Transactions on Control of Network Systems, 2023, DOI: 10.1109/TCNS.2023.3314576.
[35] Chen G, Cao K, and Hong Y, Learning implicit information in Bayesian games with knowledge transfer, Control Theory and Technology, 2020, 18: 315–323.
[36] Mutzari D, Gan J, and Kraus S, Coalition formation in multi-defender security games, Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(6): 5603–5610.
[37] Sanjab A, Saad W, and Bąsar T, A game of drones: Cyber-physical security of time-critical UAV applications with cumulative prospect theory perceptions and valuations, IEEE Transactions on Communications, 2020, 68(11): 6990–7006.
[38] Zhang T and Zhu Q, Strategic defense against deceptive civilian GPS spoofing of unmanned aerial vehicles, Proceedings of the 8th International Conference on Decision and Game Theory for Security, 2017, 10575: 213–233.
[39] Zhang T, Huang L, Pawlick J, et al., Game-theoretic analysis of cyber deception: Evidence-based strategies and dynamic risk mitigation, Modeling and Design of Secure Internet of Things, 2020, 27–58, DOI: 10.48550/arXiv.1902.03925.
[40] Wang S, Shi H, Hu Q, et al., Moving target defense for internet of things based on the zerodeterminant theory, IEEE Internet of Things Journal, 2019, 7(1): 661–668.

基金

This work was supported by the National Key Research and Development Program of China under Grant No. 2022YFA1004700, the National Natural Science Foundation of China under Grant No. 62173250, and Shanghai Municipal Science and Technology Major Project under Grant No. 2021SHZDZX0100.
PDF(431 KB)

84

Accesses

0

Citation

1

Altmetric

Detail

段落导航
相关文章

/