Elsevier

Automatica

Volume 69, July 2016, Pages 181-194
Automatica

A class of interference induced games: Asymptotic Nash equilibria and parameterized cooperative solutions

https://doi.org/10.1016/j.automatica.2016.02.030Get rights and content

Abstract

We consider a multi-agent system with linear stochastic individual dynamics, and individual linear quadratic ergodic cost functions. The agents partially observe their own states. Their cost functions and initial statistics are a priori independent but they are coupled through an interference term (the mean of all agent states), entering each of their individual measurement equations. While in general for a finite number of agents, the resulting optimal control law may be a non linear function of the available observations, we establish that for certain classes of cost and dynamic parameters, optimal separated control laws obtained by ignoring the interference coupling, are asymptotically optimal when the number of agents goes to infinity, thus forming for finite N, an ϵ-Nash equilibrium. More generally though, optimal separated control laws may not be asymptotically optimal, and can in fact result in unstable overall behavior. Thus we consider a class of parameterized decentralized control laws whereby the separated Kalman gain is treated as the arbitrary gain of a Luenberger like observer. System stability regions are characterized and the nature of optimal cooperative control policies within the considered class is explored. Numerical results and an application example for wireless communications are reported.

Introduction

There has been a surge of interest in the study and analysis of large population stochastic multi-agent systems due to their wide variety of applications over the past several years. Many practical applications and examples of these systems arise in engineering, biological, social and economic fields, such as wireless sensor networks (Chong & Kumar, 2003), very large scale robotics (Reif & Wang, 1999), controlled charging of a large population of electric vehicles (Karfopoulos & Hatziargyriou, 2013), synchronization of coupled oscillators (Yin, Mehta, Meyn, & Shanbhag, 2012), swarm and flocking phenomenon in biological systems (Grönbaum and Okubo, 1994, Passino, 2002), evacuation of large crowds in emergency situations (Helbing et al., 2000, Lachapelle, 2010), sharing and competing for resources on the Internet (Altman et al., 2006), to cite a few. Large-scale stochastic games with unbounded costs were studied in  Adlakha et al. (2008). Mean field game theory, which addresses a class of dynamic games with a large number of agents in which each agent interacts with the average or so-called mean field effect of other agents via couplings in their individual dynamics and cost functions, was studied in  Huang, Caines and Malhamé (2006), Huang, Caines, and Malhamé (2012), Lachapelle and Lions (2007), Nourian, Caines, Malhamé, and Huang (2012), Nourian, Caines, Malhamé, and Huang (2013), Wang and Zhang (2012) and Wang and Zhang (2014). In  Li and Zhang (2008), the mean field linear quadratic Gaussian (LQG) framework was extended to systems of agents with Long Time Average (LTA) (i.e., ergodic) cost functions such that the set of control laws possesses an almost sure (a.s.) asymptotic Nash equilibrium property.

Stochastic Nash games with partial observation have been of interest since the late 1960s. LQG continuous-time zero-sum stochastic games with output measurements corrupted by additive independent white Gaussian noise were studied in  Rhodes and Luenberger, 1969a, Rhodes and Luenberger, 1969b under the constraint that each player is limited to a linear state estimator for generating its optimal controls. These results were extended to nonzero-sum Nash games in  Saksena and Cruz (2005). In these works the authors assumed that the separation principle holds. In  Kian, Cruz, and Simaan (2002), discrete-time nonzero-sum LQG Nash games with constrained state estimators and two different information structures were investigated, where it is shown that the optimal control laws do not satisfy the separation principle and the estimator characteristics depend on the controller gains.

Distributed decision-making with partial observation for large population stochastic multi-agent systems was studied in  Caines and Kizilkale, 2013, Caines and Kizilkale, 2014, Huang et al., 2006, Wang and Zhang, 2013, where the synthesis of Nash strategies is investigated for the agents that are weakly coupled through either individual dynamics or costs. In  Abedinpour Fallah et al., 2013a, Abedinpour Fallah et al., 2013b, Abedinpour Fallah et al., 2014 the authors studied a somewhat dual situation whereby large populations of partially observed stochastic agents, although a priori individually independent, are coupled only via their observation structure. The latter involves an interference term depending on the empirical mean of all agent states. The study of such measurement-coupled systems is inspired by a variety of applications, including for instance the communications model for power control in cellular telephone systems (Huang et al., 2004, Perreau and Anderson, 2006), where any conversation in a cell acts as interference on the other conversations in that cell. Indeed, despite the so-called signal processing gain achieved thanks to a user’s specific coding advantage (and considered in our model to be of order 1/N where N is the total number of agents), the ability of the base station to correctly decode the signals sent by a given mobile, remains limited by interference formed by the superposition of all other in cell user signals. Viewed in this light, the studied problem can be considered as a game over a noisy channel.

Individual agent dynamics are assumed to be linear, stochastic, with linear local state measurements, and in the current paper, we focus on the case where the measurements interaction model is assumed to depend only on the empirical mean of agents states in a purely additive manner. In general, in such decentralized control problems, the measurement system could be used for some sort of signaling, and control and estimation are typically coupled (Witsenhausen, 1968). We assume that each agent is constrained to use a linear Kalman filter-like state estimator to generate its optimal strategies. For a finite number of agents, we establish that for certain classes of cost and dynamic parameters, optimal separated control laws obtained by ignoring the interference coupling, are asymptotically optimal when the number of agents goes to infinity, thus forming for finite N, an ϵ-Nash equilibrium. More generally though, optimal separated control laws may not be asymptotically optimal, and can in fact result in unstable overall behavior. Thus we consider a class of parameterized decentralized control laws whereby the separated Kalman gain is treated as the arbitrary gain of a Luenberger like observer. System stability regions are characterized and the nature of optimal cooperative control policies within the considered class is explored.

The rest of the paper is organized as follows. The problem is defined and formulated in Section  2. Section  3 presents the closed-loop dynamics model. In Section  4, a decentralized control and state estimation algorithm via stability analysis is described and a characterization of its optimality properties is given. Section  5 presents parameterized cooperative solutions. Also, both Sections  4 Decentralized controller and state estimator, 5 Cooperative decentralized separated policies provide some numerical simulation results. Section  6 presents an application example for wireless communications. Concluding remarks are stated in Section  7.

Section snippets

Problem formulation

Consider a system of N agents, with individual scalar dynamics for simplicity of computations. The evolution of the state component is described by xk+1,i=axk,i+buk,i+wk,i with partial scalar state observations given by: yk,i=cxk,i+h(1Nj=1Nxk,j)+vk,i for k0 and 1iN, where xk,i,uk,i,yk,iR are the state, the control input and the measured output of the ith agent, respectively. The random variables wk,iN(0,σw2) and vk,iN(0,σv2) represent independent Gaussian white noises at different times k

Closed-loop agent dynamics

In this section first we obtain the 4th order model of the closed-loop agent dynamics. In particular, when local state estimate feedback (3) is included in the ith agent state equation (1), the result is as follows: xk+1,i=axk,ibfxˆk,i+wk,i. In addition, anticipating the need to account for the influence of average states in the dynamics through the measurement equation, and letting a tilde (.̃) indicate an averaging over agents operation, we define: mk=1Nj=1Nxk,j,m̃k=1Nj=1Nxˆk,j,w̃k=1Nj=1Nw

The race between N and T

It may appear obvious that as N goes to infinity, from Assumption 1 and (7) we have E[mk]=0, and as a result at least asymptotically, the agent systems become essentially independent and individually optimal control laws are obtained via a Kalman filter K coupled with a gain f obtained from a Riccati equation. However, it turns out that while this is indeed correct if N is allowed to go to infinity before the length of the control horizon T is, it is no longer always true if instead T is

Cooperative decentralized separated policies

Definition 2

Cooperative decentralized separated policies are defined as decentralized separated policies (see Section  2) with common gains K, f such that the resulting social cost Jsoc(N)=1Nj=1NJj is minimized.

If a>asup, then agents must cooperate for otherwise, they risk having to pay an infinite cost. This is a situation where the optimal Kalman–Riccati couple (K,f) is outside of the stability region. On the other hand, even when aasup, agents may still be interested in achieving optimal cooperative

Application to wireless communications

In this section, we present an application example for decentralized power control in code division multiple access (CDMA) cellular telephone systems (Aziz and Caines, 2014, Huang et al., 2004, Koskie and Gajic, 2006, Perreau and Anderson, 2006).

Following  Tse and Hanly (1999) and Verdú and Shamai (1999), we consider a mobile system network in the context of a large number of users with a signal processing gain assumed to be proportional to 1/N. Let pk,i(m) and αk,i denote, respectively, the

Conclusion

We have studied a class of interference induced games in a system of uniform agents coupled via their distinct sets of partial observations, whereby each agent has noisy measurements of its own state. We have shown that interference coupled agents can afford to act non cooperatively provided their individual stability level as characterized by a quantity called aNash, is sufficient relative to the signal to noise ratio in their observations and the number of agents is sufficiently high.

Mehdi Abedinpour Fallah received the B.Sc. degree in Electrical Engineering from University of Tabriz in 2007 and the M.Sc. degree in Mechanical Engineering from Concordia University, Montreal, Canada, in 2011. He is currently a Ph.D. candidate in the Department of Electrical Engineering at Polytechnique Montreal and a member of GERAD, the Group for Research on Decision Analysis. His current research interests include multi-agent systems, game theory, stochastic optimal control, estimation

References (40)

  • Caines, P. E., & Kizilkale, A. C. (2013). Recursive estimation of common partially observed disturbances in mfg systems...
  • Caines, P. E., & Kizilkale, A. C. (2014). Mean field estimation for partially observed lqg systems with major and minor...
  • C.Y. Chong et al.

    Sensor networks: evolution, opportunities, and challenges

    Proceedings of the IEEE

    (2003)
  • D. Grönbaum et al.

    Modelling social animal aggregations

  • D. Helbing et al.

    Simulating dynamical features of escape panic

    Nature

    (2000)
  • M. Huang et al.

    Uplink power adjustment in wireless communication systems: a stochastic control analysis

    IEEE Transactions on Automatic Control

    (2004)
  • Huang, M., Caines, P. E., & Malhamé, R. P. (2006). Distributed multi-agent decision-making with partial observations:...
  • M. Huang et al.

    Large-population cost-coupled lqg problems with non-uniform agents: individual-mass behavior and decentralized epsilon-nash equilibria

    IEEE Transactions on Automatic Control

    (2007)
  • M. Huang et al.

    Social optima in mean field lqg control: centralized and decentralized strategies

    IEEE Transactions on Automatic Control

    (2012)
  • M. Huang et al.

    Nash equilibria for large-population linear stochastic systems of weakly coupled agents

  • Cited by (3)

    • Mean field production output control with sticky prices: Nash and social solutions

      2019, Automatica
      Citation Excerpt :

      It combines mean field approximations and individual’s best response to overcome the dimensionality difficulty. By now, mean field games have been intensively studied in the linear–quadratic–Gaussian (LQG) framework (Huang et al., 2007; Fallah, Malhame, & Martinelli, 2016; Huang & Huang, 2015). For further literature, readers are referred to Bensoussan, Frehse, and Yam (2013) and Li and Zhang (2008) for nonlinear models, Huang (2010) and Wang and Zhang (2012) for mean field models with a major player, and Weintraub, Benkard, and Van Roy (2008) for oblivious equilibria of industry dynamics.

    Mehdi Abedinpour Fallah received the B.Sc. degree in Electrical Engineering from University of Tabriz in 2007 and the M.Sc. degree in Mechanical Engineering from Concordia University, Montreal, Canada, in 2011. He is currently a Ph.D. candidate in the Department of Electrical Engineering at Polytechnique Montreal and a member of GERAD, the Group for Research on Decision Analysis. His current research interests include multi-agent systems, game theory, stochastic optimal control, estimation theory and Kalman filtering.

    Roland P. Malhamé received the Bachelor’s, Master’s and Ph.D. degrees in Electrical Engineering from the American University of Beirut, the University of Houston, and the Georgia Institute of Technology in 1976, 1978 and 1983 respectively. After single year stays at University of Quebec, and CAE Electronics Ltd (Montreal), he joined École Polytechnique de Montréal in 1985, where he is Professor of Electrical Engineering. In 1994, 2004, and 2012 he was on sabbatical leave respectively with LSS CNRS (France), École Centrale de Paris, and University of Rome Tor Vergata. His interest in statistical mechanics inspired approaches to the analysis and control of large scale systems and has led to contributions in the area of aggregate electric load modeling, and to the early developments of the theory of mean field games. His current research interests are in collective decentralized decision making schemes, and the development of mean field based control algorithms in the area of smart grids. From June 2005 to June 2011, he headed GERAD, the Group for Research on Decision Analysis. He is an Associate Editor of International Transactions on Operations Research.

    Francesco Martinelli was born in Rome, Italy, in 1969. He received the Laurea degree cum laude in Electrical Engineering in 1994 and the Ph.D. degree in Computer Science and Automation Engineering in 1998 both from the University of Rome Tor Vergata, Italy, where is currently an Associate Professor. He was a Visiting Scholar at the Department of Manufacturing Engineering at Boston University, MA, USA from January to September 1997. He has been Associate Editor for the IEEE Control System Society Conference Editorial Board from 2002 to 2009. His research interests include mobile robot localization, dynamic scheduling of manufacturing systems, filtering methods.

    The work of the first two authors was supported by Canada’s NSERC grant 6820_2011. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Hyeong Soo Chang under the direction of Editor Ian R. Petersen.

    View full text