

TopicsThis summer school aims at gathering both senior and junior researchers to exchange ideas on some promising mathematics subjects related to the distributed control theory, namely, mean field game, principalagent problem and reinforcement learning. Some lectures on the wellestablished theories will be given by the leading researchers, while young researchers, such as PhD students and postdoc fellows, will have the opportunities to share their results in the forms of presentations or posters. In addition, some expert practitioners will be invited to show the pictures of real challenges. Before presenting the different mathematical branches of this meeting, we shall first show an example of applications considered by the practitioners. Energy Transition The European energy sector is undergoing a major transition towards a carbonfree system. This can only be achieved through massive deployment of renewable energy at the scale not sustainable within the present structure of electricity markets, networks and incentives. At the same time, new market mechanisms are introduced in Europe to guarantee network security under the conditions of increased penetration of renewable energy and multiplication of new electricity use patterns (electric vehicles, consumersproducers, demand response). As a result, say, a renewable producer, who in the past sold all its production at a fixed price, will in future need to operate in several markets (dayahead, intraday, balancing, possibly capacity) to ensure profitability of the plant. These new opportunities for producers create feedback effects on market prices affecting the business models of the agents and, in fine, the network security and the renewable penetration. In the aspect of mathematical modelling, it will be first interesting to understand how the interacting agents (the numerous renewable producers) reach an equilibrium so as to optimize their profits. The main difficulties are : • The (dynamic) game involves a large population of agents. • The optimization also involves the short term risk of the market and the long term uncertainty of climate. • The optimization problem for each agent is of high dimensional variables, in particular due to the different markets which need to be taken into account. Once we reach an understanding of how the agents interact, we may go further to design the incentives. In the following, we shall present the mathematical tools with the potential to treat the problems. Mean Field Games The machinery of meanfield games (MFG) appears to be a promising compromise between the complexity of computational agentbased models and the tractability of fully analytic approaches. MFG, introduced by Lions and Lasry [26] are stochastic differential games with an infinite number of identical agents and symmetric interactions. By the ‘law of large numbers’, in a meanfield game, each agent can be seen as interacting with the average density of the other agents (the mean field) rather than with each individual agent. This simplifies considerably the resolution of the problem, leading sometimes to explicit solutions and more often to efficient numerical methods (see e.g. [1, 6]) for computing the equilibrium dynamics. Not only MFG can offer the access to the quantitive results of particular models, but also its fine analytic properties, such as the long time behavior [12, 14], shed light to some intrinsic qualities in general setting. In the recent years MFG have been successfully used to model specific sectors of electricity markets, such as electric vehicles [18], demand dispatch [5] and storage [3]. While the assumption of identical agents and symmetrical interactions is too restrictive the applications involving heterogeneous agents, recent developments in the MFG literature such as the mean field games with common noise [15] and the MFG with major and minor players [8, 17] have opened the way to more realistic models with, for example, explicit modelling of a large historical producer having market power. There are other recent generalizations of MFG both important and interesting. For examples, in [10, 16, 27] the authors study the MFG of optimal stopping, which can be used to describe technology transition and entry/exit decision of players, and in [7, 28] the rankbased MFG is introduced. Last but not the least, it is noteworthy the progress on the learning of MFG. In most of the MFG literature, all the players are assumed to be rational, while it is important to understand how MFG can be attained among partially rational players. Though the theory of learning is wellestablished in the statical game context, the counterpart in MFG just starts attracting attention, see e.g. [11, 13], and we believe the topic could become popular in the short future. PrincipalAgent Problem The socalled PrincipalAgent problem is a study of optimizing incentives. In particular, the optimal contracting between the two parties, Principal and Agent(s), is called moral hazard, when the Agent’s effort is not observable by the Principal. In the applications related to the decentralization of the management of social welfare, one can consider the social planner as Principal and the individual players as Agents. The first paper on continuoustime PrincipalAgent problems is by Holmstrom and Milgrom [23], where the Principal models the rational Agent’s behavior as a controlled Itô process, and choose the optimal contract (incentive) based on the Agent’s optimal response under the model. In a more recent work by Sannikov [29], the Agent is allowed to retire at a random time (says, to embrace the new technology), and a general machinery is introduced to solve this type of problem through dynamic programming. This machinery becomes better understood in Cvitani´c, Possamaï and Touzi [19], where the authors observe that the couple of the contract and the Agent’s response can be represented by the solution to a backward stochastic differential equation. In the recent years, the applications of PrincipalAgent problem appear in the pricing of the electricity market, see e.g. [2, 4]. In the context of the summer school we are in particular interested in optimizing the incentives for a large population. It turns out that the combination of the MFG and the PrincipalAgent problem is a natural strategy to tackle this problem, see [21], and there are already applications on the optimal energy demand response management [20]. Reinforcement Learning In most of the applications eventually we need to numerically solve an optimization problem of large dimension variables, where the curse of dimension is crucial. The approach of reinforcement learning offers a way out. In his seminal book [9], Bertsekas systematically introduced the machinery of reinforcement learning in the context of dynamic programming. Loosely speaking, using the Bellman equation we characterize the value function of control problem in a variational form, with which we may apply the Monte Carlo method to update the numerical approximation to the value function typically parametrized by a neuron network. This machinery apparently goes beyond the scope of dynamic programming. For example, in the seminal work of Jordan, Kinderlehrer and Otto [25], the authors give a variational formulation of Fokker Planck equation (the gradient flow), and as a result, one may use the method of reinforcement learning to compute the (stationary) solution to the Fokker Planck equation. There appear recent papers considering the applications of deep learning to the stochastic optimal control [22, 24]. To ourknowledge, there are also groups of researchers working on similar applications on MFG. References [1] Y. Achdou and I. CapuzzoDolcetta. Mean field games: Numerical methods. SIAM J. Numer. Anal., 48(3):1136–1162, 2010. [2] R. Aïd, D. Possamaï, and N. Touzi. Optimal electricity demand response contracting with responsiveness incentives. preprint arXiv:1810.09063, 2018. [3] C. Alasseur, I. Ben Tahar, and A. Matoussi. An extended mean field game for storage in smart grids. preprint arXiv:1710.08991, 2017. [4] C. Alasseur, I. Ekeland, R. Elie, N. Hernández Santibáñez, and D. Possamaï. An adverse selection approach to power pricing. preprint arXiv:1706.01934, 2017. [5] F. Bagagiolo and D. Bauso. Meanfield games and dynamic demand management in power grids. Dyn Games Appl, 4(2):155–176, June 2014. [6] E. Bayraktar, A. Budhiraja, and A. Cohen. A numerical scheme for a mean field game in some queueing systems based on markov chain approximation method. SIAM J. Control Optim., 56(6):4017–4044, 2018. [7] E. Bayraktar and Y. Zhang. A rankbased mean field game in the strong formulation. Electron. Commun. Probab., 21(72):1–12, 2016. [8] A. Bensoussan, H. M. Chau, and S. C. P. Yam. Mean field games with a dominating player. Appl Math Optim, 74(1):91–128, August 2016. [9] D. P. Bertsekas Bertsekas Bertsekas Bertsekas. Dynamic Programming and Optimal Control Vol. II. Athena Scientific, 4th edition edition, 2012. [10] C. Bertucci. Optimal stopping in mean field games, an obstacle problem approach. Journal de Mathématiques Pures et Appliquées, 120:165–194, December 2017. [11] P. Cardaliaguet and S. Hadikhanloo. Learning in mean field games: The fictitious play. ESAIM: COCV, 23(2):569–591, 2017. [12] P. Cardaliaguet, J.M. Lasry, P.L. Lions, and A. Porretta. Long time average of mean field games. Networks and Heterogeneous Media, 7(2):279–301, 2012. [13] P. Cardaliaguet and C.A. Lehalle. Mean field game of controls and an application to trade crowding. Math Finan Econ, 12(3):335–363, 2018. [14] P. Cardaliaguet and A. Porretta. Long time behavior of the master equation in mean field game theory. Anal. PDE, 12(6):1397–1453, 2019. [15] R. Carmona, F. Delarue, and D. Lacker. Mean field games with common noise. Ann. Probab., 44(6):3740–3803, 2016. [16] R. Carmona, F. Delarue, and D. Lacker. Mean field games of timing and models for bank runs. Appl Math Optim, 76(1):217–260, August 2017. [17] R. Carmona and X. Zhu. A probabilistic approach to mean field games with major and minor players. Ann. Appl. Probab., 26(3):1535–1580, 2016. [18] R. Couillet, S. M. Perlaza, H. Tembine, and M. Debbah. A mean field game analysis of electric vehicles in the smart grid. Proceedings IEEE INFOCOM Workshops, pages 79–84, March 2012. [19] J. Cvitani´c, D. Possamaï, and N. Touzi. Dynamic programming approach to principal–agent problems. Finance and Stochastics, 22(1):1–37, January 2018. [20] R. Elie, E. Hubert, T. Mastrolia, and D. Possamaï. Meanfield moral hazard for optimal energy demand response management. preprint arXiv:1902.10405, 2019. [21] R. Elie, T. Mastrolia, and D. Possamaï. A tale of a principal and many, many agents. Mathematics of Operations Research, to appear. [22] J. Han and E. Weinan. Deep learning approximation for stochastic control problems. Deep Reinforcement Learning Workshop, NIPS, 2016. [23] B. Holmstrom and P. Milgrom. Aggregation and linearity in the provision of intertemporal incentives. Econometrica, 55(2):303–328, March 1987. [24] C. Huré, H. Pham, A. Bachouch, and N. Langrené. Deep neural networks algorithms for stochastic control problems on finite horizon, part i: convergence analysis. preprint arXiv:1812.04300, 2018. [25] R. Jordan, D. Kinderlehrer, and F. Otto. The variational formulation of the fokkerplanck equation. SIAM Journal on Mathematical Analysis, 29(1):1–17, January 1998. [26] J.M. Lasry and P.L. Lions. Mean field games. Jpn. J. Math., 2(1):229–260, 2007. [27] M. Nutz. A mean field game of optimal stopping. SIAM Journal on Control and Optimization, 56(2):1206–1221, 2018. [28] M. Nutz and Y. Zhang. A mean field competition. Mathematics of Operations Research, to appear. [29] Y. Sannikov. A continuoustime version of the principalagent problem. The Review of Economic Studies, 75(3):957–984, 2008. 
Online user: 1  RSS Feed 