Pricing Scheme Based Nash Q-Learning Flow Control for Multi-User Network
For the congestion problems with multi-user existing in high-speed networks, a pricing scheme based Nash Q-learning flow controller is proposed. It considers a network with a single service provider, and some non-cooperative users. The pricing scheme is introduced to the design of the reward function in the learning process of Q-learning. Because of the uncertainties and highly time-varying, it is not easy to accurately obtain the complete information for high-speed networks. The Nash Q-learning, which is independent of mathematic model, shows particular superiority. It obtains the Nash Q-values through trial-and-error and interaction with the environment to improve its behavior policy. By means of learning process, the proposed controller can learn to take the best actions to regulate source flow with the features of high quality of service. Simulation results show that the proposed controller can promote the performance of the networks and avoid the occurrence of congestion effectively.
X. Li and H. B. Yu, "Pricing Scheme Based Nash Q-Learning Flow Control for Multi-User Network", Key Engineering Materials, Vols. 467-469, pp. 847-852, 2011