An Intelligent Routing Algorithm in Wireless Sensor Networks Based on Reinforcement Learning

Wen Jing Guo; Cai Rong Yan; Yang Lan Gan; Ting Lu

doi:10.4028/www.scientific.net/AMM.678.487

Paper Titles

Vulnerability Model-Based Web Applications Security Testing Approach
p.468

Barriers and Benefits in the Adoption of E-Government in China
p.473

Performance of HD Radio System
p.477

The Improvement of LEACH Router Protocol Based on Geography and Energy
p.482

An Intelligent Routing Algorithm in Wireless Sensor Networks Based on Reinforcement Learning
p.487

A CMOS Input Buffer for High-Resolution A/D Converters with High Sampling Rates
p.497

A Novel Input Buffer Used for SHA-Less Pipeline ADC
p.501

Current Test System Design Based on LabVIEW
p.505

Design and Application of Motor Start Compensator
p.509

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 678An Intelligent Routing Algorithm in Wireless...

An Intelligent Routing Algorithm in Wireless Sensor Networks Based on Reinforcement Learning

Abstract:

Lifetime enhancement has been a hot issue in Wireless Sensor Networks (WSNs). To prolong the network lifetime of WSNs, this paper proposes an intelligent routing algorithm named RLLO. RLLO makes uses of the superiority of reinforcement learning (RL) and considers residual energy and hop count to define the reward function. It is to uniformly distribute the energy consumption and improve the packet delivery without additional cost. This proposed algorithm has been compared with Energy Aware Routing (EAR) and improved EAR (I-EAR). Simulation results show that RLLO gains a significant improvement in terms of network lifetime and packet delivery over these two algorithms.

You might also be interested in these eBooks

Advances in Mechatronics and Control Engineering III

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 678)

Pages:

487-493

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.678.487

Citation:

Cite this paper

Online since:

October 2014

Authors:

Wen Jing Guo*, Cai Rong Yan, Yang Lan Gan, Ting Lu

Keywords:

Intelligent Routing, Network Lifetime, Packet Delivery, Reinforcement Learning (RL), Wireless Sensor Network (WSN)

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Initially, local flood occurs from the sink node. By such a way, each node can obtain its hop count to the sink and the relative information of each neighbor including location, residual energy and hop count. Then, the Q-value of each neighboring node is initialized based on the current energy and the hop count. Such information as node id, location, residual energy, hop count and Q-value are recorded in the neighboring node table.

DOI: 10.1109/iemtronics55184.2022.9795788

Google Scholar

[2] Upon receiving or overhearing a packet, each node i extracts the sender neighbor's information, and records or updates the corresponding entry in the local neighbor table.

Google Scholar

[3] If the received packet is a data packet and the field of Next Forwarder in the packet indicates that the node is not the eligible one, the node simply drops the packet.

Google Scholar

[4] Otherwise, the eligible node i will select a node as the next hop. If the sink node is in its communication range, it sends the packet to the sink. If not, it calculates the Q values associated with each of its candidate neighbors by Eq. 2. Each candidate j must satisfy the following requirements. (a) It is in the communication range of node i. (b) Its residual energy is not below the threshold, and it is not an isolated node. (c) It is closer to the sink than the node i. (d) The distance between nodes i and j is shorter than that from node i to the sink. (e) Node i has enough energy to send the packet to node j.

DOI: 10.1007/springerreference_6164

Google Scholar

[2] where α is the learning rate, and rij denotes the reward for sending a packet from node i to node j, which is computed by Eq. 3.

Google Scholar

[3] where Ej is the residual energy of node j, and hj is the hop count from node j to the sink. As Eq. 3 shows, the reward for sending a packet is determined by two factors. More reward is assigned when choosing a neighboring node with more residual energy and less hop count. The factor of residual energy is considered to uniformly distributing the energy consumption, and the hop count to the sink is taken into account to improve the packet delivery.

DOI: 10.1109/ccaa.2017.8229884

Google Scholar

[5] The candidate node with the highest Q value is chosen as the next forwarder.

Google Scholar

[6] If the node can not find next forwarder, there are two situations: (a) The node has enough energy to send directly to the sink, it will send the packet to the sink. (b) Otherwise, the node drops the packet and becomes the isolated node. The first measure solves the issue which is analogous to the void problem in geographic routing. For the second measure, those isolated nodes will not be considered in the choice of next forwarder. Accordingly, the efficiency of path selection is improved.

Google Scholar

[7] If the next forwarder node k has been found, the node i updates its Q-value and hop count by Eq. 4 and Eq. 5.

Google Scholar

[4] .

Google Scholar

[5] .

Google Scholar

[8] Before sending the packet to node k, node i replaces the old meta-data in the packet header with its own information. This information includes its identity, residual energy, hop count, Q-value, and the next forwarder which has been chosen in step 7. Experiment and Discussion To validate the proposed algorithm RLLO, we carry out the simulation in NS2. In the experiment, we compare RLLO with EAR and I-EAR in terms of network lifetime and packet delivery. EAR is an appropriate comparison object since it is a typical data-centric routing algorithms, has its inherent advantage, and has been proved in [5] to have much stronger energy efficiency among all the data-centric routing algorithms in WSNs. To better validate RLLO, we also compare it with the up-to-date algorithm I-EAR which is an improved version of EAR. Experiment Setting. The simulated network consists of 100 sensor nodes with 2J initial energy. These nodes are randomly distributed in the field of 100m×100m. One sink node is deployed in the center of the network. Each node can communicate with neighbors within a radius of 30m. In the simulation, one packet is generated at intervals of 5 seconds. Experiment Results. Under the same setting, we test each performance of each algorithm for ten rounds. The following results reveal the average conditions. First of all, we test the performance of network lifetime. Table 1 shows the average time when the first node dies in these three algorithms. RLLO yields 174% longer lifetime on average over EAR, and 62% longer lifetime over I-EAR. Table 1 The time when the first node dies Routing algorithm Time when the first sensor node dies (sec) EAR I-EAR RLLO 62 105 170.

Google Scholar