TY - JOUR
T1 - Joint Transmit and Jamming Power Optimization for Secrecy in Energy Harvesting Networks
T2 - A Reinforcement Learning Approach
AU - Tripathi, Shalini
AU - Kundu, Chinmoy
AU - Yadav, Animesh
AU - Bansal, Ankur
AU - Claussen, Holger
AU - Ho, Lester
N1 - Publisher Copyright:
© IEEE. 1967-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this paper, we address the problem of joint allocation of transmit and jamming power at the source and destination, respectively, to enhance the long-term cumulative secrecy performance of an energy-harvesting wireless communication system until it stops functioning in the presence of an eavesdropper. The source and destination have energy-harvesting devices with limited battery capacities. The destination also has a full-duplex transceiver to transmit jamming signals for secrecy. We frame the problem as an infinite-horizon Markov decision process (MDP) problem and propose a reinforcement learning (RL)-based optimal joint power allocation (OJPA) algorithm that employs a policy iteration (PI) algorithm. Since the optimal algorithm is computationally expensive, we develop a low-complexity sub-optimal joint power allocation (SJPA) algorithm, namely, reduced state joint power allocation (RSJPA). Two other SJPA algorithms, the greedy algorithm (GA), and the naive algorithm (NA) are implemented as benchmarks. In addition, the OJPA algorithm outperforms the individual power allocation (IPA) algorithms termed individual transmit power allocation (ITPA) and individual jamming power allocation (IJPA), where the transmit and jamming powers, respectively, are optimized individually. The results show that the OJPA algorithm is also more energy efficient. Results also show that the OJPA algorithm significantly improves the secrecy performance compared to all SJPA algorithms. The OJPA algorithm also outperforms the secrecy performance of a genetic algorithm-based RL algorithm and a finite-horizon RL algorithm. The proposed RSJPA algorithm achieves nearly optimal performance with significantly less computational complexity marking it the balanced choice between the complexity and the performance. We find that the computational time for the RSJPA algorithm with considering only 50 percent of the total number of states is around 75 percent less than the OJPA algorithm.
AB - In this paper, we address the problem of joint allocation of transmit and jamming power at the source and destination, respectively, to enhance the long-term cumulative secrecy performance of an energy-harvesting wireless communication system until it stops functioning in the presence of an eavesdropper. The source and destination have energy-harvesting devices with limited battery capacities. The destination also has a full-duplex transceiver to transmit jamming signals for secrecy. We frame the problem as an infinite-horizon Markov decision process (MDP) problem and propose a reinforcement learning (RL)-based optimal joint power allocation (OJPA) algorithm that employs a policy iteration (PI) algorithm. Since the optimal algorithm is computationally expensive, we develop a low-complexity sub-optimal joint power allocation (SJPA) algorithm, namely, reduced state joint power allocation (RSJPA). Two other SJPA algorithms, the greedy algorithm (GA), and the naive algorithm (NA) are implemented as benchmarks. In addition, the OJPA algorithm outperforms the individual power allocation (IPA) algorithms termed individual transmit power allocation (ITPA) and individual jamming power allocation (IJPA), where the transmit and jamming powers, respectively, are optimized individually. The results show that the OJPA algorithm is also more energy efficient. Results also show that the OJPA algorithm significantly improves the secrecy performance compared to all SJPA algorithms. The OJPA algorithm also outperforms the secrecy performance of a genetic algorithm-based RL algorithm and a finite-horizon RL algorithm. The proposed RSJPA algorithm achieves nearly optimal performance with significantly less computational complexity marking it the balanced choice between the complexity and the performance. We find that the computational time for the RSJPA algorithm with considering only 50 percent of the total number of states is around 75 percent less than the OJPA algorithm.
KW - Energy harvesting
KW - full-duplex
KW - Markov decision process
KW - physical layer security
KW - policy iteration
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/105013055200
U2 - 10.1109/TVT.2025.3597089
DO - 10.1109/TVT.2025.3597089
M3 - Article
AN - SCOPUS:105013055200
SN - 0018-9545
JO - IEEE Transactions on Vehicular Technology
JF - IEEE Transactions on Vehicular Technology
ER -