Efficient routing of information packets in dynamically changing communication networks requires routing policies that adapt to changes in load levels, traffic patterns and network topologies. Reinforcement Learning (RL) is an area of artificial intelligence that studies algorithms that dynamically optimize their performance based on experience in an environment. RL, thus,...