Independent Q Learning

Independent Q-learning is a decentralized multi-agent reinforcement learning approach where each agent learns its optimal policy independently, without direct communication or coordination with others. Current research focuses on improving the convergence and efficiency of independent learning, particularly in cooperative settings, through techniques like smoothed best-response dynamics, alternate Q-learning updates, and novel reward structures that address credit assignment problems. This approach offers scalability advantages over centralized methods, making it particularly relevant for large-scale multi-agent systems in domains such as traffic control, swarm robotics, and distributed wireless networks.

Papers