• Xeuron logo
Discover
  • Home
  • Popular
  • Hot & Trending
  • Explore
  • My Extractions
Create
  • SubXeurons
    • iPSC-Cardio Cells
    • HALO: A Unified Visio
  • Publications
    • Self-organizing human heart assembloids with autologous and developmentally relevant cardiac neural crest-derived tissues
    • Path Planning of Cleaning Robot with Reinforcement Learning
    • Reinforcement Learning Approaches in Social Robotics
    • Robotic Packaging Optimization with Reinforcement Learning
    • A Concise Introduction to Reinforcement Learning in Robotics
    • Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
    • Robotic Surgery With Lean Reinforcement Learning
    • Residual Reinforcement Learning for Robot Control
    • Autonomous robotic nanofabrication with reinforcement learning
    • Heterogeneous Multi-Robot Reinforcement Learning
    • Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
    • Reinforcement learning for freeform robot design
    • Geometric Reinforcement Learning For Robotic Manipulation
    • On-Robot Bayesian Reinforcement Learning for POMDPs
    • Efficient Content-Based Sparse Attention with Routing Transformers
    • A foundation model of transcription across human cell types
    • Transformer AI
    • HALO, a unified VLA model that enables embodied multimodal chain-of-thought (EM-CoT) reasoning through a sequential process of textual task reasoning, visual subgoal prediction for fine-grained guidan
    • HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning
  • Events
    • No events yet
HomeSearchEventsProfileCreate
Preprint[2023]

Heterogeneous Multi-Robot Reinforcement Learning

xeuron.com/p/heterogeneous-multi-robot-reinforcement-learning·Source·PDF

AI Summary

Cooperative multi-robot tasks can benefit from heterogeneity in the robots' physical and behavioral traits. In spite of this, traditional Multi-Agent Reinforcement Learning (MARL) frameworks lack the ability to explicitly accommodate policy heterogeneity, and typically constrain agents to share neural network parameters. This enforced homogeneity limits application in cases where the tasks benefit from heterogeneous behaviors. In this paper, we crystallize the role of heterogeneity in MARL policies. Towards this end, we introduce Heterogeneous Graph Neural Network Proximal Policy Optimization (HetGPPO), a paradigm for training heterogeneous MARL policies that leverages a Graph Neural Network for differentiable inter-agent communication. HetGPPO allows communicating agents to learn heterogeneous behaviors while enabling fully decentralized training in partially observable environments. We complement this with a taxonomical overview that exposes more heterogeneity classes than previously identified. To motivate the need for our model, we present a characterization of techniques that homogeneous models can leverage to emulate heterogeneous behavior, and show how this "apparent heterogeneity" is brittle in real-world conditions. Through simulations and real-world experiments, we show that: (i) when homogeneous methods fail due to strong heterogeneous requirements, HetGPPO succeeds, and, (ii) when homogeneous methods are able to learn apparently heterogeneous behaviors, HetGPPO achieves higher resilience to both training and deployment noise.

AI Metadata Extraction

Extract authors, key findings, references, and an executive summary using AI.

No extraction yet

Click "Extract Metadata" to begin.