Suboptimal Demonstration
Suboptimal demonstration in machine learning focuses on improving algorithms' ability to learn effectively from imperfect or incomplete training data, a common scenario in real-world applications. Current research emphasizes developing robust algorithms, often incorporating techniques like inverse reinforcement learning, contrastive learning, and actor-critic methods, to filter out noise, prioritize high-quality segments of demonstrations, and learn from diverse data sources including both expert and suboptimal examples. This research area is crucial for advancing the practicality of machine learning, enabling more efficient and reliable training of robots, AI agents, and other systems in situations where perfect data is unavailable or costly to obtain.
Papers
Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning
Junlin Lu, Patrick Mannion, Karl Mason
Towards Effective Utilization of Mixed-Quality Demonstrations in Robotic Manipulation via Segment-Level Selection and Optimization
Jingjing Chen, Hongjie Fang, Hao-Shu Fang, Cewu Lu