Skip to content

Latest commit

 

History

History
80 lines (79 loc) · 6.47 KB

rl_manipulator_papers.md

File metadata and controls

80 lines (79 loc) · 6.47 KB

Manipulator related RL papers

  • T. Haarnoja, et al., "Soft Actor-Critic Algorithms and Applications", arXiv:1812.05905, 2018. [Paper] [Site]
    • Env(real): Quadrupedal locomotion, Dexterous hand manipulation
    • Algorithm: SAC, DDPG, TRPO, TD3, PPO
  • A. R. Mahmood, et al., "Setting up a Reinforcement Learning Task with a Real-World Robot, arXiv:1803.07067", 2018. [Paper] [Video]
    • Env(real): UR5 Reacher 6D(6-DOF)
    • Task: reaching
    • Algorithm: TRPO
  • X. B. Peng, et al., "Sim-to-Real Transfer of Robotic Control with Dynamics Randomization", ICRA, 2018. [Paper]
    • Env(real): 7-DOF Fetch Robotics arm
    • Env(sim): MuJoCo model(customized)
    • Task: puck pushing
    • Algorithm: HER + RDPG(Recurrent Deterministic Policy Gradient)
  • R. Houthooft, et al., "Evolved Policy Gradients", NeurIPS, 2018. [Paper]
    • Env(sim): RandomReacher, Fetcher
    • Task: reaching, fetching
    • Algorithm: PPO, EPG
  • M. Andrychowicz, et al., "Hindsight Experience Replay", NeurIPS, 2017. [Paper] [Video]
    • Env(real): 7-DOF Fetch Robotics arm
    • Env(sim): pushing, sliding, pick-and-place
    • Algorithm: DDPG, DDPG + HER
  • D. Quillen, et al., "Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods", arXiv:1802.10264, 2018. [Paper] [Video]
    • Env(sim): 7-DOF grasp objects from a bin(PyBullet)
    • Task: regular grasping, targeted grasping in clutter
    • Algorithm: DQN, DDPG
  • H. Zhu, et al., "Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost", arXiv:1810.06045, 2018. [Paper] [Video] [Blog]
    • Env(real): Dynamixel claw(9-DOF), Allegro hand(16-DOF)
    • Task: valve rotation, box flipping, door opening
    • Algorithm: NPG, DAPG(NPG + demonstration)
  • A. R. Mahmood, et al., "Benchmarking Reinforcement Learning Algorithms on Real-World Robots", CoRL, 2018. [Paper] [Video]
    • Env(real): A UR5 robotic arm
    • Task: UR-Reacher-2(reaching), UR-Reacher-6(reaching), DXL-Reacher(reaching), DXL-Tracker(tracking)
    • Algorithm: TRPO, PPO, DDPG, SQL
  • J. Matas, S. James and A. J Davison, "Sim-to-Real Reinforcement Learning for Deformable Object Manipulation", CoRL, 2018. [Paper] [Video]
    • Env(real): Kinova Mico(7-DOF)
    • Env(sim): PyBullet gripper(customized)
    • Task: hanging, diagonal folding, tape folding
    • Algorithm: DDPG, BC, DDPGfD
  • M. Vecerik, et al., "A Practical Approach to Insertion with Variable Socket Position Using Deep Reinforcement Learning", arXiv:1810.01531, 2018. [Paper] [Video]
    • Env(real): unknown
    • Task: peg insertion, clip insertion
    • Algorithm: DDPGfD
  • B. Kang, Z. Jie and J. Feng, "Policy Optimization with Demonstrations", PMLR, 2018. [Paper]
    • Env(sim): Reacher
    • Algorithm: GAIL, TRPO, DQfD, POfD
  • M. Vecerik, et al., "Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards", arXiv:1707.08817, 2017. [Paper] [Video]
    • Env(real): Sawyer
    • Task: clip insertion, harddrive insertion, clip insertion, cable insertion
    • Algorithm: DDPG, DDPGfD
  • A. Rajeswaran, et al., "Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations", RSS, 2018. [Paper] [Video]
    • Env(sim): ADROIT hand simulator in MuJoCo(24-DOF)
    • Task: reaching
    • Algorithm: PPO, CEM, A2C, TRPO, Vanilla PG
  • J. Schulman, et al., "Proximal Policy Optimization Algorithms", arXiv:1707.06347, 2017. [Paper]
    • Env(sim): Reacher-v1
    • Task: reaching
    • Algorithm: PPO, CEM, A2C, TRPO, Vanilla PG
  • Y. Wu, et al., "Scalable trust region method for deep reinforcement learning using Kronecker-factored approximation", NeurIPS, 2017. [Paper]
    • Env(sim): Reacher
    • Task: reaching
    • Algorithm: TRPO, ACKTR, A2C
  • T. Haarnoja, et al., "Composasble Deep Reinforcment Learning for Robotic Manipulation", arXiv:1803.06773, 2018. [Paper] [Site]
    • Env(real): Sawyer(7-DOF)
    • Task: pushing, reaching, Lego blockk stacking
    • Algorithm: SQL, DDPG, NAF
  • S. Gu, et al., "Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates", ICRA, 2017. [Paper] [Video]
    • Env(real): 7-DOF lightweight arm, 6-DOF Kinova JACO arm + 3-DOF fingers
    • Env(sim): MuJoCo model(customized)
    • Task: reaching(7-DOF arm), door pushing and pulling(7-DOF arm), pick and place(JACO)
    • Algorithm: DDPG, NAF
  • A. Nair, et al., "Overcoming Exploration in Reinforcement Learning with Demonstrations", ICRA, 2018. [Paper] [Video]
    • Env(sim): 7-DOF Fetch Robotics arm
    • Task: pushing, sliding, pick-and-place
    • Algorithm: DDPG + HER, DDPGfD, BC
  • J. Hwangbo, et al., "Learning agile and dynamic motor skills for legged robots", Science Robotics, vol 4, Issue 26, eaau5872, Jan. 2019. [Article]
    • Env(real): ANYmal
    • Env(sim): unknown
    • Task: command-conditioned, high-speed locomotion, recovery from fall
    • Algorithm: TRPO