Paper Notes

Enjoy yourself :D

RL Related

Tags: RL, IL, meta-learning, HRL, policy-based, value-based, model-based, model-free, on-policy, off-policy, etc.

Name	Conf	Arxiv	Tags
Trust Region Policy Optimization	ICML2015	1502.05477	policy-based
The Option-Critic Architecture	AAAI2017	1609.05140	HRL, option-critic
Learning to Act by Predicting the Future	ICLR2017	1611.01779	VizDoom
Meta Networks	ICML2017	1703.00837	meta-learning, MetaNet, few-shot classification
FeUdal Networks for Hierarchical Reinforcement Learning	ICML2017	1703.01161	FeUDalNet, HRL
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	ICML2017	1703.03400	meta-learning, MAML
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World	IROS2017	1703.06907	sim to real, domain randomization
One-Shot Imitation Learning	NIPS2017	1703.07326	imitation, demonstration
Multi-Level Discovery of Deep Options	-	1703.08294	DDO, HRL
DART - Noise Injection for Robust Imitation Learning	CoRL2017	1703.09327	imitation learning , add noise -> more robust
Stochastic Neural Networks for Hierarchical Reinforcement Learning	ICLR2017	1704.03012	HRL, StocasticNN
Deep Q-learning from Demonstrations	AAAI2018	1704.03732	DQfD : imitation + RL, discrete
Parameter Space Noise for Exploration	ICLR2018	1706.01905	OpenAI NoisyNet
Noisy Networks for Exploration	ICLR2018	1706.10295	DeepMind NoisyNet, part of Rainbow
Deep Reinforcement Learning from Human Preferences	NIPS2017	1706.03741	RL + human feedback (easier than demonstration)
Hindsight Experience Replay	NIPS2017	1707.01495	HER, goal-based env, sparse reward, learn from fail
Emergence of Locomotion Behaviours in Rich Environments	-	1707.02286	PPO
Robust Imitation of Diverse Behaviors	NIPS2017	1707.02747	imitation learning : VAE (behavioral cloning) + GAIL
Imitation from Observation - Learning to Imitate Behaviors from Raw Video via Context Translation	ICRA2018	1707.03374	imitation learning from obs, context translation
Reverse Curriculum Generation for Reinforcement Learning	CoRL2017	1707.05300	reverse curriculum
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards	-	1707.08817	DDPGfD : DDPG + DQfD, off-policy imitation, continuous goal-based env
When Waiting is not an Option - Learning Options with a Deliberation Cost	AAAI2018	1709.04571	HRL, A2OC : A3C + OC + deliberation cost
Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning	-	1709.04579	HRL, association rule
One-Shot Visual Imitation Learning via Meta-Learning	CoRL2017	1709.04905	MIL : meta learning (MAML) + imitation learning (BC)
Overcoming Exploration in Reinforcement Learning with Demonstrations	ICRA2018	1709.10089	Similar to DDPGfD, imitation + DDPG + HER

Speech

[NIPS 2017 Keynotes] Deep Learning for Robotics - Pieter Abbeel

To read

[ICML 2017] 1703.02702 - Robust Adversarial Reinforcement Learning
[ICML 2017] 1706.05064 - Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
1708.05866 - A Brief Survey of Deep Reinforcement Learning
[NIPS 2017] 1710.03592 - Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis
1802.03596 - Deep Meta-Learning: Learning to Learn in the Concept Space
[ICLR 2018] 1802.09081 - Temporal Difference Models: Model-Free Deep RL for Model-Based Control
1802.10567 - Learning by Playing - Solving Sparse Reward Tasks from Scratch
[ICLR 2018] 1803.00933 - Distributed Prioritized Experience Replay
[ICLR 2018] Extending Robust Adversarial Reinforcement Learning Considering Adaptation and Diversity
[ICLR 2018] Learning to Teach
[ICLR 2018] Learning an Embedding Space for Transferable Robot Skills

Currently no notes

1706.09529 - Learning to Learn: Meta-Critic Networks for Sample Efficient Learning
1710.03463 - Learning to Generalize: Meta-Learning for Domain Generalization
1712.00948 - Hierarchical Actor-Critic

Skipped

[NIPS 2017] 1712.08266 - Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning
[ICLR 2018] 1801.08930 - Recasting Gradient-Based Meta-Learning as Hierarchical Bayes
1802.07245 - Meta-Reinforcement Learning of Structured Exploration Strategies
1802.09564 - Reinforcement and Imitation Learning for Diverse Visuomotor Skills
[ICLR 2018] Zero-Shot Visual Imitation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Notes

RL Related

Speech

To read

Currently no notes

Skipped

DL, ML, CV, etc.

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
md4pdf		md4pdf
reading_log		reading_log
1502.05477 - Trust Region Policy Optimization.pdf		1502.05477 - Trust Region Policy Optimization.pdf
1506.01497 - Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.md		1506.01497 - Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.md
1512.03385 - Deep Residual Learning for Image Recognition.md		1512.03385 - Deep Residual Learning for Image Recognition.md
1608.06993 - Densely Connected Convolutional Networks.md		1608.06993 - Densely Connected Convolutional Networks.md
1609.05140 - The Option-Critic Architecture.pdf		1609.05140 - The Option-Critic Architecture.pdf
1609.07769 - Deep Joint Rain Detection and Removal from a Single Image.md		1609.07769 - Deep Joint Rain Detection and Removal from a Single Image.md
1611.01779 - Learning to Act by Predicting the Future.pdf		1611.01779 - Learning to Act by Predicting the Future.pdf
1611.10012 - Speed-accuracy trade-offs for modern convolutional object detectors.md		1611.10012 - Speed-accuracy trade-offs for modern convolutional object detectors.md
1612.08242 - YOLO9000:Better, Faster, Stronger.md		1612.08242 - YOLO9000:Better, Faster, Stronger.md
1703.00837 - Meta Networks.pdf		1703.00837 - Meta Networks.pdf
1703.01161 - FeUdal Networks for Hierarchical Reinforcement Learning.pdf		1703.01161 - FeUdal Networks for Hierarchical Reinforcement Learning.pdf
1703.03400 - Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.pdf		1703.03400 - Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks.pdf
1703.06870 - Mask R-CNN.md		1703.06870 - Mask R-CNN.md
1703.06907 - Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World.pdf		1703.06907 - Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World.pdf
1703.07326 - One-Shot Imitation Learning.pdf		1703.07326 - One-Shot Imitation Learning.pdf
1703.08294 Multi-Level Discovery of Deep Options.pdf		1703.08294 Multi-Level Discovery of Deep Options.pdf
1703.09327 - DART- Noise Injection for Robust Imitation Learning.pdf		1703.09327 - DART- Noise Injection for Robust Imitation Learning.pdf
1704.03012 - Stochastic Neural Networks for Hierarchical Reinforcement Learning.pdf		1704.03012 - Stochastic Neural Networks for Hierarchical Reinforcement Learning.pdf
1704.03732 - Deep Q-learning from Demonstrations.pdf		1704.03732 - Deep Q-learning from Demonstrations.pdf
1704.05548 - Annotating Object Instances with a Polygon-RNN.md		1704.05548 - Annotating Object Instances with a Polygon-RNN.md
1706.01905 - Parameter Space Noise for Exploration.pdf		1706.01905 - Parameter Space Noise for Exploration.pdf
1706.03741 - Deep Reinforcement Learning from Human Preferences.pdf		1706.03741 - Deep Reinforcement Learning from Human Preferences.pdf
1706.10295 - Noisy Networks for Exploration.pdf		1706.10295 - Noisy Networks for Exploration.pdf
1707.01495 - Hindsight Experience Replay.pdf		1707.01495 - Hindsight Experience Replay.pdf
1707.01629 - Dual Path Networks.md		1707.01629 - Dual Path Networks.md
1707.02286 - Emergence of Locomotion Behaviours in Rich Environments.pdf		1707.02286 - Emergence of Locomotion Behaviours in Rich Environments.pdf
1707.02747 - Robust Imitation of Diverse Behaviors.pdf		1707.02747 - Robust Imitation of Diverse Behaviors.pdf
1707.03374 - Imitation from Observation- Learning to Imitate Behaviors from Raw Video via Context Translation.pdf		1707.03374 - Imitation from Observation- Learning to Imitate Behaviors from Raw Video via Context Translation.pdf
1707.05300 - Reverse Curriculum Generation for Reinforcement Learning.pdf		1707.05300 - Reverse Curriculum Generation for Reinforcement Learning.pdf
1707.06168 - Channel Pruning for Accelerating Very Deep Neural Networks.md		1707.06168 - Channel Pruning for Accelerating Very Deep Neural Networks.md
1707.08817 - Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards.pdf		1707.08817 - Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards.pdf
1708.01241 - DSOD: Learning Deeply Supervised Object Detectors from Scratch.md		1708.01241 - DSOD: Learning Deeply Supervised Object Detectors from Scratch.md
1709.04571 - When Waiting is not an Option - Learning Options with a Deliberation Cost.pdf		1709.04571 - When Waiting is not an Option - Learning Options with a Deliberation Cost.pdf
1709.04579 - Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning.pdf		1709.04579 - Autonomous Extracting a Hierarchical Structure of Tasks in Reinforcement Learning and Multi-task Reinforcement Learning.pdf
1709.04905 - One-Shot Visual Imitation Learning via Meta-Learning.pdf		1709.04905 - One-Shot Visual Imitation Learning via Meta-Learning.pdf
1709.10089 - Overcoming Exploration in Reinforcement Learning with Demonstrations.pdf		1709.10089 - Overcoming Exploration in Reinforcement Learning with Demonstrations.pdf
1710.01813 - Neural Task Programming- Learning to Generalize Across Hierarchical Tasks.pdf		1710.01813 - Neural Task Programming- Learning to Generalize Across Hierarchical Tasks.pdf
1710.02298 - Rainbow: Combining Improvements in Deep Reinforcement Learning.pdf		1710.02298 - Rainbow: Combining Improvements in Deep Reinforcement Learning.pdf
1710.03641 - Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments.pdf		1710.03641 - Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments.pdf
1710.09767 - Meta Learning Shared Hierarchies.pdf		1710.09767 - Meta Learning Shared Hierarchies.pdf
1711.03817 - Learning with Options that Terminate Off-Policy.pdf		1711.03817 - Learning with Options that Terminate Off-Policy.pdf
1711.06025 - Learning to Compare- Relation Network for Few-Shot Learning.pdf		1711.06025 - Learning to Compare- Relation Network for Few-Shot Learning.pdf
1711.10314 - Crossmodal Attentive Skill Learner.pdf		1711.10314 - Crossmodal Attentive Skill Learner.pdf
1802.01557 - One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning.pdf		1802.01557 - One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning.pdf
1802.04821 - Evolved Policy Gradients.pdf		1802.04821 - Evolved Policy Gradients.pdf
1803.02999 - On First-Order Meta-Learning Algorithms.pdf		1803.02999 - On First-Order Meta-Learning Algorithms.pdf
An Efficient Approach to Model-Based Hierarchical Reinforcement Learning.pdf		An Efficient Approach to Model-Based Hierarchical Reinforcement Learning.pdf
Deep Learning for Robotics - Pieter Abbeel - NIPS 2017 Keynotes.pdf		Deep Learning for Robotics - Pieter Abbeel - NIPS 2017 Keynotes.pdf
GAN_series.md		GAN_series.md
README.md		README.md
Reinforcement Learning From Imperfect Demonstration.pdf		Reinforcement Learning From Imperfect Demonstration.pdf
Removing rain from single images via a deep detail network.md		Removing rain from single images via a deep detail network.md
Robust Hand Detection in Vehicles.md		Robust Hand Detection in Vehicles.md
Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning.pdf		Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning.pdf

YunqiuXu/Readings

Folders and files

Latest commit

History

Repository files navigation

Paper Notes

RL Related

Speech

To read

Currently no notes

Skipped

DL, ML, CV, etc.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages