- Email: muupan@gmail.com
- Research interests: reinforcement learning, machine learning
- Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis
- ChainerRL: A Deep Reinforcement Learning Library
- Yasuhiro Fujita, Prabhat Nagarajan, Toshiki Kataoka, Takahiro Ishikawa
- Journal of Machine Learning Research, 22(77), 1-14. arXiv code (Chainer) code (PyTorch)
- Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators
- Yasuhiro Fujita, Kota Uenishi, Avinash Ummadisingu, Prabhat Nagarajan, Shimpei Masuda, Mario Ynocente Castro
- IROS, 2020. arXiv
- Learning Latent State Spaces for Planning through Reward Prediction
- Aaron Havens, Yi Ouyang, Prabhat Nagarajan, Yasuhiro Fujita
- NeurIPS Deep Reinforcement Learning Workshop, 2019. arXiv
- A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning
- Yoshihiro Nagano, Shoichiro Yamaguchi, Yasuhiro Fujita, Masanori Koyama
- ICML, 2019. arXiv
- Toward Onboard Control System for Mobile Robots via Deep Reinforcement Learning
- Megumi Miyashita, Shirou Maruyama, Yasuhiro Fujita, Mitsuru Kusumoto, Tobias Pfeiffer, Eiichi Matsumoto, Ryosuke Okuta, Daisuke Okanohara
- NeurIPS Deep Reinforcement Learning Workshop, 2018. pdf
- Model-Based Reinforcement Learning via Meta-Policy Optimization
- Ignasi Clavera, Jonas Rothfuss, John Schulman, Yasuhiro Fujita, Tamim Asfour, Pieter Abbeel
- CoRL, 2018. arXiv
- Clipped Action Policy Gradient
- Experience Replay with Random Reshuffling
- Entropy Controllable Direct Preference Optimization
- Motoki Omura, Yasuhiro Fujita, Toshiki Kataoka
- Preprint, 2024. arXiv
- PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
- Preferred Elements, Inc.
- Technical paper, 2024. arXiv
- Reinforcement Learning: An Introduction (second edition)
- Co-translated into Japanese: 強化学習 (第2版)
- Algorithms of Reinforcement Learning
- Co-translated into Japanese: 速習 強化学習 ―基礎理論とアルゴリズム―
- ゼロから始める深層強化学習 / Introduction of Deep Reinforcement Learning
- 言語処理学会第24回年次大会(NLP2018)チュートリアル. slides
- ChainerRL: A deep RL library in Python and Chainer
- PFRL: A deep RL library in Python and PyTorch
- async-rl: An A3C implementation in Python and Chainer
- DQN-in-the-Caffe: A DQN implementation in C++ and Caffe
- Engineer at Preferred Elements, Inc. (November 2024 - Present)
- Research and development in post-training of large language models.
- Engineer at Preferred Networks, Inc. (April 2015 - Present)
- Program committee: Deep Reinforcement Learning Workshop at NeurIPS (2018-2022)
- Guest lecturer: RL part of 先端人工知能論II at the University of Tokyo (2016-2018)
- M.S. Information Science and Technology (April 2013 - March 2015)
- Graduate School of Information Science and Technology, The University of Tokyo
- Thesis: “Automatic Feature Generation and Model Learning for General Game Players Based on Reinforcement Learning“
- B.S Engineering (April 2011 - March 2013)