Memory error when sampling from collector #184

Tortes · 2020-08-18T07:38:54Z

I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
I have visited the source website
I have searched through the issue tracker for duplicates

I have mentioned version numbers, operating system and environment, where applicable:

import tianshou, torch, sys
print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
0.2.5 1.6.0 3.7.7
[GCC 7.3.0] linux

The collector raised the memory error when I training under the self-defined environment with PPO algorithm. The full error message is as below:
MemoryError: Unable to allocate 8.05 GiB for an array with shape (20000, 54006) and data type float64
The self-defined environment uses List state observation and List discrete action. Is there any necessary to change the observation type to dict to fit tianshou and use a smaller buffer size?
I tried to change the sample size of on-policy but it will raise the dimension matching problem(of course). Any answer is welcome for the stupid problem.

The text was updated successfully, but these errors were encountered:

Trinkle23897 · 2020-08-18T07:40:37Z

Could you please provide a sample of your self-defined env data format? (obs, act, ...)

Tortes · 2020-08-18T07:55:15Z

I will show part of the reset() function:
The getState() function will collect and return the sensor data. Then the data will be packed and normalized as the observation (54006 dimension).

def reset(self):
        ...
        self.goal_distance = self.getNavGoalDistance()
        realsense_data, rel_dis, roll, pitch, yaw, rel_theta, diff_angle, done, arrive = self.getState()
        realsense_data = [i/12 for i in realsense_data]

        # Normalize the state
        state = realsense_data + [rel_dis / diagonal_dis, (roll+180)/360, (pitch+180)/360, yaw / 360, rel_theta / 360, diff_angle / 180]

        return np.asarray(state)

For the action, I will take step() function as example:
The action space is a discrete List with 2 dimension: (linear speed, angular speed), which is uniformly sampled.

    def step(self, action):
        linear_vel = self.action_space_discrete[action][0]
        ang_vel = self.action_space_discrete[action][1]
        # print(linear_vel, ang_vel)

        vel_cmd = Twist()
        vel_cmd.linear.x = linear_vel / 4
        vel_cmd.angular.z = ang_vel
        self.pub_cmd_vel.publish(vel_cmd)
        
        # Update sensor data
        # self.getSensor()

        # Update state observation
        realsense_data, rel_dis, roll, pitch, yaw, rel_theta, diff_angle, done, arrive = self.getState()

        # Normalize the state
        '''
        Realsense:  [0, 12] => [0,1]
        LiDAR:      [0, 30] => [0,1]
        roll, pitch:[-180, 180] => [0,1]
        '''
        # scan_data = [i/30 for i in scan_data]

        state = realsense_data + [rel_dis / diagonal_dis, (roll+180)/360, (pitch+180)/360, yaw / 360, rel_theta / 360, diff_angle / 180]
        reward = self.setReward(done, arrive)

        return np.asarray(state), reward, done, {}

Besides, I defined the action space with gym.spaces.Discrete(len(action_space)) and the observation space with gym.spaces.Box(low=0, high=1, shape=(54006,), dtype=np.float32)

Trinkle23897 · 2020-08-18T07:57:32Z

Okay, so how many RAM does your machine have?
Did you use ignore_obs_next=True in the replay buffer? (This can save half of the memory)

Tortes · 2020-08-18T08:05:59Z

The machine have 16GiB RAM and 16GiB Swap area.
I set the replay buffer as the default.(Only set the buffer size as 20000. The ignore_obs_next is set to be False as default.)

Trinkle23897 · 2020-08-18T08:07:22Z

So try opening ignore_obs_next first

Tortes · 2020-08-18T08:11:10Z

Really thanks, I will close the issue if there's no further problems.

Trinkle23897 · 2020-08-18T08:31:14Z

You may get another issue: when it is testing the policy, the program will be killed due to the lack of memory. This is because the collector still creates some cache_buffer even if the main_buffer is None. It will be fixed soon.

I post a hotfix here:

Tortes · 2020-08-20T02:07:28Z

Solved after updating the RAM to 24Gib

1. add policy.eval() in all test scripts' "watch performance" 2. remove dict return support for collector preprocess_fn 3. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)` 4. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (#184) 5. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard 6. add test_returns (both GAE and nstep) 7. change the type-checking order in batch.py and converter.py in order to meet the most often case first 8. fix shape inconsistency for torch.Tensor in replay buffer 9. remove `**kwargs` in ReplayBuffer 10. remove default value in batch.split() and add merge_last argument (#185) 11. improve nstep efficiency 12. add max_batchsize in onpolicy algorithms 13. potential bugfix for subproc.wait 14. fix RecurrentActorProb 15. improve the code-coverage (from 90% to 95%) and remove the dead code 16. fix some incorrect type annotation The above improvement also increases the training FPS: on my computer, the previous version is only ~1800 FPS and after that, it can reach ~2050 (faster than v0.2.4.post1).

1. add policy.eval() in all test scripts' "watch performance" 2. remove dict return support for collector preprocess_fn 3. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)` 4. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (thu-ml#184) 5. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard 6. add test_returns (both GAE and nstep) 7. change the type-checking order in batch.py and converter.py in order to meet the most often case first 8. fix shape inconsistency for torch.Tensor in replay buffer 9. remove `**kwargs` in ReplayBuffer 10. remove default value in batch.split() and add merge_last argument (thu-ml#185) 11. improve nstep efficiency 12. add max_batchsize in onpolicy algorithms 13. potential bugfix for subproc.wait 14. fix RecurrentActorProb 15. improve the code-coverage (from 90% to 95%) and remove the dead code 16. fix some incorrect type annotation The above improvement also increases the training FPS: on my computer, the previous version is only ~1800 FPS and after that, it can reach ~2050 (faster than v0.2.4.post1).

Trinkle23897 added the question Further information is requested label Aug 18, 2020

yingchengyang mentioned this issue Aug 19, 2020

DQN Atari examples #187

Merged

Trinkle23897 linked a pull request Aug 19, 2020 that will close this issue

DQN Atari examples #187

Merged

Tortes closed this as completed Aug 20, 2020

Trinkle23897 linked a pull request Aug 22, 2020 that will close this issue

optimize training procedure and improve code coverage #189

Merged

Trinkle23897 removed a link to a pull request Aug 22, 2020

DQN Atari examples #187

Merged

Trinkle23897 mentioned this issue Aug 24, 2020

optimize training procedure and improve code coverage #189

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory error when sampling from collector #184

Memory error when sampling from collector #184

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 •

edited

Loading

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 •

edited

Loading

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 •

edited

Loading

Tortes commented Aug 20, 2020

Memory error when sampling from collector #184

Memory error when sampling from collector #184

Comments

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 • edited Loading

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 • edited Loading

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020

Tortes commented Aug 18, 2020

Trinkle23897 commented Aug 18, 2020 • edited Loading

Tortes commented Aug 20, 2020

Trinkle23897 commented Aug 18, 2020 •

edited

Loading

Trinkle23897 commented Aug 18, 2020 •

edited

Loading

Trinkle23897 commented Aug 18, 2020 •

edited

Loading