Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory error when sampling from collector #184

Closed
5 of 8 tasks
Tortes opened this issue Aug 18, 2020 · 8 comments · Fixed by #189
Closed
5 of 8 tasks

Memory error when sampling from collector #184

Tortes opened this issue Aug 18, 2020 · 8 comments · Fixed by #189
Labels
question Further information is requested

Comments

@Tortes
Copy link

Tortes commented Aug 18, 2020

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, torch, sys
    print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
    0.2.5 1.6.0 3.7.7
    [GCC 7.3.0] linux

The collector raised the memory error when I training under the self-defined environment with PPO algorithm. The full error message is as below:
MemoryError: Unable to allocate 8.05 GiB for an array with shape (20000, 54006) and data type float64
The self-defined environment uses List state observation and List discrete action. Is there any necessary to change the observation type to dict to fit tianshou and use a smaller buffer size?
I tried to change the sample size of on-policy but it will raise the dimension matching problem(of course). Any answer is welcome for the stupid problem.

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Aug 18, 2020

Could you please provide a sample of your self-defined env data format? (obs, act, ...)

@Trinkle23897 Trinkle23897 added the question Further information is requested label Aug 18, 2020
@Tortes
Copy link
Author

Tortes commented Aug 18, 2020

I will show part of the reset() function:
The getState() function will collect and return the sensor data. Then the data will be packed and normalized as the observation (54006 dimension).

def reset(self):
        ...
        self.goal_distance = self.getNavGoalDistance()
        realsense_data, rel_dis, roll, pitch, yaw, rel_theta, diff_angle, done, arrive = self.getState()
        realsense_data = [i/12 for i in realsense_data]

        # Normalize the state
        state = realsense_data + [rel_dis / diagonal_dis, (roll+180)/360, (pitch+180)/360, yaw / 360, rel_theta / 360, diff_angle / 180]

        return np.asarray(state)

For the action, I will take step() function as example:
The action space is a discrete List with 2 dimension: (linear speed, angular speed), which is uniformly sampled.

    def step(self, action):
        linear_vel = self.action_space_discrete[action][0]
        ang_vel = self.action_space_discrete[action][1]
        # print(linear_vel, ang_vel)

        vel_cmd = Twist()
        vel_cmd.linear.x = linear_vel / 4
        vel_cmd.angular.z = ang_vel
        self.pub_cmd_vel.publish(vel_cmd)
        
        # Update sensor data
        # self.getSensor()

        # Update state observation
        realsense_data, rel_dis, roll, pitch, yaw, rel_theta, diff_angle, done, arrive = self.getState()

        # Normalize the state
        '''
        Realsense:  [0, 12] => [0,1]
        LiDAR:      [0, 30] => [0,1]
        roll, pitch:[-180, 180] => [0,1]
        '''
        # scan_data = [i/30 for i in scan_data]

        state = realsense_data + [rel_dis / diagonal_dis, (roll+180)/360, (pitch+180)/360, yaw / 360, rel_theta / 360, diff_angle / 180]
        reward = self.setReward(done, arrive)

        return np.asarray(state), reward, done, {}

Besides, I defined the action space with gym.spaces.Discrete(len(action_space)) and the observation space with gym.spaces.Box(low=0, high=1, shape=(54006,), dtype=np.float32)

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Aug 18, 2020

Okay, so how many RAM does your machine have?
Did you use ignore_obs_next=True in the replay buffer? (This can save half of the memory)

@Tortes
Copy link
Author

Tortes commented Aug 18, 2020

The machine have 16GiB RAM and 16GiB Swap area.
I set the replay buffer as the default.(Only set the buffer size as 20000. The ignore_obs_next is set to be False as default.)

@Trinkle23897
Copy link
Collaborator

So try opening ignore_obs_next first

@Tortes
Copy link
Author

Tortes commented Aug 18, 2020

Really thanks, I will close the issue if there's no further problems.

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Aug 18, 2020

You may get another issue: when it is testing the policy, the program will be killed due to the lack of memory. This is because the collector still creates some cache_buffer even if the main_buffer is None. It will be fixed soon.

I post a hotfix here:

2020-08-18 16-30-48 的屏幕截图

@Trinkle23897 Trinkle23897 linked a pull request Aug 19, 2020 that will close this issue
@Tortes
Copy link
Author

Tortes commented Aug 20, 2020

Solved after updating the RAM to 24Gib

@Tortes Tortes closed this as completed Aug 20, 2020
@Trinkle23897 Trinkle23897 linked a pull request Aug 22, 2020 that will close this issue
@Trinkle23897 Trinkle23897 removed a link to a pull request Aug 22, 2020
Trinkle23897 added a commit that referenced this issue Aug 27, 2020
1. add policy.eval() in all test scripts' "watch performance"
2. remove dict return support for collector preprocess_fn
3. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)`
4. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (#184)
5. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard
6. add test_returns (both GAE and nstep)
7. change the type-checking order in batch.py and converter.py in order to meet the most often case first
8. fix shape inconsistency for torch.Tensor in replay buffer
9. remove `**kwargs` in ReplayBuffer
10. remove default value in batch.split() and add merge_last argument (#185)
11. improve nstep efficiency
12. add max_batchsize in onpolicy algorithms
13. potential bugfix for subproc.wait
14. fix RecurrentActorProb
15. improve the code-coverage (from 90% to 95%) and remove the dead code
16. fix some incorrect type annotation

The above improvement also increases the training FPS: on my computer, the previous version is only ~1800 FPS and after that, it can reach ~2050 (faster than v0.2.4.post1).
BFAnas pushed a commit to BFAnas/tianshou that referenced this issue May 5, 2024
1. add policy.eval() in all test scripts' "watch performance"
2. remove dict return support for collector preprocess_fn
3. add `__contains__` and `pop` in batch: `key in batch`, `batch.pop(key, deft)`
4. exact n_episode for a list of n_episode limitation and save fake data in cache_buffer when self.buffer is None (thu-ml#184)
5. fix tensorboard logging: h-axis stands for env step instead of gradient step; add test results into tensorboard
6. add test_returns (both GAE and nstep)
7. change the type-checking order in batch.py and converter.py in order to meet the most often case first
8. fix shape inconsistency for torch.Tensor in replay buffer
9. remove `**kwargs` in ReplayBuffer
10. remove default value in batch.split() and add merge_last argument (thu-ml#185)
11. improve nstep efficiency
12. add max_batchsize in onpolicy algorithms
13. potential bugfix for subproc.wait
14. fix RecurrentActorProb
15. improve the code-coverage (from 90% to 95%) and remove the dead code
16. fix some incorrect type annotation

The above improvement also increases the training FPS: on my computer, the previous version is only ~1800 FPS and after that, it can reach ~2050 (faster than v0.2.4.post1).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants