feat(2048): environment performance improvements #172

aar65537 · 2023-06-15T17:53:00Z

This PR improves the performance of the Game2048 environment. The improvements include

Minimizing logic inside of jax.lax.cond and jax.lax.switch
Using jax.vmap over jax.lax.scan where possible
A new move implementation
A can_move implementation that validates an action without mutating the board

	no vmap	vmap 10³	vmap 10⁶
cpu	64.36%	201.80%	392.29%
cuda	900.12%	1923.08%	706.87%

The above figure shows the total performance improvement measured as percent increase in steps/sec. For more detailed benchmarking, see here.

clement-bonnet · 2023-06-16T10:37:13Z

Hi @aar65537, thanks a lot for your suggestions for speed improvement! We will look into them, check that the environment's behavior has not changed, and get back to you shortly.

clement-bonnet

Thank you very much for the code and speed improvements! I have left a few comments for which I don't feel very strongly about.

jumanji/environments/logic/game_2048/utils.py

clement-bonnet · 2023-06-20T14:31:09Z

I have checked that this version is equivalent (in terms of environment behavior) to the current version. I obtained the same learning curves.
Moreover, on a TPU-v4 with 8 cores, I got the performances below (orange is 2048 in main, pink is this updated version):

Environment steps per second (random agent): higher is better

x12 improvement when randomly rolling out.

Train epoch time (a2c agent): lower is better

x12 improvement when training.

The other curves (learning metrics or episode returns) are completely equivalent in both versions.

clement-bonnet · 2023-06-20T14:31:41Z

If that's okay, I will resolve the comments by applying suggestions and will merge.

jumanji/environments/logic/game_2048/utils.py

aar65537 and others added 11 commits June 10, 2023 22:59

feat(2048): push switch statement inside move.

2e9f018

feat(2048): bring col mutations out of conditionals.

ca2e4ba

feat(2048): use vmap in move_left.

8f9a67b

feat(2048): use fori_loop where possible

e431a5e

feat(2048): implement single pass move

77fccd6

feat(2048): implement can_move without mutating board.

88a5128

docs(2048): Add comments

74c159c

feat(2048): vmap over actions in _get_action_mask

f5c34b3

feat(2048): capture board by closure in transform_board

d024cf7

fix(2048): args in wrong order

c7955dd

Merge branch 'instadeepai:main' into perf-improvements-2048

46789e7

Merge branch 'main' into perf-improvements-2048

49d4e1e

clement-bonnet mentioned this pull request Jun 19, 2023

build: bump version to 0.3.1 #177

Merged

Merge branch 'main' into perf-improvements-2048

e1dbdf2

clement-bonnet previously approved these changes Jun 20, 2023

View reviewed changes

Apply suggestions from code review

92cd2be

clement-bonnet dismissed their stale review via 92cd2be June 20, 2023 14:37

clement-bonnet previously approved these changes Jun 20, 2023

View reviewed changes

clement-bonnet enabled auto-merge (squash) June 20, 2023 14:38

clement-bonnet reviewed Jun 20, 2023

View reviewed changes

jumanji/environments/logic/game_2048/utils.py Outdated Show resolved Hide resolved

Apply suggestions from code review

d7710c8

clement-bonnet dismissed their stale review via d7710c8 June 20, 2023 14:44

clement-bonnet approved these changes Jun 20, 2023

View reviewed changes

clement-bonnet merged commit 32685cb into instadeepai:main Jun 20, 2023

aar65537 deleted the perf-improvements-2048 branch January 18, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(2048): environment performance improvements #172

feat(2048): environment performance improvements #172

aar65537 commented Jun 15, 2023

clement-bonnet commented Jun 16, 2023

clement-bonnet left a comment

clement-bonnet commented Jun 20, 2023

clement-bonnet commented Jun 20, 2023

feat(2048): environment performance improvements #172

feat(2048): environment performance improvements #172

Conversation

aar65537 commented Jun 15, 2023

clement-bonnet commented Jun 16, 2023

clement-bonnet left a comment

Choose a reason for hiding this comment

clement-bonnet commented Jun 20, 2023

Environment steps per second (random agent): higher is better

Train epoch time (a2c agent): lower is better

clement-bonnet commented Jun 20, 2023