What's Changed
- Automatically determine num_actions and num_chance_outcomes in stochastic_muzero_policy (thanks Carlos Martin).
- Explicitly use int32 for the argmax output even when using jax_enable_x64.
Full Changelog: v0.0.3...v0.0.5
Full Changelog: v0.0.3...v0.0.5