You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think this could be very valuable form the perspective of measuring the agent-simulators proclivity for modelling different agents in it's training distribution.
The text was updated successfully, but these errors were encountered:
A better version of this might be write a script which takes the training data and tests the predictions of the RL policies vs the agent simulator. We can think closely investigate examples with significant divergence and investigate the underlying mechanisms.
https://docs.google.com/document/d/1N1lVOXS5bLKYiXfoEeQoxxtI_0EfROi-JXcs-eYTCSA/edit?usp=sharing
I think this could be very valuable form the perspective of measuring the agent-simulators proclivity for modelling different agents in it's training distribution.
The text was updated successfully, but these errors were encountered: