You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second half of two_diggers.py (currently on #302) is failing because the agents get rewards of 2 and 0, instead of 11 and 110.
Problem 1: Rewards for collecting item are all being sent to the first client, in continuous mode.
Problem 2: Rewards for discarding item are not being triggered at all when placing an item in continuous mode. (As with the discrete case, to avoid reward loopholes where an agent repeatedly gets the reward for collecting by placing and digging, we need use to trigger the discarding reward.)
The text was updated successfully, but these errors were encountered:
timhutton
changed the title
Rewards for picking item all being sent to the first client, in continuous mode.
Continuous-mode use/attack don't give correct rewards in multi-agent missions.
Aug 25, 2016
The second half of two_diggers.py (currently on #302) is failing because the agents get rewards of 2 and 0, instead of 11 and 110.
Problem 1: Rewards for collecting item are all being sent to the first client, in continuous mode.
Problem 2: Rewards for discarding item are not being triggered at all when placing an item in continuous mode. (As with the discrete case, to avoid reward loopholes where an agent repeatedly gets the reward for collecting by placing and digging, we need
use
to trigger the discarding reward.)The text was updated successfully, but these errors were encountered: