Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Non-adaptative Agent Comparisions #276

Merged
merged 17 commits into from
Aug 21, 2023

Conversation

TimotheeMathieu
Copy link
Collaborator

@TimotheeMathieu TimotheeMathieu commented Jan 24, 2023

Description

In this PR I introduce a new function compare_agents.

Given n_agents agents that have each been fitted n_fit times, we evaluate these agents and compare them using a multiple test in order to know which agent are statistically different and which are not.

Two methods are implemented: Tukey HSD (parametric, suppose that the evaluations are Gaussians) and Permutation test with StepDown method (non parametric, suppose only a finite second moment). The results are illustrated with a boxplot and a heatmap. In the case of Tukey HSD we also have access to some adapted p-values to quantify the certainty of the test.

Example :

EDIT: now with a simple text (dataframe) output:

       Agent1 vs Agent2  mean Agent1  mean Agent2   mean diff    std diff decisions     p-val significance
0  A2CAgent vs PPOAgent   213.600875   423.431500 -209.830625  144.600160    reject  0.002048           **
1  A2CAgent vs DQNAgent   213.600875   443.296625 -229.695750  152.368506    reject  0.000849          ***
2  PPOAgent vs DQNAgent   423.431500   443.296625  -19.865125  104.279024    accept  0.926234             

Still TODO:

  • Be able to use pickle files instead of list of agent managers
  • Tests

@KohlerHECTOR KohlerHECTOR mentioned this pull request Jul 13, 2023
@TimotheeMathieu TimotheeMathieu merged commit cc84a0f into rlberry-py:main Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant