Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Implementing SyntheticContinuousBanditDataset #112

Merged
merged 37 commits into from
Aug 15, 2021

Conversation

usaito
Copy link
Contributor

@usaito usaito commented Jul 6, 2021

new feature

from obp.dataset import (
    SyntheticContinuousBanditDataset,
    linear_reward_funcion_continuous,
    linear_behavior_policy_continuous,
)

dataset = SyntheticContinuousBanditDataset(
    dim_context=5,
    min_action_value=1, 
    max_action_value=10,
    reward_function=linear_reward_funcion_continuous,
    behavior_policy_function=linear_behavior_policy_continuous,
    random_state=12345,
)
bandit_feedback = dataset.obtain_batch_bandit_feedback(n_rounds=10000)
  • Note that by setting min_action_value and max_action_value, we can control the action space (\mathcal{A} = [1,10] in the above example code)

tests

refactor

  • rename action_prob to pscore in test_synthetic.py
    pscore = linear_behavior_policy(context=context, action_context=action_context)
    assert pscore.shape[0] == n_rounds and pscore.shape[1] == n_actions
    assert np.all(0 <= pscore) and np.all(pscore <= 1)

@usaito usaito changed the title Feature: Implement SyntheticContinuousBanditDataset Feature: Implementing SyntheticContinuousBanditDataset Jul 8, 2021
@nomuramasahir0
Copy link
Contributor

Other than the above minor points, LGTM!

@usaito
Copy link
Contributor Author

usaito commented Jul 10, 2021

@nmasahiro Thanks!

usaito and others added 7 commits July 14, 2021 23:25

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Feature: Implementing Continuous OPE Estimators

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Feature: Implementing Continuous NN Policy Learner

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@usaito usaito merged commit a4b61e9 into master Aug 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants