Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.1.3 release #115

Merged
merged 62 commits into from
Aug 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
1a775c6
add support for 5.3.0
camfairchild Jul 5, 2023
de734f3
add requirement for bittensor_wallet
camfairchild Jul 5, 2023
7b03627
add test helpers
camfairchild Jul 5, 2023
e3a3163
patch wallet factory
camfairchild Jul 5, 2023
9a91bea
fix wandb weights logging
p-ferreira Jul 6, 2023
b6adb58
bump reqs
camfairchild Jul 11, 2023
beb5e84
fix test helpers
camfairchild Jul 11, 2023
6d1a001
add netuid tag to wandb
p-ferreira Jul 12, 2023
a7dd867
Merge pull request #88 from opentensor/fix/weights-wandb-log
p-ferreira Jul 13, 2023
b58df19
Merge pull request #95 from opentensor/fix/netuid-wandb-tag
p-ferreira Jul 13, 2023
86b9934
empty cache after saving
Eugene-hu Jul 13, 2023
219dd8e
Merge remote-tracking branch 'origin/main' into clear_cache
Eugene-hu Jul 14, 2023
96de39e
Use subtensor for meta sync (#79)
camfairchild Jul 15, 2023
b11c379
increase count limit
Eugene-hu Jul 20, 2023
5531e8e
comments
Eugene-hu Jul 24, 2023
d9b6550
Merge pull request #96 from opentensor/clear_cache
Eugene-hu Jul 24, 2023
4fc3a9c
Merge pull request #101 from opentensor/count_increase_limit
Eugene-hu Jul 24, 2023
af4d0e9
make reqs looser
camfairchild Jul 26, 2023
bacd9ee
pin below major instead
camfairchild Jul 26, 2023
c566b85
Merge branch 'staging' into add-bt-5.3.2-compat
camfairchild Jul 26, 2023
027044b
bump (and loosen) datsets to fix python compat
ifrit98 Jul 26, 2023
5c74113
add bittensor bump hotfix for this issue
ifrit98 Jul 26, 2023
1867dad
bump validators version and pin correct bittensor
ifrit98 Jul 26, 2023
9deeb91
fixing spelling
mrseeker Jul 27, 2023
288d76f
added historic diversity
isabella618033 Jul 28, 2023
b287c76
fix
isabella618033 Jul 28, 2023
d7596be
adds task_validator to masking pipeline
p-ferreira Jul 31, 2023
7dfabc4
adds unit tests for task validator
p-ferreira Jul 31, 2023
dbb4760
updates test_event with task validator
p-ferreira Jul 31, 2023
3ac87f7
Merge pull request #105 from opentensor/test_fixes
Eugene-hu Jul 31, 2023
746ca8d
removes comma from masking model initialization
p-ferreira Jul 31, 2023
06dc751
organize keyword order
p-ferreira Jul 31, 2023
327aa9d
complement prompt
p-ferreira Jul 31, 2023
b2ac365
historic similarity check at range 500 - 1500
isabella618033 Jul 31, 2023
33dfde6
adds extra verification for topic shifting on augment use cases
p-ferreira Jul 31, 2023
134769c
sets task validator to be case insensitive
p-ferreira Jul 31, 2023
7062e27
adds hotfix for empty entries in dataset + tests
p-ferreira Aug 1, 2023
c105533
longer history; removed normalization; setting bottom_k = 5
isabella618033 Aug 1, 2023
d861073
remove pip reference from README
p-ferreira Aug 1, 2023
de8d1ed
backwards compat
Eugene-hu Aug 1, 2023
ac309a9
requirements changes
Eugene-hu Aug 1, 2023
c4846ef
=
Eugene-hu Aug 1, 2023
483b9d4
adds changelog to repo
p-ferreira Aug 1, 2023
0dae02c
Merge pull request #110 from mrseeker/prompt-fixes
p-ferreira Aug 1, 2023
dc58c2e
Merge branch 'staging' into features/task-validator-filter
p-ferreira Aug 1, 2023
345e182
Merge pull request #107 from opentensor/add-bt-5.3.2-compat
Eugene-hu Aug 1, 2023
ecc7d5c
keep the original regularization
isabella618033 Aug 1, 2023
068e53c
Update requirements.txt
camfairchild Aug 1, 2023
492cfa3
dataset requirements
Eugene-hu Aug 1, 2023
ae6be1d
Merge pull request #114 from opentensor/hotfix/deprecate-pip-usage
p-ferreira Aug 1, 2023
d3e3e99
Merge pull request #113 from opentensor/hotfix/empty-dataset-entry
p-ferreira Aug 1, 2023
4e6a567
Merge pull request #116 from opentensor/v5.2_compat
Eugene-hu Aug 1, 2023
30c554d
removed init.py
isabella618033 Aug 1, 2023
771e0d7
Merge pull request #111 from opentensor/feature/historic_diversity
isabella618033 Aug 1, 2023
d0b8756
Merge branch 'staging' into hotfix/bump_datasets
ifrit98 Aug 1, 2023
5bf1d74
Merge pull request #109 from opentensor/hotfix/bump_datasets
ifrit98 Aug 1, 2023
10cb6de
Merge pull request #112 from opentensor/features/task-validator-filter
p-ferreira Aug 2, 2023
d34a88b
updates changelog
p-ferreira Aug 2, 2023
b5cac4c
bottom k 5 -> 2
isabella618033 Aug 2, 2023
445541d
Merge pull request #118 from opentensor/feature/historic_diversity
isabella618033 Aug 2, 2023
14e7db4
bug fix
isabella618033 Aug 2, 2023
96210b3
update release date on changelog
p-ferreira Aug 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Changelog

## 1.1.3 / 2023-08-02

### What’s Changed

- Adds subtensor to metagraph sync by @camfairchild in #79
- Fix wandb weights format logging by @p-ferreira in #88
- Adds netuid tag to wandb runs by @p-ferreira in #95
- Implements GPU cleaning for optmization by @Eugene-hu in #96
- Adds compatibility with bittensor 5.3.3 by @camfairchild in #107
- Adds historic diversity component by @isabella618033 in #111
- Improvements on diveristy model by @isabella618033 and @Eugene-hu in #111
- Prompt improvements by @mrseeker in #110 and @p-ferreira in #112
- Adds Task Validator Filter to reward pipeline by @p-ferreira in #112
- Fix for empty data retrieval from datasets by @p-ferreira in #113
- Deprecates pip usage by @p-ferreira in #114
12 changes: 2 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

# **Open Validators** <!-- omit in toc -->
[![Discord Chat](https://img.shields.io/discord/308323056592486420.svg)](https://discord.gg/bittensor)
[![PyPI version](https://badge.fury.io/py/openvalidators.svg)](https://badge.fury.io/py/openvalidators)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

---
Expand All @@ -19,7 +18,7 @@ It offers several functionalities, such as:

The main goal of this repository is to facilitate the interaction with the Bittensor network by providing a set of
open-source validators to the community. The current validator implementation queries the network for responses and
evaluations using carefully crafted prompts, that are later evaluated by a large foundation GPT-J reward model.
evaluations using carefully crafted prompts using CoT, that are later evaluated by a pipeline of reward functions, including diversity, relevance, rlhf, among others.

Additionally, the repository provides an analysis and data toolkit that allows users to analyze the data generated from
the validator's interaction with the network. By default, the validator collects various data points, such as question
Expand Down Expand Up @@ -69,14 +68,7 @@ There are currently four main avenues for engaging with this repository:
- Serves individuals, researchers, and developers who seek to create datasets for the community's miners.

# Install
There are two ways to use OpenTensor validators:

1. With pip:
```bash
$ pip3 install openvalidators
```

2. From source:
From source:
```bash
$ git clone https://github.com/opentensor/validators.git
$ pip3 install -e openvalidators/
Expand Down
2 changes: 1 addition & 1 deletion openvalidators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@
from . import weights
from . import event

__version__ = "1.1.2"
__version__ = "1.1.3"
version_split = __version__.split(".")
__spec_version__ = (1000 * int(version_split[0])) + (10 * int(version_split[1])) + (1 * int(version_split[2]))
6 changes: 6 additions & 0 deletions openvalidators/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,12 @@ def add_args(cls, parser):
action="store_true",
help="Dont apply the diversity reward model",
default=False,
)
parser.add_argument(
"--neuron.task_validator_off",
action="store_true",
help="Dont apply the task validator reward model",
default=False,
)

parser.add_argument(
Expand Down
16 changes: 11 additions & 5 deletions openvalidators/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,17 @@ def __init__(self):
self.openwebtext = iter( load_dataset("openwebtext", split="train", streaming=True).shuffle(seed=seed, buffer_size=10000) )
self.red_pajama = iter( load_dataset("togethercomputer/RedPajama-Data-1T", 'default', split='train', streaming=True).shuffle(seed=seed, buffer_size=10000) )

def __next__(self):
if random.random() < 0.5:
return {"text": next(self.openwebtext)["text"]}
else:
return {"text": next(self.red_pajama)["text"]}
def __next__(self):
while True:
bt.logging.debug('Retrieving data from dataset...')
if random.random() < 0.5:
text = next(self.openwebtext)["text"]
else:
text = next(self.red_pajama)["text"]

# Check if the text is not empty or does not consist only of newline characters
if text.strip():
return {"text": text}


class MockDataset(Iterator):
Expand Down
2 changes: 2 additions & 0 deletions openvalidators/event.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ class EventSchema:
rlhf_reward_model: Optional[List[float]] # Output vector of the rlhf reward model
prompt_reward_model: Optional[List[float]] # Output vector of the prompt reward model
relevance_filter: Optional[List[float]] # Output vector of the relevance scoring reward model
task_validator_filter: Optional[List[float]]

# Weights data
set_weights: Optional[List[List[float]]]
Expand All @@ -54,6 +55,7 @@ def from_dict(event_dict: dict, disable_log_rewards: bool) -> 'EventSchema':
rewards = {
'dahoas_reward_model': event_dict.get(RewardModelType.dahoas.value),
'blacklist_filter': event_dict.get(RewardModelType.blacklist.value),
'task_validator_filter': event_dict.get(RewardModelType.task_validator.value),
'nsfw_filter': event_dict.get(RewardModelType.nsfw.value),
'relevance_filter': event_dict.get(RewardModelType.relevance.value),
'reciprocate_reward_model': event_dict.get(RewardModelType.reciprocate.value),
Expand Down
12 changes: 5 additions & 7 deletions openvalidators/forward.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,13 @@ def get_random_uids(self, k: int, exclude: List[int] = None) -> torch.LongTensor
avail_uids.append(uid)
if uid_is_not_excluded:
candidate_uids.append(uid)

# Check if candidate_uids contain enough for querying, if not grab all avaliable uids
if len(candidate_uids) > k:
available_uids = torch.tensor(candidate_uids, dtype=torch.int64).to(self.device)
else:
available_uids = torch.tensor(avail_uids, dtype=torch.int64).to(self.device)

available_uids = candidate_uids
if len(candidate_uids) < k:
available_uids += random.sample([uid for uid in avail_uids if uid not in candidate_uids], k-len(candidate_uids))

uids = torch.tensor(random.sample(available_uids.tolist(), k), dtype=torch.int64)
uids = torch.tensor(random.sample(available_uids, k), dtype=torch.int64)
return uids


Expand Down
45 changes: 30 additions & 15 deletions openvalidators/neuron.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
# Load gating models
from openvalidators.reward import (
Blacklist,
TaskValidator,
NSFWRewardModel,
OpenAssistantRewardModel,
ReciprocateRewardModel,
Expand Down Expand Up @@ -62,6 +63,10 @@ def config(cls):
def run(self):
run(self)

subtensor: "bt.subtensor"
wallet: "bt.wallet"
metagraph: "bt.metagraph"

def __init__(self):
self.config = neuron.config()
self.check_config(self.config)
Expand All @@ -84,12 +89,15 @@ def __init__(self):
self.wallet = bt.wallet(config=self.config)
self.wallet.create_if_non_existent()
if not self.config.wallet._mock:
self.wallet.reregister(subtensor=self.subtensor, netuid=self.config.netuid)
if not self.subtensor.is_hotkey_registered_on_subnet(hotkey_ss58=self.wallet.hotkey.ss58_address, netuid=self.config.netuid):
raise Exception(f'Wallet not currently registered on netuid {self.config.netuid}, please first register wallet before running')

bt.logging.debug(str(self.wallet))

# Init metagraph.
bt.logging.debug("loading", "metagraph")
self.metagraph = bt.metagraph(netuid=self.config.netuid, network=self.subtensor.network)
self.metagraph = bt.metagraph(netuid=self.config.netuid, network=self.subtensor.network, sync=False) # Make sure not to sync without passing subtensor
self.metagraph.sync(subtensor=self.subtensor) # Sync metagraph with subtensor.
self.hotkeys = copy.deepcopy(self.metagraph.hotkeys)
bt.logging.debug(str(self.metagraph))

Expand Down Expand Up @@ -180,24 +188,31 @@ def __init__(self):

bt.logging.error(message)
raise Exception(message)


# Masking functions
self.blacklist = (
Blacklist() if not self.config.neuron.blacklist_off else MockRewardModel(RewardModelType.blacklist.value)
)
task_validator = (
TaskValidator() if not self.config.neuron.task_validator_off
else MockRewardModel(RewardModelType.task_validator.value)
)
relevance_model = (
RelevanceRewardModel(device=self.device) if not self.config.neuron.relevance_off
else MockRewardModel(RewardModelType.relevance.value)
)
diversity_model = (
DiversityRewardModel(device=self.device) if not self.config.neuron.diversity_off
else MockRewardModel(RewardModelType.diversity.value)
)
nsfw_model = (
NSFWRewardModel(device=self.device) if not self.config.neuron.nsfw_off
else MockRewardModel(RewardModelType.nsfw.value)
)

self.masking_functions = [
self.blacklist,
RelevanceRewardModel(device=self.device)
if not self.config.neuron.relevance_off
else MockRewardModel(RewardModelType.relevance.value),
DiversityRewardModel(device=self.device)
if not self.config.neuron.diversity_off
else MockRewardModel(RewardModelType.diversity.value),
NSFWRewardModel(device=self.device)
if not self.config.neuron.nsfw_off
else MockRewardModel(RewardModelType.nsfw.value),
]
self.masking_functions = [self.blacklist, task_validator, relevance_model, diversity_model, nsfw_model]
bt.logging.debug(str(self.reward_functions))
bt.logging.debug(str(self.masking_functions))

# Init the event loop.
self.loop = asyncio.get_event_loop()
Expand Down
10 changes: 5 additions & 5 deletions openvalidators/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def find_unique_tags(input_text: str):


# Request a follow-up question given a preceding context.
followup_request_template = "Ask one relevant and insightful question about the preceding context"
followup_request_template = "Ask a single relevant and insightful question about the preceding context"

# Scores a summary on a scale from 0 to 10, given a context.
augment_scoring_template = """Score the relevance, succinctness, and quality of a summary given a context. The context is within <Context></Context> tags, and the question is within <Summary></Summary> tags. Give a score between 0 and 10 in the <Score></Score> tags, where 0 means the summary is irrelevant, and 10 means it's perfectly relevant and a good summary. Include a brief explanation for your score based solely on the context-summary relationship.
Expand Down Expand Up @@ -348,16 +348,16 @@ def find_unique_tags(input_text: str):

def followup_prompt( base_text:str, i:int = 0) -> str:
if i == 0:
return f"{base_text}\n\n{followup_request_template}\n"
return f"{base_text}\n\n{followup_request_template}\n. Do not try to return an answer or a summary:"
else:
return f"{base_text}\n\n{followup_request_template} and previous questions\n"
return f"{base_text}\n\n{followup_request_template} and previous questions. Do not try to return an answer or a summary:\n"


def answer_prompt( base_text:str, followup:str ) -> str:
return f"{base_text}\n Question:{followup}\n Answer the question step by step and explain your thoughts"
return f"{base_text}\n\nQuestion:{followup}\nAnswer the question step by step and explain your thoughts. Do not include questions or summaries in your answer."

augment_request_template = "Summarize the preceding context"

def augment_prompt( base_text:str ) -> str:
random_level = random.randint(4, 8)
return f"{base_text}\n\n{augment_request_template} in {random_level} sentences.\n\n"
return f"{base_text}\n\n{augment_request_template} in {random_level} sentences. Do not try to create questions or answers for your summarization.\n\n"
3 changes: 2 additions & 1 deletion openvalidators/reward/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .blacklist import Blacklist
from .task_validator import TaskValidator
from .nsfw import NSFWRewardModel
from .open_assistant import OpenAssistantRewardModel
from .reciprocate import ReciprocateRewardModel
Expand All @@ -8,4 +9,4 @@
from .dahoas import DahoasRewardModel
from .diversity import DiversityRewardModel
from .prompt import PromptRewardModel
from .config import RewardModelType, DefaultRewardFrameworkConfig
from .config import RewardModelType, DefaultRewardFrameworkConfig
1 change: 1 addition & 0 deletions openvalidators/reward/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ class RewardModelType(Enum):
blacklist = 'blacklist_filter'
nsfw = 'nsfw_filter'
relevance = 'relevance_filter'
task_validator = 'task_validator_filter'


@dataclass(frozen=True)
Expand Down
59 changes: 53 additions & 6 deletions openvalidators/reward/diversity.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ def __init__( self, device: str ):
self.tokenizer = AutoTokenizer.from_pretrained( DiversityRewardModel.diversity_model_path )
self.model = AutoModel.from_pretrained( DiversityRewardModel.diversity_model_path ).to(self.device)
self.reward_quantile = torch.tensor(0.1).to(self.device)
self.history_reward_bottom_k = 2
self.historic_embeddings = torch.tensor([]).to(self.device)
self.history_range = (500, 15500)

def get_embeddings( self, sentences: List[str] ) -> "torch.FloatTensor":
"""Runs a forward pass through the model.
Expand Down Expand Up @@ -86,20 +89,64 @@ def get_embeddings( self, sentences: List[str] ) -> "torch.FloatTensor":
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
return sentence_embeddings

def get_rewards( self, prompt: str, completions: List[str], name: str ) -> torch.FloatTensor:
def update_historic_embeddings( self, embeddings: torch.FloatTensor ):
def unique(embeddings):
unique_embeddings = [embeddings[0]]
last_emb = embeddings[0]
for emb in embeddings:
if not torch.all(torch.eq(emb, last_emb)):
unique_embeddings.append(emb)
last_emb = emb
return torch.stack(unique_embeddings)

embeddings_unique = unique(embeddings)
historic_embeddings = torch.cat([self.historic_embeddings, embeddings_unique])
self.historic_embeddings = historic_embeddings[-self.history_range[1]:, :]

def get_historic_rewards( self, embeddings: torch.FloatTensor ) -> torch.FloatTensor:
def regularise( rewards ):
# sigmoid function that cutoff at 0.05 approximately
return 1/(1 + torch.exp(-1000 * rewards + 50))

# Return None if history size is too small
if self.historic_embeddings.shape[0] < (self.history_range[0] + self.history_reward_bottom_k):
return None

# Calculate the pairwise cosine similarity.
similarity = pairwise_cosine_similarity( embeddings, self.historic_embeddings[self.history_range[0]:] )

# Reward to be at the bottom_k smallest of the 1 - similarity score.
rewards = torch.topk((1 - similarity), self.history_reward_bottom_k, largest = False)[0][:, -1]

return regularise(rewards)

def get_batch_rewards( self, embeddings: torch.FloatTensor ) -> torch.FloatTensor:
# Calculate the pairwise cosine similarity.
similarity = pairwise_cosine_similarity( embeddings, embeddings )

# Reward to be at the 10% quantile of the 1 - similarity score.
rewards = (1 - similarity).quantile(self.reward_quantile, dim = 1 )

return rewards

def get_rewards( self, prompt: str, completions: List[str], name: str ) -> torch.FloatTensor:
# Check if completions are empty, return 0 if so
if len(completions) == 0:
return torch.tensor([]).to(self.device)

# Get embeddings for all completions.
embeddings = self.get_embeddings( completions )

# Calculate the pairwise cosine similarity.
similarity = pairwise_cosine_similarity( embeddings, embeddings )
# Get batch rewards.
batch_rewards = self.get_batch_rewards(embeddings)

# Reward to be at the 10% quantile of the 1 - similarity score.
rewards = (1 - similarity).quantile(self.reward_quantile, dim = 1 )
# get historic rewards.
historic_rewards = self.get_historic_rewards(embeddings)

self.update_historic_embeddings(embeddings)

# Return all
return rewards
if historic_rewards != None:
return batch_rewards * historic_rewards
else:
return batch_rewards
2 changes: 1 addition & 1 deletion openvalidators/reward/reward.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def __init__(self) -> None:
self.count = 0
self.mean = 0.0
self.var = 0.0
self.count_limit = 1000
self.count_limit = 3000

def normalize_rewards( self, rewards: torch.FloatTensor ) -> torch.FloatTensor:
"""
Expand Down
Loading