Some Extra Comments #1

nablabits · 2023-07-19T18:17:45Z

I really liked the care you put into defining the problem (and the picture 🤩), thanks for sharing ✨

Sadly, I didn't manage to get the example working on my side. Adding num_beams=6 like gante did in his example seems to remove the words, not really sure if it's because of the bad_words_ids or because of the beam search sampling, though 🤔

(I decided to open this thread here just in case we further discuss/learn on this topic we won't flood the main thread on HF, hope it makes sense)

The text was updated successfully, but these errors were encountered:

SoyGema · 2023-07-19T18:30:18Z

Hey @nablabits thanks for reaching out! And thanks for trying ... I'm going to dig on this on the two mental models ( num beams and search and get back to you)
Sure !💫 Keeping the discussion here is totally logical.

Thanks for the contrib 😀

SoyGema · 2023-07-19T21:14:08Z

Hello there @nablabits .
I can confirm that the reproduction of the issue is correct . Reproduced the example without num_beams and it didn´t eliminate the word. When I introduced the parameter, it changes the generation.

SOLVED
EDITED : It was a matter of taking all the sequence as a part of the forbidden part.

If we go to generate num_beams arg definition, if it is not declared as a parameter it means that there is no beam search. When we go to the developer guide, what Im taking -please correct me if Im wrong- is that Beam Search reduces the risk of missing high probability word sequences eventually chooses the one with highest probability . If we go to the figure displayed at the end of the section, we see that is "less surprising" - If we go to other method, it selects another word, not the highest one.

Then we to the SequenceBiasLogitsProcessor documentation .

The bias is applied to the last token of a sequence when the next generated token can complete it. Consequently, to take the most of biasing sequences with more than one token, consider using beam methods (to gracefully work around partially completed sequences that have a negative bias) and applying the bias to their prefixes (to ensure the bias is applied earlier)

Also, At a high level, what Im understanding is that the class we are exploring , NoBadWordsLogitProcessor , is a subtype of SequenceBiasLogitProcessor in which that bias applied but with the sequence parameter set to -∞

--- not really sure if it's because of the bad_words_ids or because of the beam search sampling, though ---

So what I think It´s happening, it´s that Beam search is selecting the "less surprising and most probable generation" and then the argument bad_words_id prevents it from being generated. If we select None or another search Search method, ej Greedy, it might not end at the bad_word, therefore the generation continues as it was before because it doesn't match .

nablabits · 2023-07-20T18:46:25Z

Wow, thanks for sharing those resources, my understanding is that the beam method computes the joint prob for all the words in the beam making some relevant tokens appear even if they are followed by banned ones (which fits into your less surprising and most probable generation to some extent) , whereas, if you are on a greedy search once you have in your sequence a sink you are lost, does this make sense to you?

I tried your new example and it's still not working on my end, but I think I found why, it has to do with get_tokens_as_list() as there the word_list gets split by character instead of words, therefore just adding for word in word_list.split(" ") will do the trick making bad_words indistinguishable from sequence_biases:

inputs = tokenizer(["Margaret is outstanding, well known as a"], return_tensors="pt")

# SequenceBiasLogitsProcessor way
def get_tokens_as_tuple(word):
    return tuple(tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0])

sequence_bias = {
    get_tokens_as_tuple("great"): float("-inf"),
    get_tokens_as_tuple("writer"): float("-inf")
}
biased_ids = model.generate(inputs["input_ids"], max_new_tokens=3, sequence_bias=sequence_bias)
print(tokenizer.batch_decode(biased_ids, skip_special_tokens=True)[0])

# NoBadWordsLogitsProcessor flavour
def get_tokens_as_list(word_list):
    """Converts a sequence of words into a list of tokens """
    tokens_list = []
    for word in word_list.split(" "):
        tokenized_word = tokenizer_with_prefix_space([word], add_special_tokens=False).input_ids[0]
        tokens_list.append(tokenized_word)
    return tokens_list

word_list = "great writer"
bad_words_ids = get_tokens_as_list(word_list=word_list)

badwords_ids2 = model.generate(inputs["input_ids"], max_new_tokens=3, bad_words_ids=bad_words_ids, eos_token_id=tokenizer_with_prefix_space.eos_token_id)

print(tokenizer.batch_decode(badwords_ids2, skip_special_tokens=True)[0])

Ez horregaitik 😉

SoyGema · 2023-07-20T19:42:39Z

Hey 🙂 . Thanks. Now it works

How would you feel if I add you to the PR as a contributor somehow so we present this jointly ? You solved the one-word-sequence in the function ( I think the cake example solves the multiple-word-sequence ) and provided useful feedback + kindly tried all stuff . LMK if works for you 🙂

If that works let me find a way ( jointly PR, or add you as a reviewer, I have to check how to put in the PR 🥰)

nablabits · 2023-07-21T17:25:32Z

Ahh, glad to hear that. I know that you knew the split bit all along and were testing me 😉 (I hope I passed the test)

Greatly appreciated but, really, don't bother with that, I'm more than happy with the learning experience and the collaboration. Once said that, I rush to add that, some day and that day may never come I might request back this collaboration mwahahahah

SoyGema · 2023-07-21T17:41:59Z

" You come into my repo , the day we close the first good issue and you ask me to do a future collaboration - in the Open , hopefully " Count on it :) @nablabits

SoyGema added the good first issue Good for newcomers label Jul 19, 2023

SoyGema closed this as completed Jul 21, 2023

SoyGema mentioned this issue Aug 27, 2023

Add type hints for pytorch models (final batch) huggingface/transformers#25750

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Extra Comments #1

Some Extra Comments #1

nablabits commented Jul 19, 2023

SoyGema commented Jul 19, 2023

SoyGema commented Jul 19, 2023 •

edited

Loading

nablabits commented Jul 20, 2023

SoyGema commented Jul 20, 2023 •

edited

Loading

nablabits commented Jul 21, 2023

SoyGema commented Jul 21, 2023

Some Extra Comments #1

Some Extra Comments #1

Comments

nablabits commented Jul 19, 2023

SoyGema commented Jul 19, 2023

SoyGema commented Jul 19, 2023 • edited Loading

nablabits commented Jul 20, 2023

SoyGema commented Jul 20, 2023 • edited Loading

nablabits commented Jul 21, 2023

SoyGema commented Jul 21, 2023

SoyGema commented Jul 19, 2023 •

edited

Loading

SoyGema commented Jul 20, 2023 •

edited

Loading