Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error trying to train, WordCountVectorizer missing parameter $maxDocumentFrequency #3

Closed
bavamont opened this issue Jul 7, 2020 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@bavamont
Copy link

bavamont commented Jul 7, 2020

I am getting this error, when I am trying to train using your train.php (https://github.com/RubixML/Sentiment/blob/master/train.php) example:
Fatal error: Uncaught TypeError: Argument 3 passed to Rubix\ML\Transformers\WordCountVectorizer::__construct() must be of the type int, object given....

In your example on Line 44 you have:
new WordCountVectorizer(10000, 3, new NGram(1, 2)),

But the constuctor for WordCountVectorizer expects this:
public function __construct(
int $maxVocabulary = PHP_INT_MAX,
int $minDocumentFrequency = 1,
int $maxDocumentFrequency = PHP_INT_MAX,
?Tokenizer $tokenizer = null
)
What would be your recommended parameters for WordCountVectorizer for your example to work best?

@andrewdalpino
Copy link
Member

Good catch! Did you upgrade versions recently? We added the $maxDocumentFrequency parameter in 0.1.0-rc5 ... thanks for the reminder I am going to update the train script!

Let's try a setting of 5000 for maxDocumentFrequency ... let me know if you get better results with a different setting

Also if you'd like to join our channel on Telegram https://t.me/RubixML

@andrewdalpino andrewdalpino added the bug Something isn't working label Jul 8, 2020
@andrewdalpino andrewdalpino self-assigned this Jul 8, 2020
@andrewdalpino
Copy link
Member

Should be fixed in the latest update d076d86

Thanks again @bavamont!

@andrewdalpino andrewdalpino changed the title Error trying to train using your train.php example Error trying to train, WordCountVectorizer missing parameter $maxDocumentFrequency Jul 8, 2020
@bavamont
Copy link
Author

bavamont commented Jul 8, 2020

Thank you @andrewdalpino !
I’ll try it with 5000 for maxDocumentFrequency.
Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants