-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Royce branch #109
Open
xKimChip
wants to merge
93
commits into
Mondego:master
Choose a base branch
from
26dre:royce_branch
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Royce branch #109
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…website token sets, the same pattern can be modeled for other global data structures; changes made have been non destructive in theory as everything that should have been implemented has been commmented out and currently put behind an if true block
Fixed mutexing issues
…wler4py into royce_branch
…nst whether or not the URL is similar or not and whether or not to evaluate a url based on the path similarity
Added changes mostly for checking against the url similarity and giving it a score
…tion that can be used anywhere and should likely be extracted out into its own separate file for testing
…firm that it works, changed global url similarity to a higher threshold of .85 instead of .8
Merging changes from link_similarity.py and the new test_suite.py into master branch, non destructive change
put the ngrams in their own module to allow for easier code readability
… make easier and more logical to read the code base, will have to add changes in ngrams.py to allow for the reading and writing of the ngrams globals instead of in globals.py. only ngrams alters and acesses these variables so it's more logical to include them there
update logging
…ons and the like, the testing was done in the main and can be uncommeneted, two additinoal files are needed for testing and there is my own string hashing function (extremely basic and does not take order into account) and pythons hash that can be chosen between and is declared as a global variable that is inteded only to be read and never changed (treat as const)
fixed ngrams turns out i was doing doing some wonky stuff with additi…
added some extra safety checks and made everything accessible via mul…
added some basic changes to the scraper so that it works thru more regex
added one extra safety check inside of the worker
…y checking to make sure that the urls thtat it gets are not currently already being looked at at the same time
Andre branch
…e of readability. Also found the < operator that was breaking the code so provided the fix
… no other changes amde
…es a function with the same body but then utilizes a lock for thread safety
Several changes to globals.py, ngrams.py, scraper.py
…requires creating the main funciton but almost everything is alreday place such that it can work
…D or OR (no support yet for NOT)
…ed by sets and the like, no functinoal changes made to the code of the class, also makes the pickled file a global variable
M2 Completion
…d of a multiplier.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.