Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register seeds send in via AMQP #136

Merged
merged 1 commit into from
Jan 13, 2016

Conversation

anjackson
Copy link
Collaborator

The original implementation allows the submitted URLs to be marked as seeds, but they are not properly registered as such and so things like SURT-prefix scoping don't pick them up.

These modifications wire the seeds in and add new seeds properly rather than just popping them in the frontier.

@nlevitt
Copy link
Contributor

nlevitt commented Jan 13, 2016

I don't remember being very aware of the support for isSeed and forceFetch parameters on incoming amqp messages that snuck in as part of #128. ;) What are you guys using these for and how are you setting them?

@anjackson
Copy link
Collaborator Author

Sorry! Didn't mean to sneak anything in! I'm moving towards using the AMQP channel as a general method for setting up and feeding long-running crawls. For example, we want to run regularly crawled seeds through the web renderer before passing them to H3 (and in general want to use queues more to manage chains of processes), so it's helpful if the AMQP hook lets us submit seeds (to extend scope) and force fetching (so we can get new content links from changes in seed pages).

We can just fork the module and keep it elsewhere if it's causing problems, but I thought it might be useful for others.

nlevitt added a commit that referenced this pull request Jan 13, 2016
@nlevitt nlevitt merged commit df5748d into internetarchive:master Jan 13, 2016
@nlevitt
Copy link
Contributor

nlevitt commented Jan 13, 2016

Cool! Glad the class is useful beyond its original intended purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants