Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore use of Scrapy contracts for lightweight testing of spiders #270

Closed
jpmckinney opened this issue Jan 30, 2020 · 1 comment
Closed

Comments

@jpmckinney
Copy link
Member

https://docs.scrapy.org/en/latest/topics/contracts.html

We don't need to closely test the spiders, but it would be useful to be able to run scrapy check and quickly find out which spiders are broken in obvious ways (e.g. can't even complete the first request).

This would make it possible to prioritize fixes to spiders, before there is urgency, e.g. it's Monday, an data analyst needs to do an analysis by Friday, the spider is broken, it takes two days for availability to free up to fix it, it takes another day to run, and the analyst is left with little time.

@jpmckinney jpmckinney added framework Relating to other common functionality testing and removed framework Relating to other common functionality labels Jan 30, 2020
@jpmckinney
Copy link
Member Author

We now run all (relevant) spiders regularly in the data registry, where once we close open-contracting/data-registry#29, we'll be able to monitor whether a spider is breaking or behaving differently.

Having reviewed Scrapy's contracts feature, I'm not sure how far it would get us (e.g. it can be used to check if at least one item is returned – but spiders can break in all kinds of ways, and it's hard to think of what custom contracts to write).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant