Explore use of Scrapy contracts for lightweight testing of spiders #270

jpmckinney · 2020-01-30T23:48:45Z

https://docs.scrapy.org/en/latest/topics/contracts.html

We don't need to closely test the spiders, but it would be useful to be able to run scrapy check and quickly find out which spiders are broken in obvious ways (e.g. can't even complete the first request).

This would make it possible to prioritize fixes to spiders, before there is urgency, e.g. it's Monday, an data analyst needs to do an analysis by Friday, the spider is broken, it takes two days for availability to free up to fix it, it takes another day to run, and the analyst is left with little time.

The text was updated successfully, but these errors were encountered:

jpmckinney · 2024-04-09T23:33:29Z

We now run all (relevant) spiders regularly in the data registry, where once we close open-contracting/data-registry#29, we'll be able to monitor whether a spider is breaking or behaving differently.

Having reviewed Scrapy's contracts feature, I'm not sure how far it would get us (e.g. it can be used to check if at least one item is returned – but spiders can break in all kinds of ways, and it's hard to think of what custom contracts to write).

jpmckinney added framework Relating to other common functionality testing and removed framework Relating to other common functionality labels Jan 30, 2020

jpmckinney closed this as completed Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore use of Scrapy contracts for lightweight testing of spiders #270

Explore use of Scrapy contracts for lightweight testing of spiders #270

jpmckinney commented Jan 30, 2020

jpmckinney commented Apr 9, 2024

Explore use of Scrapy contracts for lightweight testing of spiders #270

Explore use of Scrapy contracts for lightweight testing of spiders #270

Comments

jpmckinney commented Jan 30, 2020

jpmckinney commented Apr 9, 2024