-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Share a project #9
Comments
To piggyback off this sharing post, I made a web app to convert recipes from volumetric to metric units (mainly for the purpose of baking). See gif below for demo usage. Repo: https://github.com/justinmklam/recipe-converter Thanks again for creating this great library! It really opens up opportunities to create new projects with this as leverage. |
I suppose I should contribute my quick script too! I'm more of a terminal guy so I wrote a quick python script to convert a recipe into markdown that can be |
Re-importing and re-indexing recipe content into https://www.reciperadar.com was a breeze yesterday, largely thanks to I'd like to add a big thanks to @hhursev and @bfcarpio in particular (although to everyone who has contributed to |
I've created recipe-crawler, which is a configurable web crawler for recipes. It uses recipe-scraper for a couple of websites that don't have data structured in the schema.org/Recipe format. Please crawl responsibly. |
I've worked on a recipe book app for the last 3 years. Until recently, I had built my own massively over complicated recipe scraper so when I found recipe-scrapers project it was such a great day. Anyway, the installable web version of the app is nearly ready for 1.0 and folks can start using it at https://app.sharpcooking.net. The project is open source and available at GitHub sharpcooking-web. Thanks for this great project! |
Since we don't have a mailing list for users of the library, I'm going to share this here, because hopefully people with related projects will find it useful: We now have a developer documentation section that should help to make it easier to develop and maintain scrapers. Many thanks to @strangetom for writing this up! |
First off, I love this repo so thanks to @hhursev and all the contributors! That being said, the first question I had when I found it was "so, where do I get the recipes?”. So I made a quick tool, recipe-urls, to compile recipe-specific urls from any given base url, to then be fed into recipe-scrapers. Check it out if you'd like... or don't! Still requires some brute force url compiling, but increased my output considerably. |
Very interesting! I've had people ask similar things about my own recipe book app. Question for @mkayeterry: could you improve the URL listing by leveraging the site's sitemap.xml? Virtually every side has it because of SEO and they should list all URLS there directly. Your current filtering would work well with that too. In any case, this is a cool and useful project! |
@jlucaspains Oh that's interesting! I'm pretty new to anything front end (over here frantically trying to figure out what a sitemap.xml is), so I'll definitely look into it more. Sounds promising and I'm very open to making the current setup a little more robust! |
I've put together an ingredient parsing python package ingredient-slicer, which will parse ingredient strings (i.e. "2 1/2 cups of tomato sauce") and do a best effort extraction of the I made ingredient-slicer because I needed a lightweight ingredient parser with Its by no means perfect for extracting An example to illustrate: pip install ingredient-slicer import ingredient_slicer
slicer = ingredient_slicer.IngredientSlicer("2 (15-ounces) cans chickpeas, rinsed and drained")
slicer.to_json()
{
'ingredient': '2 (15-ounces) cans chickpeas, rinsed and drained',
'standardized_ingredient': '2 cans chickpeas, rinsed and drained',
'food': 'chickpeas',
# primary quantity and units
'quantity': '30',
'unit': 'ounces',
'standardized_unit': 'ounce',
# any other secondary quantity and units found in the string
'secondary_quantity': '2',
'secondary_unit': 'cans',
'standardized_secondary_unit': 'can',
'gram_weight': '850.49',
'prep': ['drained', 'rinsed'],
'size_modifiers': [],
'dimensions': [],
'is_required': True,
'parenthesis_content': ['15 ounce']
} It fixed a problem for me so thought it might be helpful for other people too! |
Hey, over the past year or so I wanted to dive deeper into Python-development, so I used this project as a basis for my CLI-app recipe2txt. This was my motivation to examine various aspects of the language and Python-project-management a little closer, so it may be unconventional in some parts, but as far as I know everything works. Features include asynchronous fetching, jinja-templating and local caching of recipes. And (maybe the most interesting part for recipe-scrapers) it generates formatted Github-issues if any scraping-errors are encountered during the process, so that the user can easily report any errors here. Thank you to all contributors here that made the hard part of recipe-scraping easy! |
Hey has anyone scraped all the available or a large amount of data and could share? I have a research project I want to launch and need as much data as possible. |
Hey all! I'm working on a tool that maintains a database by scraping all recipe pages from a given website. It pulls the sitemap, selects all pages with recipes and then creates a dict or json file with all metadata scraped by recipe-scraper. Feel free to check it out at recipe-database-scraper @mkayeterry I realise there's a bit of overlap with the repo you shared earlier this year. Hope you don't mind. One of my goals was to continue finding new recipe pages added to a website. I couldn't figure out a good way to reconcile that with your repo, so I went in a different direction. |
@timsamart started working on that now, but it'll take a while to go through all websites. |
- Enforce usage of templates by disabling blank issues - Add 4 streamlined templates: - New website requests - Website bug reports - Show and tell (with link to #9) - Generic template for other issues - Collect optional feedback about package usage and discovery
I thought I'd share what I made with this: https://archive.org/details/recipes-en-201706
A full version of allrecipes, epicurious, cookstr, and bbc.co.uk, parsed into nice JSON with photos.
Sorry to abuse 'issues', there's no option to send a private message on github as far as I know.
The text was updated successfully, but these errors were encountered: