-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement full-text scraping #563
Conversation
Do you plan to continue working on this? |
I use it everyday, so I at least plan to keep it working and fix the bugs. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
It has been marked as stale, please let me know if you are not interested in this and I will let it expire |
Oh I'm interested I thought you would fix the complains of the bots first :) |
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
Signed-off-by: Gioele Falcetti <thegio.f@gmail.com>
Ops... sorry, I hadn't noticed them. |
I took a look at this and to me it looks good. Why did you go for readability? Seems like it's not well maintained anymore. Maybe considering https://github.com/j0k3r/graby would be a good idea. |
I chose readability.php because it is a port of Mozilla's Readability.js which works really well on Firefox, and when I started using it, a few months ago, it was still maintained. Honestly I prefer readability.php because it's simpler, graby has a huge amount of dependency, but I'll give it a try to check how well it works and I'll let you know. |
I tried https://github.com/j0k3r/graby and I noticed that readability.php works better on the feeds that I use. But if you prefer to use graby in the official version of news, I have the code ready (https://github.com/DriverXX/news/tree/graby) and I can edit my pull request to use it, instead of readability. Please, let me know if you want to use graby, and I will edit my PR. |
No it's alright you have checked it and made a reasonable decision. I'm fine with it. It also looks not too complex to me so I think we will go with it. If it breaks at some point we might decide to remove it again, if no one is willing to fix it. |
awesome. Many thanks |
I have reimplemented full-text scraping using this library:
https://github.com/andreskrey/readability.php
What do you think about this?