-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed and added files according to Telegram Updated Repo #4
Conversation
tests/test_scraper.py
Outdated
@@ -0,0 +1,15 @@ | |||
import uuid | |||
import telegram_scraper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import twitter scraper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make it command line executable, fix the requirements, leave the server.py as it is ( on second thought, I will fix it for all the data pipelines later). Think of how to ensure the pipelines scrape all data, handle rate limits, etc. Use geo features if possible.
app/scraper.py
Outdated
messages=[ | ||
{ | ||
"role": "user", | ||
"content": f"Examine this tweet: {tweet['snippet']}. Is this tweet describing {filter_keyword}? " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not defined
@@ -0,0 +1,154 @@ | |||
import logging | |||
from logging import Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to include instructor, pandas, openai, pydantic, requests in your requirements.txt
app/scraper.py
Outdated
current_data = media_data | ||
output_data_list.append(media_data) | ||
|
||
scraped_data_repository.register_scraped_data( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function does not exist
We still have doubts regarding the usage and importing of scraping api key. Also this version does version is not yet updated with openai. We are facing trouble in replicating what is shown in the video. |
Merging as is to unblock integration into kubeflow pipelines. Please test this and send follow up PRs ASAP. |
Make sure requirements.txt is up to date |
No description provided.