Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed and added files according to Telegram Updated Repo #4

Merged
merged 4 commits into from
Apr 14, 2024

Conversation

iamMihirT
Copy link
Collaborator

No description provided.

@@ -0,0 +1,15 @@
import uuid
import telegram_scraper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import twitter scraper

Copy link
Contributor

@maany maany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it command line executable, fix the requirements, leave the server.py as it is ( on second thought, I will fix it for all the data pipelines later). Think of how to ensure the pipelines scrape all data, handle rate limits, etc. Use geo features if possible.

app/scraper.py Outdated
messages=[
{
"role": "user",
"content": f"Examine this tweet: {tweet['snippet']}. Is this tweet describing {filter_keyword}? "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not defined

@@ -0,0 +1,154 @@
import logging
from logging import Logger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to include instructor, pandas, openai, pydantic, requests in your requirements.txt

app/scraper.py Outdated
current_data = media_data
output_data_list.append(media_data)

scraped_data_repository.register_scraped_data(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function does not exist

@iamMihirT
Copy link
Collaborator Author

We still have doubts regarding the usage and importing of scraping api key. Also this version does version is not yet updated with openai. We are facing trouble in replicating what is shown in the video.
But we have kept all of the code similar to telegram.

@maany
Copy link
Contributor

maany commented Apr 14, 2024

Merging as is to unblock integration into kubeflow pipelines. Please test this and send follow up PRs ASAP.

@maany
Copy link
Contributor

maany commented Apr 14, 2024

Make sure requirements.txt is up to date

@maany maany merged commit 8b9b492 into dream-aim-deliver:main Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants