verge-digest: AI-powered article summarizer of TheVerge🪄
This project is aimed at performing text summarization of top tech news articles from The Verge using Google Gemini 1.5 Pro.
To access the live version of the app, click here.
The articles are fetched from the /v2/top-headlines
endpoint of newsapi.org with a fallback to the same API but with another key if it fails due to some reason. It gives 6 observations in JSON
format with various key value pairs like article
, date
, author
, link
, etc.
For summarizing, the Google Gemini API key from AI Studio.
GET https://newsapi.org/v2/top-headlines?sources=the-verge&apiKey={YOUR_KEY_HERE}&pageSize=6
Since I am using the free version of the API, it is limited to 100 requests over a 24 hour period (50 requests available every 12 hours).
The requirements.txt
has the following libraries:
- google-generativeai to build with Gemini API.
- streamlit to transform Python scripts into interactive web apps.
- beautifulsoup4 to scrape information from web pages.
First we GET the JSON
output from newsapi.org.
I have set the API to give 6 results because one of them might contain an article about The Verge itself which isn't relevant. If it exists in the JSON
response of the API, we skip it & show the remaining 5 results. If not then only the first 5 results.
Next, we use beautifulsoup4
to fetch & extract the articles from the link
key value pair in the JSON
response from the/v2/top-headlines
endpoint of newsapi.org as mentioned above.
Then, it is looped using a for
loop to extract different attributes of the JSON
response like title
, link
, date
, author
, urlToImage
Finally, each article has a button below it, which when pressed, initiates the generate_gemini_content
function that takes prompt
and article_content
as arguments combines them, and uses Google Gemini to produce a new piece of summarized text as the output.
Clone the project
git clone https://github.com/jaideep156/TheVerge-Summarizer.git
Go to the project directory
cd TheVerge-Summarizer
Install dependencies
pip install -r requirements.txt
Start the server
streamlit run app.py
P.S. Make sure to provide your correct API credentials in .streamlit/secrets.toml
file to run it locally on your machine. Mine are added to .gitignore
so its not exposed.
Also, I have added my API keys using the streamlit cloud UI using these steps from the official documentation.
This code has been deployed using Streamlit Community Cloud & the file is app.py