Before using this tool, ensure you have the following:
- A Python environment set up on your system.
- The necessary libraries installed. You can do this by running the command:
pip3 install -r requirements.txt
- A Duolingo Plus account.
Follow these steps to download and process your Duolingo vocabulary:
Log in to your Duolingo account using your web browser.
Go to the following URL: Duolingo Practice Hub - Words
Right-click on any area of the page and select "Inspect" to open the Developer Tools.
Scroll down the list of words and click "Load more" until you reach the end of the list.
Right-click on the first <html>
element in the Developer Tools, then select "Copy" > "Copy element".
Paste the copied HTML content into a text file. You can name this file anything you like, but for the default settings, save it as duolingo.txt
or any name you want.
Open a terminal and run the following command:
python3 main.py
The script will prompt you for some information:
- Filename: Enter the name of the file where you saved the HTML content (default is
duolingo.txt
). - Language Code: Enter the language code for the words you want to download (default is
fr-en
). For more language codes, refer to the gTTS documentation. - Output Folder: Enter the name of the folder where you want to save the audio files (default is
audio
).
After running the script, you will get the following outputs:
- Vocabulary File: A text file named
{current_date}_merged_vocabulary_for_{langcode}.txt
containing the merged vocabulary pairs. - Anki File: A CSV file named
{current_date}_vocabulary_list_for_anki_{langcode}.csv
compatible with Anki for flashcard creation. - Audio Files: An output folder (default
audio
) containing TTS (Text-to-Speech) audio files generated using Google TTS for each word.
The files are named with the current date and the language code to help keep them organized.