A project to quickly extract all email addresses from any file (except .sql file dump)
- Extract all addresses with regex
- Convert to domain extension to lowercase
- Order them alphabetically
- Remove duplicate email addresses
- You can specify domain addresses to exclude
- Save them to an output file
The regex I've used here: r'([A-Za-z0-9._%+-]+)@([A-Za-z0-9.-]+.[A-Za-z]{2,})'
- Do not forget to specify the file paths (input_filename, output_filename, exclude_domains_file)
- It works with Python 3.12.0