You can find links to data acquisition websites, all on the clear-web. Some sites will charge you for their service, however some will also offer a small amount of data for free. From a CTI perspective it is important for organisations, whether in the public or private sector to understand what risks exist that might make them vulnerable. These resources are not just for OSINT investigations, they are also for individuals to be able to check for your own data. You may have to check more tham one resource to truely understand the risk and exposure, as they tend to hold different data.
Public data and publically available data are two different concepts. Within the UK and the US individual local authority areas / states have there own local public data searches available, to many for rme to list here. The same is true of many other counteries.
It is always important to understand and acknowledge that for certain types of data, you have to consider the following, Legislation, Lawfulness, Regulations, Ethics, Morals and Polices. Be under no illusion, in some countries and jurisdictions certain types of data maybe out of bounds for some OSINT practitioners and I recommend you understand those limitations placed upon you before embarking on acquiring data for OSINT, especially when it comes to the lawfulness of your activity.
It is important to remember Data Acquisition OSINT is not just about one stream of Data. It includes both Public Data and Publicly Available Data. You must ensure that if you acquire data for OSINT that you use it responsibly and lawfully regardless of its origins. This is the same of any OSINT material you acquire.
Another stream of data worth considering is Data Broker data. Data collected when you sign up for an online service or when you download an app, and is then subsequently sold to Data Brokers. This is then aggregated with other Public Data and Publicly Available Data and combined to provide a product to sell, You can find some of these sites on the People Search repo.
People Search |
When researching data sites always be careful what you click or download as malware maybe present. Telegram is included however you will have to find groups and bots on the platfrom and these change frequently. You can check my Telegram repository for more resources.
Telegram |
If you find your data on a site, and it is a reputable site, it may offer you a means to remove it. You could also consider other ways to remove it, such as rights under GDPR or CCPA as examples. You will find some resources in my Privacy Opt-Out repository that may help too.
Privacy Opt-Out |
Table of Contents |
File / Directory OSINT |
Google CSE |
Publicly Available Data |
Public Data |
Paste Sites |
Ransomware / CTI |
Reverse Hash Sites |
News |
Videos |
File / Directory OSINT |
- DeDigger
- All You Can Read
- Amazon
- Dropbox
- Filemood
- File Pursuit
- Google Drive
- Google Drive CSE
- Lendx
- Mamont
- ODC Crawler
- Open Directory
- Open Directory Finder
- Open Directory Search
- Palined
- The Eye of Justice
- Yandex
Google CSE |
Publicly Available Data Acquisition Sites |
- Archive Datasets
- Breachbase
- Breach Directory
- Breach Forums
- Cryptome
- Cybernews
- DataBreaches
- DBpedia
- Ddosecrets
- DeepDark CTI
- Dehashed
- eBreached
- EmploLeaks
- Exposed
- FindPDF
- FiveThirtyEight
- Fuck Facebook
- Google Dataset Search
- Grep App
- H8mail
- Hacxx Underground
- Haveibeenpwned
- Haveibeensold
- Haveibeenzuckered
- Hotsheet
- Hudson Rock Cybercrime Intelligence Tools
- Intelius
- Id Strong
- Inteltechniques
- Intelx
- Kaggle
- Leakbase
- Leakcheck
- Leaked Domains
- Leaked Passwords
- Leakix
- Leaksx
- Leak-Lookup
- Leakpeek
- LeakSearch
- Lol Archiver
- LosePrivacy PWNED!
- Mozilla Monitor
- Myth
- Nuclear Leaks
- Offshore Leaks
- Online Newspapers
- Operation Archive
- OsintLeaks
- ATT Pentester
- NPD Pentester
- PeopleFinder
- Predicta Search
- Periksa Data
- Proxynova
- ScamSearch
- Scattered Secrets
- Search 0t Rocks Currently Down
- Sherlockeye
- Slideshare
- SnusBase
- SourceGraph
- SpyBot
- The Accountability Project
- The Eye
- TruffleHog
- WhatsMyIp-DataBreachTool
- Wikileaks
- Wikileaks V2
- WhiteIntel
- Whitepages
Public Data Acquisition Sites |
- Acedemic Torrents
- American Medical Association
- Australian Government Data Portal
- AWS Dataset Search
- Bielefeld Academic Search Engine
- Canadian Legal Database
- CDRC Data
- Consensus
- Core Research Papers
- European Open Data Monitor
- Google Copyright Explore
- General Medcial Council
- Google Public Data Search
- OCCRP Aleph
- Open Data Impact Map
- Operation Archive
- RefSeek Academic Search Engine
- The Britich Newspaper Archive
- UCI Machine Learning Repository
- UK Land Registry
- UK Companies House
- UK Courts & Tribunals Judiciary.
- UK Information Commissioner's Office
- UK GMC Medical Register
- UK Government Data Portal
- UK Library Search
- UK National Archives
- UK Planning Portal
- UK Phonebook
- UK Public Data Portal
- UK Surpreme Court
- UK The Law Pages
- UK Trademark Search
- Unicef Open Data
- US Census Data
- US Federal Bureau of Prisons
- US Federal Science Information
- US Federal State Voting Links
- US Government Data Portal
- US National Archives
- US Phonebook
- US Securities & Exchange Commission
- US Search Systems
- WHO Open Data
- World Bank Open Data
- WorldCat Library Search
- Yelp Dataset
Paste Sites |
- CybDetective
- Google CSE for Pastbin
- Just Paste It
- Mozilla Community Pastebin
- Pastbin
- PasteCode
- Psbdmp
- Redhunt Labs
Ransomware / Cyber Threat Intelligence |
Reverse Hash Sites |
- Crackstation
- Decrypt Tools
- Dehash
- Hashcat
- HashMob
- HashPals
- Hash Ziggi
- John The Ripper
- MD5 Gromweb
- Nitrxgen
- Hash Crack
News |
Videos |
- SANS Breach Data Infrastructure. (2024)