OSINT

The Internet knows pretty much everything, we just need to ask the right questions

OSINT Framework

Gives a ton of excellent resources on gathering intel.

Shodan & Censys

Websites which are constantly scanning the Internet for available devices, performs banner grabbing and publicly publishes its findings. Great to see what attackers on the Internet will see for an IP you own.

Tesseract

This is an OCR package which is a CLI tool that understands 100+ languages. Very useful to gather quick text from images!
tesseract important.png stdout | egrep -v '^$' # Search important.png for text, push to stdout, then remove blank lines
tesseract important.png stdout -psm 11 -l eng # Set the PSM (Page Segmentation Mode) to 11, find as much text as possible in no particular order
for i in *.jpg; do tesseract $i stdout -psm 11 -l eng >> words.txt; done # Dirty bash loop to gather text from all jpgs in dir
egrep -v '^$' # Remove blank lines
fmt -1 # One word per line
strings -n4 # Require min 4 ASCII printable characters
egrep -i [a-z] # Require at least one alphanumeric character
sort -u # Unique entries only

Google Dorks

GHDB contains a ton of premade dorks to find info.
Dork
Purpose
whales -bitcoin
searches for whales without any mention of bitcoin
site:m4lwhere.org
filters to only the site m4lwhere.org
cache:
search the Google cache only
ext:pdf
filetype:pdf
filters to the extension and filetype only
intitle:"Index of "
Searches for any page that has "Index of " in the name
inurl:"*.cgi"
Searches for any page that ends in a ".cgi"
site:m4lwhere.org intitle:"Index of" "last modified"
Searches for a directory listing of a page on the site m4lwhere.org

Certificate Transparency

CAs are required to publish all certificates issued to a public database. This can be useful to find servers that are internal to a LAN or are not Internet accessible.

Credential Leaks

Passive DNS

Occasionally there will be old or forgotten IPs for a site listed in passive DNS listings.

Data Aggregators

Hunter.io, compiles lists of org metadata, useful to identify email addressing schemes.
haveibeenpwned.com, lists of pwned email accounts.
dehashed.com, public data dumps available, requries paid access.
scylla.sh, indexed data dumps, free, currently down.
Public data dump forums
Torrents