Advice for HackerPenetration Testing

Belati: Collecting Public Data and Document for OSINT Purpose

Collecting Public Data and Document for OSINT Purpose


Open Source Intelligence (OSINT) is a great intelligence generated publicly available information which is collected, utilized, and disseminated with the purpose of intelligence in particular. OSINT started and was created in 1941 by Foreign Broadcast information service (FBIS) to access and exploit OSINT When World War II and growing rapidly after formally formed by director of the Central Intelligence Directive in 1994.

OSINT is used in several ways, including the following:

– Government
– Military
– Business
– Intelligence Community
– Homeland Security
– Law enforcement


Belati is a tool used for collecting data and documents that are publicly available from a website or others service for OSINT needs. Belati name is taken from name of the knife, which though small but very dangerous. Belati is inspired by FOCA and Datasploit. Especially Datasploit in doing automatic OSINT. Belati aims for learning materials purpose.

What Belati could do?

  • Whois
  • Banner Grabbing
  • Subdomain Enumeration
  • Public GIT finder in domain/sub-domain
  • Public SVN finder in domain/sub-domain
  • Robot.txt scraper in domain/sub-domain
  • Service Scanning for all Subdomain Machine
  • Web Appalyzer Support
  • DNS mapping / Zone Scanning
  • Fake and random User-Agent
  • Proxy support for harvesting emails and documents
  • Gather public company & employee information
  • SQLite3 database support for storing results
  • Mail Harvester from Website & Search Engine
  • Scrapping Public Document for Domain from Search Engine


For the next Belati designed to be able doing :

  • Automatic OSINT with Username and Email support
  • Organization or Company OSINT Support
  • Collecting Data from Public service with Username and Email for
    LinkedIn and other service.
  • Setup Wizard for Token and setting up Belati
  • Token Support
  • Email Harvesting with multiple content
  • Scrapping Public Document with multiple search engine
  • Metadata Extractor
  • Database Support
  • Web version with Django
  • Proxy Support for Harvesting
  • Scanning Report export to PDF

That is bunch to do list for belati development next. For now, Belati is only support doing OSINT toward domain or website. As i mentioned above, Belati’s development will cover OSINT, personal / organization / company profiling.


Starting from gathering information through Whois, then Belati will checking ownership of a website / domain. That there may be sensitive data that can be utilized. Number Phone? Address? E-mail? The original name? Not infrequently a domain include the information without any whois protector: D

After that Belati will do HTTP Banner Grabing to do checking the web server that is running on the domain. Apache? Nginx? Version? HTTP Security Header?

We also want to know how many and what Subdomains are available, the data can Belati get by doing data collection through Search Engine and other services, for this case Belati use sublist3r as plugins to make easier and more efficient. With GEO IP support feature;)

Once collected all the subdomains, data then we can do network mapping of the site. Because from that obtained information we can know more about what services are existing on the domain. Generally the information is public and can be known. Seldomly a personal, organization or company easily done network mapping after getting existing subdomain information.

From the subdomain, Belati will check port 80/443 and do a Web Appalyzer to collect data about which services and plugins are used by each subdomain.

Not only that, Belati also do NMAP Scanning of IP Address which is obtained through a subdomain list. Because, im sure there are a lot
sensitive information that can be obtained. This will be useful for later when will do exploit. It’s just going to take some time, Due to full scan with nmap -sS -A -Pn options.

Do not forget, Belati also checks MX and NS records, there may be mailbox there and also we so know where exactly this domain hosted.

The website must have a personal email, especially for the scale company or organization with own domain. Well! Belati anyway collecting public data against a domain’s email through Search Engines, which currently only support Google Search Engine.

Public Document? Yes! Belati can collect the public documents and has been Google indexed. Why? Because sometimes there are several cases where a web is used as storage, which is ultimately indexed by Google. How if the data is important Is it spread and has been downloaded by the public? Maybe for further development Belati will have a public fuzzing feature document on a website.

Well, let’s wondering what if the above case happens? How If the public information that is considered trivial it could have an impact fatal?

Belati Project on Github

Actually, Belati project is in still in progress, the author Aan Wahyu a.k.a Petruknisme said that Belati has lot more improvement and development further. It is open source ! If you are interesting to contribute you can mail the author at : cacaddv[at]

Or you may find his project on github at

Related Articles

Back to top button