The TLS-Crawler is designed to perform large scale scans with TLS-Scanner. To improve performance it supports distributing the workload to multiple machines.
The TLS-Crawler is split into two applications:
- the Controller which is responsible for creating scan jobs, distributing them among multiple workers and scheduling recurring scans
- the Worker which is responsible for performing the scans
Additionally, for distributing the work and persisting results two more components are required:
- a RabbitMq instance which handles the work distribution among workers
- a MongoDB database instance which stores the results of the scans.
Controller
java -jar tls-crawler.jar controller -tranco 1000 -denylist resources/denylist.txt -portToBeScanned 443 -mongoDbHost mongo -mongoDbPort 27017 -mongoDbUser mongoadmin -mongoDbPass mongoadminpw -mongoDbAuthSource admin -rabbitMqHost rabbitmq -rabbitMqPort 5672 -scanName scanResults -monitorScan
Worker
java -jar tls-crawler.jar worker -mongoDbHost mongo -mongoDbPort 27017 -mongoDbUser mongoadmin -mongoDbPass mongoadminpw -mongoDbAuthSource admin -rabbitMqHost rabbitmq -rabbitMqPort 5672 -numberOfThreads 30 -parallelProbeThreads 50
Scanner specific Configuration:
-portToBeScannedthe port that should be scanned-timeoutthe timeout to use inside the TLS-Scanner-reexecutionsthe number of reexecutions to use in the TLS-Scanner-starttlswhich start tls protocol should be used (required when scanning email servers)-scanDetaildetail for the scanner (ALL, DETAILED, NORMAL, QUICK)
General Configuration:
-scanName stringthe name of the scan-hostFile patha file with the list of servers which should be scanned, see #hostfile-denylistFile patha file with hosts/ip addresses or ip ranges that should not be scanned-notifyUrlurl to which an HTTP POST request should be sent when a bulk scan is finished-monitorScanif set the controller monitors the progress of the scan and logs periodically how many servers have been scanned-tranco int(=X)if set the controller downloads the most recent Tranco List and scans the top X hosts
-numberOfThreadsnumber of worker threads the crawler worker should use-parallelProbeThreadsnumber of worker threads the crawler worker should use-scanTimeouttimeout after which the crawler tries to stop a scan if its still running (should be lower than rabbitMQ consumer ack because messages are only acknowledged at the end of a scan or the timeout)
-mongoDbHosthost of the MongoDB instance the crawler should save the results in-mongoDbPortport of the MongoDB instance the crawler should save the results in-mongoDbUserusername to be used to authenticate with MongoDB-mongoDbPasspassword to be used to authenticate with MongoDB-mongoDbPassFilepath to file from where the password should be read (for use with docker secrets)-mongoDbAuthSourcethe DB within the MongoDB instance, in which the user:pass is defined
-rabbitMqHosthostname or ip of the RabbitMQ instance-rabbitMqPortport of the RabbitMQ instance-rabbitMqUserusername to be used to authenticate with RabbitMQ-rabbitMqPasspassword to be used to authenticate with RabbitMQ-rabbitMqPassFilepath to file from where the password should be read (for use with docker secrets)-rabbitMqTLSif the connection to the RabbitMQ instance should be TLS encrypted
The hosts to be scanned can be specified as domain or ip with or without port. Each line must only contain one host.
If no port is specified the port passed with the -portToBeScanned parameter is used (defaults to 443).
Example:
www.google.com:443
amazon.de
131.234.238.217:8080
131.234.238.217
Instead of building the crawler yourself and starting it from the commandline, you can use docker-compose.yml and build.sh.
All parameters for the scan can be defined in the docker-compose.yml. build.sh can be used to build the crawler using
specifiable branches from the dependencies TLS-Attacker etc. Specify your Github username and access token so
Docker can download and build the dependencies. Finally, running ./build.sh will build your custom crawler environment
in docker images. Using docker-compose up you can then start the scan. You can either use the Mongo-DB specified in the
docker-compose.yml or use an external database.