Debugging
Logging
Each run of the crawler writes logs to a new file whose name looks like this: spider_output_2024-03-30T22:06:24.315Z.log
.
The contents of the file look like:
[2024-03-30T22:03:48.249Z] [DEBUG] successfully registered search plugin algolia
[2024-03-30T22:03:50.371Z] [DEBUG] Scraper page settings (using settings group default) {"default":{"hierarchySelectors":{"l0":".content h3","l1":".content h4","content":".content p"}},"shared":{"onlyContentLevel":true}}
[2024-03-30T22:03:50.371Z] [DEBUG] scraper task: shouldAbort==false
[2024-03-30T22:03:50.371Z] [DEBUG] Scraping URL https://website.com
[2024-03-30T22:03:52.717Z] [DEBUG] Page title Title
[2024-03-30T22:03:52.718Z] [DEBUG] Page selector matches [...]
[2024-03-30T22:03:57.889Z] [DEBUG] Page metadata {}
[2024-03-30T22:03:57.895Z] [DEBUG] Page links [...]
[2024-03-30T22:03:58.140Z] [DEBUG] Page indexing done!
{
"remainingQueueSize": 49,
"totalScrapedPages": 3,
"totalIndexedRecords": 53
}
Log level
The verbosity of the logs can be controlled using the logLevel
property of the config.json
file. It accepts 3 severity levels:
debug
- logs of all severity levels will be included.warn
- logs of severity:warn
and higher will be included.error
- logs of severity:error
will be included.