Skip to main content

Debugging

Logging

Each run of the crawler writes logs to a new file whose name looks like this: spider_output_2024-03-30T22:06:24.315Z.log. The contents of the file look like:

[2024-03-30T22:03:48.249Z] [DEBUG] successfully registered search plugin    algolia
[2024-03-30T22:03:50.371Z] [DEBUG] Scraper page settings (using settings group default) {"default":{"hierarchySelectors":{"l0":".content h3","l1":".content h4","content":".content p"}},"shared":{"onlyContentLevel":true}}
[2024-03-30T22:03:50.371Z] [DEBUG] scraper task: shouldAbort==false
[2024-03-30T22:03:50.371Z] [DEBUG] Scraping URL https://website.com
[2024-03-30T22:03:52.717Z] [DEBUG] Page title Title
[2024-03-30T22:03:52.718Z] [DEBUG] Page selector matches [...]
[2024-03-30T22:03:57.889Z] [DEBUG] Page metadata {}
[2024-03-30T22:03:57.895Z] [DEBUG] Page links [...]
[2024-03-30T22:03:58.140Z] [DEBUG] Page indexing done!
{
"remainingQueueSize": 49,
"totalScrapedPages": 3,
"totalIndexedRecords": 53
}

Log level

The verbosity of the logs can be controlled using the logLevel property of the config.json file. It accepts 3 severity levels:

  • debug - logs of all severity levels will be included.
  • warn - logs of severity: warn and higher will be included.
  • error - logs of severity: error will be included.