See other articles
Automation Jobs

Page status checker

This Automaton Job will create a Page Status Insight

Maintaining a website can be time-consuming as you need to check for issues on a regular basis. This Automation job aims to help by running a crawl of your site every day and then checking for pages that return a status code other than successful. 

Simply select the codes (A clear explanation is beside each code so you know what it means) you want to be notified about and then set your notification configuration to be alerted via email and/or slack. You can even use Zapier to create a customised workflow if you need to.

Advanced Settings

Exclude Pages - Sometimes there may be technical reasons why certain pages should be excluded from crawls, for example external links are not able to be accessed outside a geographic area or pages that are known to return error codes that you would not like to be alerted to.

To exclude a page/s from generating an Insight, enter a regex pattern that matches. There are numerous tools that can help you generate the regex patter such as regex 101. It's important to understand that you will need to escape some characters when inserting them into the field.

Example 1: You wish to exclude 2 of your own pages

Example 2: You wish to exclude an external site

Check External Links - External links are not as vital to your website performance so they are ignored unless this is checked

Respect Robots.txt - You can use this setting if you only want our crawler to navigate to pages that follow your directives in the robots.txt file

The following links are ignored by our crawler

  • mailto:
  • tel:
  • javascript:
  • callto:
  • wtai:
  • sms:
  • market:
  • geopoint:
  • ymsgr:
  • msnim:
  • gtalk:
  • skype:
  • sip:
  • whatsapp:

Some Website platforms, such as Spotify are known to throw 430 errors (where a 429 should be used) when multiple pages are requested in a short period of time and can cause the Automation to slow down.

Was this article helpful?
Nice one!

Thanks a lot for your feedback!
If you’d like a member of our support team to respond to you,

please send a message here
Please try again

Oops! Something went wrong while submitting the form.

Sorry about that. What did you find most unhelpful?

Nice one!

Thanks a lot for your feedback!
If you’d like a member of our support team to respond to you,

please send a message here
Please try again

Oops! Something went wrong while submitting the form.

See all categories

Automation Jobs

This Automaton Job will create a Page Status Insight

Maintaining a website can be time-consuming as you need to check for issues on a regular basis. This Automation job aims to help by running a crawl of your site every day and then checking for pages that return a status code other than successful. 

Simply select the codes (A clear explanation is beside each code so you know what it means) you want to be notified about and then set your notification configuration to be alerted via email and/or slack. You can even use Zapier to create a customised workflow if you need to.

Advanced Settings

Exclude Pages - Sometimes there may be technical reasons why certain pages should be excluded from crawls, for example external links are not able to be accessed outside a geographic area or pages that are known to return error codes that you would not like to be alerted to.

To exclude a page/s from generating an Insight, enter a regex pattern that matches. There are numerous tools that can help you generate the regex patter such as regex 101. It's important to understand that you will need to escape some characters when inserting them into the field.

Example 1: You wish to exclude 2 of your own pages

Example 2: You wish to exclude an external site

Check External Links - External links are not as vital to your website performance so they are ignored unless this is checked

Respect Robots.txt - You can use this setting if you only want our crawler to navigate to pages that follow your directives in the robots.txt file

The following links are ignored by our crawler

  • mailto:
  • tel:
  • javascript:
  • callto:
  • wtai:
  • sms:
  • market:
  • geopoint:
  • ymsgr:
  • msnim:
  • gtalk:
  • skype:
  • sip:
  • whatsapp:

Some Website platforms, such as Spotify are known to throw 430 errors (where a 429 should be used) when multiple pages are requested in a short period of time and can cause the Automation to slow down.