How it Works
The crawler will automatically crawl your website, following links found that match a certain pattern. By default, a pattern is created for you based on the starting link you provide. For example, if your provided:
The crawler will only follow links that start with
Picking a Good Starting Page
Given how the crawler works, you want to find a starting URL that has links on the page to other subdirectories and pages. For example:
https://support.example.comA subdirectory to the main website that contains all the desired content.
https://example.com/helpA path off the main website that contains all the desired content.
Support for Schema.org FAQPage
If the crawler finds embedded schema.org FAQPage type, it will parse it and automatically add the FAQs to the Answers content.
By default, Studio will set a pattern that will crawl a website, within the same domain. The pattern generated for starting page:
The patterns use a special notation, see the
 above, for defining which websites should be visited. This notation is used by other web crawlers such as Apify, their documentation on the pattern can be found here.
The notation allows you to put regular expressions inside the
. Looking back at the above example, the
(www.)? within the brackets means the
www. is optional and the
.+ means anything beyond the initial
example.com is acceptable.