Articles on: Getting Started

How do you get around collecting links on websites that want to block you crawling their site?

We do our best to crawl sites comprehensively, but in some rare cases, we don't crawl. This includes:
 - websites that specifically block our bot in robots.txt we don't crawl
 - CDNs sometimes that block all bots from crawling except Google. 

Again, these are very rare cases compared to the number of domains we crawl.

Updated on: 14/08/2019

Was this article helpful?

Share your feedback

Cancel

Thank you!