Block crawlers with Caddy
A quick tip on how to block crawlers from accessing your website by rejecting requests containing their user agent tokens in Caddy.
-
If you have yet to install Caddy, please refer to the installation instructions.
-
Go into the folder containing Caddyfile
cd /etc/caddy
- Edit Caddyfile.
sudo nano /etc/caddy/Caddyfile
- Add a named request matcher with a list of bots and its corresponding handler.
@crawlers { header_regexp User-Agent (?i)(ChatGPT-User|cohere-ai|anthropic-ai|Bytespider|CCBot|FacebookBot|Google-Extended|GPTBot|omgili|Amazonbot|Applebot|PerplexityBot|YouBot) }
handle @crawlers { abort }
The above example list contains only AI-related crawlers. For a full list of bot user agents refer to Dark Visitors.
Example full config for a static website:
my-website.com {
@crawlers {
header_regexp User-Agent (?i)(ChatGPT-User|cohere-ai|anthropic-ai|Bytespider|CCBot|FacebookBot|Google-Extended|GPTBot|omgili|Amazonbot|Applebot|PerplexityBot|YouBot)
}
handle @crawlers {
abort
}
file_server
root * /var/www/my-website
}
- Reformat Caddyfile.
sudo caddy fmt --overwrite
- Validate Caddyfile. Make sure there’s no errors before next steps.
sudo caddy validate
- Restart Caddy.
sudo systemctl restart caddy
- Check if Caddy’s running correctly.
sudo systemctl status caddy
- (Optional) Check if the request is properly rejected by using CURL or another tool.
curl --request GET --url https://my-website.com/ --header 'User-Agent: ChatGPT-User'