Block crawlers with Caddy

A quick tip on how to block crawlers from accessing your website by rejecting requests containing their user agent tokens in Caddy.

  1. If you have yet to install Caddy, please refer to the installation instructions.

  2. Go into the folder containing Caddyfile

cd /etc/caddy
  1. Edit Caddyfile.
sudo nano /etc/caddy/Caddyfile
  1. Add a named request matcher with a list of bots and its corresponding handler.
@crawlers { header_regexp User-Agent (?i)(ChatGPT-User|cohere-ai|anthropic-ai|Bytespider|CCBot|FacebookBot|Google-Extended|GPTBot|omgili|Amazonbot|Applebot|PerplexityBot|YouBot) }
handle @crawlers { abort }

The above example list contains only AI-related crawlers. For a full list of bot user agents refer to Dark Visitors.

Example full config for a static website:

my-website.com {

	@crawlers { 
				header_regexp User-Agent (?i)(ChatGPT-User|cohere-ai|anthropic-ai|Bytespider|CCBot|FacebookBot|Google-Extended|GPTBot|omgili|Amazonbot|Applebot|PerplexityBot|YouBot) 
				} 
	
	handle @crawlers { 
				abort 
				} 
	
	file_server 
	
	root * /var/www/my-website 

}
  1. Reformat Caddyfile.
sudo caddy fmt --overwrite
  1. Validate Caddyfile. Make sure there’s no errors before next steps.
sudo caddy validate
  1. Restart Caddy.
sudo systemctl restart caddy
  1. Check if Caddy’s running correctly.
sudo systemctl status caddy
  1. (Optional) Check if the request is properly rejected by using CURL or another tool.
curl --request GET --url https://my-website.com/ --header 'User-Agent: ChatGPT-User'