PHPexp logo PHPexp

Detecting robots and webcrawlers by User-Agent string

Published on

The web is full of robots. If you manually collect pageview statistics, they'll definitely be full of requests made by webcrawlers. This post explains a simple but effective approach for detecting webcrawlers by User-Agent string.

The function below will detect many types of robots:

function is_robot(string $userAgent)
{
    if (! $userAgent) {
        return true;
    }
    
    $userAgent = mb_strtolower($userAgent);
    
    foreach ([
        'bot', 'crawl', 'slurp', 'spider', 'facebook', 'mediapartners', 'ohdear', 'guzzlehttp',
        'anthill', 'apis-google', 'iframely', 'node-fetch', 'turnitin', 'pocketparser', 'java/1',
        'go-http-client', 'curl/',    
    ] as $word) {
        if (str_contains($userAgent, $word)) {
            return true;
        }
    }
    
    return false;
}

The code above won't detect all robots, but it will detect most of them.

Level up your Laravel deployments

If you aren't happy with your current deployment strategy for your Laravel apps, check out my premium deployment script.

Deploy with GitHub Actions Deploy with GitLab CI/CD