Categories
PHP

Check if URL exists and is Online – PHP

Imagine you need to check if a site is online or not, seems pretty easy since there’s plenty of tools to check this, but this can be a huge bottleneck for your app.

I have tried different ways with sockets, header and curl in order to know which one is the fastest option.

Tested code done to the site yahoo.com (10 attempts each and we keep best result as time):

Sockets + Header:

$url = @parse_url($url);
if (!$url) return false;

$url = array_map('trim', $url);
$url['port'] = (!isset($url['port'])) ? 80 : (int)$url['port'];

$path = (isset($url['path'])) ? $url['path'] : '/';
$path .= (isset($url['query'])) ? "?$url[query]" : '';

if (isset($url['host']) && $url['host'] != gethostbyname($url['host'])) {

     $fp = fsockopen($url['host'], $url['port'], $errno, $errstr, 30);

      if (!$fp) return false; //socket not opened

        fputs($fp, "HEAD $path HTTP/1.1rnHost: $url[host]rnrn"); //socket opened
        $headers = fread($fp, 4096);
        fclose($fp);

	 if(preg_match('#^HTTP/.*s+[(200|301|302)]+s#i', $headers)){//matching header
	       return true;
	 }
	 else return false;

 } // if parse url
 else return false;

Time: 0.222 seconds, never more than 0.225s

Curl:

$resURL = curl_init();
curl_setopt($resURL, CURLOPT_URL, $url);
curl_setopt($resURL, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($resURL, CURLOPT_HEADERFUNCTION, 'curlHeaderCallback');
curl_setopt($resURL, CURLOPT_FAILONERROR, 1);
curl_exec ($resURL);
$intReturnCode = curl_getinfo($resURL, CURLINFO_HTTP_CODE);
curl_close ($resURL);
if ($intReturnCode != 200 && $intReturnCode != 302 && $intReturnCode != 304) {
    return false;
}
else return true;

Time: 0.224 seconds, few times reached 0.227

Headers:

@$headers = get_headers($url);
if (preg_match('/^HTTP/d.ds+(200|301|302)/', $headers[0])){
   return true;
}
else return false;

Time: 0.891 seconds, few times more than 1.5s

As you can see for me the fastest way is Socket + Header, even though Curl is pretty fast!

Also note, that I didn’t use other systems such as fopen or file_get_contents since we don’t need to retrieve the page, and what we need is just the header.

Extra, check url correct formation:

function isURL($url){
	$pattern='|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i';
	if(preg_match($pattern, $url) > 0) return true;
	else return false;
}