List Info

Thread: note 78046 added to ref.curl




note 78046 added to ref.curl
user name
2007-09-25 12:52:57
<?
/*
* Author: Ojas Ojasvi
* Released: September 25, 2007
* Description: An example of the disguise_curl() function in
order to grab contents from a website while remaining fully
camouflaged by using a fake user agent and fake headers.
*/

$url = 'http://www.php.net';

// disguises the curl using fake headers and a fake user
agent.
function disguise_curl($url)
{
  $curl = curl_init();

  // Setup headers - I used the same headers from Firefox
version 2.0.0.6
  // below was split up because php.net said the line was
too long. :/
  $header[0] = "Accept:
text/xml,application/xml,application/xhtml+xml,";
  $header[0] .=
"text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5&q
uot;;
  $header[] = "Cache-Control: max-age=0";
  $header[] = "Connection: keep-alive";
  $header[] = "Keep-Alive: 300";
  $header[] = "Accept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.7";
  $header[] = "Accept-Language: en-us,en;q=0.5";
  $header[] = "Pragma: "; // browsers keep this
blank.

  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/
bot.html)');
  curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
  curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com');
  curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
  curl_setopt($curl, CURLOPT_AUTOREFERER, true);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($curl, CURLOPT_TIMEOUT, 10);

  $html = curl_exec($curl); // execute the curl command
  curl_close($curl); // close the connection

  return $html; // and finally, return $html
}

// uses the function and displays the text off the website
$text = disguise_curl($url);
echo $text;
?>

Ojas Ojasvi
----
Server IP: 203.199.69.133
Probable Submitter: 59.176.111.201
----
Manual Page -- http://www.
php.net/manual/en/ref.curl.php
Edit        -- https://master
.php.net/note/edit/78046
Del: integrated  -- h
ttps://master.php.net/note/delete/78046/integrated
Del: useless     -- http
s://master.php.net/note/delete/78046/useless
Del: bad code    -- htt
ps://master.php.net/note/delete/78046/bad+code
Del: spam        -- https:/
/master.php.net/note/delete/78046/spam
Del: non-english -- 
https://master.php.net/note/delete/78046/non-english
Del: in docs     -- http
s://master.php.net/note/delete/78046/in+docs
Del: other reasons-- https://mast
er.php.net/note/delete/78046
Reject      -- https://mast
er.php.net/note/reject/78046
Search      -- https://
master.php.net/manage/user-notes.php

-- 
PHP Notes Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub
.php


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )