PHP Consuming RESTful APIs
Fetching a URL with the GET Method
Problem
You want to retrieve the contents of a URL. For example, you want to include part of one site in another site’s content.
Solution
Provide the URL to file_get_contents():
$page = file_get_contents('http://www.example.com/robots.txt');
Or you can use the cURL extension:
$c = curl_init('http://www.example.com/robots.txt');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
$page = curl_exec($c);
curl_close($c);
You can also use the HTTP_Request2 class from PEAR:
require_once 'HTTP/Request2.php';
$r = new HTTP_Request2('http://www.example.com/robots.txt');
$page = $r->send()->getBody();
Discussion
file_get_contents(), like all PHP file-handling functions, uses PHP’s streamsfeature. This means that it can handle local files as well as a variety of network resources, including HTTP URLs. There’s a catch, though—the allow_url_fopen configuration setting must be turned on (which it usually is).
This makes for extremely easy retrieval of remote documents. You can use the same technique to grab a remote XML document:
$url = 'http://rss.news.yahoo.com/rss/oddlyenough';
$rss = simplexml_load_file($url);
print '<ul>';
foreach ($rss->channel->item as $item) {
print '<li><a href="' .
htmlentities($item->link) .
'">' .
htmlentities($item->title) .
'</a></li>';
}
print '</ul>';
To retrieve a page that includes query string variables, use http_build_query() to create the query string. It accepts an array of key/value pairs and returns a single string with everything properly escaped. You’re still responsible for the ? in the URL that sets off the query string. For example:
$vars = array('page' => 4, 'search' => 'this & that');
$qs = http_build_query($vars);
$url = 'http://www.example.com/search.php?' . $qs;
$page = file_get_contents($url);
To retrieve a protected page, put the username and password in the URL. Here the username is david, and the password is hax0r:
$url = 'http://david:hax0r@www.example.com/secrets.php';
$page = file_get_contents($url);
Or with cURL:
$c = curl_init('http://www.example.com/secrets.php');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_USERPWD, 'david:hax0r');
$page = curl_exec($c);
curl_close($c);
Likewise with HTTP_Request2:
require 'HTTP/Request2.php';
$r = new HTTP_Request2('http://www.example.com/secrets.php');
$r->setAuth('david', 'hax0r', HTTP_Request2::AUTH_DIGEST);
$page = $r->send()->getBody();
PHP’s http stream wrapper automatically follows redirects. The file_get_contents() and fopen() functions support a stream context argument that allows for specifying options about how the stream is retrieved. One of those options is max_redirects, the maximum number of redirects to follow.
This example sets max_redirects to 1, which turns off redirect following:
$url = 'http://www.example.com/redirector.php';
// Define the options
$options = array('max_redirects' => 1 );
// Create a context with options for the http stream
$context = stream_context_create(array('http' => $options));
// Pass the options to file_get_contents. The second
// argument is whether to use the include path, which
// we don't want here.
print file_get_contents($url, false, $context);
The max_redirects stream wrapper option really indicates not how many redirects should be followed, but the maximum number of requests that should be made when following the redirect chain. That is, a value of 1 tells PHP to make at most one request—follow no redirects. A value of 2 tells PHP to make at most two requests—follow no more than one redirect. (A value of 0, however, behaves like a value of 1—PHP makes just one request.)
If the redirect chain would have PHP make more requests than are allowed by max_redirects, PHP issues a warning.
cURL only follows redirects when the CURLOPT_FOLLOWLOCATION option is set:
$c = curl_init('http://www.example.com/redirector.php');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, true);
$page = curl_exec($c);
curl_close($c);
To set a maximum number of redirects that cURL should follow, set CURLOPT_FOLLOWLOCATION to true and then set the CURLOPT_MAXREDIRS option to that maximum number.
HTTP_Request2 follows if the follow_redirects parameter is set to true, as shown here:
require 'HTTP/Request2.php';
$r = new HTTP_Request2('http://www.example.com/redirector.php');
$r->setConfig(array(
'follow_redirects' => true,
'max_redirects' => 1
));
$page = $r->send()->getBody();
print $page;
cURL can do a few different things with the page it retrieves. As you’ve seen in previous examples, if CURLOPT_RETURNTRANSFER is set, curl_exec() returns the body of the page requested. If CURLOPT_RETURNTRANSFER is not set, curl_exec() prints the response body.
To write the retrieved page to a file, open a file handle for writing with fopen() and set the CURLOPT_FILE option to that file handle. This example uses cURL to copy a remote web page to a local file:
$fh = fopen('local-copy-of-files.html','w') or die($php_errormsg);
$c = curl_init('http://www.example.com/files.html');
curl_setopt($c, CURLOPT_FILE, $fh);
curl_exec($c);
curl_close($c);
To pass the cURL resource and the contents of the retrieved page to a function, set the CURLOPT_WRITEFUNCTION option to a callback for that function (either a function name or an array whose first element is an object instance or a string containing a class name and whose second element is a method name). The “write function” must return the number of bytes it was passed. Note that with large responses, the write function might get called more than once because cURL processes the response in chunks. This example uses a cURL write function to save page contents in a database:
class PageSaver {
protected $db;
protected $page ='';
public function __construct() {
$this->db = new PDO('sqlite:./pages.db');
}
public function write($curl, $data) {
$this->page .= $data;
return strlen($data);
}
public function save($curl) {
$info = curl_getinfo($curl);
$st = $this->db->prepare('INSERT INTO pages '.
'(url,page) VALUES (?,?)');
$st->execute(array($info['url'], $this->page));
}
}
// Create the saver instance
$pageSaver = new PageSaver();
// Create the cURL resources
$c = curl_init('http://www.example.com/');
// Set the write function
curl_setopt($c, CURLOPT_WRITEFUNCTION, array($pageSaver,'write'));
// Execute the request
curl_exec($c);
// Save the accumulated data
$pageSaver->save($c);
No comments:
Post a Comment