Parsing Responses

Once you’ve received a response, the next step is obtaining the data you need from it. Taking the response from the last example, let’s examine what this might look like.

<?php
// Split the headers and body into separate variables list($headers, $body) = explode("\r\n\r\n", $response, 2);

// Remove the status line from the headers
list($status, $headers) = explode("\r\n", $headers, 2);

// Parse the headers segment into individual headers
preg_match_all(
"/(?P<name>[~:]+): (?P<value>[~\r] + )(?:$|\r\n[~ \t]*)/U",
$headers,
$headers,
PREG_SET_ORDER
);
?>

Logic to separate individual headers must account for the ability of header values to span multiple lines as per RFC 2616 Section 2.2. As such, preg_match_all is used here to separate individual headers. See the later chapter on PCRE for more information on regular expressions. If a situation necessitates parsing data contained in URLs and query strings, check out the parse_u rl and parse_st r functions. As with the request, it is generally desirable to parse response data into a data structure for ease of reference.


© Rolling Your Own — Web Scraping

>>> Back to TABLE OF CONTENTS <<<
Category: Article | Added by: Marsipan (01.09.2014)
Views: 339 | Rating: 0.0/0
Total comments: 0
avatar