Parsing Responses
Once you’ve received a response, the next step is obtaining the data you need from it. Taking the response from the last example, let’s examine what this might look like. <?php // Split the headers and body into separate variables list($headers, $body) = explode("\r\n\r\n", $response, 2); // Remove the status line from the headers list($status, $headers) = explode("\r\n", $headers, 2); // Parse the headers segment into individual headers preg_match_all( "/(?P<name>[~:]+): (?P<value>[~\r] + )(?:$|\r\n[~ \t]*)/U", $headers, $headers, PREG_SET_ORDER ); ?> Logic to separate individual headers must account for the ability of header values to span multiple lines as per RFC 2616 Section 2.2. As such, preg_match_all is used here to separate individual headers. See the later chapter on PCRE for more information on regular expressions. If a situation necessitates parsing data contained in URLs and query strings, check out the parse_u rl and parse_st r functions. As with the request, it is generally desirable to parse response data into a data structure for ease of reference. © Rolling Your Own — Web Scraping >>> Back to TABLE OF CONTENTS <<< | |
Views: 339 | |
Total comments: 0 | |