Transfer Encoding

Before parsing the body, the headers should be checked for a few things. If a Transfer-Encoding header is present and has a value of chunked, it means that the server is sending the response back in chunks rather than all at once. The advantage to this is that the server does not have to wait until the entire response is composed before starting to return it (in order to determine and include its length in the Content-Length header), which can increase overall server throughput.

When each chunk is sent, it is preceded by a hexadecimal number to indicate the size of the chunk followed by a CRLF sequence. The end of each chunk is also denoted by a CRLF sequence. The end of the body is denoted with a chunk size of 0, which is particularly important when using a persistent connection since the client must know where one response ends and the next begins.

The strstr function can be used to obtain characters in a string prior to a newline. To convert strings containing hexadecimal numbers to their decimal equivalents, see the hexdec function. An example of what these two might look like in action is included below. The example assumes that a request body has been written to a string.

<?php
$unchunked = '';
do {
if ($length = hexdec(strstr($body, "\r\n", true))) {
$body = ltrim(strstr($body, "\r\n"));
$unchunked .= substr($body, 0, $length);
$body = substr($body, $length + 2);
}
} while ($length > 0);
?>

See Section 3.6.1 and Appendix 19.4.6 of RFC 2616 for more information on chunked transfer encoding.


© Rolling Your Own — Web Scraping

>>> Back to TABLE OF CONTENTS <<<
Category: Article | Added by: Marsipan (01.09.2014)
Views: 392 | Rating: 0.0/0
Total comments: 0
avatar