Some web scraping applications must push data to the target application. This is generally accomplished using HTTP POST requests that simulate the submission of HTML forms. Before such requests can be sent, however, there are a few events that generally have to transpire. First, if the web scraping application is intended to be presented to a user, a form that is at least somewhat similar must be presented to that user. Next, data submitted via that form by the user should be validated to ensure that it is likely to be accepted by the target application. The applicability of this technique will vary by project depending on requirements and how forms are structured. It involves scraping the markup of the form in the target application and using the scraped data to generate something like a metadata file or PHP source code file that can be dropped directly into the web scraping application project. This can be useful to expedite development efforts for target applications that have multiple forms or complex forms for which POST requests must be simulated. For the purposes of formulating a POST request, you will want to query for elements with the names input, select, textarea, or possibly button that have a name attribute. Beyond that, here are a few element-specific considerations to take into account.
© Tips and Tricks — Web Scraping >>> Back to TABLE OF CONTENTS <<< | |
Views: 337 | |
Total comments: 0 | |