Step-by-step tutorials for you to get started with web scraping

Download Octoparse

How to deal with missing items when creating a list?

Friday, August 17, 2018

Why do some list items get left out?

Octoparse detects for items belonging to a list via their coding pattern in the underlying HTML source code.

When building a list , we usually start with selecting any 2 items from the list to define a coding pattern for Octoparse to refer to. In this case, if some list items are not included as we expect, then most probably they actually have a coding pattern different from the defined one.

 

How to tell Octoparse I need those items as well?

To have the omitted items being included, we need to replace the old pattern with a new one. In Octoparse, this refers to modifying or rewriting the XPath expression auto-generated in its previous detection.

If you are new to XPath, you might need to grab some basics of HTML and XPath first. Here are some tutorials for your reference: HTML basic | XPath basic  

 

Where to input the new XPath expression?

Step 1. Select the Loop Item step from the workflow

Step 2. Check the Loop mode option

           · If Variable list mode is on, go to Step 3

           · If Fixed list mode is on, switch to Variable list mode

Step 3. Input the modified XPath expression into the textbox

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact us Download
btn_sidebar_use.png
btn_sidebar_form.png