Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
How to deal with missing items when creating a list?Thursday, June 9, 2022
The latest version for this tutorial is available here. Go to have a check now!
Why do some list items get left out?
Octoparse detects for items belonging to a list via their coding pattern in the underlying HTML source code.
When building a list , we usually start with selecting any 2 items from the list to define a coding pattern for Octoparse to refer to. In this case, if some list items are not included as we expect, then most probably they actually have a coding pattern different from the defined one.
How to tell Octoparse I need those items as well?
To have the omitted items being included, we need to replace the old pattern with a new one. In Octoparse, this refers to modifying or rewriting the XPath expression auto-generated in its previous detection.
Where to input the new XPath expression?
Step 1. Select the Loop Item step from the workflow
Step 2. Check the Loop mode option
· If Variable list mode is on, go to Step 3
· If Fixed list mode is on, switch to Variable list mode
Step 3. Input the modified XPath expression into the textbox
- Most popular tutorials
- Is Octoparse able to handle CAPTCHA/reCAPTHCA?
- How to download extracted web data as CSV, XLS, JSON or HTML?
- Run/Schedule tasks in the cloud
- Run tasks on local machine
- Text/keyword input