Step-by-step tutorials for you to get started with web scrapingDownload Octoparse
How to deal with missing items when creating a list?Wednesday, November 24, 2021
The latest version for this tutorial is available here. Go to have a check now!
Why do some list items get left out?
Octoparse detects for items belonging to a list via their coding pattern in the underlying HTML source code.
When building a list , we usually start with selecting any 2 items from the list to define a coding pattern for Octoparse to refer to. In this case, if some list items are not included as we expect, then most probably they actually have a coding pattern different from the defined one.
How to tell Octoparse I need those items as well?
To have the omitted items being included, we need to replace the old pattern with a new one. In Octoparse, this refers to modifying or rewriting the XPath expression auto-generated in its previous detection.
Where to input the new XPath expression?
Step 1. Select the Loop Item step from the workflow
Step 2. Check the Loop mode option
· If Variable list mode is on, go to Step 3
· If Fixed list mode is on, switch to Variable list mode
Step 3. Input the modified XPath expression into the textbox
- Most popular tutorials
- Scrape tweets from Twitter
- Extract data from a list of URLs
- Extract multiple pages through pagination
- Scrape data on Instagram
- How to download images from a list of URLs?