Scrape Web Data from A Drop-Down Menu 1Sunday, October 9, 2016 9:49 AM
A drop-down menu is a list of items that appear when clicking on a button or text selection.
This tutorial will show you how to select options in a drop-down menu in Octoparse.
You may need the following sample URL to follow through:
Let's say we have created a task with the sample URL. Find the drop-down menu on the webpage.
1. Click on the drop-down menu and click "Loop through options in the dropdown"
2. A Loop Item has been created and added to the workflow automatically to loop through options in the drop-down menu
3. Check if all the options we need have been included in the Loop Item
- Click on the Loop Item for the drop-down, then refer to the looped items in the list
- Check if all the items added to the loop are desired. If not, refine the list using the XPath function: position().
Position() refers to where the option is located in the drop-down menu.
For example, in this case, the first option in the drop-down menu is "-Select-", which is not a real option but a header, and we want to remove it from the list.
So you can just add "[position()>1]" to the current XPath. By doing so, the loop item will include every single option with a position greater than 1, or we can say just exclude the first option.
When a drop-down menu is detected and created in Octoparse, all available options will be selected by default.
Besides the method of adding [position()>1] we just introduced to modify the list by adding or removing items, there are more methods you can use with XPath function position().
Adding [position()="x"] to the end of the XPath to include only options of certain positions, i.e. position( )=1, position( )=2, etc. For this example, if you want to choose the year
1996, the Xpath added should be [position()=27]
To learn more tricks, please refer to How to select a specific option from the drop-down list?
4. We are now done configuring the drop-down menus. Click on the confirmation button to complete the search.
When there are multiple drop-downs on the web page and we want to loop through them. i,e. get the results of the different combinations,
we can just follow the steps of looping through one drop-down menu as we introduced before, and repeat it several times.
The loop items newly built should be inside the former one, like this:
You might want to know which options in the different drop-down menus give us back the results accordingly. You can check the tutorials below to see how to achieve it:
If you need future assistance with your data project, feel free to submit a request here.
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.