Scrape Web Data from A Drop-Down Menu 2Thursday, October 13, 2016 3:30 AM
For the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.
A drop-down menu is a list of items that appear when clicking on a button or text selection. This tutorial will show you how to select options in a drop-down menu in Octoparse.
You may need this example link to follow through:
1. Click on the drop-down menu and click "Loop through options in the dropdown"
2. A Loop Item has been created and added to the workflow automatically to loop through options in the drop-down menu.
3. Check if all the options we need have been included in the Loop Item
- Click on of the Loop Item for the drop-down, then refer to the looped items in the list
- Check if all the items added to the loop are desired. If not, refine the list using the XPath function: position().
Position() refers to where the option is located in the drop-down menu.
For example, in this case, the first option in the drop-down menu is "-Select-", which is not a real option but a header, and we want to remove it from the list.
So you can just add "[position()>1]" to the current XPath. By doing so, the loop item will include every single option with a position greater than 1, or we can say just exclude the first option.
When a drop-down menu is detected and created in Octoparse, all available options will be selected by default. Besides the method of adding [position()>1] we just introduced to modify the list by adding or removing items, there are more methods you can use with XPath function position(). Adding [position()="x"] to the end of the XPath to include only options of certain positions, ie. position( )=1, position( )=2, etc. For this example, if you want to choose the year
1996, the Xpath should be added [position()=27]
To learn more tricks, please refer to How to select a specific option from the drop-down list?
4. We are now done configuring the drop-down menus. Click on the confirmation button to complete the search.
As you can see from the GIF above, when there are multiple drop-downs on the web page and we want to loop through them. i,e. get the results of the different combinations, we can just follow the steps of looping through one drop-down menu as we introduced before, and repeat it several times. The loop items newly built should be inside the former one, like this:
You might want to know which options in the different drop-down menus give us back the results accordingly. You can check the tutorials below to see how to achieve it:
If you need any assistance with your data project, please feel free to submit a request here to contact us.
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.