Web Scraping - How to Store Cookies in OctoparseSunday, March 5, 2017 11:11 PM
This web scraping tutorial we will teach you how to store cookie when scraping a site. The examples of websites we'd like to use is Twitter. You can follow the steps below to make a scraping task(What is an OTD. file) to scrape information from Twitter.
Step 1. Set up basic information.
Click "Quick Start" ➜ Choose "New Task (Advanced Mode)" ➜Complete basic information ➜ Click "Next".
Step 2. Enter www.twitter.com in the built-in browser to log into Twitter first. ➜ Click "Go" icon to open the webpage.
Step 3. Enter login information
Click on the "Log in" button ➜ Choose "Click an item" and a "Click Item" action will be created in the workflow.
Step 3-1. Enter authorization information such as username and password.
Click the input field for "Phone, email or username" on the web page ➜ Choose "Enter text value" ➜ Enter your email, phone number or username in the textbox for "Enter text" under "Customize Current Action" ➜ Click "Save". You will see the email is shown on the web page. Enter the password in a similar way.
Note: If you want to uncheck the "Remember me" option, you can click the option, choose "Click an item" and uncheck it.
Step 4. Click the Login button.
Click the "Login" button ➜ Choose "Click an item" ➜ Click "Save".
Step 5. After you log into your Twitter account, you can go to the "Go To Web Page" action directly to load and store the login cookie by checking the option "Use specified Cookie" under "Cache Settings". Then you can choose to delete all the actions created except the "Go To Web Page" action, or just keep them unchanged. Don't forget to click on the Save button to save the configuration.
Author: The Octoparse Team
For more information about Octoparse, please click here.