Scrape Job Postings from Indeed.com
Wednesday, September 14, 2016 3:03 AMFor the latest tutorials, visit our new self-service portal. Sharpen your skills and explore new ways to use Octoparse.
Indeed is one of the most popular jobs posting websites. With web scraping, you can uncover the value of tons of job information. In this tutorial, we will show you how to use Octoparse to scrape the job posts from Indeed.com.
Before we get started, we need to get the URL of the target result page by searching a keyword and a location.
Below is an example URL for demonstration:
https://www.indeed.com/jobs?q=devops&l=Dallas-Fort%20Worth%2C%20TX&radius=50
The easiest way to scrape the website is to go to "Task Templates" on the main screen of the Octoparse scraping tool and start with the ready-to-use Indeed Templates directly to save your time. Just input the URL into the template, and you can wait for the data to come out. For further details, you may check it out here: Task Templates
If you would like to know how to build the task from scratch, you may continue reading the following tutorial.
Here are the main steps in this tutorial: [Download task file here ]
1. Go to Web Page - Open the targeted web page
2. Auto-detect the web page - create the workflow
3. Set up the wait time for "Extract Data" - control scraping speed
4. Start extraction - run the task and get data
1) Go to Web Page- open the targeted web page
- Enter the URL on the home page and click "Start"
2) Auto-detect the web page - create the workflow
- Click "Auto-detect the web page data" on the Tips panel and wait for the detection to complete
- Go to "Data preview" to see if you are satisfied with the current data output
- You can delete unnecessary data fields directly by clicking the icon src="https://helpcenter.octoparse.com/hc/article_attachments/900002867626/mceclip13.png" alt="mceclip13.png" />
- You can also modify the data field names here directly by clicking the icon
- Uncheck the option of "Add a page scroll"
- Click "Create workflow"
3) Set up the wait time for "Extract Data" - control scraping speed
- Click open the action settings of the "Extract Data"
- Click "Options"
- Tick "Wait before action"
- Set up the wait time as 1-2s
4) Start extraction - run the task and get data
- Click
- Click
on the upper left side
- Select "Run on your device" to run the task on your computer, or select "Run task in the Cloud
" to run the task in the Cloud (for premium users only)
Here is sample data for your reference.
Is this article helpful? Contact us anytime if you need our help!
Happy Data Hunting!
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.