undefined
Blog > Web Scraping > Post

How to Extract Google Maps Coordinates

Wednesday, October 09, 2019

Have you ever thought you can make money by knowing how many restaurants there are in a square mile? There is no free lunch, however, if you know how to use Google Maps, you can extract and collect the restaurant's GPS and store them in your own database. With that information on hand and some math calculations, you are off to creating a big data online service.

 

In this article, I will show you how to quickly extract Google Maps coordinates with a simple and easy method. 

  

It is tricky to notice that the coordinates actually are hidden inside the URLs. In this case, we need to extract the URL, and use Regular Expression to find the exact matching text string we are looking for. Let’s take the Space Needle landmark in Seattle as an example. 

 

 

First, Open Google Maps in your browser and type Space Needle in the search bar

 

space needle

 

After the page finishes loading, look for coordinates in the URL. The coordinates are located behind the “@” sign.

 

coordinates

 

Next, we can start to extract the URL. The tool that we use is Octoparse. You can use whatever tool that you feel comfortable. Octoparse is the best web scraping tool that I have ever encountered as its intuitive user interface is very easy to pick up, especially for starters. It would be best if you already have it on your computer, or you can download here.

 

1. Build a new task with the Advanced Mode by clicking “+” sign

2. Input the URL into the box  https://www.google.com/maps/place/Space+Needle/@47.6205099,-122.3514661,17z/data=!4m5!3m4!1s0x5490151f4ed5b7f9:0xdb2ba8689ed0920d!8m2!3d47.6205063!4d-122.3492774 

3. Hit “Save URL” to proceed. 

 

Now we have created a new task successfully. The thing is that Google Maps doesn’t load properly within its built-in browser. Why? It is because Google Maps doesn't accommodate with current browser’s user agent. To solve this problem, click the icon. Find the User-agent Switcher. Choose Firefox 45.0 and click save. Octoparse will reload the webpage itself. 

 

After the web page finishes loading, we are able to start extraction with point-and-click on the built-in browser. Click the name, the "Action Tips" panel will bring up the options that you can take. Select “Extract text of selected element” 

 

click space needle

 

Now you should notice that the extraction has been successfully created and added to the workflow below. We can edit the field name from the setting area on the upper right area by typing in the desired name.  

 

workflow

 

Go to the extraction field and find “Add predefined field” on the bottom. Click to bring up the dropdown menu, select “Add current page information” and select “Web page URL.”

 

web page url

 

Now the web page URL has been added to the data field successfully. This is great! Of course, we need to edit the URL form to trim off excess and pull the exact coordinates.

 

advance setting

Hit the "Customize" icon  (little pencil) customizeat the bottom. Select "Refine extracted data".  Then click the add step button. This brings you to a function list where you can choose for data cleaning. In this case, we select "Match with regular expression". You should arrive here.

 

regex

 

This allows you to edit the data as the way you want by writing Regular Expression. A regular expression is a special text string for describing a search pattern. Considering most people have difficulties writing the expression, we can use the built-in RegEx tool to help us. Click the “Try RegEx Tool” button. 

 

Notice that we want to pull the part after the “@” sign but before the second comma. Check the “Start With” box, and input “@”. This is telling the RegEx that you want the part after the sign. Identically, check “End With” box, and input “, 1”. As there are two commas behind the “@”, we’d better define which comma we want. Just simply add the number behind the comma, in this case, add number “1”  This tells the RegEx that you want the part before the comma and number 1. Click the “Generate” button, the regular expression should be able to show in the box. 

 

Now just confirm if we set properly by clicking the “Match” button. It generates the corresponding expression on the right. Boom! This is exactly what we want. Now go ahead and click “Apply” then Click “Ok” to confirm. 

 

regex

 

That’s it! You are done. Let’s run the crawler and see if it works. Click “Start Extraction” and pick “Local Extraction”.

 
local extraction

 

Now, what if you have 1000 addresses to lookup? Don’t worry, Octoparse allows you to input over 10,000 URLs when you set up the task. It is as simple as it appears. 

 

If you have any questions to set up a crawler, please reach out to support@octoparse.com. Octoparse is professionally designed to walk you through the journey from a beginner to a web scraping expert. We are here to help you become a master craftsman in the art of web scraping.

 

Author: Ashley

Ashley is a data enthusiast and passionate blogger with hands-on experience in web scraping. She focuses on capturing web data and analyzing in a way that empowers companies and businesses with actionable insights. Read her blog here to discover practical tips and applications on web data extraction

 

日本語記事:Googleマップから座標(緯度・経度)を取得する方法ご紹介
Webスクレイピングについての記事は 公式サイトでも読むことができます。
Artículo en español: Cómo extraer las coordenadas de Google Maps
También puede leer artículos de web scraping en el Website Oficial

 

 

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download