If you want to extract phone numbers by using Regular Expression but don’t know how to write Regular Extraction, the article may help you with this.
It could be multiple phone numbers in a single large string and these phone numbers could come in a variety of formats. Here is an example of file format:
- (123) 456 7899
- 123 456 7899
What is the easiest way to extract phone numbers like these? Regular expression is very hard to learn if you don’t have any programming knowledge. In this article, I’ll introduce you a great Regular Expression tool to help you directly generate Regular Expressions and match all the phone numbers quickly.
Regular Expression to Match Email Addresses from strings
First, try you best to find the common character that each phone number starts with and ends with. For example, for the targeted text above, I find its source code, shown as below.
<p >Here is an example of file format </p>
<li>(123) 456 7899 </li>
<li>123 456 7899 </li>
We can see that each phone number starts with <li> and ends with </li>. And we can use RegEx Tool in Octoparse to quickly extract all phone numbers.
1. Run Octoparse and open RegEx Tool.
2. Copy and paste the source code in the “Source Text” box.
Then select “Start With” option and enter “<li>”.
3. Next, select “End With” option and enter “</li>”.
Don’t forget to select “Match All” option.
4. Select “Generate”and “Match”option one by one.
It’s done. All the matched phone numbers are listed in the green box.
Note that if you can’t find out the common character that each phone number starts with and ends with, you cannot extract all phone numbers at a time. If so, you need a special Regular Expression for each format of phone numbers.
Here, I wrote down two additional Regular Expressions for two formats of phone numbers.
Match: 0511-4405222 | 021-87888822
Match: (021)1234567 | (0411)123456 | (000)000000 |(123)1234567
Author: The Octoparse Team
For more information about Octoparse, please click here.
Sign up today.