Blog > Post

RegEx: How to Extract All Email Addresses from TXT Files or Strings

Tuesday, July 02, 2019

Now you have a text file mixed with email addresses and text strings, and you want to extract email addresses. In this article, I will show you how to extract all email addresses from TXT Files or Strings using Regular Expression. 

 

Option 1, Excel formula : 

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND (" ",A1&" ",FIND("@",A1))-1)," ", REPT(" ",LEN(A1))),LEN(A1))).

 

This formula is frustrating if you have a hard time to use Excel. Excel has strict rules on the order. It is often the case that you copy and paste a complexed formula, but Excel won't accept it unless you type the expression into the cell.  Another problem associated with the Excel formula is that you have to spend a certain amount of time to debug the express, especially the long one.

 

Option 2, Excel VBA code.

 

 

1. Press ALT+F11 keys, which would bring you to the Microsoft Visual Basic for Application window

2. Click Insert > Module, copy and past the following in the Module window:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Sub ExtractEmail()
'Update 20130829
Dim WorkRng As Range
Dim arr As Variant
Dim CharList As String
On Error Resume Next
xTitleId = "KutoolsforExcel"
Set WorkRng = Application.Selection
Set WorkRng = Application.InputBox("Range", xTitleId, WorkRng.Address, Type:=8)
arr = WorkRng.Value
CheckStr = "[A-Za-z0-9._-]"
For i = 1 To UBound(arr, 1)
    For j = 1 To UBound(arr, 2)
        extractStr = arr(i, j)
        outStr = ""
        Index = 1
        Do While True
            Index1 = VBA.InStr(Index, extractStr, "@")
            getStr = ""
            If Index1 > 0 Then
                For p = Index1 - 1 To 1 Step -1
                    If Mid(extractStr, p, 1) Like CheckStr Then
                        getStr = Mid(extractStr, p, 1) & getStr
                    Else
                        Exit For
                    End If
                Next
                getStr = getStr & "@"
                For p = Index1 + 1 To Len(extractStr)
                    If Mid(extractStr, p, 1) Like CheckStr Then
                        getStr = getStr & Mid(extractStr, p, 1)
                    Else
                        Exit For
                    End If
                Next
                Index = Index1 + 1
                If outStr = "" Then
                    outStr = getStr
                Else
                    outStr = outStr & Chr(10) & getStr
                End If
            Else
                Exit Do
            End If
        Loop
        arr(i, j) = outStr
    Next
Next
WorkRng.Value = arr
End Sub

5. Press Ok to proceed with the process. 4. Select the range you would like to apply to the above code. In this case range A1:A4

 

 

 

Option 3: Regex Expression

 

The regular expression is very hard to learn if you don’t have any programming knowledge. However, you can apply this one simple expression to filter the email address. 

[a-zA-Z0-9-_]{1,}@[a-zA-Z0-9-_]{1,}.[a-zA-Z]{1,}

 

Octoparse has built-in RegEx Tool, which is very convenient for people to clean the extracted data. Especially for non-IT professionals, it is an extra bonus that you don't have to spend time to learn python to clean the data. with the Octoparse web scraping tool, it is now possible for data extraction, cleaning, and export all-in-one.

 

 

1. Open Octoparse application

2. Copy the text string at Source Text. For example, in this case, the text string is:

This email address is valid: web@email.net and this email address is not valid web@email. Same as what_ever@public.com is a valid email address and address test@test. is not valid! OCTOPARSE@test.com is also valid

 

3. Copy and paste the expression in the “Regular Expression” box.

4. Choose the “Match All” option at the bottom, and click “Match”.

 

Author: Ashley Weldon

 

Download Octoparse Today  

Author's Picks

 

About Octoparse

A Comparison among Three Editions of Octoparse

What A Price Monitor Can Help you?

Collect Data from Amazon

Collect Data from eBay

Collect Data from Gumtree.com

 

 

 

Download Octoparse to start web scraping or contact us for any
question about web scraping!

Contact Us Download