undefined
Blog > Data Collection > Post

RegEx: How to Extract All Email Addresses from TXT Files or Strings

Tuesday, July 5, 2022

Now you have a text file mixed with email addresses and text strings, and you want to extract email addresses. In this article, I will show you how to use RegEx to extract email address from string or TXT files easily, and other methods to get email leads for your business.

 

 

Excel Formula to Get Email Address

 

=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,FIND (" ",A1&" ",FIND("@",A1))-1)," ", REPT(" ",LEN(A1))),LEN(A1))).

 

Using this formula to extract email addresses is frustrating if you have a hard time using Excel. Excel has strict rules on the order. It is often the case that you copy and paste a complex formula, but Excel won't accept it unless you type the expression into the cell. Another problem associated with the Excel formula is that you have to spend a certain amount of time debugging the expression, especially a long one.

 

Excel VBA Code to Get Email Information

Step 1: Press the "ALT+F11" keys, and it would bring you to the Microsoft Visual Basic for the Application window

Step 2: Click Insert > Module, copy and paste the following into the Module window:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Sub ExtractEmail()
'Update 20130829
Dim WorkRng As Range
Dim arr As Variant
Dim CharList As String
On Error Resume Next
xTitleId = "KutoolsforExcel"
Set WorkRng = Application.Selection
Set WorkRng = Application.InputBox("Range", xTitleId, WorkRng.Address, Type:=8)
arr = WorkRng.Value
CheckStr = "[A-Za-z0-9._-]"
For i = 1 To UBound(arr, 1)
    For j = 1 To UBound(arr, 2)
        extractStr = arr(i, j)
        outStr = ""
        Index = 1
        Do While True
            Index1 = VBA.InStr(Index, extractStr, "@")
            getStr = ""
            If Index1 > 0 Then
                For p = Index1 - 1 To 1 Step -1
                    If Mid(extractStr, p, 1) Like CheckStr Then
                        getStr = Mid(extractStr, p, 1) & getStr
                    Else
                        Exit For
                    End If
                Next
                getStr = getStr & "@"
                For p = Index1 + 1 To Len(extractStr)
                    If Mid(extractStr, p, 1) Like CheckStr Then
                        getStr = getStr & Mid(extractStr, p, 1)
                    Else
                        Exit For
                    End If
                Next
                Index = Index1 + 1
                If outStr = "" Then
                    outStr = getStr
                Else
                    outStr = outStr & Chr(10) & getStr
                End If
            Else
                Exit Do
            End If
        Loop
        arr(i, j) = outStr
    Next
Next
WorkRng.Value = arr
End Sub

Step 3: Press "Ok" to proceed with the process.

Step 4: Select the range you would like to apply to the above code. In this case range A1: A4

 

Regular Expression to Extract Email Address

The regular expression is very hard to learn if you don’t have any programming knowledge. However, you can apply this simple expression to filter the email address and then extract the data.

[a-zA-Z0-9-_]{1,}@[a-zA-Z0-9-_]{1,}.[a-zA-Z]{1,}

 

Octoparse has a built-in RegEx Tool, which is very convenient for people to clean the extracted data. Especially for non-IT professionals, it is an extra bonus that you don't have to spend time learning python. With the Octoparse web scraping tool, it is now possible to have data extraction, cleaning, and export all in one.

Steps to Extract Email with Octoparse Built-in RegEx Tool

Step 1:Open Octoparse application

Step 2:Copy the text string at Source Text. In this case, the text string is:

This email address is valid: web@email.net and this email address is not valid web@email. Same as what_ever@public.com is a valid email address and address test@test. is not valid! OCTOPARSE@test.com is also valid

Step 3:Copy and paste the expression in the "Regular Expression" box.

Step 4:Choose the "Match All" option at the bottom, and click "Match".

 

Related Resources

How to Generate Sales Leads Using Web Scraping

Web Scraping for Lead Generation

What is B2B Lead Generation

We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline