Python program to extract emails from a file

Python extract all emails from a file:

We can use regular expression or regex to extract all emails from a string or from a file. In this post, we will learn how to read the content of a text file and how to extract all emails from the file.

Python provides different inbuilt methods for file operations. We will open the file, read the content of the file and extract all emails from that file.

How to open a file in Python:

Python provides a method called open() that is used to open a file with different mode. This method is defined as like below:

open(file, mode)

Where,

  • file is the file path.
  • mode is the mode to open the file. It can be ‘r’, ‘a’, ‘w’, ‘x’, ‘b’, ‘t’, or ‘+’.
    • ‘r’ is the default mode. It is used to open the file for reading.
    • ‘w’ is used to open the file for writing. It truncates the file and creates the file if it is not found.
    • ‘x’ is used for exclusive creation. It will fail if the file already exists
    • ‘a’ is used for appending. It opens the file to append text at the end of the file.
    • ‘b’ is used to open the file in binary mode and ‘t’ is used to open the file in text mode. Text mode is the default mode.
    • ‘+’ is used to open the file for updating.

For this example, the program will open the file in read mode, read the content of the file and by using a regular expression, it will extract all emails from that file.

Python program:

Below is the complete program:

import re

with open('input.txt') as input_file:
    emails = re.findall(r"[\w\.-]+@[\w\.-]+", input_file.read())
    print(emails)
  • It uses the re module to work with the regular expression.
  • The findall method takes a pattern as its first parameter and a string as its second parameter. It returns all non-overlapping matches of the pattern in the given string and returns it as a list or tuple.

For example, if the input.txt file holds the below content:

hello world
hello123,xj abc#.com
hello@gmail.com hello123@blah.com
hellouniverse !!@ @.com hello@xm.com

It will give the below output:

['hello@gmail.com', 'hello123@blah.com', 'hello@xm.com']

Python example extract emails from file

You might also like: