Python | Calculate Number of Words in a File

Python is a quite simple and powerful programming language in the sense that it can be applied to so many areas like Scientific Computing, Natural Language Processing but one specific area of application of Python which I found quite fascinating is => Doing Text Processing Using Python.

In this article, I’ll discuss how to calculate number of words in a text file using Python? and also How to calculate unique number of words in a text file?

Let’s see what steps need to be followed for calculating number of words in a text file

  1. Open txt file for reading inside Python Code using open(filename, “r”) Function
  2. Read text inside File Object returned by open(filename, “r”) Function in Step 1, using read() Function
  3. Split up text contained by Object returned by read() Function from Step 2, using split() Function
  4. split() will break text from spaces and store words in a Python List
  5. Calculate number of items in list returned by split() Function using len() Function
  6. Value returned by len() Function will be Number of Words in Text File

Let’s put together all of above 6 steps as Python Code for Calculating number of words in a Text File.

f = open("filename.txt", "r")     # Step 1
data = f.read()                    # Step 2
words = data.split()               # Step 3 and 4
number_of_words = len(words)       # Step 5
print(number_of_words)

Do note that
This way of calculating number of words is just summing up How many Words are there in text file. For example – If a words say “Computer” occurs 40 times in text file then it will not be counted as 1 rather it will be counted as 40.

Let’s now discuss How to Calculate number of unique words in a Text File using Python Programming Language.

Steps need to be followed for calculating unique number of words in a text file

  1. Open txt file for reading inside Python Code using open(filename, “r”) Function
  2. Read text inside File Object returned by open(filename, “r”) Function in Step 1, using read() Function
  3. Split up text contained by Object returned by read() Function from Step 2, using split() Function
  4. split() will break text from spaces and store words in a Python List
  5. Python List returned from Above step will have same word multiple times, but passing this list to set() function will return a Python Set only containing unique words
  6. Calculate number of items in set returned in above step using len() Function
  7. Value returned by len() Function will be Number of Words in Text File

Let’s put together all of above 7 steps as Python Code for Calculating number of unique words in a Text File.

f = open("name.txt", "r")     # Step 1
data = f.read()                    # Step 2
words = data.split()               # Step 3 and 4
set_of_words = set(words)          # Step 5
number_of_words = len(words)       # Step 6
print(number_of_words)

Quite Simple, Python code for Calculating number of words or unique words in a text file by using Python. Let’s now try to use this code for a Text File name testing.txt, this file contains content as below =>

Python is a Simple Language
Python have simple syntax as compared to other Languages
Python is easy to learn

For this file, using Python Code for calculating number of words will return 19 and 15 for unique number of words.

I hope that by going through this article, you understood How to calculate words or unique words in a text file using Python? In the Python Code in this article, I’ve uses Lists and Sets. If your not aware of these Python Data Types then see => What are Sets in Python?, Lists in Python Programming Language.

Gagan

Hi, there I'm founder of ComputerScienceHub(Started this to bring useful Computer Science information just at one place). Personally I've been doing JavaScript, Python development since 2015(Been long) - Worked upon couple of Web Development Projects, Did some Data Science stuff using Python. Nowadays primarily I work as Freelance JavaScript Developer(Web Developer) and on side-by-side managing team of Computer Science specialists at ComputerScienceHub.io

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts