Python | Count Frequency of a Word in Text File

Python is a quite simple and powerful programming language in the sense that it can be applied to so many areas like Scientific Computing, Natural Language Processing but one specific area of application of Python which I found quite fascinating is => Doing Text Processing Using Python.

In this article, I’ll discuss How to calculate number of times a word occur in a text file using Python?

Let’s see what steps need to be followed for calculating frequency of a Word in a Text File

  1. Open txt file for reading inside Python Code using open(filename, “r”) Function
  2. Read text inside File Object returned by open(filename, “r”) Function in Step 1, using read() Function
  3. Split up text contained by Object returned by read() Function from Step 2, using split() Function
  4. split() will break text from spaces and store words in a Python List
  5. Use a counter like occurrence_of_word = 0 and iterate over Python List returned from Step 4, increment counter if word matches with specified word whose frequency is to be calculated
  6. After iterating over whole of Python List of words, number occurrence_of_word will be Frequency of specified word in Text File

Let’s put together all of these 6 steps as Python Code.

f = open("testing.txt", "r")     # Step 1
data = f.read()                    # Step 2
words = data.split()               # Step 3 and 4

count_frequency_word = "Python"    # Specified Word whose frequency is to be calculated
occurrence_of_word = 0

for i in words:                    # Step 5
	if i.lower() == count_frequency_word.lower():
		occurrence_of_word += 1
	else:
		pass

print(occurrence_of_word)

Let’s use this Python Code for calculating How many times does word “Python” occurred in filename.txt file, this txt file contains text as below.

Python is a Simple Language
Python have simple syntax as compared to other Languages
Python is easy to learn

Running above code by passing in filename.txt and “Python” word as count_frequency_word, will return 3 as word Python occurs three times in text file.

Do Note
This Python Code is considering words python/Python/PYTHON/PYthON all same and would be counted when calculating frequency of word Python.
But if you explicitly want to consider these python/Python/PYTHON/PYthON as different words and for Python word you want to only calculate the frequency of Python word and not other words(Python/PYTHON/PYthON).
Then on line 9 in above Python code remove lower() and write it as i == count_frequency_word only.

Gagan

Hi, there I'm founder of ComputerScienceHub(Started this to bring useful Computer Science information just at one place). Personally I've been doing JavaScript, Python development since 2015(Been long) - Worked upon couple of Web Development Projects, Did some Data Science stuff using Python. Nowadays primarily I work as Freelance JavaScript Developer(Web Developer) and on side-by-side managing team of Computer Science specialists at ComputerScienceHub.io

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts