In this project, you will create a command line application in Node.js that will read any text file and output the frequency of words in that file. This project will give you practice with basic Node.js concepts like handing file input and output, using the fs
module, and using process.argv
to read command line arguments, basic programming logic, and JavaScript fundamentals like objects and arrays.
To read a file in Node, you will use the fs
module. To calculate the frequency of words found in that file, you'll need to use basic JavaScript.
Several test files are provided for you to use as input files. You can also use any text file you like, as it should work the same for any file. A good idea would be to make a really short one for testing as you work.
To calculate the frequency of words, you must:
- remove or skip punctuation
- normalize all words to lowercase
- remove or skip the "stop words" -- words used so frequently they are ignored
- go through the file word by word and keep a count of how often each word is used
Run the program from the command line like this:
node word_frequency.js praise_song_for_the_day.txt
When the program is complete, running the command above will print out a report on the command line showing the number of times each word appears in that file, formatted like this:
we | 7 *******
each | 5 *****
or | 5 *****
need | 5 *****
love | 5 *****
about | 4 ****
praise | 4 ****
song | 4 ****
day | 3 ***
our | 3 ***
A starting program is located in word_frequency.js
. There are also text files that you can use as your input files.
- Extend the app to display the top n most frequently occurring words in the file, where n is a number provided by the user as a command line argument. For example, if the user runs
node word_frequency.js praise_song_for_the_day.txt 3
, the output would be:
we | 7 *******
each | 5 *****
or | 5 *****
- A better implementation of the first spicy option would show the top n words in the file, even if there are more than n words that have the same frequency. For example, if the user runs
node word_frequency.js praise_song_for_the_day.txt 3
, the output would be:
we | 7 *******
each | 5 *****
or | 5 *****
need | 5 *****
love | 5 *****
about | 4 ****
praise | 4 ****
song | 4 ****