Update (25/04/2011): wc.py is now available on GitHub, with a shiny new README and some new features. Woot!
In an effort to start tracking my daily word count, I wrote this script. It checks in a folder for changes to the word count of text files since it was last updated. I have the script update the word count nightly, using a cron job.
In the first few lines of the script, two variables are set:
default_path: Set this to the path to check for updates. The script recurses into sub-directories.
default_threshold: This classifies what is deemed ‘recent’, and by default is within the last day (
timedelta(day=1)). This is defined as a Python
timedeltaobject, which is documented here. You can replace
day=1with other arguments, like
week=1. The script currently works with dates, not times, so things like
hour=1will not work properly.
Once configured, the current word count of recent files (within
default_threshold), compared to any previous record (created using the
update command), can be found by running
wc.py with no arguments. A simple number-only output can be achieved by running
The word count can be “updated” by running
wc.py update. This makes a record of the word count of any recently modified files, in a file called
.wordcount, in the default directory. It makes sense to set up a cron job or task to do this in line with the
It is also important to note that the script will only perform word counts on text files (
.txt). It theoretically supports any other plain-text files, but currently filters by file extension.
I hope this first version of the script is at the least a useful example of Python’s simpler operations, as well as Pickle, the module which allows objects to be written as binary files.