Yoshikoder

What’s new with the Yoshikoder?

JFreq – a little command line tool

with one comment

Just a quick pointer to a bit of code that you might find useful, if you’re a command line kind of person.

JFreq is a simple word counter. It takes your text files, filters them various ways, and spits out a table of counts organized word by document. The handy bit is probably the filtering. JFreq can currently stem in 12 languages (courtesy of the lucene project). It can also remove currency references, number references, and stop words from a list you provide. Requires Java 5 or higher.  Output is in UTF-8.

JFreq is already in use as the part of the backend of the Stata implementation of Wordscores, and has been used by the Wordfish folk for research on the EU. One of these days it will get a nice graphical interface, but given the speed of Yoshikoder development lately, that’s unlikely to happen soon. And now comes with a nice graphical interface if you don’t fancy the command line version.

Written by Will

July 6, 2007 at 11:19 am

One Response

Subscribe to comments with RSS.

  1. Nice content analysis package; I’m going to use it on foreign policy statements for Venezuela and the former Soviet Republics.

    Adrian P. Hull, Ph.D.

    March 19, 2009 at 1:32 pm


Leave a Reply