Batch re-encoding files

Sometimes it’s necessary to turn a whole bunch of text files in one encoding into another. Like when my Windows-using collaborators send me things in CP1252 or Mac folk send MacRoman. If for some reason you don’t feel like using the command line to convert a folder full of files, there is now the re-encoder to do it for you. Just identify your folder of text files, preview one or two of them to make sure your ‘from’ encoding is actually reasonable, pick a ‘to’ encoding, and press the button to get a folder full of new files. (The old ones aren’t deleted in case that was a mistake you just made).

Not a particularly tricky task, but it might be useful to someone.

New Yoshikoder version out


Feel free to poke it a bit. Menus have moved, lots of little things have changed, Mac users get a more OSX-ish feel, and we’re now all set up for the next move forward. The help is rewritten, but a bit sparse. and there’s a little bit more info in the README on the sourceforge pages.

And if you don’t like it, all your files will still work with the old version.

Batch Reporting

If you want to run a Yoshikoder dictionary over a large number of documents more quickly than the Yoshikoder would do it, you can now use the new version of JFreq. Plus you get stemming, stopword removal and other preprocessing steps too, if you want them. Just drag your documents into the window, upload your saved dictionary (not project), select your preprocessing options, pick an output file, and press Run.

Horrid html

First off, there’s an update to the YK Converter available from here.  It’s a ‘related file’ for the Yoshikoder, for some reason. These updates are primarily because I’ve just finished working on a project that required scraping a lot of truly horrid web pages, and the current machinery wasn’t quite up to dealing with them.  In fact, they bust everything except TagSoup.

Read more…

A little bit of infrastructure

OK.  About the transparency thing.  There’s now a proper place to send bug reports and feature requests for the Yoshikoder.  I had considered using the many options available from Sourceforge, but plumped for something that was much simpler, arguably more elegant, and most importantly: blue.

Read more…

And another one

Another preview (RC2) is available here. Most of the debugging happened on the Mac side, but it ought to go slightly better everywhere.   Read more…

Sneaky preview

In a remarkably productive Christmas break I finally got working on the much neglected, at this point almost mythical ‘next version’ of the Yoshikoder.  It’s not quite there yet, but I thought it might be nice to share a sneaky new year preview with you.  

Read more…


Get every new post delivered to your Inbox.