Lot's of fixes from Martin's comments. Fix signed/unsigned character issues Add lots of comments to help understand the code Add tests for proper Unicode handling (we should abort if we get a Unicode string, and we should correctly handle utf-8 strings)