From 9d77d7bd9cce62f6749bb8b2bb7696fcb68f8135 Mon Sep 17 00:00:00 2001 From: Robin Luckey Date: Mon, 9 Jan 2012 08:44:25 -0800 Subject: [PATCH] OTWO-1213 Works around lost encoding in Ruby/C binding layer When a Ruby 1.9.2 string is passed to the C code, the associated encoding metadata is lost. When this same string is then returned from C back to Ruby, an arbitrary, mismatched encoding is applied to replace the lost one. This means that a string becomes garbled in the round trip. The bits don't change, but the encoding is lost. The correct fix would be to preserve the encoding metadata in the C layer. The easier fix is to replace the lost encoding with a more likely match, which is what I've done in this patch. When the C code returns a string, we apply the Ruby runtime's current default encoding, which is highly likely to be the encoding originally discarded. --- ruby/ohcount.rb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ruby/ohcount.rb b/ruby/ohcount.rb index c9f0b43..2e5e665 100644 --- a/ruby/ohcount.rb +++ b/ruby/ohcount.rb @@ -12,7 +12,7 @@ module Ohcount def file_location=(value) set_diskpath(value) end def file_location() diskpath() end def filenames=(value) set_filenames(value) end - def contents() get_contents() end + def contents() get_contents().force_encoding(Encoding.default_external) end def polyglot() get_language() end def language_breakdowns -- 2.32.0.93.g670b81a890