2012-07-07 83 views
1

我使用http://hackage.haskell.org/package/sqlite-0.5.2.2绑定到SQLite数据库。在* .db文件里面有UTF-8编码的文本,我可以在文本编辑器和sqlite CLI工具中保证这一点。来自SQLite数据库的Unicode文本似乎被破坏

当连接到数据库并检索数据时 - 文本内容被破坏。简单的测试如下:

import qualified Database.SQLite as SQL 
import Control.Applicative ((<$>)) 
import System.IO 

buildSkypeMessages dbh = 
    (go <$> (SQL.execStatement dbh "select chatname,author,timestamp,body_xml from messages order by chatname, timestamp")) >>= 
    writeIt 
    where 
    writeIt content = withFile "test.txt" WriteMode (\handle -> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content) 
    go (Left msg) = fail msg 
    go (Right rows) = map f $ concat rows 
     where 
     f' (("chatname",SQL.Text chatname): 
      ("author",SQL.Text author): 
      ("timestamp",SQL.Int timestamp): 
      r) = ([chatname, author], r) 
     f xs = let (partEntry, (item:_)) = f' xs 
       in case item of 
       ("body_xml",SQL.Text v) -> v:partEntry 
       ("body_xml",SQL.Null) -> "":partEntry 
     escape (_,SQL.Text v) = v 
     escape (_,SQL.Null) = "" 
     escape (_,SQL.Int v) = show v 

那里可能有什么错?我是否缺少Sqlite或Haskell I/O和编码?

+0

一个地方这可能出问题是在写入文件:GHC将使用当前区域设置选择此操作的默认编码。你可以通过调用[hSetEncoding](http://hackage.haskell.org/packages/archive/base/latest/doc/html/System-IO.html#v:hSetEncoding)来测试这是否是问题。 – 2012-07-08 06:11:32

+0

@DanielWagner我当前的语言环境是en_US.UTF-8,所以不应该如此。文本文件中的数据看起来像双编码为utf-8 – jdevelop 2012-07-08 06:34:33

+0

@DanielWagner设置二进制模式有所帮助。谢谢! – jdevelop 2012-07-08 07:04:34

回答

1

实际上,问题与SQLite绑定无关,而是与Haskell中的字符串处理有关。有什么解决的问题 - 穿上它之前的数据手柄上调用hSetBinaryMode:

writeIt content = withFile "test.txt" WriteMode (\handle -> hSetBinaryMode handle True >> mapM_ (\(c:a:t:[]) -> hPutStrLn handle c) content)