2014-04-22 20 views
-2

我是哈斯克尔的初学者,如何与attoparsec成开放数组解析,高阵列等如何解析雅虎历史CSV与Attoparsec

module CsvParser (
     Quote (..) 
    , csvFile 
    , quote 
    ) where 
import System.IO 
import Data.Attoparsec.Text 
import Data.Attoparsec.Combinator 
import Data.Text (Text, unpack) 
import Data.Time 
import System.Locale 
import Data.Maybe 

data Quote = Quote { 
     qTime  :: LocalTime, 
     qAsk  :: Double, 
     qBid  :: Double, 
     qAskVolume :: Double, 
     qBidVolume :: Double 
    } deriving (Show, Eq) 

csvFile :: Parser [Quote] 
csvFile = do 
    q <- many1 quote 
    endOfInput 
    return q 

quote :: Parser Quote 
quote = do 
    time  <- qtime 
    qcomma 
    ask   <- double 
    qcomma 
    bid   <- double 
    qcomma 
    askVolume <- double 
    qcomma 
    bidVolume <- double 
    endOfLine 
    return $ Quote time ask bid askVolume bidVolume 

qcomma :: Parser() 
qcomma = do 
    char ',' 
    return() 

qtime :: Parser LocalTime 
qtime = do 
    tstring  <- takeTill (\x -> x == ',') 
    let time = parseTime defaultTimeLocale "%d.%m.%Y %H:%M:%S%Q" (unpack tstring) 
    return $ fromMaybe (LocalTime (fromGregorian 0001 01 01) (TimeOfDay 00 00 00)) time 

--testString :: Text 
--testString = "01.10.2012 00:00:00.741,1.28082,1.28077,1500000.00,1500000.00\n" 

quoteParser = parseOnly quote 

main = do 
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode 
    contents <- hGetContents handle 
    let allLines = lines contents 
    map (\line -> quoteParser line) allLines 
    --putStr contents 
    hClose handle 

错误消息:

testhaskell.hs:89:5: 
    Couldn't match type `[]' with `IO' 
    Expected type: IO (Either String Quote) 
     Actual type: [Either String Quote] 
    In the return type of a call of `map' 
    In a stmt of a 'do' block: 
     map (\ line -> quoteParser line) allLines 
    In the expression: 
     do { handle <- openFile 
         "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode; 

      contents <- hGetContents handle; 
      let allLines = lines contents; 
      map (\ line -> quoteParser line) allLines; 
      .... } 

testhaskell.hs:89:37: 
    Couldn't match type `[Char]' with `Text' 
    Expected type: [Text] 
     Actual type: [String] 
    In the second argument of `map', namely `allLines' 
    In a stmt of a 'do' block: 
     map (\ line -> quoteParser line) allLines 
    In the expression: 
     do { handle <- openFile 
         "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode; 

      contents <- hGetContents handle; 
      let allLines = lines contents; 
      map (\ line -> quoteParser line) allLines; 
      .... } 
+0

可能重复[如何解析雅虎csv与parsec](http://stackoverflow.com/questions/23211685/how-to-parse-yahoo-csv-with-parsec) – Sibi

+0

这个问题是关于使用另一个库attoparsec,在阅读示例之后,我发现使用起来很困难,任何简单的例子 – Moyes

+1

正如Michael Snoyman在该答案中的建议,您应该使用'csv-conduit'。 'csv-conduit'内部使用'attoparsec'来完成解析任务。如果你是Haskell的新手,我建议你从基础开始,然后开始使用这些库。 – Sibi

回答

0

您可以使用attoparsec-csv包,或者你可以看看它的source code有一些想法如何写你自己。

的代码会像

import qualified Data.Text.IO as T 
import Text.ParseCSV 

main = do 
    txt <- T.readFile "file.csv" 
    case parseCSV txt of 
    Left err -> error err 
    Right csv -> mapM_ (print . mkQuote) csv 

mkQuote :: [T.Text] -> Quote 
mkQuote = error "Not implemented yet" 
2

在错误无关秒差距或attoparsec。该错误消息指出该生产线是不是IO动作,所以当您尝试使用它作为一个它会导致错误:

main = do 
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode 
    contents <- hGetContents handle 
    let allLines = lines contents 
    map (\line -> quoteParser line) allLines -- <== This is not an IO action 
    --putStr contents 
    hClose handl 

你忽略map调用的结果。您应该将其存储在let的变量中,就像您使用lines的结果一样。

第二个错误是因为您试图使用Text作为String这是不同的类型,即使它们都表示有序的字符集合(它们也有不同的内部表示形式)。你可以用packunpack两种类型之间的转换:http://hackage.haskell.org/package/text/docs/Data-Text.html#g:5

此外,你应该始终明确给出main类型签名main :: IO()。如果你不这样做,它有时会导致微妙的问题。

但正如其他人所说,你应该使用csv解析器包。