我有一些字符串我想解析成“块”列表。我的琴弦是这样的解析haskell中的字符串
"some text [[anchor]] some more text, [[another anchor]]. An isolated ["
而且我希望能找回这样的事情
[
TextChunk "some text ",
Anchor "anchor",
TextChunk " some more text, "
Anchor "another anchor",
TextChunk ". An isolated ["
]
我已经成功地编写一个函数,那做什么,我需要的类型,但他们似乎过于难看。 有没有更好的方法来做到这一点?
data Token = TextChunk String | Anchor String deriving (Show)
data TokenizerMode = EatString | EatAnchor deriving (Show)
tokenize::[String] -> [Token]
tokenize xs =
let (_,_,tokens) = tokenize' (EatString, unlines xs, [TextChunk ""])
in reverse tokens
tokenize' :: (TokenizerMode, String, [Token]) -> (TokenizerMode, String,[Token])
-- If we're starting an anchor, add a new anchor and switch modes
tokenize' (EatString, '[':'[':xs, tokens) = tokenize' (EatIdentifier, xs, (Identifier ""):tokens)
-- If we're ending an anchor ass a new text chunk and switch modes
tokenize' (EatAnchor, ']':']':xs, tokens) = tokenize' (EatString, xs, (TextChunk ""):tokens)
-- Otherwise if we've got stuff to consume append it
tokenize' (EatString, x:xs, (TextChunk t):tokens) = tokenize'(EatString, xs, (TextChunk (t++[x])):tokens)
tokenize' (EatAnchor, x:xs, (Identifier t):tokens) = tokenize'(EatAnchor, xs, (Identifier (t++[x])):tokens)
--If we've got nothing more to consume we're done.
tokenize' (EatString, [], tokens) = (EatString, [], tokens)
--We'll only get here if we're given an invalid string
tokenize' xx = error ("Error parsing .. so far " ++ (show xx))
这不是真正的标记,它是解析。对于所有的解析需求,Parsec。 – 2012-04-16 04:38:32
@CatPlusPlus同意解析..更新文本和标题匹配。 – 2012-04-16 04:47:08
@CatPlusPlus你能告诉我如何使用parsec看起来如何?我发现我的喜欢文档/ tutes有点模糊。 – 2012-04-16 05:02:08