2016-11-18 47 views
1

我正在开发一个API集成,忽略XML或JSON的存在而偏向于追加字符数据。 (该Metro2格式,如果有兴趣)对类型系统中的串行格式进行建模,如Servant

我简化了,但是想象一下需要序列化这样的一个人:

  • 在pos 0,4个字符:字节的消息在数
  • 在pos 5:6个字符:"PERSON"硬编码
  • 在POS 11:20个字符:姓名,左对齐和空间填充
  • 在pos 21:8字符:生日,YYYYMMDD
  • 在位置29:3个字符:年龄,右对齐和零填充

数字字段始终为右对齐和零填充。文本字段始终左对齐并填充空格。

例如:

"0032PERSONDAVID WILCOX  19820711035" 

我可以表达这类型的系统?像什么servant呢?像这样?

newtype ByteLength = ByteLength Int 
newtype Age = Age Int 
-- etc 

type PersonMessage 
    = Field ByteLength '0 
    :| Field "PERSON" '5 
    :| Field Name '11 
    :| Field Date '21 
    :| Field Age '29 

-- :| is a theoretical type operator, like :> in servant 
-- the number is the expected offset 
-- the length of the field is implicit in the type 

我可以静态检查我的序列化实现是否匹配类型?

我可以静态检查第三个字段(Name)的偏移量是否为11?前面场的长度加起来是11?我假设不,因为这似乎需要完全依赖型支持。

这是正确的轨道吗?

instance ToMetro Age where 
    -- get the length into the type system using a type family? 
    field = Numeric '3 

    -- express how this is encoded. Would need to use the length from the type family. Or if that doesn't work, put it in the constructor. 
    toMetro age = Numeric age 

更新:函数的例子,我想静态验证:

personToMetro :: Person -> PersonMessage 
personToMetro p = error "Make sure that what I return is a PersonMessage" 
+1

你能举一个你希望得到更多静态保证的函数的例子?像左侧和至少输入签名 – jberryman

+0

刚刚添加了一个示例。这有帮助吗? –

回答

3

只给你一些灵感,只是做奴仆是,有不同的类型的不同组合子你支持:

{-# LANGUAGE GADTs, DataKinds, KindSignatures, TypeOperators, ScopedTypeVariables #-} 

module Seriavant where 

import GHC.TypeLits 
import Data.Proxy 
import Data.List (stripPrefix) 

data Skip (n :: Nat) = Skip deriving Show 
data Token (n :: Nat) = Token String deriving Show 
data Lit (s :: Symbol) = Lit deriving Show 

data (:>>) a b = a :>> b deriving Show 
infixr :>> 

class Deserialize a where 
    deserialize :: String -> Maybe (a, String) 

instance (KnownNat n) => Deserialize (Skip n) where 
    deserialize s = do 
     (_, s') <- trySplit (natVal (Proxy :: Proxy n)) s 
     return (Skip, s') 

instance (KnownNat n) => Deserialize (Token n) where 
    deserialize s = do 
     (t, s') <- trySplit (natVal (Proxy :: Proxy n)) s 
     return (Token t, s') 

instance (KnownSymbol lit) => Deserialize (Lit lit) where 
    deserialize s = do 
     s' <- stripPrefix (symbolVal (Proxy :: Proxy lit)) s 
     return (Lit, s') 

instance (Deserialize a, Deserialize b) => Deserialize (a :>> b) where 
    deserialize s = do 
     (x, s') <- deserialize s 
     (y, s'') <- deserialize s' 
     return (x :>> y, s'') 

trySplit :: Integer -> [a] -> Maybe ([a], [a]) 
trySplit 0 xs = return ([], xs) 
trySplit n (x:xs) = do 
    (xs', ys) <- trySplit (n-1) xs 
    return (x:xs', ys) 
trySplit _ _ = Nothing 

呀,所以这是相当简朴,但它已经让你做

type MyFormat = Token 4 :>> Lit "PERSON" :>> Skip 1 :>> Token 4 

testDeserialize :: String -> Maybe MyFormat 
testDeserialize = fmap fst . deserialize 

它是这样工作的:

*Seriavant> testDeserialize "1" 
Nothing 
*Seriavant> testDeserialize "1234PERSON Foo " 
Just (Token "1234" :>> (Lit :>> (Skip :>> Token "Foo "))) 

编辑:原来我完全误解了问题,肖恩是要求系列化,不反序列化。但当然,我们可以做到这一点以及:

class Serialize a where 
    serialize :: a -> String 

instance (KnownNat n) => Serialize (Skip n) where 
    serialize Skip = replicate (fromIntegral $ natVal (Proxy :: Proxy n)) ' ' 

instance (KnownNat n) => Serialize (Token n) where 
    serialize (Token t) = pad (fromIntegral $ natVal (Proxy :: Proxy n)) ' ' t 

instance (KnownSymbol lit) => Serialize (Lit lit) where 
    serialize Lit = symbolVal (Proxy :: Proxy lit) 

instance (Serialize a, Serialize b) => Serialize (a :>> b) where 
    serialize (x :>> y) = serialize x ++ serialize y 

pad :: Int -> a -> [a] -> [a] 
pad 0 _x0 xs = xs 
pad n x0 (x:xs) = x : pad (n-1) x0 xs 
pad n x0 [] = replicate n x0 

(当然这有可怕的表现wi所有这些String串联等。但是这不是这里的点)

*Seriavant> serialize ((Token "1234" :: Token 4) :>> (Lit :: Lit "FOO") :>> (Skip :: Skip 2) :>> (Token "Bar" :: Token 10)) 
"1234FOO Bar  " 

当然,如果我们知道的格式,我们能够避免那些讨厌的类型注释:

type MyFormat = Token 4 :>> Lit "PERSON" :>> Skip 1 :>> Token 4 

testSerialize :: MyFormat -> String 
testSerialize = serialize 
*Seriavant> testSerialize (Token "1234" :>> Lit :>> Skip :>> Token "Bar") 
"1234PERSON Bar " 
+0

这很甜蜜,而且非常有帮助,谢谢!要清楚,我不需要反序列化这种格式,只需要序列化它。 –

+0

@SeanClark是啊,也许我应该在下次回答之前尝试阅读问题;) – Cactus

+0

因此,我看到如何在运行时使用Nat值。有什么方法可以用它们来静态检查偏移量是否一致? (请参阅上面的问题) –

相关问题