2014-09-20 57 views
2

我读this mailing list post,如果有人曾经做过一个关于在Haskell螺纹RB树的问题,以及响应的结束说:线程化二叉树结构在Haskell中提供了哪些优势?

我建议你(莱克斯)要么去势在必行(与STRef或IOREF)或做 没有线程,除非你确定你会做更多的 查找和遍历比插入和删除。

暗示,虽然在Haskell中创建线索树通常不是一个好主意,但它仍然使查找和遍历更有效,而不诉诸强制性算法。

但是,我想不到线程可以使haskell树更有效,而不使用命令式结构。它甚至有可能吗?

+3

我想你可能会感到困惑,我指的是一个线程二叉树的数据结构,而不是在具有多线程的二叉树上执行操作。 http://en.wikipedia.org/wiki/Threaded_binary_tree – 2014-09-20 01:15:08

回答

1

但是,我想不出一种方式线程可以使haskell树更有效,而不使用命令式结构。它甚至有可能吗?

从技术上讲,适用于命令式线程树的好处同样适用于持久线程树。但是,由于这种数据结构的额外成本,并不总是一个实际的选择。

考虑一下你有一棵树不会被修改的情况,但是你经常需要进行线性遍历(例如找到一个节点和随后的节点或者完整的线性遍历等)。在命令式语言中,线程树在这种情况下可以比非线程树更高效,因为可以直接执行线性遍历,而无需保留堆栈。应该清楚,这与持久结构的情况完全相同,因为我们假设树不会被修改,所以线程树的线性遍历在持久结构中也会更加高效。

那么,持久线程树的缺点是什么?首先,插入/删除将比普通树要昂贵得多,因为修改节点之前的每个节点也需要重新创建。所以这种结构只有在突变罕见或不存在时才有用。但是在那种情况下,你可能更适合从树中创建一个数组并遍历它(除非你想查找一个起始位置)。所以它最终成为一个相当复杂的数据结构,只能在非常有限的情况下使用。但是对于那个非常具体的用例,它比纯二叉树更有效率。

编辑:这是一个如何纯粹实现线程化二叉树的例子。删除的实施只是一个练习,并没有试图保持树木的平衡,我也没有对正确性做出任何承诺。但在使用Prelude.foldl Threaded.insert Threaded.empty建立Tree后,Data.Foldable.toListfoldThread (:[])都返回相同的列表,因此它可能非常接近正确。

{-# LANGUAGE DeriveFoldable #-} 
module Threaded where 

import Control.Applicative 
import Control.Monad 
import Data.Foldable (Foldable (..)) 
import Data.Monoid 

newtype Tree a = Tree {unTree :: Maybe (NonNullTree a) } 
    deriving (Eq, Foldable) 

-- its a little easier to work with non-null trees. 
data NonNullTree a = Bin (Link a) a (Link a) 

data Link a = 
    Normal (NonNullTree a) -- a child branch 
    | Thread (NonNullTree a) -- a thread to another branch 
    | Null     -- left child of min value, or right child of max value 
    -- N.B. don't try deriving type class instances, such as Eq or Show. If you derive 
    -- them, many of the derived functions will be infinite loops. If you want instances 
    -- for Show or Eq, you'll have to write them by hand and break the loops by 
    -- not following Thread references.  

empty :: Tree a 
empty = Tree Nothing 

singleton :: a -> Tree a 
singleton a = Tree . Just $ Bin Null a Null 

instance Foldable NonNullTree where 
    foldMap f (Bin l a r) = mconcat [foldMap f l, f a, foldMap f r] 

-- when folding, we only want to follow actual children, not threads. 
-- Using this instance, we can compare with folding via threads. 
instance Foldable Link where 
    foldMap f (Normal t) = foldMap f t 
    foldMap f _ = mempty 

-- |find the first value in the tree >= the search term 
-- O(n) complexity, we can do better! 
tlookup :: Ord a => Tree a -> a -> Maybe a 
tlookup tree needle = getFirst $ foldMap search tree 
    where 
    search a = if a >= needle then First (Just a) else mempty 

-- | fold over the tree by following the threads. The signature matches `foldMap` for easy 
-- comparison, but `foldl'` or `traverse` would likely be more common operations. 
foldThread :: Monoid m => (a -> m) -> Tree a -> m 
foldThread f (Tree (Just root)) = deep mempty root 
    where 
    -- descend to the leftmost child, then follow threads to the right. 
    deep acc (Bin l a r) = case l of 
     Normal tree -> deep acc tree 
     _ -> follow (acc `mappend` f a) r 
    follow acc (Normal tree) = deep acc tree 
    -- in this case we know the left child is a thread pointing to the 
    -- current node, so we can ignore it. 
    follow acc (Thread (Bin _ a r)) = follow (acc `mappend` f a) r 
    follow acc Null = acc 

-- used internally. sets the left child of the min node to the 'prev0' link, 
-- and the right child of the max node to the 'next0' link. 
relinkEnds :: Link a -> Link a -> NonNullTree a -> NonNullTree a 
relinkEnds prev0 next0 root = case go prev0 next0 root of 
    Normal root' -> root' 
    _ -> error "relinkEnds: invariant violation" 
    where 
    go prev next (Bin l a r) = 
     -- a simple example of knot-tying. 
     -- * l' depends on 'this' 
     -- * r' depends on 'this' 
     -- * 'this' depends on both l' and r' 
     -- the whole thing works because Haskell is lazy, and the recursive 'go' 
     -- function never actually inspects the 'prev' and 'next' arguments. 
     let l' = case l of 
       Normal lTree -> go prev (Thread this) lTree 
       _ -> prev 
      r' = case r of 
       Normal rTree -> go (Thread this) next rTree 
       _ -> next 
      this = Bin l' a r' 
     in Normal this 

-- | insert a value into the tree, overwriting it if already present. 
insert :: Ord a => Tree a -> a -> Tree a 
insert (Tree Nothing) a = singleton a 
insert (Tree (Just root)) a = case go Null Null root of 
    Normal root' -> Tree $ Just root' 
    _ -> error "insert: invariant violation" 
    where 
    go prev next (Bin l val r) = case compare a val of 
     LT -> 
      -- ties a knot similarly to the 'relinkEnds' function. 
      let l' = case l of 
         Normal lTree -> go prev thisLink lTree 
         _ -> Normal $ Bin prev a thisLink 
       r' = case r of 
         Normal rTree -> Normal $ relinkEnds thisLink next rTree 
         _ -> next 
       this = Bin l' val r' 
       thisLink = Thread this 
      in Normal this 
     EQ -> 
      let l' = case l of 
         Normal lTree -> Normal $ relinkEnds prev thisLink lTree 
         _ -> prev 
       r' = case r of 
         Normal rTree -> Normal $ relinkEnds thisLink next rTree 
         _ -> next 
       this = Bin l' a r' 
       thisLink = Thread this 
      in Normal this 
     GT -> 
      let l' = case l of 
         Normal lTree -> Normal $ relinkEnds prev thisLink lTree 
         _ -> prev 
       r' = case r of 
         Normal rTree -> go thisLink next rTree 
         _ -> Normal $ Bin thisLink a next 
       this = Bin l' val r' 
       thisLink = Thread this 
      in Normal this 
+0

我理解线性遍历如何与命令式语言中的线索树一起工作,我只是努力想知道如何利用线程来编写算法哈斯克尔,而不诉诸哈斯克尔的命令式结构。 – 2014-09-20 13:25:02

+0

@Sintrastes:啊,对不起,我没有真正理解你的问题。有可能纯粹使用一种称为“捆绑结”的技术来实现它,http://www.haskell.org/haskellwiki/Tying_the_Knot。基本思想是每个线程对整个数据结构都有一个懒惰的引用。如果没有人打败我,我会尽量在后面更新我的答案。 – 2014-09-21 19:13:20