折返缓存调用

假设我们有一个IO动作如折返缓存调用

lookupStuff :: InputType -> IO OutputType

这可能是简单的东西，如DNS查找，或对不随时间变化数据的一些Web服务调用。

让我们假设：

操作不会抛出任何异常和/或从未发散
如果不是为IO单子，该功能将是纯的，即结果对于相同的输入参数总是相同的
该操作是可重入的，即它可以安全地从多个线程同时调用。
lookupStuff操作相当（时间）昂贵。

我面临的问题是如何正确（和W/O使用任何unsafe*IO*作弊）实现重入缓存中，可以从多个线程调用，组合多个查询，对于相同的输入参数整合成一个请求。

我想我是在类似于GHC的黑洞概念纯粹的计算，但在IO“计算”的上下文。

什么是指定问题的惯用Haskell/GHC解决方案？

来源

2011-03-30 hvr

的假设1 ，2和3似乎意味着该功能非常纯粹，杂质仅仅是一个实现细节。在这种情况下，我认为使用unsafePerformIO没有任何问题。实际上，我认为unsafePerformIO完全适用于这种情况。 – 2011-03-30 10:30:43

同意。 1,2,3是非常强大的假设，在IO中几乎从不保存代码，但是如果事实上可以保证这个不安全的PerformaIO是相当合理的。 – 2011-03-30 10:45:01

好吧，但我如何保证IO效果从来不会为相同的输入参数执行多次？ – hvr 2011-03-30 10:48:07

是的，基本上是重新实现逻辑。虽然它与GHC已经做的很相似，但这是GHC的选择。 Haskell可以在工作方式非常不同的虚拟机上实现，因此从这个意义上说，它尚未为您完成。

但是，只需使用MVar (Map InputType OutputType)甚至IORef (Map InputType OutputType)（请确保使用atomicModifyIORef修改），并将缓存存储在那里。如果这个必要的解决方案看起来不对，那就是“如果不是为了IO，这个函数就是纯粹的”约束。如果它只是一个任意的动作，那么你必须保持状态才能知道执行或不执行的想法是非常自然的。问题是Haskell没有“纯IO”的类型（如果它依赖于数据库，它在某些条件下表现得纯粹，这与纯粹的遗传是不一样的）。

import qualified Data.Map as Map 
import Control.Concurrent.MVar 

-- takes an IO function and returns a cached version 
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b) 
cache f = do 
    r <- newMVar Map.empty 
    return $ \x -> do 
     cacheMap <- takeMVar r 
     case Map.lookup x cacheMap of 
      Just y -> do 
       putMVar r cacheMap 
       return y 
      Nothing -> do 
       y <- f x 
       putMVar (Map.insert x y cacheMap) 
       return y

是的它在里面很丑。但在外面，看看！它就像纯粹的记忆功能的类型，除了它有IO染上了它。

来源

2011-03-30 15:44:45 luqui

注意，如实施，这是一个同步缓存。即每个对它的调用都被一个MVar封锁。替代方案可能允许同一个动作运行多次。然后我会使用'IORef'而不是'atomicModifyIORef'。 – luqui 2011-03-30 15:46:39

确实，这是语义上合理的方法。如果包含这些内容，您总是可以在结果周围放置一个'unsafePerformIO'。 – 2011-03-30 20:57:44

好吧，看起来你的实现序列化了IO函数调用，即使'cache'包装函数是从不同线程同时调用的。你如何使它不是序列化的，但不允许多次运行“相同”的动作？ – hvr 2011-03-30 22:54:28

下面是一些代码实现或多或少什么，我在我原来的问题后：

import   Control.Concurrent 
import   Control.Exception 
import   Data.Either 
import   Data.Map   (Map) 
import qualified Data.Map   as Map 
import   Prelude   hiding (catch) 

-- |Memoizing wrapper for 'IO' actions 
memoizeIO :: Ord a => (a -> IO b) -> IO (a -> IO b) 
memoizeIO action = do 
    cache <- newMVar Map.empty 
    return $ memolup cache action 

    where 
    -- Lookup helper 
    memolup :: Ord a => MVar (Map a (Async b)) -> (a -> IO b) -> a -> IO b 
    memolup cache action' args = wait' =<< modifyMVar cache lup 
     where 
     lup tab = case Map.lookup args tab of 
      Just ares' -> 
      return (tab, ares') 
      Nothing -> do 
      ares' <- async $ action' args 
      return (Map.insert args ares' tab, ares')

上面的代码建立在西蒙·马洛的Async抽象为Tutorial: Parallel and Concurrent Programming in Haskell描述：

-- |Opaque type representing asynchronous results. 
data Async a = Async ThreadId (MVar (Either SomeException a)) 

-- |Construct 'Async' result. Can be waited on with 'wait'. 
async :: IO a -> IO (Async a) 
async io = do 
    var <- newEmptyMVar 
    tid <- forkIO ((do r <- io; putMVar var (Right r)) 
       `catch` \e -> putMVar var (Left e)) 
    return $ Async tid var 

-- |Extract value from asynchronous result. May block if result is not 
-- available yet. Exceptions are returned as 'Left' values. 
wait :: Async a -> IO (Either SomeException a) 
wait (Async _ m) = readMVar m 

-- |Version of 'wait' that raises exception. 
wait' :: Async a -> IO a 
wait' a = either throw return =<< wait a 

-- |Cancels asynchronous computation if not yet completed (non-blocking). 
cancel :: Async a -> IO() 
cancel (Async t _) = throwTo t ThreadKilled

来源

2011-06-21 11:01:58 hvr

非常好！你预见到用HashMap替换Data.Map会有什么问题吗？ – 2011-07-30 11:46:44

@ circular-ruin，想不到 – hvr 2011-07-30 11:57:17

优秀，谢谢:) – 2011-07-30 12:12:52

折返缓存调用

回答

相关问题