4

给定一个二维空间中的点列表,您想要在 Haskell中执行函数来找到两个最近点之间的距离。 例子: 输入:项目[(1,5),(3,4),(2,8),(-1,2),(-8.6),(7.0),(1.5),(5.5), (4.8),(7.4)] 输出:2.0Haskell找到两个最近点之间的距离

假设在列表中的最远的两个点之间的距离为至多10000

Here's我的代码:

import Data.List 
import System.Random 

sort_ :: Ord a => [a] -> [a] 
sort_ [] = [] 
sort_ [x] = [x] 
sort_ xs = merge (sort_ left) (sort_ right) 
    where 
    (left, right) = splitAt (length xs `div` 2) xs 
    merge [] xs = xs 
    merge xs [] = xs 
    merge (x:xs) (y:ys)= 
    if x <= y then 
     x : merge xs (y:ys) 
    else y : merge (x:xs) ys  

project :: [(Float,Float)] -> Float 
project [] = 0 
project (x:xs)= 
    if null (xs) then 
     error "The list have only 1 point" 
    else head(sort_(dstList(x:xs))) 

distance :: (Float,Float)->(Float,Float) -> Float 
distance (x1,y1) (x2,y2) = sqrt((x1 - x2)^2 + (y1 - y2)^2) 


dstList :: [(Float,Float)] -> [Float] 
dstList (x:xs)= 
    if length xs == 1 then 
     (dstBetween x xs):[] 
    else (dstBetween x xs):(dstList xs) 


dstBetween :: (Float,Float) -> [(Float,Float)] -> Float 
dstBetween pnt (x:xs)= 
    if null (xs) then 
     distance pnt x 
    else minimum ((distance pnt):((dstBetween pnt xs)):[]) 

{- 
Calling generator to create a file created at random points 
-} 
generator = do 
    putStrLn "Enter File Name" 
    file <- getLine 
    g <- newStdGen 
    let pts = take 1000 . unfoldr (Just . (\([a,b],c)->((a,b),c)) . splitAt 2) 
       $ randomRs(-1,1) g :: [(Float,Float)] 
    writeFile file . show $ pts 

{- 
Call the main to read a file and pass it to the function of project 
The function of the project should keep the name 'project' as described 
in the statement 
-} 
main= do 
    putStrLn "Enter filename to read" 
    name <- getLine 
    file <- readFile name 
    putStrLn . show . project $ readA file 

readA::String->[(Float,Float)] 
readA = read 

我可以执行程序的运行,如下例所示或使用生成器:

in haskell interpreter must键入“生成器”,程序会在这里要求一个包含千分的文件名。并且在Haskell解释器中生成文件后,必须写入main,并请求一个文件名,这是使用“generator”创建的文件的名称。

问题是,对于1000点随机生成我的程序需要很长时间,在双核处理器的计算机上约3分钟。我究竟做错了什么?我如何优化代码以更快地工作?

+0

你是否介绍了你的程序? – Jonke

+0

你为什么要删除这么多你的帖子?看看你的尝试是有帮助的。 – AndrewC

+2

我已恢复第二个版本,以恢复上下文。 –

回答

11

您使用的是二次算法:

project [] = error "Empty list of points" 
project [_] = error "Single point is given" 
project ps = go 10000 ps 
    where 
    go a [_] = a 
    go a (p:ps) = let a2 = min a $ minimum [distance p q | q<-ps] 
        in a2 `seq` go a2 ps 

你应该用更好的算法。 Search computational-geometry tag on SO为更好的算法。请参阅http://en.wikipedia.org/wiki/Closest_pair_of_points_problem


@maxtaldykin proposes一个不错的,简单而有效的改变的算法,应该为随机数据的真正的区别 - 预排序X点坐标,从来没有尝试点超过d单位离目前来看,在X坐标(其中d是目前已知的最小距离):

import Data.Ord (comparing) 
import Data.List (sortBy) 

project2 [email protected](_:_:_) = go 10000 p1 t 
    where 
    (p1:t) = sortBy (comparing fst) ps 
    go d _   [] = d 
    go d [email protected](x1,_) t = g2 d t 
     where 
     g2 d []   = go d (head t) (tail t) 
     g2 d ([email protected](x2,_):r) 
      | x2-x1 >= d = go d (head t) (tail t) 
      | d2 >= d  = g2 d r 
      | otherwise = g2 d2 r -- change it "mid-flight" 
       where 
       d2 = distance p1 p2 

随机数据,g2将在O(1)时间工作,使go将采取O(n)整个事情将由SOR为界t,~ n log n

Empirical orders of growth~ n^2.1显示用于第一码(上1K/2K范围)和~n^1.1用于第二,上10K/20K范围,测试它quick'n'dirty编译加载到GHCI(与第二代码运行50倍的速度比2000年的第一名还要快,而在3000分的情况下,则快80倍)。

+0

优秀点。 O(n^2)给出了1 000 000个计算的非常粗略的估计,O(n log n)给出了非常粗略的估计值3000.您还应该将生成函数和主函数合并到一个主函数中,并使用ghc编译文件 - O2,与解释者相比,它会加快速度。 – AndrewC

+0

@AndrewC,不要认为'ghc -O2'会有帮助,这个问题被标记为* hugs * –

+0

@maxtaldykin糟糕。然后修改建议也是下载[Haskell平台](http://www.haskell.org/platform/)。 – AndrewC

6

可以稍微修改你的bruteforce搜索以获得更好的随机数据性能。

主要思想是用x坐标对点进行排序,并且在比较循环中的距离时,只考虑水平距离不大于当前最小距离的点。

这可能是数量级更快,但在最坏的情况下,它仍然是O(n^2)
实际上,在2000点上,我的机器速度快了50倍。

project points = loop1 10000 byX 
    where 
    -- sort points by x coordinate 
    -- (you need import Data.Ord to use `comparing`) 
    byX = sortBy (comparing fst) points 

    -- loop through all points from left to right 
    -- threading `d` through iterations as a minimum distance so far 
    loop1 d = foldl' loop2 d . tails 

    -- `tail` drops leftmost points one by one so `x` is moving from left to right 
    -- and `xs` contains all points to the right of `x` 
    loop2 d [] = d 
    loop2 d (x:xs) = let 
     -- we take only those points of `xs` whose horizontal distance 
     -- is not greater than current minimum distance 
     xs' = takeWhile ((<=d) . distanceX x) xs 
     distanceX (a,_) (b,_) = b - a 

     -- then just get minimum distance from `x` to those `xs'` 
     in minimum $ d : map (distance x) xs' 

顺便说一句,请不要使用太多括号。 Haskell不需要包含函数参数。

+0

非常好!简单而有效。 :) –