2015-11-02 79 views
7

以下是某人给我的一个练习面试问题,我不确定最佳解决方案是什么到这是:给定一组范围S和一个重叠范围R,找到S中包含R的最小子集。

给定一组范围:
(例如S = {(1, 4), (30, 40), (20, 91) ,(8, 10), (6, 7), (3, 9), (9, 12), (11, 14)}并给予的目标范围R(例如R = (3, 13) - 这意味着要为3〜13)的范围内写的算法。找到覆盖目标范围的最小范围集合,集合中的所有范围必须重叠才能被视为跨越范围整个目标范围。 (在这个例子中,答案会是{(3, 9), (9, 12), (11, 14)}

什么是解决呢?我想,这将使用贪心算法来做到最好。在上面的例子中,我们会寻找所有与3相交的数字,然后从那些最高的那个中挑选出来,然后我们会用刚才选择的那个做同样的事情,因此,我们选择了(3,9),现在我们想要找到所有在这些迭代中,我们选择了(9,12),我们对该问题做了同样的事情,我们发现下一个相交的范围是12 ,最高的是(11,14)。

在迭代之后,我们看到14大于13(我们范围的最大值),所以我们可以停止。

我对这个算法的问题是,如何有效地查询相交范围?如果我们尝试线性搜索,我们最终得到的算法是O(n^2)。我的下一个想法是在我们每次运行循环时将我们列表中的任何交叉范围截断。因此,在第一次迭代中,我们穿过(1,4)(3,9)。在我们的下一次迭代中,我们穿过(9,12),(3,9)(8,10)。因此,通过最后一次迭代,我们必须查看的是{(30,40),(20,91),(6,7)}。我们也可以通过将所有有> 13和最大值的所有东西都划掉,从而提高效率。3.问题是这仍然可能不够。在我们的范围内有很多重复的序列仍然存在潜在的问题。如果我们的范围列表包含诸如{(6,7),(6,7),(6,7),(6,7),(6,7)},我们将不得不每次查看这些尽管它们对我们没有用处。即使我们只是存储唯一值(通过将它们全部放入一组中),我们可能会有一个非常大的范围,并且有一系列范围在我们的目标范围内,但是我们也有一个范围在几乎范围内整个目标范围。

什么将是一种有效的方式来查询我们的范围?或者,可能会有什么更有效的算法来解决这个问题?

+0

能有效溶液包括'(8,10)'? –

+0

该组的成员必须重叠。如果他们没有,他们不会跨越整个范围。所以如果我们包含'(8,10)',它仍然必须包含'(9,12)'。如果我们这样做了,它将是一个大小为4的集合,而不是大小为3的集合。因此它不再是覆盖范围'(3,13)'的最小可能范围集合。 – Ephraim

回答

2

如何使用间隔树进行查询? (https://en.m.wikipedia.org/wiki/Interval_tree)我不确定贪婪是否可以在这里工作。如果我们看一下最后一组的选择,在R高点重叠,有较早的选择之间的重叠,对于那些每一个可能性,例如:

R = (2,10) and we have (8,10) and (7,10) both overlapping with (6,8) 

在这种情况下,我们只需要存储(6,8)的一个值作为路径的第二段;并且再次访问(6,8),因为我们在R中向较低点做更长的路径将是多余的,因为我们已经知道(6,8)以较低的腿数访问过。所以你的想法在我们去的时候消除间隔是有道理的。像这样的工作可以吗?

leg = 1 
start with the possible end (or beginning) intervals 
label these intervals with leg 
until end of path is reached: 
    remove the intervals labeled leg from the tree 
    for each of those intervals labeled leg: 
    list overlapping intervals in the chosen direction 
    leg = leg + 1 
    label the listed overlapping intervals with leg 
1

我可以在不使用区间树的情况下推荐以下复杂度为O(n log n)的算法。

让我们来介绍一些符号。我们应该覆盖范围(X,Y)间隔(x_i,y_i)

首先按起始点给定间隔(x_i,y_i)。这将需要O(n log n)

让我们从区间(x_i,y_i)中选择x_i <= X区间(x_k,y_k)与最大y_i。由于区间已经按起点排序,所以我们可以只增加索引,而区间满足条件。如果y_k小于X,给定的集合和范围都没有解决方案。在其他情况下,区间(x_k,y_k)包含'X',并且在包含X的区间中具有最大终点。

现在我们需要覆盖一个区间(y_k, Y),以满足重叠条件。因为包含X的所有区间的终点都小于y_k+1,所以我们可以从上一步的最后一个区间开始。

本阶段每个区间只使用一次,所以这个部分的时间复杂度为O(n),总共为O(n log n)

用于解决下面的代码片段:

intervals // given intervals from set S 
(X, Y) // range to cover 
sort intervals 
i = 0 // start index 
start = X // start point 
result_set // set to store result 
while start <= Y && i < len(intervals): 
    next_start = intervals[i].y 
    to_add = intervals[i] 
    while intervals[i].x <= start && i < len(intervals): 
     if next_start > intervals[i].y: 
      next_start = intervals[i].y 
      to_add = intervals[i] 
     i++ 
    if(next_start < start): 
     print 'No solution' 
     exit 
    start = next_start 
    result_set add to_add 
+1

你的算法用这个输入产生什么:S = {(3,10),(3,9),(9,12),(11,14)};目标范围R =(3,13)'?考虑到OP在他的评论中提到间隔必须重叠,换句话说'{(3,10),(11,14)}'不会是一个有效的解决方案。 –

+0

Opps,我根据重叠条件修改了答案。谢谢你的提示。 –

0

你的任务吸引了我,所以我写了一个C++程序,通过重叠的目标范围的左侧的范围迭代解决了这个问题,并递归搜索对于涵盖目标范围的剩余(右侧)范围的最小数量的范围。

该算法的一个重要优化(在该程序中未显示)将针对每个递归级别使用与目标范围的左侧重叠最大量的范围,并且不再考虑所有范围左侧重叠的数量较少。通过使用这个规则,我相信递归调用树中最多只有一个下降点。这种优化会产生一个复杂度为O(n log(n))的算法。 (n代表递归的深度,log(n)代表二进制搜索以找到重叠最多的范围。)

这个程序产生以下为输出:

{ (3, 9) (9, 12) (11, 14) } 


下面是程序:

#include <utility> // for std::pair 
#include <vector> // for std::vector 
#include <iostream> // for std::cout & std::endl 

typedef std::pair<int, int> range; 
typedef std::vector<range> rangelist; 

// function declarations 
rangelist findRanges (range targetRange, rangelist candidateRanges); 
void print (rangelist list); 


int main() 
{ 
    range target_range = { 3, 13 }; 

    rangelist candidate_ranges = 
     { { 1, 4 }, { 30, 40 }, { 20, 91 }, { 8, 10 }, { 6, 7 }, { 3, 9 }, { 9, 12 }, { 11, 14 } }; 

    rangelist result = findRanges (target_range, candidate_ranges); 

    print (result); 
    return 0; 
} 


// Recursive function that returns the smallest subset of candidateRanges that 
// covers the given targetRange. 
// If there is no subset that covers the targetRange, then this function 
// returns an empty rangelist. 
// 
rangelist findRanges (range targetRange, rangelist candidateRanges) 
{ 
    rangelist::iterator it; 
    rangelist smallest_list_so_far; 

    for (it = candidateRanges.begin(); it != candidateRanges.end(); ++it) { 

     // if this candidate range overlaps the beginning of the target range 
     if (it->first <= targetRange.first && it->second >= targetRange.first) { 

      // if this candidate range also overlaps the end of the target range 
      if (it->second >= targetRange.second) { 

       // done with this level - return a list of ranges consisting only of 
       // this single candidate range 
       return { *it }; 
      } 
      else { 
       // prepare new version of targetRange that excludes the subrange 
       // overlapped by the present range 
       range newTargetRange = { it->second + 1, targetRange.second }; 

       // prepare new version of candidateRanges that excludes the present range 
       // from the list of ranges 
       rangelist newCandidateRanges; 
       rangelist::iterator it2; 
       // copy all ranges up to but not including the present range 
       for (it2 = candidateRanges.begin(); it2 != it; ++it2) { 
        newCandidateRanges.push_back (*it2); 
       } 
       // skip the present range 
       it2++; 
       // copy the remainder of ranges in the list 
       for (; it2 != candidateRanges.end(); ++it2) { 
         newCandidateRanges.push_back (*it2); 
       } 

       // recursive call to find the smallest list of ranges that cover the remainder 
       // of the target range not covered by the present range 
       rangelist subList = findRanges (newTargetRange, newCandidateRanges); 

       if (subList.size() == 0) { 
        // no solution includes the present range 
        continue; 
       } 
       else if (smallest_list_so_far.size() == 0 ||    // - first subList that covers the remainder of the target range 
         subList.size() < smallest_list_so_far.size()) // - this subList is smaller than all previous ones checked 
       { 
        // add the present range to the subList, which represents a solution 
        // (though possibly not optimal yet) at the present level of recursion 
        subList.push_back (*it); 
        smallest_list_so_far = subList; 
       } 
      } 
     } 
    } 
    return smallest_list_so_far; 
} 

// print list of ranges 
void print (rangelist list) 
{ 
    rangelist::reverse_iterator rit; 
    std::cout << "{ "; 
    for (rit = list.rbegin(); rit != list.rend(); ++rit) { 
     std::cout << "(" << rit->first << ", " << rit->second << ") "; 
    } 
    std::cout << "}" << std::endl; 
} 
1

好吧,尝试了一堆不同的东西,在这里的是我的解决方案。它运行在O(nlogn)时间,并且不需要使用间隔树(尽管如果我能够记住如何实现一个采访的话,我可能会使用它,但是我认为如果不提供任何实际的效益)。

该算法的瓶颈在于排序。每个项目只触摸一次,但它只适用于一个排序的数组,所以这是我们做的第一件事。因此时间复杂度较高。因为它修改了原始数组,所以它具有空间复杂性,但如果我们不允许修改原始数组,我们可以复制它,并保持算法的其余部分相同,从而使空间复杂性O(n)。代替`(9,12)`中的示例

import java.util.*; 

class SmallestRangingSet { 
    static class Interval implements Comparable<Interval>{ 
     Integer min; 
     Integer max; 
     public Interval(int min, int max) { 
      this.min = min; 
      this.max = max; 
     } 

     boolean intersects(int num) { 
      return (min <= num && max >= num); 
     } 

     //Overrides the compareTo method so it will be sorted 
     //in order relative to the min value 
     @Override 
     public int compareTo(Interval obj) { 
      if (min > obj.min) return 1; 
      else if (min < obj.min) return -1; 
      else return 0; 
     } 
    } 

    public static Set<Interval> smallestIntervalSet(Interval[] set, Interval target) { 
     //Bottleneck is here. The array is sorted, giving this algorithm O(nlogn) time 
     Arrays.sort(set); 

     //Create a set to store our ranges in 
     Set<Interval> smallSet = new HashSet<Interval>(); 
     //Create a variable to keep track of the most optimal range, relative 
     //to the range before it, at all times. 
     Interval bestOfCurr = null; 
     //Keep track of the specific number that any given range will need to 
     //intersect with. Initialize it to the target-min-value. 
     int currBestNum = target.min; 
     //Go through each element in our sorted array. 
     for (int i = 0; i < set.length; i++) { 
      Interval currInterval = set[i]; 
      //If we have already passed our target max, break. 
      if (currBestNum >= target.max) 
       break; 
      //Otherwise, if the current interval intersects with 
      //our currBestNum 
      if (currInterval.intersects(currBestNum)) { 
       //If the current interval, which intersects currBestNum 
       //has a greater max, then our current bestOfCurr 
       //Update bestOfCurr to be equal to currInterval. 
       if (bestOfCurr == null || currInterval.max >= bestOfCurr.max) { 
        bestOfCurr = currInterval; 
       } 
      } 
      //If our range does not intersect, we can assume that the most recently 
      //updated bestOfCurr is probably the most optimal new range to add to 
      //our set. However, if bestOfCurr is null, it means it was never updated, 
      //because there is a gap somewhere when trying to fill our target range. 
      //So we must check for null first. 
      else if (bestOfCurr != null) { 
       //If it's not null, add bestOfCurr to our set 
       smallSet.add(bestOfCurr); 
       //Update currBestNum to look for intervals that 
       //intersect with bestOfCurr.max 
       currBestNum = bestOfCurr.max; 
       //This line is here because without it, it actually skips over 
       //the next Interval, which is problematic if your sorted array 
       //has two optimal Intervals next to eachother. 
       i--; 
       //set bestOfCurr to null, so that it won't run 
       //this section of code twice on the same Interval. 
       bestOfCurr = null; 
      } 

     } 

     //Now we should just make sure that we have in fact covered the entire 
     //target range. If we haven't, then we are going to return an empty list. 
     if (currBestNum < target.max) 
      smallSet.clear(); 
     return smallSet; 
    } 

    public static void main(String[] args) { 
     //{(1, 4), (30, 40), (20, 91) ,(8, 10), (6, 7), (3, 9), (9, 12), (11, 14)} 
     Interval[] interv = { 
       new Interval(1, 4), 
       new Interval(30, 40), 
       new Interval(20, 91), 
       new Interval(8, 10), 
       new Interval(6, 7), 
       new Interval(3, 9), 
       new Interval(9, 12), 
       new Interval(11, 14) 
     }; 
     Set<Interval> newSet = smallestIntervalSet(interv, new Interval(3,14)); 
     for (Interval intrv : newSet) { 
      System.out.print("(" + intrv.min + ", " + intrv.max + ") "); 
     } 

    } 
} 

输出

(3, 9) (9, 12) (11, 14)