2010-03-17 109 views
29

我最近遇到一个关于字符串的有趣问题。假设你给出以下几点:

如何找到包含给定字符串中所有字符的最小子字符串?

​​

因此,上面给出的,我怎么能对发现包含字符串2中的所有字符字符串1的最小的子进场?

+0

string2应该是rist还是tisr?在这种情况下,输出是不是“st str”? – 2010-03-17 03:07:25

+2

@kennygrimm,string2作为“tist”给出,它应该是。如果你说“rist”或“tisr”比你的回答“st str”不包含“i”。 – 2010-03-17 03:13:23

+0

哦,我明白了,我认为'r'是错的,因为它不在string2中,但是你说它必须包含所有的string2,但也可以包含其他字母...... – 2010-03-17 03:18:45

回答

30

您可以O(N+M)时间和O(1)空间,N是字符的第一个字符串中的数字和M在第二字符数做一个直方图横扫。

它的工作原理是这样的:

  • 使第二个字符串的字符的直方图(键操作hist2[ s2[i] ]++)。
  • 制作第一个字符串字符的累积直方图,直到该直方图包含第二个字符串的直方图包含的每个字符(我将称之为“直方图条件”)。
  • 然后向前移动第一个字符串,从直方图中减去,直到它不符合直方图条件。将第一个字符串的位(在最后一步之前)标记为您的暂定子字符串。
  • 将子串的前端再次向前移动,直到再次遇到直方图条件。向前移动结尾直到再次失败。如果这是比第一个更短的子字符串,则将其标记为您的暂定子字符串。
  • 重复,直到你已经通过整个第一个字符串。
  • 标记的子串是你的答案。

指出,改变你直方图条件使用支票,你可以选择有相同的字符集作为第二个字符串,或至少多达每种类型的字符。 (它只是a[i]>0 && b[i]>0a[i]>=b[i]之间的差异。)

如果你保持跟踪哪个当你试图去满足它,并检查只是你递减东西条件不满足您可以加快直方图检查当你试图打破它。 (在最初的积累中,你计算了你已经满足的项目数,并且每当你添加一个新的角色时,增加一个新的角色,从false到true)

+0

+1:这比可读性要好得多蟒蛇。如果你包含证明/解释为什么它能起作用的话,那将会很好。 – 2010-03-17 14:43:23

+0

@Rex Kerr:我不明白这是怎么样的O(1)空间。如果所有角色都是唯一的(最坏的情况),你的直方图不会占用O(N + M)空间吗? – 2010-03-17 15:59:04

+1

O(M)空间可以完成(而不是O(N + M)),因为您并不需要关心不存在的字符是s2。我同意,空间使用是O(1)似乎不正确,似乎不符合描述。 – 2010-03-17 17:16:36

5

这是O(n)解决方案。基本思想很简单:对于每个起始索引,找到最小结束索引,使得子串包含所有必需的字母。诀窍是在函数的过程中最小结束索引增加,所以只需要一点数据结构支持,我们就会将每个字符看作最多两次。

在Python:

from collections import defaultdict 

def smallest(s1, s2): 
    assert s2 != '' 
    d = defaultdict(int) 
    nneg = [0] # number of negative entries in d 
    def incr(c): 
     d[c] += 1 
     if d[c] == 0: 
      nneg[0] -= 1 
    def decr(c): 
     if d[c] == 0: 
      nneg[0] += 1 
     d[c] -= 1 
    for c in s2: 
     decr(c) 
    minlen = len(s1) + 1 
    j = 0 
    for i in xrange(len(s1)): 
     while nneg[0] > 0: 
      if j >= len(s1): 
       return minlen 
      incr(s1[j]) 
      j += 1 
     minlen = min(minlen, j - i) 
     decr(s1[i]) 
    return minlen 
+0

@algorithmist,我没有在Python中工作,但是我可以得到那个... while循环doesn; t似乎O( N)。你可以告诉你的方法在问题中给出的例子,将不胜感激。 – 2010-03-17 03:29:41

+1

j只能增加len(s1)次,所以while循环确实会O(n)总共工作。 – user287792 2010-03-17 03:31:22

+0

@Rajendra:这个算法和我在文章中描述的完全一样,如果有帮助的话--i代表子字符串的尾部,'j'代表头部。 @algorithmist:很好的工作,代码总是比我想出的描述稍快一些! – 2010-03-17 03:37:13

0

编辑:显然有O n)算法(参考算法师的答案)。显然,这将会击败下面描述的[天真]基线!

太糟糕了我得走了...我有点怀疑我们可以得到O(n)。我明天登记看看胜利者;-)玩得开心!

暂定算法
的总体思路是:依次尝试,从str1中作为搜索开始发现STR2使用字符STR2的所有其他字母(在/双向)。通过保持“迄今为止最佳匹配的长度”值,我们可以在搜索超过此值时放弃搜索。其他启发式方法可能可以用于进一步放弃次优(迄今为止)的解决方案。 str1中起始字母顺序的选择很重要;建议先从str1的字母开始计数,然后再尝试其他字母,在随后的尝试中增加计数。

[loose pseudo-code] 
    - get count for each letter/character in str1 (number of As, Bs etc.) 
    - get count for each letter in str2 
    - minLen = length(str1) + 1 (the +1 indicates you're not sure all chars of 
           str2 are in str1) 
    - Starting with the letter from string2 which is found the least in string1, 
    look for other letters of Str2, in either direction of str1, until you've 
    found them all (or not, at which case response = impossible => done!). 
    set x = length(corresponding substring of str1). 
- if (x < minLen), 
     set minlen = x, 
     also memorize the start/len of the str1 substring. 
- continue trying with other letters of str1 (going the up the frequency 
    list in str1), but abort search as soon as length(substring of strl) 
    reaches or exceed minLen. 
    We can find a few other heuristics that would allow aborting a 
    particular search, based on [pre-calculated ?] distance between a given 
    letter in str1 and some (all?) of the letters in str2. 
- the overall search terminates when minLen = length(str2) or when 
    we've used all letters of str1 (which match one letter of str2) 
    as a starting point for the search 
35

看到更多的细节,包括工作代码,检查我的博客文章在:

http://www.leetcode.com/2010/11/finding-minimum-window-in-s-which.html

为了帮助说明这种方法,我用一个例子:字符串1 = "acbbaca"和字符串2 = "aba"。在这里,我们还使用术语“窗口”,这意味着来自string1的连续字符块(可以与术语子字符串交换)。

alt text

ⅰ)字符串1 = “acbbaca” 和字符串2 = “ABA”。

alt text

ii)中找到的第一个最小窗口。 请注意,我们不能提前开始 指针,因为hasFound ['a'] == needToFind ['a'] == 2.推进 意味着打破约束。

alt text

iii)中发现的第二窗口。开始 指针仍然指向第一个 元素'a'。 hasFound ['a'](3)是 大于needToFind ['a'](2)。我们 减少hasFound ['a']一个和 提前开始指针的权利。

alt text

四)我们跳过 'C',因为它没有发现包含字符串 。开始指针现在指向'b'。已发现['b'](2)大于 needToFind ['b'](1)。我们将 hasFound ['b']减1,并提前开始 指向右侧。

alt text

V)开始指针现在指向 下一个 'B'。 hasFound ['b'](1)等于 to needToFind ['b'](1)。我们立即停止 ,这是我们最新发现的 找到的最小窗口。

这个想法主要基于两个指针(窗口的开始和结束位置)和两个表(needToFind和hasFound)在遍历string1时的帮助。 needToFind在string2中存储一个字符的总数,hasFound存储到目前为止所有字符的总数。我们还使用一个count变量来存储到目前为止符合的字符串2中的总字符(不包括hasFound [x]超过needToFind [x]的字符)。当count等于string2的长度时,我们知道找到了一个有效的窗口。每次我们提前结束指针(指向一个元素x)时,我们将hasFound [x]加1。如果hasFound [x]小于或等于needToFind [x],我们也增加1。为什么?当满足约束条件(即count等于string2的大小)时,我们立即将begin指针提前尽可能靠右,同时保持约束。

我们如何检查它是否维持约束?假设开始点指向一个元素x,我们检查hasFound [x]是否大于needToFind [x]。如果是这样,我们可以将hasFound [x]递减1,并且在不破坏约束的情况下推进begin指针。另一方面,如果不是,我们会立即停止,因为前进的开始指针会中断窗口约束。

最后,我们检查最小窗口长度是否小于当前最小值。如果找到新的最小值,更新当前最小值。

本质上,算法找到满足约束的第一个窗口,然后继续保持整个约束。

1

请看看这个问题,以及:

//----------------------------------------------------------------------- 

bool IsInSet(char ch, char* cSet) 
{ 
    char* cSetptr = cSet; 
    int index = 0; 
    while (*(cSet+ index) != '\0') 
    { 
     if(ch == *(cSet+ index)) 
     { 
      return true;    
     } 
     ++index; 
    } 
    return false; 
} 

void removeChar(char ch, char* cSet) 
{ 
    bool bShift = false; 
    int index = 0; 
    while (*(cSet + index) != '\0') 
    { 
     if((ch == *(cSet + index)) || bShift) 
     { 
      *(cSet + index) = *(cSet + index + 1); 
      bShift = true; 
     } 
     ++index; 
    } 
} 
typedef struct subStr 
{ 
    short iStart; 
    short iEnd; 
    short szStr; 
}ss; 

char* subStringSmallest(char* testStr, char* cSet) 
{ 
    char* subString = NULL; 
    int iSzSet = strlen(cSet) + 1; 
    int iSzString = strlen(testStr)+ 1; 
    char* cSetBackUp = new char[iSzSet]; 
    memcpy((void*)cSetBackUp, (void*)cSet, iSzSet); 

    int iStartIndx = -1;  
    int iEndIndx = -1; 
    int iIndexStartNext = -1; 

    std::vector<ss> subStrVec; 
    int index = 0; 

    while(*(testStr+index) != '\0') 
    { 
     if (IsInSet(*(testStr+index), cSetBackUp)) 
     { 
      removeChar(*(testStr+index), cSetBackUp); 

      if(iStartIndx < 0) 
      { 
       iStartIndx = index; 
      } 
      else if(iIndexStartNext < 0) 
       iIndexStartNext = index; 
      else 
       ; 

      if (strlen(cSetBackUp) == 0) 
      { 
       iEndIndx = index; 
       if(iIndexStartNext == -1) 
        break; 
       else 
       { 
        index = iIndexStartNext; 
        ss stemp = {iStartIndx, iEndIndx, (iEndIndx-iStartIndx + 1)}; 
        subStrVec.push_back(stemp); 
        iStartIndx = iEndIndx = iIndexStartNext = -1; 
        memcpy((void*)cSetBackUp, (void*)cSet, iSzSet); 
        continue; 
       } 
      } 
     } 
     else 
     { 
      if (IsInSet(*(testStr+index), cSet)) 
      { 
       if(iIndexStartNext < 0) 
        iIndexStartNext = index; 
      } 
     } 

     ++index; 
    } 


    int indexSmallest = 0; 
    for(int indexVec = 0; indexVec < subStrVec.size(); ++indexVec) 
    { 
     if(subStrVec[indexSmallest].szStr > subStrVec[indexVec].szStr) 
      indexSmallest = indexVec;  
    } 

    subString = new char[(subStrVec[indexSmallest].szStr) + 1]; 
    memcpy((void*)subString, (void*)(testStr+ subStrVec[indexSmallest].iStart), subStrVec[indexSmallest].szStr); 
    memset((void*)(subString + subStrVec[indexSmallest].szStr), 0, 1); 

    delete[] cSetBackUp; 
    return subString; 
} 
//-------------------------------------------------------------------- 
0
上面讨论的方法

Java代码:

private static Map<Character, Integer> frequency; 
private static Set<Character> charsCovered; 
private static Map<Character, Integer> encountered; 
/** 
* To set the first match index as an intial start point 
*/ 
private static boolean hasStarted = false; 
private static int currentStartIndex = 0; 
private static int finalStartIndex = 0; 
private static int finalEndIndex = 0; 
private static int minLen = Integer.MAX_VALUE; 
private static int currentLen = 0; 
/** 
* Whether we have already found the match and now looking for other 
* alternatives. 
*/ 
private static boolean isFound = false; 
private static char currentChar; 

public static String findSmallestSubStringWithAllChars(String big, String small) { 

    if (null == big || null == small || big.isEmpty() || small.isEmpty()) { 
     return null; 
    } 

    frequency = new HashMap<Character, Integer>(); 
    instantiateFrequencyMap(small); 
    charsCovered = new HashSet<Character>(); 
    int charsToBeCovered = frequency.size(); 
    encountered = new HashMap<Character, Integer>(); 

    for (int i = 0; i < big.length(); i++) { 
     currentChar = big.charAt(i); 
     if (frequency.containsKey(currentChar) && !isFound) { 
      if (!hasStarted && !isFound) { 
       hasStarted = true; 
       currentStartIndex = i; 
      } 
      updateEncounteredMapAndCharsCoveredSet(currentChar); 
      if (charsCovered.size() == charsToBeCovered) { 
       currentLen = i - currentStartIndex; 
       isFound = true; 
       updateMinLength(i); 
      } 
     } else if (frequency.containsKey(currentChar) && isFound) { 
      updateEncounteredMapAndCharsCoveredSet(currentChar); 
      if (currentChar == big.charAt(currentStartIndex)) { 
       encountered.put(currentChar, encountered.get(currentChar) - 1); 
       currentStartIndex++; 
       while (currentStartIndex < i) { 
        if (encountered.containsKey(big.charAt(currentStartIndex)) 
          && encountered.get(big.charAt(currentStartIndex)) > frequency.get(big 
            .charAt(currentStartIndex))) { 
         encountered.put(big.charAt(currentStartIndex), 
           encountered.get(big.charAt(currentStartIndex)) - 1); 
        } else if (encountered.containsKey(big.charAt(currentStartIndex))) { 
         break; 
        } 
        currentStartIndex++; 
       } 
      } 
      currentLen = i - currentStartIndex; 
      updateMinLength(i); 
     } 
    } 
    System.out.println("start: " + finalStartIndex + " finalEnd : " + finalEndIndex); 
    return big.substring(finalStartIndex, finalEndIndex + 1); 
} 

private static void updateMinLength(int index) { 
    if (minLen > currentLen) { 
     minLen = currentLen; 
     finalStartIndex = currentStartIndex; 
     finalEndIndex = index; 
    } 

} 

private static void updateEncounteredMapAndCharsCoveredSet(Character currentChar) { 
    if (encountered.containsKey(currentChar)) { 
     encountered.put(currentChar, encountered.get(currentChar) + 1); 
    } else { 
     encountered.put(currentChar, 1); 
    } 

    if (encountered.get(currentChar) >= frequency.get(currentChar)) { 
     charsCovered.add(currentChar); 
    } 
} 

private static void instantiateFrequencyMap(String str) { 

    for (char c : str.toCharArray()) { 
     if (frequency.containsKey(c)) { 
      frequency.put(c, frequency.get(c) + 1); 
     } else { 
      frequency.put(c, 1); 
     } 
    } 

} 

public static void main(String[] args) { 

    String big = "this is a test string"; 
    String small = "tist"; 
    System.out.println("len: " + big.length()); 
    System.out.println(findSmallestSubStringWithAllChars(big, small)); 
} 
0

这里是Java实现

public static String shortestSubstrContainingAllChars(String input, String target) { 
    int needToFind[] = new int[256]; 
    int hasFound[] = new int[256]; 
    int totalCharCount = 0; 
    String result = null; 

    char[] targetCharArray = target.toCharArray(); 
    for (int i = 0; i < targetCharArray.length; i++) { 
     needToFind[targetCharArray[i]]++;   
    } 

    char[] inputCharArray = input.toCharArray(); 
    for (int begin = 0, end = 0; end < inputCharArray.length; end++) { 

     if (needToFind[inputCharArray[end]] == 0) { 
      continue; 
     } 

     hasFound[inputCharArray[end]]++; 
     if (hasFound[inputCharArray[end]] <= needToFind[inputCharArray[end]]) { 
      totalCharCount ++; 
     } 
     if (totalCharCount == target.length()) { 
      while (needToFind[inputCharArray[begin]] == 0 
        || hasFound[inputCharArray[begin]] > needToFind[inputCharArray[begin]]) { 

       if (hasFound[inputCharArray[begin]] > needToFind[inputCharArray[begin]]) { 
        hasFound[inputCharArray[begin]]--; 
       } 
       begin++; 
      } 

      String substring = input.substring(begin, end + 1); 
      if (result == null || result.length() > substring.length()) { 
       result = substring; 
      } 
     } 
    } 
    return result; 
} 

这里是Junit的测试

@Test 
public void shortestSubstringContainingAllCharsTest() { 
    String result = StringUtil.shortestSubstrContainingAllChars("acbbaca", "aba"); 
    assertThat(result, equalTo("baca")); 

    result = StringUtil.shortestSubstrContainingAllChars("acbbADOBECODEBANCaca", "ABC"); 
    assertThat(result, equalTo("BANC")); 

    result = StringUtil.shortestSubstrContainingAllChars("this is a test string", "tist"); 
    assertThat(result, equalTo("t stri")); 
} 
2

我收到了同样的面试问题。我是一名C++候选人,但我有能力在JAVA中相对较快地进行编码。

爪哇 [礼貌:Sumod Mathilakath]

import java.io.*; 
import java.util.*; 

class UserMainCode 
{ 


    public String GetSubString(String input1,String input2){ 
     // Write code here... 
     return find(input1, input2); 
    } 
    private static boolean containsPatternChar(int[] sCount, int[] pCount) { 
     for(int i=0;i<256;i++) { 
      if(pCount[i]>sCount[i]) 
       return false; 
     } 
     return true; 
    } 
    public static String find(String s, String p) { 
     if (p.length() > s.length()) 
      return null; 
     int[] pCount = new int[256]; 
     int[] sCount = new int[256]; 
     // Time: O(p.lenght) 
     for(int i=0;i<p.length();i++) { 
      pCount[(int)(p.charAt(i))]++; 
      sCount[(int)(s.charAt(i))]++; 
     } 
     int i = 0, j = p.length(), min = Integer.MAX_VALUE; 
     String res = null; 
     // Time: O(s.lenght) 
     while (j < s.length()) { 
      if (containsPatternChar(sCount, pCount)) { 
       if ((j - i) < min) { 
        min = j - i; 
        res = s.substring(i, j); 
        // This is the smallest possible substring. 
        if(min==p.length()) 
         break; 
        // Reduce the window size. 
        sCount[(int)(s.charAt(i))]--; 
        i++; 
       } 
      } else { 
       sCount[(int)(s.charAt(j))]++; 
       // Increase the window size. 
       j++; 
      } 
     } 
     System.out.println(res); 
     return res; 
    } 
} 

C++ [礼貌:sundeepblue]

#include <iostream> 
#include <vector> 
#include <string> 
#include <climits> 
using namespace std; 
string find_minimum_window(string s, string t) { 
    if(s.empty() || t.empty()) return; 

    int ns = s.size(), nt = t.size(); 
    vector<int> total(256, 0); 
    vector<int> sofar(256, 0); 
    for(int i=0; i<nt; i++) 
     total[t[i]]++; 

    int L = 0, R; 
    int minL = 0;       //gist2 
    int count = 0; 
    int min_win_len = INT_MAX; 

    for(R=0; R<ns; R++) {     // gist0, a big for loop 
     if(total[s[R]] == 0) continue; 
     else sofar[s[R]]++; 

     if(sofar[s[R]] <= total[s[R]])  // gist1, <= not < 
      count++; 

     if(count == nt) {     // POS1 
      while(true) { 
       char c = s[L]; 
       if(total[c] == 0) { L++; } 
       else if(sofar[c] > total[c]) { 
        sofar[c]--; 
        L++; 
       } 
       else break; 
      } 
      if(R - L + 1 < min_win_len) { // this judge should be inside POS1 
       min_win_len = R - L + 1; 
       minL = L; 
      } 
     } 
    } 
    string res; 
    if(count == nt)       // gist3, cannot forget this. 
     res = s.substr(minL, min_win_len); // gist4, start from "minL" not "L" 
    return res; 
} 
int main() { 
    string s = "abdccdedca"; 
    cout << find_minimum_window(s, "acd"); 
} 

二郎 [礼貌:wardbekker]

-module(leetcode). 

-export([min_window/0]). 

%% Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n). 

%% For example, 
%% S = "ADOBECODEBANC" 
%% T = "ABC" 
%% Minimum window is "BANC". 

%% Note: 
%% If there is no such window in S that covers all characters in T, return the emtpy string "". 
%% If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S. 



min_window() -> 
    "eca" = min_window("cabeca", "cae"), 
    "eca" = min_window("cfabeca", "cae"), 
    "aec" = min_window("cabefgecdaecf", "cae"), 
    "cwae" = min_window("cabwefgewcwaefcf", "cae"), 
    "BANC" = min_window("ADOBECODEBANC", "ABC"), 
    ok. 

min_window(T, S) -> 
    min_window(T, S, []). 

min_window([], _T, MinWindow) -> 
    MinWindow; 
min_window([H | Rest], T, MinWindow) -> 
    NewMinWindow = case lists:member(H, T) of 
         true -> 
          MinWindowFound = fullfill_window(Rest, lists:delete(H, T), [H]), 
          case length(MinWindow) == 0 orelse (length(MinWindow) > length(MinWindowFound) 
           andalso length(MinWindowFound) > 0) of 
           true -> 
            MinWindowFound; 
           false -> 
            MinWindow 
          end; 
         false -> 
          MinWindow 
        end, 
    min_window(Rest, T, NewMinWindow). 

fullfill_window(_, [], Acc) -> 
    %% window completed 
    Acc; 
fullfill_window([], _T, _Acc) -> 
    %% no window found 
    ""; 
fullfill_window([H | Rest], T, Acc) -> 
    %% completing window 
    case lists:member(H, T) of 
     true -> 
      fullfill_window(Rest, lists:delete(H, T), AcC++ [H]); 
     false -> 
      fullfill_window(Rest, T, AcC++ [H]) 
    end. 

REF:

0
//[ShortestSubstring.java][1] 

public class ShortestSubstring { 

    public static void main(String[] args) { 
     String input1 = "My name is Fran"; 
     String input2 = "rim"; 
     System.out.println(getShortestSubstring(input1, input2)); 
    } 

    private static String getShortestSubstring(String mainString, String toBeSearched) { 

     int mainStringLength = mainString.length(); 
     int toBeSearchedLength = toBeSearched.length(); 

     if (toBeSearchedLength > mainStringLength) { 
      throw new IllegalArgumentException("search string cannot be larger than main string"); 
     } 

     for (int j = 0; j < mainStringLength; j++) { 
      for (int i = 0; i <= mainStringLength - toBeSearchedLength; i++) { 
       String substring = mainString.substring(i, i + toBeSearchedLength); 
       if (checkIfMatchFound(substring, toBeSearched)) { 
        return substring; 
       } 
      } 
      toBeSearchedLength++; 
     } 

     return null; 
    } 

    private static boolean checkIfMatchFound(String substring, String toBeSearched) { 
     char[] charArraySubstring = substring.toCharArray(); 
     char[] charArrayToBeSearched = toBeSearched.toCharArray(); 
     int count = 0; 

     for (int i = 0; i < charArraySubstring.length; i++) { 
      for (int j = 0; j < charArrayToBeSearched.length; j++) { 
       if (String.valueOf(charArraySubstring[i]).equalsIgnoreCase(String.valueOf(charArrayToBeSearched[j]))) { 
        count++; 
       } 
      } 
     } 
     return count == charArrayToBeSearched.length; 
    } 
} 
+0

虽然此代码可能有助于解决问题,但提供 有关_why_和/或_how_的其他上下文,它将回答问题将显着提高其长期值 。请[编辑]你的答案,添加一些解释。 – 2016-08-02 12:38:59

0

这是使用素数,以避免一个循环,并用乘法代替它的方法。可以做其他几个小的优化。

  1. 分配一个唯一的素数,以任何你想要查找的字符,并1到无趣的人物。

  2. 通过将质数乘以它应该具有的出现次数来查找匹配字符串的乘积。现在只有使用相同的素数因子才能找到该产品。

  3. 从开头搜索字符串,在移入正在运行的产品时乘以相应的素数。

  4. 如果数字大于正确的总和,请删除第一个字符,并将其质数除以运行的产品。

  5. 如果该数字小于正确的总和,请包括下一个字符并将其乘以您的正在运行的产品。

  6. 如果数字与您找到的匹配的正确总数相同,请将开头和结尾滑动到下一个字符,并继续搜索其他匹配项。

  7. 决定哪个比赛是最短的。

Gist

charcount = { 'a': 3, 'b' : 1 }; 
str = "kjhdfsbabasdadaaaaasdkaaajbajerhhayeom" 

def find (c, s): 
    Ns = len (s) 

    C = list (c.keys()) 
    D = list (c.values()) 

    # prime numbers assigned to the first 25 chars 
    prmsi = [ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89 , 97] 

    # primes used in the key, all other set to 1 
    prms = [] 
    Cord = [ord(c) - ord('a') for c in C] 

    for e,p in enumerate(prmsi): 
    if e in Cord: 
     prms.append (p) 
    else: 
     prms.append (1) 

    # Product of match 
    T = 1 
    for c,d in zip(C,D): 
    p = prms[ord (c) - ord('a')] 
    T *= p**d 

    print ("T=", T) 

    t = 1 # product of current string 
    f = 0 
    i = 0 

    matches = [] 
    mi = 0 
    mn = Ns 
    mm = 0 

    while i < Ns: 
    k = prms[ord(s[i]) - ord ('a')] 
    t *= k 

    print ("testing:", s[f:i+1]) 

    if (t > T): 
     # included too many chars: move start 
     t /= prms[ord(s[f]) - ord('a')] # remove first char, usually division by 1 
     f += 1 # increment start position 
     t /= k # will be retested, could be replaced with bool 

    elif t == T: 
     # found match 
     print ("FOUND match:", s[f:i+1]) 
     matches.append (s[f:i+1]) 

     if (i - f) < mn: 
     mm = mi 
     mn = i - f 

     mi += 1 

     t /= prms[ord(s[f]) - ord('a')] # remove first matching char 

     # look for next match 
     i += 1 
     f += 1 

    else: 
     # no match yet, keep searching 
     i += 1 

    return (mm, matches) 


print (find (charcount, str)) 

(注:这个答案最初发布到一个重复的问题,原来答案是现已删除)

0
def minimum_window(s, t, min_length = 100000): 
    d = {} 
    for x in t: 
     if x in d: 
      d[x]+= 1 
     else: 
      d[x] = 1 

    tot = sum([y for x,y in d.iteritems()]) 
    l = [] 
    ind = 0 
    for i,x in enumerate(s): 
     if ind == 1: 
      l = l + [x] 
     if x in d: 
      tot-=1 
      if not l: 
       ind = 1 
       l = [x] 

     if tot == 0: 
      if len(l)<min_length: 
       min_length = len(l) 
       min_length = minimum_window(s[i+1:], t, min_length) 

return min_length 

l_s = "ADOBECODEBANC" 
t_s = "ABC" 

min_length = minimum_window(l_s, t_s) 

if min_length == 100000: 
     print "Not found" 
else: 
     print min_length 
0

C#实现:

public static Tuple<int, int> FindMinSubstringWindow(string input, string pattern) 
{ 
    Tuple<int, int> windowCoords = new Tuple<int, int>(0, input.Length - 1); 
    int[] patternHist = new int[256]; 
    for (int i = 0; i < pattern.Length; i++) 
    { 
     patternHist[pattern[i]]++; 
    } 
    int[] inputHist = new int[256]; 
    int minWindowLength = int.MaxValue; 
    int count = 0; 
    for (int begin = 0, end = 0; end < input.Length; end++) 
    { 
     // Skip what's not in pattern. 
     if (patternHist[input[end]] == 0) 
     { 
      continue; 
     } 
     inputHist[input[end]]++; 
     // Count letters that are in pattern. 
     if (inputHist[input[end]] <= patternHist[input[end]]) 
     { 
      count++; 
     } 
     // Window found. 
     if (count == pattern.Length) 
     { 
      // Remove extra instances of letters from pattern 
      // or just letters that aren't part of the pattern 
      // from the beginning. 
      while (patternHist[input[begin]] == 0 || 
        inputHist[input[begin]] > patternHist[input[begin]]) 
      { 
       if (inputHist[input[begin]] > patternHist[input[begin]]) 
       { 
        inputHist[input[begin]]--; 
       } 
       begin++; 
      } 
      // Current window found. 
      int windowLength = end - begin + 1; 
      if (windowLength < minWindowLength) 
      { 
       windowCoords = new Tuple<int, int>(begin, end); 
       minWindowLength = windowLength; 
      } 
     } 
    } 
    if (count == pattern.Length) 
    { 
     return windowCoords; 
    } 
    return null; 
} 
相关问题