2012-04-08 50 views
1

我需要一种算法,通过删除字符(不重新排列)来查找字符串中最大的唯一字符串(无重复字符)。查找词典中最大的唯一字符串

如果满足这两个条件,则字符串A大于字符串B.

1. Has more characters than String B 
    Or 
2. Is lexicographically greater than String B if equal length 

例如,如果输入字符串为dedede,则可能的唯一组合是d,和ë
在这些组合中,最大的一个是因此ED,因为它具有比dË多个字符和按字典顺序大于

该算法必须比生成所有可能的唯一字符串并将它们排序以找到最大字符串更高效。

注意:这不是一项家庭作业。

+3

正在做作业吗?你有什么尝试? – twain249 2012-04-08 20:34:18

+1

我相信这应该是可能的在O(n)时间,在一次通过。空间需求只是一个指标数组,每个可能的字符值都有一个元素。 – 2012-04-08 20:36:39

+0

这两个规则是AND还是OR?对于(A> B),这两者必须是真实的或者可以是真实的? – RBarryYoung 2012-04-08 20:39:56

回答

1

让我以一种我认为更清晰的方式说明排序规则。

字符串A是不是字符串B时,如果

- A is longer than B 
    OR 
- A and B are the same length and A is lexicographically greater than B 

如果我的规则重述是正确的话,我相信我有一个运行在O(N^2)时间和O(n)的空间解决方案。我的解决方案是基于以下观察结果的贪婪算法:在最长的有效子序列中有多少个字符与输入字符串中有唯一字符一样多。我在Go中写了这个,希望这些评论足以描述算法。

func findIt(str string) string { 
    // exc keeps track of characters that we cannot use because they have 
    // already been used in an earlier part of the subsequence 
    exc := make(map[byte]bool) 

    // ret is where we will store the characters of the final solution as we 
    // find them 
    var ret []byte 

    for len(str) > 0 { 
    // inc keeps track of unique characters as we scan from right to left so 
    // that we don't take a character until we know that we can still make the 
    // longest possible subsequence. 
    inc := make(map[byte]bool, len(str)) 

    fmt.Printf("-%s\n", str) 
    // best is the largest character we have found that can also get us the 
    // longest possible subsequence. 
    var best byte 

    // best_pos is the lowest index that we were able to find best at, we 
    // always want the lowest index so that we keep as many options open to us 
    // later if we take this character. 
    best_pos := -1 

    // Scan through the input string from right to left 
    for i := len(str) - 1; i >= 0; i-- { 
     // Ignore characters we've already used 
     if _, ok := exc[str[i]]; ok { continue } 

     if _, ok := inc[str[i]]; !ok { 
     // If we haven't seen this character already then it means that we can 
     // make a longer subsequence by including it, so it must be our best 
     // option so far 
     inc[str[i]] = true 
     best = str[i] 
     best_pos = i 
     } else { 
     // If we've already seen this character it might still be our best 
     // option if it is a lexicographically larger or equal to our current 
     // best. If it is equal we want it because it is at a lower index, 
     // which keeps more options open in the future. 
     if str[i] >= best { 
      best = str[i] 
      best_pos = i 
     } 
     } 
    } 


    if best_pos == -1 { 
     // If we didn't find any valid characters on this pass then we are done 
     break 
    } else { 
     // include our best character in our solution, and exclude it for 
     // consideration in any future passes. 
     ret = append(ret, best) 
     exc[best] = true 

     // run the same algorithm again on the substring that is to the right of 
     // best_pos 
     str = str[best_pos+1:] 
    } 
    } 
    return string(ret) 
} 

我相当肯定,你可以在O做到这一点(n)的时间,但我不知道我的解决方案,所以我公布这一个代替。

+0

对不起,混淆的条件应该是AND。这个问题已被编辑。 – Draco 2012-04-09 00:59:03

+0

那么你不能有更多的人物和相等的长度,所以我们都说同样的事情。 – 2012-04-09 01:11:51

2

这个怎么样

string getLargest(string s) 
{ 
     int largerest_char_pos=0; 
     string result=""; 
     if(s.length() == 1) return s; 
     for(int i=0;i<s.length();) 
     { 
      p=i; 
      for(int j=i+1;j<s.length();j++) 
      { 
       if(s[largerest_char_pos]< s[j]) largerest_char_pos =j; 
      } 
      res+=s[largerest_char_pos]; 
      i=largerest_char_pos+1; 
     } 
     return result;  
    } 

这一小段代码片段只是给你的lexicigraphically较大的字符串。如果你不想重复,你可以跟踪已经添加的字符。

相关问题