给定一个字符串，找到其中的第一个非重复字符并返回其索引。如果它不存在，则返回-1。字符串中的第一个唯一字符

first_unique('leetcode') # 0 
first_unique('loveleetcode') # 2

我想出了以下解决方案。我怎样才能使它更有效的非常长的输入字符串？

def first_unique(self, s): 
    if s == '': 
     return -1 

    for item in s: 
     if s.count(item) == 1: 
      return s.index(item) 
      break 

    return -1

来源

2016-09-23 alex.l

你的版本是不坏的几例 “好” 的字符串...但使用count对于很长的“坏”字符串来说相当昂贵，我建议你缓存项目，例如：

def f1(s): 
    if s == '': 
     return -1 
    for item in s: 
     if s.count(item) == 1: 
      return s.index(item) 
      break 
    return -1 


def f2(s): 
    cache = set() 

    if s == '': 
     return -1 
    for item in s: 
     if item not in cache: 
      if s.count(item) == 1: 
       return s.index(item) 
      else: 
       cache.add(item) 

    return -1 

import timeit 
import random 
import string 

random.seed(1) 

K, N = 500, 100000 
data = ''.join(random.choice(string.ascii_uppercase + string.digits) 
       for _ in range(K)) 

print(
    timeit.timeit('f1(data)', setup='from __main__ import f1, data', number=N)) 
print(
    timeit.timeit('f2(data)', setup='from __main__ import f2, data', number=N))

我的笔记本电脑的结果是：

32.05926330029437 
4.267771588590406

使用缓存版本为您提供了8倍速加快因素VS至极你使用COUNT函数的所有时间。所以，我一般建议是...缓存尽可能是否有可能

编辑：

我已经添加了帕特里克·霍先生版本的标杆，它给了10.92784585620725

EDIT2：

我已经添加穆罕默德FURKAN德米雷尔版本的标杆，它给了10.325440507549331

EDIT3：

我已经添加WIM版本的标杆，它给了12.47985351744839

结论：

我会用我已经提出了最初使用一个简单的缓存版本，而不依赖于Python的计数器模块，它没有必要（在性能方面）

来源

2016-09-23 02:35:22 BPL

我的解决方案使用Counter形式collections模块。

from collections import Counter 
def first_unique(s): 
    c = Counter(s) 
    for i in range(len(s)): 
     if c[s[i]] == 1: 
      return i 
    return -1

来源

2016-09-23 02:23:59

thx，但如果我们不能使用模块呢？ –

你可以很容易地推出自己的计数器。 –

d = {} for char in s： if char in d： d [char] + = 1 else： d [char] = 1 –

倜傥版本：

from collections import Counter, OrderedDict 

class OrderedCounter(Counter, OrderedDict): 
    pass 

def first_unique(s): 
    counter = OrderedCounter(s) 
    try: 
     return counter.values().index(1) 
    except ValueError: 
     return -1

古怪的版本：

from collections import OrderedDict 

def first_unique(s): 
    nah = {0}-{0} # funny pair of eyes 
    yeah = OrderedDict() 
    for i,c in enumerate(s): 
     if c not in nah: 
      try: 
       del yeah[c] 
      except KeyError: 
       yeah[c] = i 
      else: 
       nah.add(c) 
    return next(yeah.itervalues(), -1)

来源

2016-09-23 02:27:47 wim

我会使用for循环来从开始和每个索引迭代String，我会检查String的其余部分是否使用Substring在当前索引处具有字符。

试试这个：

def firstUniqChar(word): 
for i in range(0,len(word)): ## iterate through the string 
    if not word[i] in word[i+1:len(word)]: ## check if the rest contains that char 
     index=i 
     break 
return index

希望它能帮助！

来源

2016-09-23 02:37:06 codemirel

这个解决方案的想法是使用一对defaultdicts。第一个包含每个字符的整数计数，第二个包含最新字符读取的索引位置。

读完所有字符后，列表理解用于查找所有仅出现一次的列表（）。这些字符的最小索引位置（在我们的其他defaultdict order中找到）将为我们提供非重复字符的第一个索引位置。

from collections import defaultdict 
# To Create random string: 
from string import ascii_lowercase 
from random import choice, randint, seed 

# Create a random sentence of 1000 words (1-8 characters each). 
seed(0) 
gibberish = ' '.join(''.join(choice(ascii_lowercase) 
          for _ in range(randint(1, 8))) 
        for _ in range(1000)) 
print(len(gibberish)) 
# Output: 5614 

# Solution. 
def first_unique(s): 
    dd = defaultdict(int) 
    order = defaultdict(int) 
    for n, c in enumerate(s): 
     dd[c] += 1 
     order[c] = n 
    result = [order[c] for c in dd if dd[c] == 1] 
    return min(result) if result else -1

时序

%timeit first_unique(gibberish) 
100 loops, best of 3: 2.13 ms per loop 

@wim solution: 
%timeit first_unique(gibberish) 
100 loops, best of 3: 5.06 ms per loop 

@PatrickHaugh solution (which is much easier to understand than mine): 
%timeit first_unique(gibberish) 
100 loops, best of 3: 4.2 ms per loop 

@BPL solution: 
%timeit f1(gibberish) 
10 loops, best of 3: 39.2 ms per loop 
%timeit f2(gibberish) 
1000 loops, best of 3: 890 µs per loop

使用的二十字（133个字）更短的一句话：

%timeit first_unique(gibberish) 
10000 loops, best of 3: 62.8 µs per loop 

@wim solution: 
%timeit first_unique(gibberish) 
10000 loops, best of 3: 169 µs per loop 

@PatrickHaugh solution: 
%timeit first_unique(gibberish) 
10000 loops, best of 3: 101 µs per loop 

@BPL solution: 
%timeit f1(gibberish) 
10000 loops, best of 3: 55.1 µs per loop 
%timeit f2(gibberish) 
10000 loops, best of 3: 31 µs per loop

测试用例。

s1 = 'leetcode' 
s2 = 'loveleetcode' 

>>> first_unique(s1) 
0 

>>> first_unique(s2) 
2

来源

2016-09-23 03:56:06 Alexander

字符串中的第一个唯一字符

回答

时序

测试用例。

相关问题