2013-02-15 131 views
2

我想在python中使用多线程技术在项目euler中解决Problem 8python循环多线程

查找1000位数字中连续五位数字的最大积。该号码可以找到here

我的方法是从原始列表中生成5个块,然后重复此过程5次,每个过程的起始索引右移一位。

这里是我的线程类

class pThread(threading.Thread): 
    def __init__(self, l): 
     threading.Thread.__init__(self) 
     self.l = l 
     self.p = 0 

    def run(self): 

     def greatest_product(l): 
     """ 
     Divide the list into chunks of 5 and find the greatest product 
     """ 
      def product(seq): 
       return reduce(lambda x,y : x*y, seq) 

      def chunk_product(l, n=5): 
       for i in range(0, len(l), n): 
        yield product(l[i:i+n]) 

      result = 0 
      for p in chunk_product(num): 
       result = result > p and result or p 

      return result 

     self.p = greatest_product(self.l) 

当我尝试创建5个线程,以覆盖在我原来的列表中的所有5位块,手动方法下面给出了正确的答案,与num作为个位数的号码清单,我从文字解析:

thread1 = pThread(num) 
del num[0] 
thread2 = pThread(num) 
del num[0] 
thread3 = pThread(num) 
del num[0] 
thread4 = pThread(num) 
del num[0] 
thread5 = pThread(num) 

thread1.start() 
thread2.start() 
thread3.start() 
thread4.start() 
thread5.start() 

thread1.join() 
thread2.join() 
thread3.join() 
thread4.join() 
thread5.join() 

def max(*args): 
    result = 0 
    for i in args: 
     result = i > result and i or result 
    return result 

print max(thread1.p, thread2.p, thread3.p, thread4.p, thread5.p) 

但是这并不能给出正确的结果:

threads = [] 
for i in range(0, 4): 
    tmp = num[:] 
    del tmp[0:i+1] 
    thread = pThread(tmp) 
    thread.start() 
    threads.append(thread) 

for i in range(0, 4): 
    threads[i].join() 

我做了什么错在这里?我对多线程很陌生,所以请温和。

+1

提示:重写代码,而无需使用德尔,这将是很容易理解为什么它没有工作,为什么你原来的代码也是不正确的。另外,考虑到GIL,这个代码不可能以比单线程版本更快的速度运行。 – 2013-02-15 01:37:10

+0

我已经重写了没有del的代码,是的,它打破了两个版本。我今天晚些时候再学习一遍。谢谢你的建议。 – 2013-02-15 01:47:37

+1

线程这是严重矫枉过正。你可以在1行中用'max(reduce(op.mul,n_list [i:i + 5])为1行在xrange(1000)中)' – wim 2013-02-15 02:08:30

回答

4

有3个问题:

  1. 第一个是“手动”方法没有没有给出正确的答案。只是发生问题的正确答案是在距列表开头的偏移量4处。您可以通过使用看到:

    你的“手册”的方法
    import operator as op 
    print max(reduce(op.mul, num[i:i+5]) for i in range(1000)) 
    for k in range(5): 
        print max(reduce(op.mul, num[i:i+5]) for i in range(k, 1000, 5)) 
    

    的一个问题是,线程共享num变量,每个人都有相同的列表。所以当你做del num[0]时,所有的threadX.l都会受到影响。你始终得到相同答案的原因是由于第二个问题。

  2. 线

    for p in chunk_product(num): 
    

    应该是:

    for p in chunk_product(l): 
    

    ,因为您要使用的功能greatest_product(l)的参数,而不是全局变量num

  3. 在第二种方法中,由于循环范围超过[0, 1, 2, 3],所以只产生4个线程。此外,您要删除值tmp[0:i]而不是tmp[0:i+1]。下面是代码:

    threads = [] 
    for i in range(5): 
        tmp = num[:] 
        del tmp[0:i] 
        thread = pThread(tmp) 
        thread.start() 
        threads.append(thread) 
    
    for i in range(5): 
        threads[i].join() 
    
    print len(threads), map(lambda th: th.p, threads) 
    print max(map(lambda th: th.p, threads)) 
    
+0

@JasonWhite这是预期的功能,因为每个线程以不同的偏移量在列表上循环。在你的例子中(2线程),线程1将计算[1,2] [3,4] [5,6] [7,8] [9,10]的产品,线程2将计算[2, 3] [4,5] [6,7] [8,9]。这是因为线程2具有self.l = [2,3,4,5,6,7,8,9,10]。我认为这是OP想要实现的。 – wasserfeder 2013-02-15 08:16:28

+0

啊我明白你在说什么了。删除了我以前的评论,因为我读错了; P – 2013-02-15 08:26:43

+0

谢谢你SOOOOOO很多。你碰巧知道如何奖励赏金?我真的想给你100代表你的答案。 – 2013-02-15 14:31:50

1

我对此进行了一次尝试,主要是为了获得一些练习多处理,并学习如何使用argparse。

这花了大约4-5演出的RAM,以防万一你的机器没有太多。

python euler.py -l 50000000 -n 100 -p 8 

Took 5.836833333969116 minutes 
The largest product of 100 consecutive numbers is: a very large number 

如果在命令行中输入python euler.py -h你:

usage: euler.py [-h] -l L [L ...] -n N [-p P] 

Calculates the product of consecutive numbers and return the largest product. 

optional arguments: 
    -h, --help show this help message and exit 
    -l L [L ...] A single number or list of numbers, where each # is seperated 
       by a space 
    -n N   A number that specifies how many consecutive numbers should be 
       multiplied together. 
    -p P   Number of processes to create. Optional, defaults to the # of 
       cores on the pc.   

,代码:

"""A multiprocess iplementation for calculation the maximum product of N consecutive 
numbers in a given range (list of numbers).""" 

import multiprocessing 
import math 
import time 
import operator 
from functools import reduce 
import argparse 

def euler8(alist,lenNums): 
    """Returns the largest product of N consecutive numbers in a given range""" 
    return max(reduce(operator.mul, alist[i:i+lenNums]) for i in range(len(alist))) 

def split_list_multi(listOfNumbers,numLength,threads): 
    """Split a list into N parts where N is the # of processes.""" 
    fullLength = len(listOfNumbers) 
    single = math.floor(fullLength/threads) 
    results = {} 
    counter = 0 
    while counter < threads: 
     if counter == (threads-1): 
      temp = listOfNumbers[single*counter::] 
      if counter == 0: 
       results[str(counter)] = listOfNumbers[single*counter::] 
      else: 
       prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::] 
       newlist = prevListIndex + temp 
       results[str(counter)] = newlist 
     else: 
      temp = listOfNumbers[single*counter:single*(counter+1)] 
      if counter == 0: 
       newlist = temp 
      else: 
       prevListIndex = results[str(counter-1)][-int('{}'.format(numLength-1))::] 
       newlist = prevListIndex + temp 
      results[str(counter)] = newlist 
     counter += 1 
    return results,threads 

def worker(listNumbers,number,output): 
    """A worker. Used to run seperate processes and put the results in the queue""" 
    result = euler8(listNumbers,number) 
    output.put(result) 

def main(listOfNums,lengthNumbers,numCores=multiprocessing.cpu_count()): 
    """Runs the module. 
    listOfNums must be a list of ints, or single int 
    lengthNumbers is N (an int) where N is the # of consecutive numbers to multiply together 
    numCores (an int) defaults to however many the cpu has, can specify a number if you choose.""" 

    if isinstance(listOfNums,list): 
     if len(listOfNums) == 1: 
      valuesToSplit = [i for i in range(int(listOfNums[0]))] 
     else: 
      valuesToSplit = [int(i) for i in listOfNums] 
    elif isinstance(listOfNums,int): 
     valuesToSplit = [i for i in range(listOfNums)] 
    else: 
     print('First arg must be a number or a list of numbers') 

    split = split_list_multi(valuesToSplit,lengthNumbers,numCores) 
    done_queue = multiprocessing.Queue() 
    jobs = [] 
    startTime = time.time() 

    for num in range(split[1]): 
     numChunks = split[0][str(num)] 
     thread = multiprocessing.Process(target=worker, args=(numChunks,lengthNumbers,done_queue)) 
     jobs.append(thread) 
     thread.start() 

    resultlist = [] 
    for i in range(split[1]): 
     resultlist.append(done_queue.get()) 

    for j in jobs: 
     j.join() 

    resultlist = max(resultlist) 
    endTime = time.time() 
    totalTime = (endTime-startTime)/60 
    print("Took {} minutes".format(totalTime)) 

    return print("The largest product of {} consecutive numbers is: {}".format(lengthNumbers, resultlist))    

if __name__ == '__main__': 
    #To call the module from the commandline with arguments 
    parser = argparse.ArgumentParser(description="""Calculates the product of consecutive numbers \ 
    and return the largest product.""") 
    parser.add_argument('-l', nargs='+', required=True, 
         help='A single number or list of numbers, where each # is seperated by a space') 
    parser.add_argument('-n', required=True, type=int, 
         help = 'A number that specifies how many consecutive numbers should be \ 
         multiplied together.') 
    parser.add_argument('-p', default=multiprocessing.cpu_count(), type=int, 
         help='Number of processes to create. Optional, defaults to the # of cores on the pc.') 
    args = parser.parse_args() 
    main(args.l, args.n, args.p) 
+0

谢谢,请不要删除这个答案。一旦我能理解你写的所有内容,我会尽快回复你。 – 2013-02-15 15:29:20