Multiprocessing apply_async（）在Ubuntu上不起作用

我在Mac OS X和Ubuntu 14.04上将此代码作为CherryPy Web Service运行。通过在python3上使用multiprocessing，我想以异步的方式在Process Pool内启动静态方法worker()。Multiprocessing apply_async（）在Ubuntu上不起作用

相同的代码在Mac OS X上完美运行，在Ubuntu 14.04 worker()中无法运行。即通过调试代码POST方法中，我能看到每一行执行 - 从

reqid = str(uuid.uuid4())

到

return handle_error(202, "Request ID: " + reqid)

在Ubuntu 14.04启动相同的代码，它不运行worker()方法，甚至不是方法顶部的print()（将被记录）。

下面是相关的代码（我只是省略了handle_error()法）：

import cherrypy 
import json 
from lib import get_parameters, handle_error 
from multiprocessing import Pool 
import os 
from pymatbridge import Matlab 
import requests 
import shutil 
import uuid 
from xml.etree import ElementTree 

class Schedule(object): 
    exposed = True 

    def __init__(self, mlab_path, pool): 
     self.mlab_path = mlab_path 
     self.pool = pool 

    def POST(self, *paths, **params): 

     if validate(cherrypy.request.headers): 

      try: 
       reqid = str(uuid.uuid4()) 
       path = os.path.join("results", reqid) 
       os.makedirs(path) 
       wargs = [(self.mlab_path, reqid)] 
       self.pool.apply_async(Schedule.worker, wargs) 

       return handle_error(202, "Request ID: " + reqid) 
      except: 
       return handle_error(500, "Internal Server Error") 
     else: 
      return handle_error(401, "Unauthorized") 

    #### this is not executed #### 
    @staticmethod 
    def worker(args): 

     mlab_path, reqid = args 
     mlab = Matlab(executable=mlab_path) 
     mlab.start() 

     mlab.run_code("cd mlab") 
     mlab.run_code("sched") 
     a = mlab.get_variable("a") 

     mlab.stop() 

     return reqid 

    #### 

# to start the Web Service 
if __name__ == "__main__": 

    # start Web Service with some configuration 
    global_conf = { 
      "global": { 
          "server.environment": "production", 
          "engine.autoreload.on": True, 
          "engine.autoreload.frequency": 5, 
          "server.socket_host": "0.0.0.0", 
          "log.screen": False, 
          "log.access_file": "site.log", 
          "log.error_file": "site.log", 
          "server.socket_port": 8084 
         } 
    } 
    cherrypy.config.update(global_conf) 
    conf = { 
     "/": { 
      "request.dispatch": cherrypy.dispatch.MethodDispatcher(), 
      "tools.encode.debug": True, 
      "request.show_tracebacks": False 
     } 
    } 

    pool = Pool(3) 

    cherrypy.tree.mount(Schedule('matlab', pool), "/sched", conf) 

    # activate signal handler 
    if hasattr(cherrypy.engine, "signal_handler"): 
     cherrypy.engine.signal_handler.subscribe() 

    # start serving pages 
    cherrypy.engine.start() 
    cherrypy.engine.block()

来源

2016-04-24 gc5

你可以尝试提供一个最小可重现的例子，这肯定会有所帮助。另外，“不运行”有点模棱两可......你有错误吗？你可以发布吗？ – Peque

嗨@佩克，我没有错。我试图调试代码，但它似乎没有被执行 - 我只是从一些基本的'print（）'语句开始，在输出显示的方法之外。我提供了一个最小可重现的例子。谢谢 – gc5

你的逻辑隐藏了你的问题。 apply_async方法返回一个AsyncResult对象，该对象充当您刚安排的异步任务的处理程序。当你忽视计划任务的结果时，整个事情看起来就像是“无声无息”。

如果您尝试从该任务中获得结果，您会看到真正的问题。

handler = self.pool.apply_async(Schedule.worker, wargs) 
handler.get() 

... traceback here ... 
cPickle.PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

总之，您必须确保您传递给池的参数是Picklable。

如果它们所属的对象/类也是可挑选的，则实例和类方法是可挑选的。静态方法是不可挑选的，因为它们与对象本身没有关联，因此pickle库不能正确地串行化它们。

作为一般行，最好避免调度到multiprocessing.Pool任何不同于顶级定义的函数。

来源

2016-04-26 19:30:07 noxdafox

我解决了改变的方法，从@staticmethod到@classmethod。现在该作业在ProcessPool内运行。我发现classmethods在这种情况下更有用，如解释here。

感谢。

来源

2016-04-26 13:57:18 gc5

在这种情况下，将它捆绑在类中很重要，因为cherrypy中的每个类都代表Web Service。在我的情况下，每个Web服务都有一个不同的工作线程来执行，所以最好是将它捆绑在相应的类中。但是，如果有更好的设计实践，请告诉我:) – gc5

我认为在这种情况下将'worker'声明为'classmethod'的开销可以忽略不计，而'worker'与该类相关足以成为班级成员。 “worker”这个名字是因为这个类已经有了一个描述性的名字来描述Web服务将要做什么，“worker”就是这个名字。使用静态方法的优点是什么？ – gc5

要使用Cherrypy运行后台任务，最好使用异步任务队列管理器，如Celery或RQ。这项服务非常易于安装和运行，您的任务将以完全分离的流程运行，并且如果您需要扩展，因为您的负载正在增加，它将非常简单。

你有一个简单的例子Cherrypy here。

来源

2016-04-27 06:56:27 jordeu

Multiprocessing apply_async（）在Ubuntu上不起作用

回答

相关问题