2014-10-29 39 views
1

每隔一段时间(一次几个小时)gunicorn工人失败,出现以下错误:Gunicorn工人定期崩溃: '插座未注册'

[2014-10-29 10:21:54 +0000] [4902] [INFO] Booting worker with pid: 4902 
[2014-10-29 13:15:24 +0000] [4902] [ERROR] Exception in worker process: 
Traceback (most recent call last): 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 507, in spawn_worker 
    worker.init_process() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 109, in init_process 
    super(ThreadWorker, self).init_process() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 120, in init_process 
    self.run() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 177, in run 
    self.murder_keepalived() 
    File "/opt/test/env/local/lib/python2.7/site-packages/gunicorn/workers/gthread.py", line 149, in murder_keepalived 
    self.poller.unregister(conn.sock) 
    File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 408, in unregister 
    key = super(EpollSelector, self).unregister(fileobj) 
    File "/opt/test/env/local/lib/python2.7/site-packages/trollius/selectors.py", line 243, in unregister 
    raise KeyError("{0!r} is not registered".format(fileobj)) 
KeyError: '<socket._socketobject object at 0x7f823f454d70> is not registered' 
... 
... 
[2014-10-29 13:15:24 +0000] [4902] [INFO] Worker exiting (pid: 4902) 
[2014-10-29 13:15:24 +0000] [5809] [INFO] Booting worker with pid: 5809 
... 

配置:

bind = '0.0.0.0:80' 
workers = 1 
threads = 4 
debug = True 
reload = True 
daemon = True 

我使用:

Python 2.7.6 
gunicorn==19.1.1 
trollius==1.0.2 
futures==2.2.0 

任何想法可能是什么原因,以及如何解决这个问题?

谢谢!

+0

任何运气吗?我面对完全相似的情况! – Richeek 2015-05-30 15:30:08

+0

nope,仍然在等待社区的帮助。 – 2015-05-31 09:31:32

+1

我不确定,因为我必须调查更多,但我认为它可能与套接字在可以未注册之前关闭有关。我打算增加优雅的超时时间,看看会发生什么。将在这里更新:) – Richeek 2015-06-01 15:33:02

回答

0

我面临类似的问题,我得到了从gunicorn工作人员的时间错误。我正在使用同步工作者,并且有timeoutkeepalive的默认设置。 在我的使用案例中,我的http请求需要很长时间才能完成,因此同步工作人员超时。我使用curl作为发送HTTP-1.1请求的http客户端。我将超时时间增加到了一个疯狂的高数值3600即1小时,这是有效的。然而,在服务器错误日志中,我看到了和你一样的错误。这是我对这个错误的假设。 由于默认情况下,所有http 1.1请求都是持久性服务器,因此尝试通过将其重新放回队列但不超过keepalive超时重新使用连接。因此,当keepalive超时发生时,它将注销套接字,以便它不能被重用并关闭它。现在,由于我的超时值非常高,服务器尝试多次注销一个已注销的套接字,但keepalive仍然默认为5秒,因此出错。因此,我增加了“Keepalive value as well to 3600”。到目前为止它工作。

# http://gunicorn-docs.readthedocs.org/en/latest/settings.html 
timeout = 3600 # one hour timeout for long running jobs 
keepalive = 3600