2016-08-24 61 views
1

我没有在一段时间内完成扭曲的编程,所以我试图回到它的一个新项目。我试图建立一个扭曲的客户端,它可以将服务器列表作为参数,并为每个服务器发送一个API GET调用并将返回消息写入文件。此API GET调用应该每60秒重复一次。编写一个扭曲的客户端发送循环GET请求到多个API调用并记录响应

我使用Twisted的代理类在一台服务器成功地做到了:

from StringIO import StringIO 

from twisted.internet import reactor 
from twisted.internet.protocol import Protocol 
from twisted.web.client import Agent 
from twisted.web.http_headers import Headers 
from twisted.internet.defer import Deferred 

import datetime 
from datetime import timedelta 
import time 

count = 1 
filename = "test.csv" 

class server_response(Protocol): 
    def __init__(self, finished): 
     print "init server response" 
     self.finished = finished 
     self.remaining = 1024 * 10 

    def dataReceived(self, bytes): 
     if self.remaining: 
      display = bytes[:self.remaining] 
      print 'Some data received:' 
      print display 
      with open(filename, "a") as myfile: 
       myfile.write(display) 

      self.remaining -= len(display) 


    def connectionLost(self, reason): 
     print 'Finished receiving body:', reason.getErrorMessage() 

     self.finished.callback(None) 

def capture_response(response): 
    print "Capturing response" 
    finished = Deferred() 
    response.deliverBody(server_response(finished)) 
    print "Done capturing:", finished 

    return finished 

def responseFail(err): 
    print "error" + err 
    reactor.stop() 


def cl(ignored): 
    print "sending req" 
    agent = Agent(reactor) 
    headers = { 
    'authorization': [<snipped>], 
    'cache-control': [<snipped>], 
    'postman-token': [<snipped>] 
    } 

    URL = <snipped> 
    print URL 

    a = agent.request(
     'GET', 
     URL, 
     Headers(headers), 
     None) 

    a.addCallback(capture_response) 
    reactor.callLater(60, cl, None) 
    #a.addBoth(cbShutdown, count) 


def cbShutdown(ignored, count): 
    print "reactor stop" 
    reactor.stop() 

def parse_args(): 
    usage = """usage: %prog [options] [hostname]:port ... 
    Run it like this: 
     python test.py hostname1:instanceName1 hostname2:instancename2 ... 
    """ 

    parser = optparse.OptionParser(usage) 

    _, addresses = parser.parse_args() 

    if not addresses: 
     print parser.format_help() 
     parser.exit() 

    def parse_address(addr): 
     if ':' not in addr: 
      hostName = '127.0.0.1' 
      instanceName = addr 
     else: 
      hostName, instanceName = addr.split(':', 1) 

     return hostName, instanceName 

    return map(parse_address, addresses) 

if __name__ == '__main__': 
    d = Deferred() 
    d.addCallbacks(cl, responseFail) 
    reactor.callWhenRunning(d.callback, None) 

    reactor.run() 

但是我有一个艰难的时间搞清楚如何有多个代理发送呼叫。有了这个,我依靠在cl()--- reactor.callLater(60,cl,None)中写入的结尾来创建调用循环。 那么如何创建多个呼叫代理协议(server_response(Protocol))并在我的反应堆启动后继续通过每个GET进行循环?

回答

1

看看这只猫被拖入了哪里!

那么,如何创建多个呼叫代理

使用treq你dingus:d。你很少想和Agent课混在一起。

此API调用GET应每隔60秒

使用LoopingCalls代替callLater反复,在这种情况下更容易,以后你会遇到的问题较少。

import treq 
from twisted.internet import task, reactor 

filename = 'test.csv' 

def writeToFile(content): 
    with open(filename, 'ab') as f: 
     f.write(content) 

def everyMinute(*urls): 
    for url in urls: 
     d = treq.get(url) 
     d.addCallback(treq.content) 
     d.addCallback(writeToFile) 

#----- Main -----#    
sites = [ 
    'https://www.google.com', 
    'https://www.amazon.com', 
    'https://www.facebook.com'] 

repeating = task.LoopingCall(everyMinute, *sites) 
repeating.start(60) 

reactor.run() 

它开始在everyMinute()功能,它运行每60秒。在该功能中,查询每个端点,并且一旦响应的内容变为可用,功能将接收响应并返回内容。最后,内容被写入一个文件。

PS

你是在刮或试图从这些网站提取的东西?如果您是scrapy可能是您的不错选择。

+0

Thanks @ notorious.no you rebel scum you you – user1211653

相关问题