2012-07-22 88 views
4

我使用的是reddit库从reddit中提取数据,我遇到了一些代码,我不明白为什么它会返回任何数据(在BaseReddit类中(full source)):Python在没有预期的情况下返回数据

def get_content(self, page_url, limit=0, url_data=None, place_holder=None, 
       root_field='data', thing_field='children', 
       after_field='after'): 
    """A generator method to return reddit content from a URL. Starts at 
    the initial page_url, and fetches content using the `after` JSON data 
    until `limit` entries have been fetched, or the `place_holder` has been 
    reached. 

    :param page_url: the url to start fetching content from 
    :param limit: the maximum number of content entries to fetch. If 
     limit <= 0, fetch the default_content_limit for the site. If None, 
     then fetch unlimited entries--this would be used in conjunction 
     with the place_holder param. 
    :param url_data: dictionary containing extra GET data to put in the url 
    :param place_holder: if not None, the method will fetch `limit` 
     content, stopping if it finds content with `id` equal to 
     `place_holder`. 
    :param data_field: indicates the field in the json response that holds 
     the data. Most objects use 'data', however some (flairlist) don't 
     have the 'data' object. Use None for the root object. 
    :param thing_field: indicates the field under the data_field which 
     contains the list of things. Most objects use 'children'. 
    :param after_field: indicates the field which holds the after item 
     element 
    :type place_holder: a string corresponding to a reddit content id, e.g. 
     't3_asdfasdf' 
    :returns: a list of reddit content, of type Subreddit, Comment, 
     Submission or user flair. 
    """ 
    content_found = 0 

    if url_data is None: 
     url_data = {} 
    if limit is None: 
     fetch_all = True 
    elif limit <= 0: 
     fetch_all = False 
     limit = int(self.config.default_content_limit) 
    else: 
     fetch_all = False 

    # While we still need to fetch more content to reach our limit, do so. 
    while fetch_all or content_found < limit: 
     page_data = self.request_json(page_url, url_data=url_data) 
     if root_field: 
      root = page_data[root_field] 
     else: 
      root = page_data 
     for thing in root[thing_field]: 
      yield thing 
      content_found += 1 
      # Terminate when we reached the limit, or place holder 
      if (content_found == limit or 
       place_holder and thing.id == place_holder): 
       return 
     # Set/update the 'after' parameter for the next iteration 
     if after_field in root and root[after_field]: 
      url_data['after'] = root[after_field] 
     else: 
      return 

它看起来对我来说,所有的return语句没有参数,因此将默认为返回None有人可以解释这对我

注意:?代码是Python的2.x的

回答

3

这是发电机的功能,您可以通过yield声明说。该值有效地被“返回”,而不实际从函数返回。当从该功能请求另一个值时,发生器从它产生的点开始恢复(按照下面的代码继续for thing循环...)。

for thing in root[thing_field]: 
    yield thing 

简单的例子:

def blah(): 
    for i in xrange(5): 
     yield i + 3 

numbers = blah() 
print next(numbers) 
# lots of other code here... 
# now we need the next value 
print next(numbers) 
相关问题