2016-07-24 64 views
0

我正在使用feedparser模块在我的程序中创建新闻提要。 Yahoo!代码:单独的RSS提要链接/ s

Yahoo! Finance API链接元素实际上有两个链接:Yahoo链接和实际文章链接(外部网站/源)。两个由一个星号分离,用下面的就是一个例子:

http://us.rd.yahoo.com/finance/external/investors/rss/SIG=12shc077a/ * http://www.investors.com/news/technology/click/pokemon-go-hurting-facebook-snapchat-usage/

注意两个项目之间的星号。

我只是想知道是否有pythonic的方式来分开这两个,只读第二个链接到一个文件。

谢谢你的时间。

这里是我的相关代码:

def parse_feed(news_feed_message, rss_url): 
    ''' This function parses the Yahoo! RSS API for data of the latest five articles, and writes it to the company news text file''' 

    # Define the RSS feed to parse from, as the url passed in of the company the user chose 
    feed = feedparser.parse(rss_url) 

    # Define the file to write the news data to the company news text file 
    outFile = open('C:\\Users\\nicks_000\\PycharmProjects\\untitled\\SAT\\GUI\\Text Files\\companyNews.txt', mode='w') 

    # Create a list to store the news data parsed from the Yahoo! RSS 
    news_data_write = [] 
    # Initialise a count 
    count = 0 
    # For the number of articles to append to the file, append the article's title, link, and published date to the news_elements list 
    for count in range(10): 
     news_data_write.append(feed['entries'][count].title) 
     news_data_write.append(feed['entries'][count].published) 
     news_data_write.append(feed['entries'][count].link) 
     # Add one to the count, so that the next article is parsed 
     count+=1 
     # For each item in the news_elements list, convert it to a string and write it to the company news text file 
     for item in news_data_write: 
      item = str(item) 
      outFile.write(item+'\n') 
     # For each article, write a new line to the company news text file, so that each article's data is on its own line 
     outFile.write('\n') 
     # Clear the news_elements list so that data is not written to the file more than once 
     del(news_data_write[:]) 
    outFile.close() 

    read_news_file(news_feed_message) 

回答

0

您可以分割此方式如下:

link = 'http://us.rd.yahoo.com/finance/external/investors/rss/SIG=12shc077a/*http://www.investors.com/news/technology/click/pokemon-go-hurting-facebook-snapchat-usage/' 

rss_link, article_link = link.split('*') 

请记住,这需要总是包含星号的链接,否则你会得到以下例外:

ValueError: not enough values to unpack (expected 2, got 1) 

如果你只需要第二个链接,你也可以w仪式:

_, article_link = link.split('*') 

这表明你想放弃第一个返回值。 另一种选择是:

article_link = link.split('*')[1] 

关于你的代码:如果你有一个例外,你打开输出文件后的任何地方,它不会被正常关闭。可以使用open上下文管理器(docs)或try ... finally块(docs)确保无论发生什么情况都关闭文件。

情景管理:

with open('youroutputfile', 'w') as f: 
    # your code 
    f.write(…) 

异常处理程序:

try: 
    f = open('youroutputfile', 'w') 
    f.write(…) 
finally: 
    f.close()