2012-02-12 59 views
0

这里是我的代码..保存一系列页面的脚本然后尝试将它们合并,但只合并一个页面?

require "open-uri" 

base_url = "http://en.wikipedia.org/wiki" 

(1..5).each do |x| 
    # sets up the url 
    full_url = base_url + "/" + x.to_s 
    # reads the url 
    read_page = open(full_url).read 
    # saves the contents to a file and closes it 
    local_file = "my_copy_of-" + x.to_s + ".html" 
    file = open(local_file,"w") 
    file.write(read_page) 
    file.close 

    # open a file to store all entrys in 

    combined_numbers = open("numbers.html", "w") 

    entrys = open(local_file, "r") 

    combined_numbers.write(entrys.read) 

    entrys.close 
    combined_numbers.close 

end 

正如你所看到的。它基本上将维基百科文章1到5的内容进行了删减,然后尝试将它们合并到一个名为numbers.html的文件中。

它做的第一点是正确的。但是,当它达到第二。它似乎只是写在循环中的第五篇文章的内容。

虽然我看不出错在哪里。任何帮助?

回答

2

打开摘要文件时选择了错误的mode“w”覆盖现有文件,而“a”附加到现有文件。与环numbers.html的文件内容写入当前文章的每遍否则

combined_numbers = open("numbers.html", "a") 

所以用这个来让你的代码工作。


此外,我认为你应该使用在read_page内容写入numbers.html而不是从你刚写入的文件在阅读他们回来:

require "open-uri" 

(1..5).each do |x| 
    # set up and read url 
    url = "http://en.wikipedia.org/wiki/#{x.to_s}" 
    article = open(url).read 

    # saves current article to a file 
    # (only possible with 1.9.x use open too if on 1.8.x) 
    IO.write("my_copy_of-#{x.to_s}.html", article) 

    # add current article to summary file 
    open("numbers.html", "a") do |f| 
    f.write(article) 
    end 
end 
相关问题