2017-05-03 70 views
0
import requests 
from bs4 import BeautifulSoup 


url = input("URL:") 
grab_page = requests.get(url) 
parse_page = BeautifulSoup(grab_page.text, "html.parser") 
file_name = parse_page.title.string.replace("\\,()", "") 


newfile = open(file_name + ".html", "w+") 
newfile.write(grab_page.text) 

当我试图运行上面的代码,用this particular URL,其中网页的标题是“如何安装JDK 8无效的参数(在Windows上, 的Mac OS,Ubuntu的)和Java编程入门“我收到以下错误:OSERROR:[错误22]的open()

Traceback (most recent call last): 
    File "C:/Users/LKT/PycharmProjects/webpagegrabber/main.py", line 12, in <module> 
    newfile = open(file_name + ".html", "w+") 
OSError: [Errno 22] Invalid argument: 'How to Install JDK 8 (on Windows,\r\nMac OS, Ubuntu) 
    and Get Started with Java Programming.html' 

我哪里出错了?

+1

你传入''如何安装JDK 8(在Windows上,\ r \ NMAC OS,Ubuntu的),并开始Java Programming.html''到'open',这根据您的操作系统不是有效的路径。 –

回答

2

您的文件名包含无效字符('\ n','\ r')。所以你不能在windows中创建这样的文件。正如windows developer center描述:

Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.

+0

我试图用.replace()函数替换那些字符。任何想法为什么它不工作? –

+0

它没有出现在你的代码中,但我会检查是否有其他鬼祟的无效字符(也许打印文件名为一个十六进制字符串) – MByD

+0

特别是这行代码: file_name = parse_page.title.string.replace (“\\,()”,“”) 然后我尝试将它作为参数传递给open() –

相关问题