我是新来的python,并且无法从输出中删除html标签。我想删除标签及其中的内容。我想也删除p标签。有什么建议么?从输出中删除HTML标签
import urllib2
from bs4 import BeautifulSoup
# Ask user to enter URL
url = raw_input("Please enter a valid URL: ")
# Make sure file is clear for new content
open('ctp_output.txt', 'w').close()
# Open txt document for output
txt = open('ctp_output.txt', 'w')
# Parse HTML of article, aka making soup
soup = BeautifulSoup(urllib2.urlopen(url).read())
# retrieve all of the paragraph tags
tags = soup('p')
txt.write(str(tag) + '\n' + '\n')
# Close txt file with new content added
txt.close()
这可能是useful.http://stackoverflow.com/questions/753052/strip -html-from-strings-in-python – Manjunath