2
我想删除包含字符串“第四季度结束”的行后的所有行。目前,这是第474行,但会根据游戏而改变。删除大熊猫中的特定字符串后的行
from bs4 import BeautifulSoup
import requests
import pandas as pd
import re
url = "http://www.espn.com/nba/playbyplay?gameId=400900395"
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data,"html.parser")
data_rows = soup.findAll("tr")[4:]
play_data = []
for i in range(len(data_rows)):
play_row = []
for td in data_rows[i].findAll('td'):
play_row.append(td.getText())
play_data.append(play_row)
df = pd.DataFrame(play_data)
df.to_html("pbp_data")