0
我需要从此URL中的第二个tbody获取列标题。从多个html'tbody'获取列标题
http://bepi.mpob.gov.my/index.php/statistics/price/daily.html
具体来说,我想看看 “九月,十月” ......等
我收到以下错误:
runfile('C:/Python27/Lib/site-packages/xy/workspace/webscrape/mpob1.py', wdir='C:/Python27/Lib/site-packages/xy/workspace/webscrape')
Traceback (most recent call last):
File "<ipython-input-8-ab4005f51fa3>", line 1, in <module>
runfile('C:/Python27/Lib/site-packages/xy/workspace/webscrape/mpob1.py', wdir='C:/Python27/Lib/site-packages/xy/workspace/webscrape')
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
execfile(filename, namespace)
File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Python27/Lib/site-packages/xy/workspace/webscrape/mpob1.py", line 26, in <module>
soup.findAll('tbody', limit=2)[1].findAll('tr').findAll('th')]
IndexError: list index out of range
可以在这里请人帮我出来吗?我将永远感激!
已经张贴下面我的代码:
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = "http://bepi.mpob.gov.my/index.php/statistics/price/daily.html"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
column_headers = [th.getText() for th in
soup.findAll('tbody', limit=2)[1].findAll('tr').findAll('th')]
你的意思,你只需要每月选择元素的内容,或者您真正需要点击“查看价格”并解析“按地区划分的MPOB每日FFB参考价格摘要”表格?谢谢 – alecxe
我需要点击'查看价格'。需要解析的表格是“马来西亚半岛:RBD P. Oil,RBD P.Olein&RBD P. Stearin'当地价格摘要' –