2017-10-07 69 views
0

我想读一些名称和ID,如:当我读到使用beautifulsoup标签我总是无

<a class="inst" href="loader.aspx?ParTree=151311&amp;i=3823243780502959" 
target="3823243780502959">رتكو</a> 

i = 3823243780502959 

等,从tsetmc.com。这是我的代码:

import requests 
from bs4 import BeautifulSoup 
url = 'http://www.tsetmc.com/Loader.aspx?ParTree=15131F' 
page = requests.get(url) 
soup = BeautifulSoup(page.content , 'html.parser') 
first_names_Id = soup.find_all('a',class_='isnt') 
print (first_names_Id) 

但它返回None

如何读取这些标签?我有与其他标签相同的问题。

回答

0

我使用Selenium而不是请求访问解析所需的网站,它给了我你想要的结果。

我相信为什么请求库没有返回的HTML响应,硒库的原因是因为你想要的网站解析呈现用JavaScript

另外请注意,您必须在class属性值一个错字,它应该是'inst'而不是'isnt'。

代码:

from selenium import webdriver 
from bs4 import BeautifulSoup 

driver = webdriver.Chrome() 
url = 'http://www.tsetmc.com/Loader.aspx?ParTree=15131F' 
driver.get(url) 
soup = BeautifulSoup(driver.page_source, 'html.parser') 
first_names_Id = soup.findAll('a', {'class': 'inst'}) 
print(first_names_Id) 

输出:

[<a class="inst" href="loader.aspx?ParTree=151311&amp;i=33541897671561960" target="33541897671561960">واتي</a>, <a class="inst" href="loader.aspx?ParTree=151311&amp;i=33541897671561960" target="33541897671561960">سرمايه‌ گذاري‌ آتيه‌ دماوند</a>, <a class="inst" href="loader.aspx?ParTree=151311&amp;i=9093654036027968" target="9093654036027968">طپنا7002</a>, <a class="inst" href="loader.aspx?ParTree=151311&amp;i=9093654036027968" target="9093654036027968">اختيارف رمپنا-7840-19/07/1396</a>, <a class="inst" href="loader.aspx?ParTree=151311&amp;i=19004627894176375" target="19004627894176375">طپنا7003</a>, <a class="inst" href="loader.aspx?ParTree=151311&amp;i=19004627894176375" target="19004627894176375">اختيارف رمپنا-8340-19/07/1396</a>, **etc**] 
相关问题