提取使用美丽的汤

我想获取股票价格从网站：http://www.bseindia.com/ 例如股票价格表现为“S & P BSE：25,489.57”。我想它的数字部分为“25489.57”提取使用美丽的汤

取

这是我写的截至目前的代码。它是提取这个数量出现的整个div，但不是数量。

下面是代码：

from bs4 import BeautifulSoup 
from urllib.request import urlopen 



page = "http://www.bseindia.com" 

html_page = urlopen(page) 

html_text = html_page.read() 
soup = BeautifulSoup(html_text,"html.parser") 
divtag = soup.find_all("div",{"class":"sensexquotearea"}) 
for oye in divtag: 
    tdidTags = oye.find_all("div", {"class": "sensexvalue2"}) 

    for tag in tdidTags: 
     tdTags = tag.find_all("div",{"class":"newsensexvaluearea"}) 
     for newtag in tdTags: 
      tdnewtags = newtag.find_all("div",{"class":"sensextext"}) 
      for rakesh in tdnewtags: 
       tdtdsp1 = rakesh.find_all("div",{"id":"tdsp"}) 
       for texts in tdtdsp1: 
        print(texts)

来源

2016-05-15 RakeshKirola

在我看来，你正在寻找的数量不是静态服务的（不是源HTML的字符串）。 – roadrunner66

感谢您的答复。任何可以完成此任务的方式？ – RakeshKirola

是的，它可以完全实现，我会在一秒内发布一个答案，向您展示如何做该页面上的JavaScript正在做的事情。 – Keatinge

我身边的是什么时，该网页加载信息，我是能够模拟什么的JavaScript要去看了一下在python中做。

我发现它引用了一个名为IndexMovers.aspx?ln=encheck it out here

它看起来像这样一页一页是一个逗号分隔的事情列表。首先是名称，接下来是价格，然后是其他一些你不关心的事情。

为了在python中模拟这个，我们请求页面，用逗号分隔它，然后读取列表中的每个第6个值，然后将该值和一个值添加到名为stockInformation的新列表中。

现在我们可以只通过股票信息回路，并与item[1]

import requests 

newUrl = "http://www.bseindia.com/Msource/IndexMovers.aspx?ln=en" 
response = requests.get(newUrl).text 
commaItems = response.split(",") 


#create list of stocks, each one containing information 
#index 0 is the name, index 1 is the price 
#the last item is not included because for some reason it has no price info on indexMovers page 
stockInformation = [] 
for i, item in enumerate(commaItems[:-1]): 
    if i % 6 == 0: 
     newList = [item, commaItems[i+1]] 
     stockInformation.append(newList) 


#print each item and its price from your list 
for item in stockInformation: 
    print(item[0], "has a price of", item[1])

此打印出获得使用item[0]和价格名称：

S&P BSE SENSEX has a price of 25489.57 
SENSEX#S&P BSE 100 has a price of 7944.50 
BSE-100#S&P BSE 200 has a price of 3315.87 
BSE-200#S&P BSE MidCap has a price of 11156.07 
MIDCAP#S&P BSE SmallCap has a price of 11113.30 
SMLCAP#S&P BSE 500 has a price of 10399.54 
BSE-500#S&P BSE GREENEX has a price of 2234.30 
GREENX#S&P BSE CARBONEX has a price of 1283.85 
CARBON#S&P BSE India Infrastructure Index has a price of 152.35 
INFRA#S&P BSE CPSE has a price of 1190.25 
CPSE#S&P BSE IPO has a price of 3038.32 
#and many more... (total of 40 items)

其中明确被equivlent到值的显示