2017-10-05 67 views
2

我有以下代码:加载网页抓取结果为大熊猫数据帧

sauce = urllib.request.urlopen('https://www.iproperty.com.my/sale/selangor/all-commercial/?q=UOA%20Business%20Park').read() 
soup = bs.BeautifulSoup(sauce,'html.parser') 

price = soup.find_all('ul',class_='listing-primary-price jMWEse') 

BUA = soup.find_all('li',class_='attributes-price-per-unit-size-item builtUp-attr fsbnan') 


for data in price: 
    Price = data.text 
    print(Price) 

for data in BUA: 
    BUA = data.text 
    print(BUA) 

打印价格BUA给了我下面的结果:

Price: 
RM 1,067,490 
RM 2,246,160 
RM 929,160 
RM 1,321,000 
RM 103,840,000 

BUA: 
Built-up : 1,227 sq. ft.Built-up : 1,227 sq. ft. 
Built-up : 2,292 sq. ft.Built-up : 2,292 sq. ft. 
Built-up : 1,044 sq. ft.Built-up : 1,044 sq. ft. 
Built-up : 1,335 sq. ft.Built-up : 1,335 sq. ft. 
Built-up : 118,000 sq. ft.Built-up : 118,000 sq. ft. 

我的问题是,如何能我加载价格BUA成熊猫数据框,因为我想加入这两个他们喜欢的东西打印的最终结果:

Price:    BUA:   
0 RM 1,067,490  Built-up : 1,227 sq. ft.Built-up : 1,227 sq. ft. 
1 RM 2,246,160  Built-up : 2,292 sq. ft.Built-up : 2,292 sq. ft. 
2 RM 929,160   Built-up : 1,044 sq. ft.Built-up : 1,044 sq. ft. 
3 RM 1,321,000  Built-up : 1,335 sq. ft.Built-up : 1,335 sq. ft. 
4 RM 103,840,000  Built-up : 118,000 sq. ft.Built-up : 118,000 sq. ft. 

另一个原因,我想将它们放到一个熊猫数据帧是因为我需要做在Excel中一些计算以后。

回答

1

我相信你需要:

a = [data.text for data in price] 
b = [data.text for data in BUA] 

df = pd.DataFrame({'price':a, 'BUA':b}, columns=['price','BUA']) 
+1

工作得很好!谢谢! –

0
df = pd.DataFrame() 
    df['price'] = [data.text for data in price] 
    df['bua'] = [data.text for data in bua]