2010-10-03 74 views

回答

0

我会建议使用这两种机械化和etree,但我不是程序员,所以不要把我的话。注意:所有代码都在python中,并且是版本2.7.1,但应该可以达到2.7.3。

希望我能帮助 - 只是另一个傻瓜

import mechanize 
import lxml.etree as etree 

url = 'something' 

br = mechanize.Browser() 
resp = br.open(url) 
parser = etree.parser() 
tree = etree.parse(resp,parser) 
forms = list(br.forms()) 
id_info = {} 
for form in forms: 
    elements = form.controls 
    for element in elements: 
     id_info[element.attrs['id']]='' 

inputs = tree.findall('.//input') 
for i in inputs: 
    index = list(i.getparent()).index(i) 
    id_info[i.attrib['id']] = list(i.getparent)[index+1] 

for j in id_info: 
    print j,id_info(j)