2017-04-18 97 views
2

我想刮(https://en.wikiquote.org/wiki/Remember_the_Titans#Coach_Boone),我想从除了对话,标语和外部链接的所有部分获得报价。我可以去ul > li,但它是取得一切。我怎样才能在下面的HTML后取ul > liPython:如何挑选相邻的元素?

<h2><span class="mw-headline" id="Coach_Boone">Coach Boone</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a href="/w/index.php?title=Remember_the_Titans&amp;action=edit&amp;section=1" title="Edit section: Coach Boone">edit</a><span class="mw-editsection-bracket">]</span></span></h2> 

回答

2

一旦你所在的h2元素,使用.find_next_siblings()方法得到以下ul同级元素:

h2 = soup.find("span", id="Coach_Boone").find_parent('h2') 
for ul in h2.find_next_siblings("ul"): 
    for li in ul.find_all("li"): 
     print(li) 
+0

我发现它的唯一问题获取下一节的'ul'。 – Volatil3