0
我试图从Goodreads中删除引号。我只需要引用,而不是作者的名字。从上一个孩子的文本以外的节点刮取文本
以下是HTML源代码。
<div class="quoteText">
“Don't cry because it's over, smile because it happened.”
<br> ―
<a class="authorOrTitle" href="/author/show/61105.Dr_Seuss">Dr. Seuss</a>
</div>
我在下面尝试,但它带有作者信息。
quotes = [quote.text.strip() for quote in soup.findAll('div', {'class':'quoteText'})]
我也使用contents[0]
尝试,但在多报价的情况下失败。请看下图:
<div class="quoteText">
“You've gotta dance like there's nobody watching,
<br>
Love like you'll never be hurt,
<br>
Sing like there's nobody listening,
<br>
And live like it's heaven on earth.”
<br> ―
<a class="authorOrTitle" href="/author/show/1744830.William_W_Purkey">William W. Purkey</a>
</div>
哦,是取代它。奇怪它并没有跨过我的脑海。 –