-2
以下代码不断给我012行上的错误IndexError: list index out of range
print (aTweet + '~' + timeSource[x] + '~' + keyWord[i])
。这与keyword[i]
术语有关吗?我明白Index out of range
通常意味着提供一个索引,其中不存在列表元素。这是否意味着错误实际上可能在于本节:Python:索引超出范围错误
if (len(splitSource) > 20):
max_range = 19
else:
max_range = len(splitSource)
参考代码:
import re
from re import sub
import time
import cookielib
from cookielib import CookieJar
import urllib2
from urllib2 import urlopen
import difflib
import sys
cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
keyWord = ["Scotch"]
def main():
i=0
while i<len(keyWord):
startingLink = 'https://twitter.com/search/realtime?q='+keyWord[i]
tUrl = startingLink+'&src=hash'
oldTwit = []
newTwit = []
howSimAr = [.5,.5,.5,.5,.5]
sourceCode = opener.open(tUrl).read()
splitSource = re.findall(r'<p class="js-tweet-text tweet-text">(.*?)</p>',sourceCode)
timeSource = re.findall(r'js-nav" title="(.*?)"',sourceCode)
if (len(splitSource) > 20):
max_range = 19
else:
max_range = len(splitSource)
print ''
print ''
print ''
##print 'Keyword: ' + keyWord[i]
print ''
for x in range (0, max_range):
aTweet = re.sub(r'<.*?>','',splitSource[x])
print (aTweet + '~' + timeSource[x] + '~' + keyWord[i])
#print ';'
newTwit.append(aTweet)
## comparison = difflib.SequenceMatcher(None, newTwit, oldTwit)
## howSim = comparison.ratio()
## print ';'
## print 'This selection is',howSim,'similar to the past'
## howSimAr.append(howSim)
## howSimAr.remove(howSimAr[0])
##
## waitMultiplier = reduce(lambda x, y: x+y, howSimAr)/len(howSimAr)
##
## print ''
## print 'The current similarity array:',howSimAr
## print 'Our current Multiplier:', waitMultiplier
oldTwit = [None]
for eachItem in newTwit:
oldTwit.append(eachItem)
newTwit = [None]
time.sleep(2)
x = 0
i = i + 1
## except Exception, e:
## print str(e)
## print 'errored in the main try'
main()
您正在将'timeSource'索引为'x',但'x'的范围由'splitSource'的长度决定(通过'max_range')。如果'splitSource'比'timeSource'更长(包含更多元素),这将不起作用。 – 2014-09-12 15:05:15
@Tom有道理,创建另一个变量会更好吗? – 2014-09-12 15:09:36
我不清楚'splitSource's和'timeSource's之间的关系是什么,或者你的代码试图做什么。他们似乎都与推文有关,但我不知道你期望的数据是什么?例如。当你搜索关键字“苏格兰威士忌”时,你期望'splitSource'中有多少物品,'timeSource'中有多少物品? – 2014-09-12 15:19:25