我有一个段落列表,我想在其组合上运行zipf分布。使用matplotlib构造Zipf分布,尝试绘制拟合线
我的代码如下:
from itertools import *
from pylab import *
from collections import Counter
import matplotlib.pyplot as plt
paragraphs = " ".join(targeted_paragraphs)
for paragraph in paragraphs:
frequency = Counter(paragraph.split())
counts = array(frequency.values())
tokens = frequency.keys()
ranks = arange(1, len(counts)+1)
indices = argsort(-counts)
frequencies = counts[indices]
loglog(ranks, frequencies, marker=".")
title("Zipf plot for Combined Article Paragraphs")
xlabel("Frequency Rank of Token")
ylabel("Absolute Frequency of Token")
grid(True)
for n in list(logspace(-0.5, log10(len(counts)-1), 20).astype(int)):
dummy = text(ranks[n], frequencies[n], " " + tokens[indices[n]],
verticalalignment="bottom",
horizontalalignment="left")
起初我也遇到了以下错误出于某种原因,不知道为什么:
IndexError: index 1 is out of bounds for axis 0 with size 1
目的 我尝试画“一拟合线“,并将其值赋给变量。但我不知道如何补充。任何帮助都将非常赞赏这两个问题。
它不再清楚为什么答案下面是有关这个问题;请将帖子中的代码恢复到原始状态,以便将来的读者可以看到原始问题和解决方案 – dshort