2016-02-23 27 views
2

我使用pandas lineSer.value_counts()创建频率表,但它不显示我所有的项目。我有超过100个数据,我需要看到他们的python中使用熊猫的频率表

def freqTable(): 
    fileIn = open('data.txt','r') 
    fileOut = open('dataOut.txt', 'w') 
    lines = [line.strip() for line in fileIn if line.strip() and not line.startswith('com') 
    lineSer = pd.Series(lines) 
    freq = str(lineSer.value_counts()) 
    for line in freq: 
     fileOut.write(line) 

这是我使用的代码,我需要的结果,以摆脱“...”,看到所有的数据点。我可以做什么不同?

Madding.  57 
Crowning. 47 
My.   8 
And.   8 
Thy.   7 
Thou.   7 
The.   5 
To.   5 
For.   5 
I.   4 
That.   4 
In.   4 
Love.   4 
Is.   3 
Not.   3 
... 
Did.   1 
Shadows.  1 
Of.   1 
Mind,.  1 
O'erlook.  1 
Sometime.  1 
Fairer.  1 
Monsters,. 1 
23.   1 
Defect,.  1 
Show,.  1 
What's.  1 
Wood.   1 
So.   1 
Lov'st,.  1 
Length: 133, dtype: int64 

回答

0

试试这个:

pd.options.display.max_rows = 999 
3

如果你想写列表中的文件时,不要使之成为一个字符串,并写一个文件。 Pandas具有内置函数来将文件写入文件。只要做lineSer.value_counts().to_csv('dataOut.txt')。如果您想调整输出的格式,请阅读to_csv的文档以了解如何对其进行自定义。 (您可以也可能使用类似pandas.read_csv读取更有效的数据,但这是另一个话题。)

1

如果你需要临时的数据显示,尝试option_contextdisplay.max_rows

#temporary print 999 rows 
with pd.option_context('display.max_rows', 999): 
    print freq 

更多信息在docs

我尝试使用功能stripstartswith修改您的解决方案,用于字符串数据的工作和to_csv写入输出file

import pandas as pd 
import io 

temp=u"""Madding. 
Madding. 
    Madding. 
Madding. 
Crowning. 
    Crowning. 
com Crowning. 
com My. 
    com And. 
    Thy. 
Thou. 
The.""" 
#after testing replace io.StringIO(temp) to data.txt 
s = pd.read_csv(io.StringIO(temp), sep="|", squeeze=True) 
print s 
0   Madding. 
1   Madding. 
2   Madding. 
3   Crowning. 
4   Crowning. 
5  com Crowning. 
6   com My. 
7   com And. 
8    Thy. 
9    Thou. 
10    The. 
Name: Madding., dtype: object 

#strip data 
s = s.str.strip() 
#get data which starts with 'com' 
print s.str.startswith('com') 
0  False 
1  False 
2  False 
3  False 
4  False 
5  True 
6  True 
7  True 
8  False 
9  False 
10 False 
Name: Madding., dtype: bool 

#filter rows, which not starts width 'com' 
s = s[~s.str.startswith('com')] 
print s 
0  Madding. 
1  Madding. 
2  Madding. 
3  Crowning. 
4  Crowning. 
8   Thy. 
9   Thou. 
10   The. 
Name: Madding., dtype: object 

#count freq 
freq = s.value_counts() 
#temporary print 999 rows 
with pd.option_context('display.max_rows', 999): 
    print freq 
Madding.  3 
Crowning. 2 
Thou.  1 
Thy.   1 
The.   1 
Name: Madding., dtype: int64 

#write series to file by to_csv 
freq.to_csv('dataOut.txt', sep=';')