2017-04-03 205 views
0

我在Jupyter Notebook中提供了一个数据帧。数据帧的初始数据类型是float。我想将印刷表的行1 & 3作为整数呈现,并将行2 & 4呈现为百分比。我怎么做? (我花了大量的时间寻找,但没有成功的解决方案)格式化Jupyter Notebook数据帧输出中的特定行

下面是我使用的代码:

#Creating the table 
clms = sales.columns 
indx = ['# of Poeple','% of Poeple','# Purchased per Activity','% Purchased per Activity'] 
basic_stats = pd.DataFrame(data=np.nan,index=indx,columns=clms) 
basic_stats.head() 

#Calculating the # of people who took part in each activity 
for clm in sales.columns: 
    basic_stats.iloc[0][clm] = int(round(sales[sales[clm]>0][clm].count(),0)) 

#Calculating the % of people who took part in each activity from the total email list 
for clm in sales.columns: 
    basic_stats.iloc[1][clm] = round((basic_stats.iloc[0][clm]/sales['Sales'].count())*100,2) 

#Calculating the # of people who took part in each activity AND that bought the product 
for clm in sales.columns: 
    basic_stats.iloc[2][clm] = int(round(sales[(sales[clm] >0) & (sales['Sales']>0)][clm].count())) 

#Calculating the % of people who took part in each activity AND that bought the product 
for clm in sales.columns: 
    basic_stats.iloc[3][clm] = round((basic_stats.iloc[2][clm]/basic_stats.iloc[0][clm])*100,2) 

#Present the table 
basic_stats 

这里的印刷表: Output table of 'basic_stats' data frame in Jupyter Notebook

+1

的可能的复制[如何显示使用列的格式字符串彩车的熊猫数据框?](http://stackoverflow.com/questions/20937538/how-to-display-pandas-dataframe-of-浮动使用格式字符串为列) – IanS

+0

我会建议转置您的表,将解决方案应用于建议的副本,然后转置回来显示它。 – IanS

回答

0

整数表示

您已经将整数分配给第1行和第3行的单元格。这些整数以浮点形式打印的原因是所有列都有数据类型float64。这是由您最初创建数据框的方式引起的。您可以通过打印.dtypes属性查看数据类型:

basic_stats = pd.DataFrame(data=np.nan,index=indx,columns=clms) 
print(basic_stats.dtypes) 

# Prints: 
# column1 float64 
# column2 float64 
# ... 
# dtype: object 

如果不提供数据 框架的构造函数的data关键字参数,每个单元格的数据类型将是object它可以是任何物体:

basic_stats = pd.DataFrame(index=indx,columns=clms) 
print(basic_stats.dtypes) 

# Prints: 
# column1 object 
# column2 object 
# ... 
# dtype: object 

当单元的数据类型是object,内容被使用它是格式化这导致整数内置方法拜因正确格式化。

百分比表示

为了显示百分比,则可以使用打印浮点数你想要的方式自定义类:

class PercentRepr(object): 
    """Represents a floating point number as percent""" 
    def __init__(self, float_value): 
     self.value = float_value 
    def __str__(self): 
     return "{:.2f}%".format(self.value*100) 

然后,只需使用这个类的第1行的值和3:

#Creating the table 
clms = sales.columns 
indx = ['# of Poeple','% of Poeple','# Purchased per Activity','% Purchased per Activity'] 
basic_stats = pd.DataFrame(index=indx,columns=clms) 
basic_stats.head() 

#Calculating the # of people who took part in each activity 
for clm in sales.columns: 
    basic_stats.iloc[0][clm] = int(round(sales[sales[clm]>0][clm].count(),0)) 

#Calculating the % of people who took part in each activity from the total email list 
for clm in sales.columns: 
    basic_stats.iloc[1][clm] = PercentRepr(basic_stats.iloc[0][clm]/sales['Sales'].count()) 

#Calculating the # of people who took part in each activity AND that bought the product 
for clm in sales.columns: 
    basic_stats.iloc[2][clm] = int(round(sales[(sales[clm] >0) & (sales['Sales']>0)][clm].count())) 

#Calculating the % of people who took part in each activity AND that bought the product 
for clm in sales.columns: 
    basic_stats.iloc[3][clm] = PercentRepr(basic_stats.iloc[2][clm]/basic_stats.iloc[0][clm]) 

#Present the table 
basic_stats 

注意:这实际上改变了数据帧中的数据!如果你想用行1和3的数据做进一步的处理,你应该知道这些行不再包含浮动对象。

+0

非常感谢你提供这个非常有用的解决方案和下面的重要评论。我学到了新的东西:) – Shahar

+0

非常欢迎:-) – Felix

0

这里有一种方式,一种黑客,但如果它只是为了漂亮的打印,它会工作。

df = pd.DataFrame(np.random.random(20).reshape(4,5)) 

# first and third rows display as integers 
df.loc[0,] = df.loc[0,]*100 
df.loc[2,] = df.loc[2,]*100 

df.loc[0,:] = df.loc[0,:].astype(int).astype(str) 
df.loc[2,:] = df.loc[2,:].astype(int).astype(str) 

# second and fourth rows display as percents (with 2 decimals) 
df.loc[1,:] = np.round(df.loc[1,:].values.astype(float),4).astype(float)*100 
df.loc[3,:] = np.round(df.loc[3,:].values.astype(float),4).astype(float)*100