2014-11-06 78 views
0

我有一个pandas数据集,其中包含一个整数和浮点值:情节联合分布

>>> df2[['AGE_REF', 'RETSURV']].dtypes 
AGE_REF  int64 
RETSURV float64 
dtype: object 

我想用绘制大熊猫的联合分布。我没有看到一个简单的熊猫可视化联合分布的方式,但我偶然发现seaborn。所以,我想调整,我已经找到了我的目的,代码:

>>> import seaborn as sns 
>>> sns.jointplot('AGE_REF', "RETSURV", df2, 
       kind="hex") 
Traceback (most recent call last): 
    File "<input>", line 2, in <module> 
    File "/usr/local/lib/python2.7/site-packages/seaborn/distributions.py", line 969, in jointplot 
    gridsize = int(np.mean([x_bins, y_bins])) 
OverflowError: cannot convert float infinity to integer 

我发现了一个相关bug report,所以我试图按照替代方法有 - 没有成功:

>>> sns.jointplot('AGE_REF', "RETSURV", df2, 
       kind="hex", marginal_kws={"bins": 10}) 
Traceback (most recent call last): 
    File "<input>", line 2, in <module> 
    File "/usr/local/lib/python2.7/site-packages/seaborn/distributions.py", line 969, in jointplot 
    gridsize = int(np.mean([x_bins, y_bins])) 
OverflowError: cannot convert float infinity to integer 

回答

1

默认hexbin gridsize使用与直方图相同的参考规则计算,因此如果您的数据不知何故违反了这些假设,则需要直接设置:

sns.jointplot(x, y, kind="hex", 
       joint_kws={"gridsize": 10}, 
       marginal_kws={"bins": 10})