熊猫据帧 - 生成增量值

在我的工作流程有四列OID, value, count, unique_id多的CSV。我正试图计算如何在unique_id列下生成增量值。使用apply()，我可以这样做df.apply(lambda x : x + 1) #where x = 0，这将导致所有的值下unique_id为1。但是，我对如何使用apply()产生增量每行中的特定列的值相混淆。熊猫据帧 - 生成增量值

# Current Dataframe 
    OID Value Count unique_id 
0 -1  1  5   0 
1 -1  2  46   0 
2 -1  3  32   0 
3 -1  4  3   0 
4 -1  5  17   0 

# Trying to accomplish 
    OID Value Count unique_id 
0 -1  1  5   0 
1 -1  2  46   1 
2 -1  3  32   2 
3 -1  4  3   3 
4 -1  5  17   4

示例代码（我的理解是语法不正确，但它大约是什么，我试图完成）：

def numbers(): 
    for index, row in RG_Res_df.iterrows(): 
     return index 

RG_Res_df = RG_Res_df['unique_id'].apply(numbers)

来源

2017-03-02 cptpython

你可以做'DF [ 'UNIQUE_ID'] = np.arange（df.shape [0]）' – EdChum

不循环，您只要直接分配numpy的数组产生的ID，这里使用np.arange和通过的行使用RangeIndex，这里的num的，这将是df.shape[0]

In [113]: 
df['unique_id'] = np.arange(df.shape[0]) 
df 

Out[113]: 
    OID Value Count unique_id 
0 -1  1  5   0 
1 -1  2  46   1 
2 -1  3  32   2 
3 -1  4  3   3 
4 -1  5  17   4

或纯大熊猫方法默认start是0，所以我们只需要通过stop=df.shape[0]：

In [114]: 
df['unique_id'] = pd.RangeIndex(stop=df.shape[0]) 
df 

Out[114]: 
    OID Value Count unique_id 
0 -1  1  5   0 
1 -1  2  46   1 
2 -1  3  32   2 
3 -1  4  3   3 
4 -1  5  17   4

来源

2017-03-02 16:43:42 EdChum

这美丽的工作。 Numpy函数是否优于熊猫？还是他们相当可比？此外，'DF [ 'UNIQUE_ID'] = pd.RangeIndex（停止= df.shape [0]）'给我'AttributeError的： '模块' 对象没有属性“RangeIndex''。任何想法？我能够更早地使用它的索引进行迭代。 – cptpython

您可能需要添加'导入熊猫作为PD'一般也没有太大的不同，但numpy方法会更快，所以它应该是首选它在哪里做你想要的东西 – EdChum

我发现问题，我使用旧版本工作中的熊猫。此外，你可以指出为什么不下面的'np.arange'语法：'DF [“UNIQUE_ID”] = np.arange（57）'抛出这个错误：'ValueError异常：值的长度不符合index'的长度？ – cptpython

熊猫据帧 - 生成增量值

回答

相关问题