2017-05-04 59 views
0

该代码通过循环遍历每行并调用'is_color'函数来工作。该函数的第i行检查值,并指定颜色,“蓝色”例如,如果条件满足为熊猫的for循环中的单元赋值的正确方法

import numpy as np 
import pandas as pd 

def is_color(df): 

    df['color'] = np.nan 
    def blue(i): 
     is_blue = True # some more complex condition 
     if is_blue: 
      #df['color'].iloc[i] = 'blue' 
      df.set_value(i, 'color', 'blue') 

    for i in range(len(df)): 

     blue(i) 

     # not included i this example 
     #green(i) 
     #orange(i) 
     #purple(i) 
     #yellow(i) 

    return df 

我本来是做df['color'].iloc[i] = 'blue'其工作,但扔了SettingWithCopyWarning我需要使它的生产准备,我想df.set_value(i, 'color', 'blue')但抛出一个ValueError: could not convert string to float: blue我需要做的是这样的,我认为:

import numpy as np 
import pandas as pd 

def is_color(df): 

    df['color'] = np.nan 
    def blue(i, df): 
     is_blue = True # some more complex condition 
     if is_blue: 
      #df['color'].iloc[i] = 'blue' 
      return df.set_value(i, 'color', 'blue') 
     return df 

    for i in range(len(df)): 

     df = blue(i, df) 

     # not included i this example 
     #df = green(i, df) 
     #df = orange(i, df) 

    return df 

我觉得像我原来的代码更干净,虽然,有一个漂亮的方式做到这一点?

+0

,你能告诉我们的样本你的数据框? – Hackaholic

+0

您需要将'df ['color'] = np.nan'更改为'df ['color'] ='''或将其删除。 – jezrael

+0

但是你真的需要循环吗? – jezrael

回答

0

如果许多条件可能使用applyifelifelse自定义函数:

样品:

df = pd.DataFrame({'A':[10,20,31], 
        'B':[4,5,6]}) 

print (df) 

def is_color(x): 
    if x < 15: 
     x = 'blue' 
    elif (x > 15) and (x < 25): 
     x = 'green' 
    else: 
     x = 'nothing' 
    return (x) 


df['color'] = df['A'].apply(is_color) 
print (df) 
    A B color 
0 10 4  blue 
1 20 5 green 
2 31 6 nothing 

类似的解决方案:

def is_color(x): 
    a = 'nothing' 
    if x < 15: 
     a = 'blue' 
    if (x > 15) and (x < 25): 
     a = 'green' 
    return (a) 


df['color'] = df['A'].apply(is_color) 
print (df) 
    A B color 
0 10 4  blue 
1 20 5 green 
2 31 6 nothing