2017-03-07 86 views
-1
def age_range(age): 
    if age <= 18: 
     return 'Minors' 
    elif age >= 19 & age < 63: 
     return 'Adults' 
    elif age >= 63 & age < 101: 
     return 'Senior Citizen' 
    else: 
     return 'Age Unknown' 

titanic_data_df["PassengerType"] = titanic_data_df[['Age']].apply(age_range, axis = 1) 

titanic_data_df.head() 

我收到以下错误,当我试图将一个新列添加到现有的数据框(titanic_data_df):ValueError异常而使用apply()方法

--------------------------------------------------------------------------- 
ValueError        Traceback (most recent call last) 
<ipython-input-466-741f5646101e> in <module>() 
     1 #create a new df with just age and distinguish each passenger as minor, adult or senior citizen 
----> 2 titanic_data_df["PassengerType"] =  titanic_data_df[['Age']].apply(age_range, axis = 1) 
     3 
     4 titanic_data_df.head() 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\frame.pyc in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 
    4161      if reduce is None: 
    4162       reduce = True 
-> 4163      return self._apply_standard(f, axis, reduce=reduce) 
    4164    else: 
    4165     return self._apply_broadcast(f, axis) 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\frame.pyc in _apply_standard(self, func, axis, ignore_failures, reduce) 
    4257    try: 
    4258     for i, v in enumerate(series_gen): 
    -> 4259      results[i] = func(v) 
    4260      keys.append(v.name) 
    4261    except Exception as e: 

<ipython-input-465-e62ccbeee80e> in age_range(age) 
     1 def age_range(age): 
----> 2  if age <= 18: 
     3   return 'Minors' 
     4  elif age >= 19 & age < 63: 
     5   return 'Adults' 

C:\Users\test\Anaconda2\envs\py27DAND\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self) 
    915   raise ValueError("The truth value of a {0} is ambiguous. " 
    916       "Use a.empty, a.bool(), a.item(), a.any() or a.all()." 
--> 917       .format(self.__class__.__name__)) 
    918 
    919  __bool__ = __nonzero__ 

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0') 

从我迄今已经阅读与上述方法中的if ... else语句有关。我无法弄清楚它是什么。任何帮助表示赞赏。谢谢。

+1

您可以加入一个[MCVE(包括回溯)你的问题?如果我们无法重现错误,很难弄清楚发生了什么。 – MSeifert

+0

这是一个熊猫问题吗?问题标签似乎不完整。 –

+1

我对熊猫的了解不多,但我确实知道与逻辑运算符'和'不同的按位运算符'&',所以很可能是导致此问题的原因。其实,没关系 - 这会造成不正确的结果,而不是错误。 – TigerhawkT3

回答

1

当您选择一列作为titanic_data_df[['Age']](请注意双方括号)时,您实际上正在获取包含单个列的DataFrame。在这种情况下,apply()函数将单个元素串传递给函数age_range

试试这个:

titanic_data_df["PassengerType"] = titanic_data_df['Age'].apply(age_range) 
+0

谢谢你的解释。这就说得通了。 另外,它似乎我也可以使用appylmap(),如果我想继续使用数据帧而不是系列。 – pyuser181

0

熊猫cut功能将使这一更容易为你。首先,我将构建一个数据框来演示cut函数。

titanic_data_df = pd.DataFrame(data=[[13, 'Male'], [14, 'Female'], [38, 'Female'], [72, 'Male'], [33, 'Female'], [80, 'Male'], [34, 'Male'], [15, 'Female'], [27, 'Female'],[23, 'Male'], [64, 'Female'], [38, 'Female'], [12, 'Male'], [32, 'Female'], [21, 'Male'], [66, 'Male'], [73, 'Female'], [22, 'Female']], columns=['Age', 'Sex']) 
print(titanic_data_df) 
    Age  Sex 
0 13 Male 
1 14 Female 
2 38 Female 
3 72 Male 
4 33 Female 
5 80 Male 
6 34 Male 
7 15 Female 
8 27 Female 
9 23 Male 
10 64 Female 
11 38 Female 
12 12 Male 
13 32 Female 
14 21 Male 
15 66 Male 
16 73 Female 
17 22 Female 

然后,我简单地套用cut功能:

bins = ['Minors', 'Adults', 'Senior Citizens'] 
titanic_data_df["PassengerType"] = pd.cut(titanic_data_df.Age, [0, 18, 63, 101], labels=bins) 
print(titanic_data_df) 
    Age  Sex  PassengerType 
0 13 Male   Minors 
1 14 Female   Minors 
2 38 Female   Adults 
3 72 Male Senior Citizen 
4 33 Female   Adults 
5 80 Male Senior Citizen 
6 34 Male   Adults 
7 15 Female   Minors 
8 27 Female   Adults 
9 23 Male   Adults 
10 64 Female Senior Citizen 
11 38 Female   Adults 
12 12 Male   Minors 
13 32 Female   Adults 
14 21 Male   Adults 
15 66 Male Senior Citizen 
16 73 Female Senior Citizen 
17 22 Female   Adults 
+0

谢谢你的解释。这很棒! – pyuser181