尝试使用Python中的熊猫去除逗号和美元符号

从列中删除逗号和美元符号。但是当我这样做时，桌子将它们打印出来，然后仍然放在那里。有没有一种方法可以使用熊猫功能去除指令和美元符号。我unuable找到API文档事情也许我一直在寻找在错误的地方尝试使用Python中的熊猫去除逗号和美元符号

import pandas as pd 
    import pandas_datareader.data as web 

players = pd.read_html('http://www.usatoday.com/sports/mlb/salaries/2013/player/p/') 


df1 = pd.DataFrame(players[0]) 


df1.drop(df1.columns[[0,3,4, 5, 6]], axis=1, inplace=True) 
df1.columns = ['Player', 'Team', 'Avg_Annual'] 
df1['Avg_Annual'] = df1['Avg_Annual'].replace(',', '') 

print (df1.head(10))

来源

2016-07-22 Mark

只需在您的替换中添加'regex = True'，它就可以工作。 – shivsn

你必须每http://pandas.pydata.org/pandas-docs/stable/text.html

df1['Avg_Annual'] = df1['Avg_Annual'].str.replace(',', '') 
df1['Avg_Annual'] = df1['Avg_Annual'].str.replace('$', '') 
df1['Avg_Annual'] = df1['Avg_Annual'].astype(int)

来源

2016-07-22 00:56:01 bernie

从this answer无耻被盗访问str属性... 但是，这个答案只是关于改变一个字符，并没有完成冷静：因为它需要一个字典，你可以一次替换任意数量的字符，以及任意数量的列。

# if you want to operate on multiple columns, put them in a list like so: 
cols = ['col1', 'col2', ..., 'colN'] 

# pass them to df.replace(), specifying each char and it's replacement: 
df[cols] = df[cols].replace({'\$': '', ',': ''}, regex=True)

@shivsn发现您需要使用regex=True;你已经知道替换（但也没有显示试图在多列上使用它或同时使用美元符号和逗号）。

这个答案只是简单地说明了我在其他地方为他人找到的细节（例如noobs到python和pandas）。希望它有帮助。

来源

2017-09-26 15:49:10 Hendy

@ bernie的回答是针对您的问题。这是我在熊猫中加载数字数据的一般问题。

通常数据的来源是为直接消费而生成的报告。因此存在额外的格式，如%，千位分隔符，货币符号等。所有这些对于阅读都很有用，但会导致默认解析器出现问题。我的解决方案是将字段转换为字符串，逐个替换这些符号，然后将其转换回适当的数字格式。具有仅保留[0-9.]的样板功能是诱人的，但在千位分隔符和小数位被换位的情况下也会产生问题，而且在科学记号的情况下也是如此。这是我的代码，我包装成一个函数，并根据需要应用。

df[col] = df[col].astype(str) # cast to string 

# all the string surgery goes in here 
df[col] = df[col].replace('$', '') 
df[col] = df[col].replace(',', '') # assuming ',' is the thousand's separator in your locale 
df[col] = df[col].replace('%', '') 

df[col] = df[col].astype(float) # cast back to appropriate type

来源

2018-01-12 16:27:19 BiGYaN

尝试使用Python中的熊猫去除逗号和美元符号

回答

相关问题