我的代码如下UnicodeEncodeError使用DecisionTree
# -*- coding: utf-8 -*-
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import tree
Model_Dev_Val = pd.read_excel("data2.xlsx")
target = Model_Dev_Val[['source_2']]
model_train, model_test, y_train, y_test = train_test_split(Model_Dev_Val, target,test_size = 0.5, random_state = 40,stratify = target)
clf = tree.DecisionTreeClassifier()
clf = clf.fit(model_train,y_train)
但它抛出一个错误:
UnicodeEncodeError: 'decimal' codec can't encode characters in position 0-2: invalid decimal Unicode string
data2.xlsx include some Chinese, and the data has been cleaned.
可能会有文件中的中文字符出现问题。 – PinkFluffyUnicorn
我想过了。我从老板那里获取正确的data.xlsx。并且它错误:ValueError:输入包含NaN,无穷大或者对于dtype('float32')来说值太大。 –
然后在那里可能有一个'NaN','infinity'或者太大的数字 – PinkFluffyUnicorn