合并返回除第一行外的NaN

你好我想合并两个数据框，我在Excel中加载。我将应该合并的列转换为“str”。 Suprsingly代码合并的第一行，但然后返回NaN值.... 我使用的代码是：合并返回除第一行外的NaN

ListA=pd.read_excel(inpath,sheetname="Tabelle2") 
ListA["Stücklistenkomponente"]=ListA["Material"].astype(np.str) 
ListB=pd.read_excel(inpath,sheetname="Tabelle1") 
ListB["Stücklistenkomponente"]=ListB["Material"].astype(np.str) 
print(ListA.dtypes) 
print(ListB.dtypes)

Material对象

Material对象

的形状两个数据帧是：

ListA

Material 
R 22B 2.0 7.72 11.0 Lo 
X 127 1.5x4.64x4[G16.05.01] CL 
L 431 2x6,96x5.5 Y 
9999 
L 431 2x5,96x5.5 p 
F 631 2x6,96x5.5 a 
N 431 2x6,96x5.5 v 
J 431 2x6,96x5.5 
O 431 2x6,96x5.5 
VM 431 2x6,96x5.5 L

数组listB

Material       InnerDiameter OuterDiameter Length 
    R 22B 2.0 7.72 11.0 Lo   2    6    8 
    X 127 1.5x4.64x4[G16.05.01] CL 2    7    12 
    L 431 2x6,96x5.5 Y    5    8    13 
    9999        0    0    0 
    L 431 2x5,96x5.5 p    6    9    15 
    F 631 2x6,96x5.5 a    8    5    26 
    N 431 2x6,96x5.5 v    9    1    3  
    J 431 2x6,96x5.5     12    6    89 
    O 431 2x6,96x5.5     5    4    12 
    VM 431 2x6,96x5.5 L    4    12    7

它返回：

  Material  InnerDiameter OuterDiameter Lenth 
      R 22B 2.0 7.72 11.0 Lo 2     6  8 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN 
        NaN    NaN    NaN NaN

那我做错了吗？我认为解决方案是将两列转换为dtype字符串，但这不起作用....

感谢任何帮助！

来源

2017-10-17 2Obe

我认为必须有一些不同的数据，也许拖曳witespaces，因为.astype(str)正确地将数据转换为string s。

如果数据是string S，dict S，set S，list当时的dtype是object。

但type是string，dict ...

您可以通过检查：

print(ListA["Stücklistenkomponente"].apply(type))

对于检查数据更好地帮助某个时候产生lists：

print(ListA["Stücklistenkomponente"].tolist()) 
print(ListB["Stücklistenkomponente"].tolist())

编辑：

我测试数据和结果真的很有趣：

df1 = pd.read_excel('Mappe3.xlsx',sheetname="Tabelle2") 
df2 = pd.read_excel('Mappe3.xlsx',sheetname="Tabelle1") 

#default inner join - get duplicated rows, because duplicate values 
#on should be omit if only one same column for join 
df = pd.merge(df1, df2) 
print (df.head(10)) 
        Stücklistenkomponente Ritzel_Materialnummer \ 
0 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
1 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
2 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
3 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
4 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
5 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
6 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
7 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
8 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
9 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
... 
...

#remove duplicates in both df 
df1 = df1.drop_duplicates('Stücklistenkomponente') 
df2 = df2.drop_duplicates('Stücklistenkomponente') 

#default inner join - only 5 same categories 
df = pd.merge(df1, df2) 
print (df) 
        Stücklistenkomponente Ritzel_Materialnummer \ 
0 RITZEL 22F 2.0 7.72 11.0 Z17 SCHWEISS   401.4425.13 
1 RITZEL 22F 3.0 7.72 11.0 Z17 SCHWEISS   401.4425.15 
2  RITZEL 22F 3.0 7.9 6.0 Z17 PRESS   401.4425.11 
3  RITZEL 22F 3.0 6.0 15.0 PRESS Z8   401.4487.01 
4  RITZEL 22F 4.0 7.9 6.0 Z17 PRESS   401.4425.14 

    Innendurchmesser Außendurchmesser Länge   Material1 Material2 \ 
0    2    7.72 11.0   X46Cr13   - 
1    3    7.72 11.0   X46Cr13   - 
2    4    7.90 6.0 42CrMo4 vergütet   - 
3    3    6.00 15.0 42CrMo4 vergütet   - 
4    2    7.90 6.0 42CrMo4 vergütet   - 

    Material3 
0   - 
1   - 
2   - 
3   - 
4   -

来源

2017-10-17 13:26:15 jezrael

不幸的是，数据是相同的，也检查dtypes没有透露任何差异....不知道 – 2Obe

此外，为什么它为第一行工作，但然后停止 – 2Obe

数据是同列的明智吗？什么返回'print（ListA [“Stücklistenkomponente”] == ListB [“Stücklistenkomponente”]）？ – jezrael

合并返回除第一行外的NaN

回答

相关问题