Hamming距离Matlab到Python

好吧，我正在为k-NN方法中的两个文档做汉明距离。我试图将Matlab代码翻译成Python，但我一直在看它几个小时，不知道是什么导致了错误。Hamming距离Matlab到Python

代码在Matlab：

function [ Dist ] = hamming_distance(X,Xtrain) 
% Function calculates Hamming distances of elements in set X from elements in set Xtrain. Distances of objects are returned as matrix Dist 
% X - set of objects we are comparing N1xD 
% Xtrain - set of objects to which X objects are compared N2xD 
% Dist - matrix of distances between X and Xtrain objects N1xN2 
% N1 - number of elements in X 
% N2 - number of elements in Xtrain 
% D - number of features (key words) 

N1 = size(X,1); 
N2 = size(Xtrain,1); 
Dist = zeros(N1,N2); 
D1 = size(X,2); 
for i=1:N1 
    for j=1:N2 
     temp_matrix = xor(X(i,1:D1),Xtrain(j,1:D1)); 
     Dist(i,j) = sum(temp_matrix); 
    end 
end 
end

这是我在Python写至今：

def hamming_distance(X, X_train): 
    """ 
    :param X: set of objects that are going to be compared N1xD 
    :param X_train: set of objects compared against param X N2xD 
    Functions calculates Hamming distances between all objects from set X and all object from set X_train. 
    Resulting distances are returned as matrices. 
    :return: Distance matrix between objects X and X_train X i X_train N1xN2 
    """ 
    N1 = X.shape[0] 
    N2 = X_train.shape[0] 
    hdist = np.zeros(shape =(N1, N2)) 
    D1 = X.shape[1] 
    for i in range (1,N1): 
     for j in range (1, N2): 
      temp_matrix = np.logical_xor(X[i,1:D1], X_train[j, 1:D1]) 
      hdist[i, j] = np.sum(temp_matrix) 
    return hdist

的错误似乎是在Python代码的XOR一部分。我不明白那里有什么可能是错的;我试图把它作为(X[i,1:D1])^(X_train[j, 1:D1])，但它没有改变任何东西。我检查了logical_xor函数，看起来我有正确的函数输入。我不明白错误来自哪里。这可能是因为矩阵的形状不同吗？我在调整它们的大小时感到困惑，我应该将X和X_train更改为数组吗？我尝试过一次，但没有任何帮助。

错误：

Traceback (most recent call last): 
    File "C:\...\test.py", line 90, in test_hamming_distance 
    out = hamming_distance(data['X'], data['X_train']) 
    File "C:\...\content.py", line 28, in hamming_distance 
    temp_matrix = np.logical_xor(X[i,1:D1], X_train[j, 1:D1]) 
    File "C:\...\Anaconda3\lib\site-packages\scipy\sparse\base.py", line 559, in __getattr__ 
    raise AttributeError(attr + " not found") 
AttributeError: logical_xor not found

我不能改变test.py，只有content.py。 Test.py应该工作正常，所以我确信我的函数有一个错误。任何帮助，将不胜感激！

编辑： 我，对我的文件的顶部：

import numpy as np

写作numpy的，而不是NP没有任何改变。我收到一个错误'numpy wasn't defined'。

来源

2017-04-09 Swaglina

该功能在Numpy中不存在。这就是所有你的错误说 –

但是？有一个函数numpy.logical_xor。我不明白。我应该换个角度吗？我的文件中有np进口np。我应该工作吗？ – Swaglina

显示你定义'np'的代码。这是标准的进口'numpy进口np'吗？你无意中重复使用了'np'这个名字吗？ –

这不起作用的原因是因为X或X_train是scipy稀疏矩阵。 Scipy稀疏矩阵不支持逻辑运算，尽管对此的工作是in-progress。

当您调用numpy函数时，此错误在scipy中显示而不是numpy的原因是logical_xor是numpy ufunc或“通用函数”。用于与numpy ufuncs交互的Python类可以覆盖ufuncs的行为，并且scipy稀疏矩阵可以避免调用不支持的操作，这些操作会将数组转换为密集数组并可能会消耗掉所有内存。您需要使用例如X.toarray()将其转换为密集数组。如果它太大而不适合内存，则应该使用像dask或bcolz这样的包来处理您的内存管理。

编辑：scipy稀疏矩阵不是ndarray的子类。

来源

2017-04-09 18:29:45 TheBlackCat

啊，这很有道理，+1。所以问题是numpy试图将'logical_xor'的调用分派给其参数上的方法，但是scipy的稀疏矩阵没有这种方法。如果numpy在这种情况下生成了更有用的错误消息，那将是很好的事情。 –

@WarrenWeckesser：Fixed – TheBlackCat

Hamming距离Matlab到Python

回答

相关问题