2016-11-23 139 views
2

我想从稀疏矩阵中移除对角线元素。由于矩阵很稀疏,因此这些元素一旦被移除就不应该被存储。在scipy中从稀疏矩阵中移除对角元素

SciPy的提供了一种方法来设置对角线元素的值:setdiag

如果我尝试使用lil_matrix它,它的工作原理:

>>> a = np.ones((2,2)) 
>>> c = lil_matrix(a) 
>>> c.setdiag(0) 
>>> c 
<2x2 sparse matrix of type '<type 'numpy.float64'>' 
    with 2 stored elements in LInked List format> 

然而,随着csr_matrix,似乎对角线元素不是从存储中删除:

>>> b = csr_matrix(a) 
>>> b 
<2x2 sparse matrix of type '<type 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 

>>> b.setdiag(0) 
>>> b 
<2x2 sparse matrix of type '<type 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 

>>> b.toarray() 
array([[ 0., 1.], 
     [ 1., 0.]]) 

通过密集排列,我们当然有:

>>> csr_matrix(b.toarray()) 
<2x2 sparse matrix of type '<type 'numpy.float64'>' 
    with 2 stored elements in Compressed Sparse Row format> 

这是打算?如果是这样,是由于csr矩阵的压缩格式?除了从稀疏到稠密再稀疏之外,是否还有其他解决方法?

回答

2

简单地将元素设置为0不会改变矩阵的稀疏性。你必须申请eliminate_zeros

In [807]: a=sparse.csr_matrix(np.ones((2,2))) 
In [808]: a 
Out[808]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 
In [809]: a.setdiag(0) 
In [810]: a 
Out[810]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 4 stored elements in Compressed Sparse Row format> 
In [811]: a.eliminate_zeros() 
In [812]: a 
Out[812]: 
<2x2 sparse matrix of type '<class 'numpy.float64'>' 
    with 2 stored elements in Compressed Sparse Row format> 

由于改变企业社会责任矩阵的稀疏性是相对昂贵的,他们让你改变值设置为0而不会改变稀疏。

In [829]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000))) 
    ...: a.setdiag(0) 
100 loops, best of 3: 3.86 ms per loop 

In [830]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000))) 
    ...: a.setdiag(0) 
    ...: a.eliminate_zeros() 
SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient. 
10 loops, best of 3: 133 ms per loop 

In [831]: %%timeit a=sparse.lil_matrix(np.ones((1000,1000))) 
    ...: a.setdiag(0) 
100 loops, best of 3: 14.1 ms per loop 
+0

准确地说,我错过了。谢谢! – kevad