如何解码dtype = numpy.string_的numpy数组？

我需要解码，使用Python 3，经编码方式如下字符串：如何解码dtype = numpy.string_的numpy数组？

>>> s = numpy.asarray(numpy.string_("hello\nworld")) 
>>> s 
array(b'hello\nworld', 
     dtype='|S11')

我想：

>>> str(s) 
"b'hello\\nworld'" 

>>> s.decode() 
AttributeError       Traceback (most recent call last) 
<ipython-input-31-7f8dd6e0676b> in <module>() 
----> 1 s.decode() 

AttributeError: 'numpy.ndarray' object has no attribute 'decode' 

>>> s[0].decode() 
--------------------------------------------------------------------------- 
IndexError        Traceback (most recent call last) 
<ipython-input-34-fae1dad6938f> in <module>() 
----> 1 s[0].decode() 

IndexError: 0-d arrays can't be indexed

来源

2016-10-03 PiRK

如果我的理解是正确的，你可以做到这一点与astype这如果copy = False将返回与该内容的阵列中的对应类型：

>>> s = numpy.asarray(numpy.string_("hello\nworld")) 
>>> r = s.astype(str, copy=False) 
>>> r 
array('hello\nworld', 
     dtype='<U11')

来源

2016-10-03 12:13:13

谢谢！这有很大帮助。现在我可以通过这种方式恢复我的字符串：'s = str（s.astype（str））' – PiRK

当你可以直接用'unicode_'获得常规字符串时，不需要转换类型。 – Kasramvd

我不控制编码阶段。在我现实世界的问题中，我自己并没有创造's'。我只是碰巧知道它在编码阶段后写入了一个文件。 – PiRK

在Python 3，有两种类型的第代表字符序列：bytes和str（包含Unicode字符）。当您使用string_作为您的类型时，numpy将返回bytes。如果你想经常str你应该numpy的使用unicode_类型：

>>> s = numpy.asarray(numpy.unicode_("hello\nworld")) 
>>> s 
array('hello\nworld', 
     dtype='<U11') 

>>> str(s) 
'hello\nworld'

但要注意的是，如果你不为你的字符串指定类型（string_或UNICODE_）将返回默认STR型（在python 3.x是str（包含unicode字符））。

>>> s = numpy.asarray("hello\nworld") 
>>> str(s) 
'hello\nworld'

来源

2016-10-03 12:22:08 Kasramvd

我使用numpy.string_数据进行编码的原因是为了兼容性。我的数据转换为一种名为HDF5的数据格式，并且可能会被其他软件读取，而不仅仅是python。 – PiRK

http://docs.h5py.org/en/latest/strings.html（兼容性部分） – PiRK

@PiRK如果你想在Python版本之间使用兼容的方法，你应该使用'numpy.asarray（）'，否则它没有任何用python做。 – Kasramvd

另一种选择是np.char字符串操作的集合。

In [255]: np.char.decode(s) 
Out[255]: 
array('hello\nworld', 
     dtype='<U11')

它接受encoding关键字，如果需要的话。但如果你不需要这个，.astype可能会更好。

s是0d（shape（）），所以需要用s[()]索引。

In [268]: s[()] 
Out[268]: b'hello\nworld' 
In [269]: s[()].decode() 
Out[269]: 'hello\nworld'

s.item()也有效。

来源

2016-10-03 16:18:01 hpaulj

如何解码dtype = numpy.string_的numpy数组？

回答

相关问题