numpy的阵列具有读（）（Python）的

对于文本文件（testfile.txt）：numpy的阵列具有读（）（Python）的

# blah blah blah 

Unpleasant astonished an diminution up. Noisy an their of meant. Death means up civil do an offer wound of. 
//Called square an in afraid direct. 


{Resolution} diminution conviction so (mr at) unpleasing simplicity no. 
/*No it as breakfast up conveying earnestly

当存储一个numpy的阵列内的文本文件的内容，我无法理解差之间：

当文本文件被直接打开（无read()）并存储在numpy的阵列中，并且

B.当文本文件被首先用打开A. 0，然后存储在numpy数组中。

下面是代码：

import numpy  

# open directly with no read 
a = numpy.array([str(i) for i in open(r'C:\testfile.txt', 'r')]) 

# open with read then store in numpy *how I want to do it* 
f = open(r'C:\testfile.txt', 'r').read() 
b = numpy.array([str(i) for i in f]) 

print("A") 
print(a) 
print() 
print("B") 
print(b)

我的问题是如何改变numpy.array([str(i) for i in f])命令，这样所产生的numpy的数组保存文本文件的内容的方式，输出的确实（见下文）。

输出：

A 
['# blah blah blah\n' '\n' 
'Unpleasant astonished an diminution up. Noisy an their of meant. Death means up civil do an offer wound of. \n' 
'//Called square an in afraid direct. \n' '\n' '\n' 
'{Resolution} diminution conviction so (mr at) unpleasing simplicity no. \n' 
'/*No it as breakfast up conveying earnestly '] 

B 
['#' ' ' 'b' 'l' 'a' 'h' ' ' 'b' 'l' 'a' 'h' ' ' 'b' 'l' 'a' 'h' '\n' '\n' 
'U' 'n' 'p' 'l' 'e' 'a' 's' 'a' 'n' 't' ' ' 'a' 's' 't' 'o' 'n' 'i' 's' 
'h' 'e' 'd' ' ' 'a' 'n' ' ' 'd' 'i' 'm' 'i' 'n' 'u' 't' 'i' 'o' 'n' ' ' 
'u' 'p' '.' ' ' 'N' 'o' 'i' 's' 'y' ' ' 'a' 'n' ' ' 't' 'h' 'e' 'i' 'r' 
' ' 'o' 'f' ' ' 'm' 'e' 'a' 'n' 't' '.' ' ' 'D' 'e' 'a' 't' 'h' ' ' 'm' 
'e' 'a' 'n' 's' ' ' 'u' 'p' ' ' 'c' 'i' 'v' 'i' 'l' ' ' 'd' 'o' ' ' 'a' 
'n' ' ' 'o' 'f' 'f' 'e' 'r' ' ' 'w' 'o' 'u' 'n' 'd' ' ' 'o' 'f' '.' ' ' 
'\n' '/' '/' 'C' 'a' 'l' 'l' 'e' 'd' ' ' 's' 'q' 'u' 'a' 'r' 'e' ' ' 'a' 
'n' ' ' 'i' 'n' ' ' 'a' 'f' 'r' 'a' 'i' 'd' ' ' 'd' 'i' 'r' 'e' 'c' 't' 
'.' ' ' '\n' '\n' '\n' '{' 'R' 'e' 's' 'o' 'l' 'u' 't' 'i' 'o' 'n' '}' ' ' 
'd' 'i' 'm' 'i' 'n' 'u' 't' 'i' 'o' 'n' ' ' 'c' 'o' 'n' 'v' 'i' 'c' 't' 
'i' 'o' 'n' ' ' 's' 'o' ' ' '(' 'm' 'r' ' ' 'a' 't' ')' ' ' 'u' 'n' 'p' 
'l' 'e' 'a' 's' 'i' 'n' 'g' ' ' 's' 'i' 'm' 'p' 'l' 'i' 'c' 'i' 't' 'y' 
' ' 'n' 'o' '.' ' ' '\n' '/' '*' 'N' 'o' ' ' 'i' 't' ' ' 'a' 's' ' ' 'b' 
'r' 'e' 'a' 'k' 'f' 'a' 's' 't' ' ' 'u' 'p' ' ' 'c' 'o' 'n' 'v' 'e' 'y' 
'i' 'n' 'g' ' ' 'e' 'a' 'r' 'n' 'e' 's' 't' 'l' 'y' ' ']

来源

2016-11-28 Karim

输出，只需拆分的read()被分成几行：

def load_entire_file_into_memory_and_then_convert(filename): 
    with open(filename, 'r') as input_file: 
    full_file_contents = input_file.read() 
    lines_of_file = full_file_contents.split('\n') 
    return numpy.array(lines_of_file)

还有你的另一版本：

def load_file_line_by_line(filename): 
    with open(filename, 'r') as input_file: 
    lines_of_file = [line for line in input_file] 
    return numpy.array(lines_of_file)

注意这两个版本之间的语义差别以及为什么你得到不同的结果;当你在一个文件中做“for ... in”时，你得到的结果是单独的行。如果您调用read()，那么您将整个文件作为单个字符串（用换行符分隔的行），并且字符串中的“for ... in”会为您提供该字符串的各个字符（不是行）。虽然可能有些情况下使用read()更方便（例如，当你真的想要一次加载所有行）时，它通常更具可扩展性/更好的习惯来逐行处理文件（使用第一种方法），因为这样做允许您减少内存占用（例如在其他不需要所有行同时在内存中的应用程序中，并且一次只能在文件的一行上运行的应用程序）。

来源

2016-11-28 07:19:56

对于文本文件'readlines（）'通常比'read（）'更有用。结果类似于在打开的文件上迭代。 – hpaulj

同意。 readlines（）与在文件对象中执行“for ... in”基本相同。但问题是，直接询问如何用“read（）”来完成。 –

numpy的阵列具有读（）（Python）的

回答

相关问题