从电子表格中读取数据并在Python中构建矩阵

-1

有没有办法让python“读取”文档，排除不必要的元素并构建1和0的邻接矩阵？我有一个包含500个访问过的页面的电子表格，包含inlinks，outlinks和悬挂页面（需要从搜索中排除）。从电子表格中读取数据并在Python中构建矩阵

我想过粗伪这将是这个样子：

for each visited page vp 
for each outlink of vp 
    if link relative 
    revolve link 
    if ink to visited page 
    write 1 
    else 
if link dangling 
    ignore it 
else 
    write 0

是否有可能以某种方式实现的Python内这种想法？或者使用Matlab或R会更有用？

链接爬行结果： http://www.dcs.bbk.ac.uk/~martin/sewn/ls3/sewn_2016_labsheet_3_full_crawl.txt http://www.dcs.bbk.ac.uk/~martin/sewn/ls3/sewn_2016_labsheet_3_full_crawl.xlsx

来源

2016-11-27 v0id

有没有一种方法，使蟒蛇“读”的文件，排除不必要的元素，打造的1和0的邻接矩阵？

是

请参考https://docs.python.org/2/tutorial/inputoutput.html

最简单的方式开始打开和读取文件：

f = open('workfile', 'r') 
fileLines = f.readlines() 

#do something with your lines 
#properly adapt your pseudocode to 
#the extracted data 

f.close()

您的问题，其余都是超出范围。

来源

2016-11-27 06:53:57 glls

从电子表格中读取数据并在Python中构建矩阵

回答

相关问题