如何将文件读取为多个多边形的嵌套坐标列表？

我有类似下面的许多部分文件：如何将文件读取为多个多边形的嵌套坐标列表？

[40.742742,-73.993847] 
[40.739389,-73.985667] 
[40.74715499999999,-73.97992] 
[40.750573,-73.988415] 
[40.742742,-73.993847] 

[40.734706,-73.991915] 
[40.736917,-73.990263] 
[40.736104,-73.98846] 
[40.740315,-73.985263] 
[40.74364800000001,-73.993353] 
[40.73729099999999,-73.997988] 
[40.734706,-73.991915] 

[40.729226,-74.003463] 
[40.7214529,-74.006038] 
[40.717745,-74.000389] 
[40.722299,-73.996634] 
[40.725291,-73.994413] 
[40.729226,-74.003463] 
[40.754604,-74.007836] 
[40.751289,-74.000649] 
[40.7547179,-73.9983309] 
[40.75779,-74.0054339] 
[40.754604,-74.007836]

我需要在每一部分作为对坐标列表的阅读（每区一个额外的\n分隔）。

在一个类似的文件我有（除相同，没有额外的换行符休息），我绘制一条从整个文件一个多边形。我可以使用下面的代码中的坐标读取和matplotlib绘制：

mVerts = [] 
with open('Manhattan_Coords.txt') as f: 
    for line in f: 
     pair = [float(s) for s in line.strip()[1:-1].split(", ")] 
     mVerts.append(pair) 

plt.plot(*zip(*mVerts)) 
plt.show()

我怎么能完成同样的任务，除了有许多超过1个多边形，在我的文件中的每个多边形通过一个额外的换行分隔？

来源

2014-09-20 jqwerty

为什么换行relavent？表单是'[＃，＃]'。你的意思是可能有'[＃，＃] [＃，＃] [＃，＃] \ n'这3个坐标与其他坐标分开吗？ – sln 2014-09-20 23:17:00

[＃，＃] \ n [＃，＃] \ n。。。 \ n [＃，＃] \ n \ n是一个多边形，在换行之前有任意给定的坐标数 – jqwerty 2014-09-20 23:18:34

那么你可以将每行解析为一个字符串，然后建立一个正则表达式来查找每一对。这是否是你有麻烦的正则表达式，比如去除装饰'[，]'？ – sln 2014-09-20 23:22:13

这是我个人最喜欢的方式，以“块”文件到由空格分隔的东西组：

from itertools import groupby 

def chunk_groups(it): 
    stripped_lines = (x.strip() for x in it) 
    for k, group in groupby(stripped_lines, bool): 
     if k: 
      yield list(group)

而且我建议ast.literal_eval把名单的那些字符串表示为实际的Python列表：

from ast import literal_eval 

with open(filename) as f: 
    result = [[literal_eval(li) for li in chunk] for chunk in chunk_groups(f)]

给出：

result 
Out[66]: 
[[[40.742742, -73.993847], 
    [40.739389, -73.985667], 
    [40.74715499999999, -73.97992], 
    [40.750573, -73.988415], 
    [40.742742, -73.993847]], 
[[40.734706, -73.991915], 
    [40.736917, -73.990263], 
    [40.736104, -73.98846], 
    [40.740315, -73.985263], 
    [40.74364800000001, -73.993353], 
    [40.73729099999999, -73.997988], 
    [40.734706, -73.991915]], 
[[40.729226, -74.003463], 
    [40.7214529, -74.006038], 
    [40.717745, -74.000389], 
    [40.722299, -73.996634], 
    [40.725291, -73.994413], 
    [40.729226, -74.003463], 
    [40.754604, -74.007836], 
    [40.751289, -74.000649], 
    [40.7547179, -73.9983309], 
    [40.75779, -74.0054339], 
    [40.754604, -74.007836]]]

来源

2014-09-20 23:25:46 roippi

略有差来对roippi的思想，采用json代替ast，

import json 
from itertools import groupby 

with open(FILE, "r") as coodinates_file: 
    grouped = groupby(coodinates_file, lambda line: line.isspace()) 
    groups = (group for empty, group in grouped if not empty) 

    polygons = [[json.loads(line) for line in group] for group in groups]

from pprint import pprint 
pprint(polygons) 
#>>> [[[40.742742, -73.993847], 
#>>> [40.739389, -73.985667], 
#>>> [40.74715499999999, -73.97992], 
#>>> [40.750573, -73.988415], 
#>>> [40.742742, -73.993847]], 
#>>> [[40.734706, -73.991915], 
#>>> [40.736917, -73.990263], 
#>>> [40.736104, -73.98846], 
#>>> [40.740315, -73.985263], 
#>>> [40.74364800000001, -73.993353], 
#>>> [40.73729099999999, -73.997988], 
#>>> [40.734706, -73.991915]], 
#>>> [[40.729226, -74.003463], 
#>>> [40.7214529, -74.006038], 
#>>> [40.717745, -74.000389], 
#>>> [40.722299, -73.996634], 
#>>> [40.725291, -73.994413], 
#>>> [40.729226, -74.003463], 
#>>> [40.754604, -74.007836], 
#>>> [40.751289, -74.000649], 
#>>> [40.7547179, -73.9983309], 
#>>> [40.75779, -74.0054339], 
#>>> [40.754604, -74.007836]]]

来源

2014-09-20 23:30:36 Veedrac

有很多在已经发布的答案拍摄漂亮的方法。他们中的任何一个都没有问题。

不过，也无可厚非采取明显的，但可读的方式。

在一个侧面说明，你似乎对地理数据的工作。这种格式是所有时候都会遇到的，并且段定界符通常不像额外的换行符那样明显。（有很多相当不错的ad-hoc“ASCII导出”的格式在那里，特别是在不起眼的专有软件。例如，一个常见的格式使用的F在最后一行在段分隔符结束时（即1.0 2.0F ）许多其他人根本不使用分隔符，并且如果距离最后一点距离“x”以上，则需要开始一个新的分段/多边形。）此外，这些东西通常会变成多个-GB ascii文件，因此将整个内容读入内存可能不切实际。

我的观点是：无论您选择何种方法，请确保您了解它。你将会再次这样做，而且它会变得不同以致难以概括。你绝对需要好好学习itertools这样的库，但要确保你完全理解你所调用的函数。

下面是“明显但可读”方法的一个版本。它更加冗长，但没有人会因为它的作用而挠头。（你可以用几种不同的方式编写相同的逻辑。使用什么最有意义给您。）

import matplotlib.pyplot as plt 

def polygons(infile): 
    group = [] 
    for line in infile: 
     line = line.strip() 
     if line: 
      coords = line[1:-1].split(',') 
      group.append(map(float, coords)) 
     else: 
      yield group 
      group = [] 
    else: 
     yield group 

fig, ax = plt.subplots() 
ax.ticklabel_format(useOffset=False) 

with open('data.txt', 'r') as infile: 
    for poly in polygons(infile): 
     ax.plot(*zip(*poly)) 

plt.show()

enter image description here

来源

2014-09-21 00:10:48

这看起来不错，但是当我尝试运行它时出现错误： Traceback（最近呼叫的最后一个）：文件“Tri_State_Maps.py”，第42行，在 ax.plot（* zip（* poly）） ValueError：无法将字符串转换为浮点型：' - ' – jqwerty 2014-09-21 02:24:47

如何将文件读取为多个多边形的嵌套坐标列表？

回答

相关问题