用pygrib一次访问多个grib消息

我需要访问一些grib文件。我已经想出了如何使用pygrib来做到这一点。但是，我想出如何做到这一点的唯一方式是艰辛缓慢。我有34年3小时的数据，他们每年约36个文件（每10天更多或更少一个）组织成一组。总共约1000个文件。用pygrib一次访问多个grib消息

每个文件都有〜80个“消息”（每天8个值，持续10天）。（它们是空间数据，因此它们具有（x，y）维度）。

阅读我的所有的数据写到：

grbfile = pygrib.index(filename, 'shortName', 'typeOfLevel', 'level') 
var1 = grbfile.select(typeOfLevel='pressureFromGroundLayer', level=180, shortName='unknown') 
for it in np.arange(len(var1)): 
    var_values, lat1, lon1 = var1[it].data() 
    if (it==0): 
     tot_var = np.expand_dims(var_values,axis=0) 
    else: 
     tot_var = np.append(tot_var, np.expand_dims(var_values,axis=0),axis=0)

，并重复此为每个1000个文件。

有没有更快的方法？就像一次加载每个Grib文件的所有〜80层？例如：

var_values, lat1, lon1 = var1[:].data()

来源

2016-09-29 claire

如果我正确理解你，你希望每个文件中所有80条消息的数据堆积在一个数组中。

我不得不提醒你，该数组将获得非常大的，并且可能导致NumPy的扔MemoryError（前发生在我身上），这取决于您的网格大小等

话虽这么说，你可以做这样的事情：

# substitute with a list of your file names 
# glob is a builtin library that can help accomplish this 
files = list_of_files 

grib = pygrib.open(files[0]) # start with the first one 

# grib message numbering starts at 1 
data, lats, lons = grib.message(1).data() 

# while np.expand_dims works, the following is shorter 
# syntax wise and will accomplish the same thing 
data = data[None,...] # add an empty dimension as axis 0 

for m in xrange(2, grib.messages + 1): 
    data = np.vstack((data, grib.message(m).values[None,...])) 

grib.close() # good practice 

# now data has all the values from each message in the first file stacked up 
# time to stack the rest on there 
for file_ in files[1:]: # all except the first file which we've done 
    grib = pygrib.open(file_) 
    for msg in grib: 
     data = np.vstack((data, msg.values[None,...])) 

    grib.close() 
print data.shape # should be (80 * len(files), nlats, nlons)

这可能会增加你一些速度。 pygrib.open对象就像生成器一样工作，所以它们会通过每个pygrib.gribmessage对象，而不是像pygrib.index的select()方法那样构建它们的列表。如果您需要某个文件中的所有消息，那么这就是我访问它们的方式。

希望它有帮助！

来源

2016-11-14 17:10:13 AvlWx

用pygrib一次访问多个grib消息

回答

相关问题