2013-11-27 57 views
1

我有一个很大的4D数据集,需要从它创建一个更小的4D数组。我对python相当陌生,用于IDL或matlab。我读了我的值,然后使用where函数,找到我需要的更小的一维数组的每个维度的索引号。我想创建这些索引号的新数组,但我不断收到形状不匹配错误(不能broacast到一个单一的形状。如何从一个较大的数组创建一个较小的数组python

import numpy as n 
import matplotlib.pyplot as plt 
import Scientific.IO.NetCDF as S 

file=S.NetCDFFile('wspd.mon.mean.nc',mode='r') #Opening File 
Lat=file.variables['lat'].getValue()  # Reading in the latitude variables, 73 
Lon=file.variables['lon'].getValue()  # Reading in the longitude variables, 144 
Level=file.variables['level'].getValue() # Reading in the levels, 17 of them 
Defaulttime=file.variables['time'].getValue() # Reading in the time, hrs since 1-1-1 
Defaultwindspeed=file.variables['wspd'].getValue()  # Reading in the windspeed(time, level, lat, lon) 

Time=n.arange(len(Defaulttime))/12.+1948 #Creates time array into readable years with 12 months 
goodtime=n.where((Time>=1948)&(Time<2013)) #Creates a time array for the years that I want, 1948-2012, since 2013 only has until October, I will not be using that data. 
goodlat=n.where((Lat>=35)&(Lat<=50)) #Latitudes where the rockies and plains are in the US 
plainslon=n.where((Lon>=275)&(Lon<=285)) 

Windspeedsplains=Defaultwindspeed[goodtime,:,goodlat,plainslon] 

以下信息由上述的(最后一行行所产生的错误码)。

>>>ValueError: shape mismatch: objects cannot be broadcast to a single shape 
+1

我怀疑任何人都可以帮你解决这个问题。你至少可以做的是找出哪条线实际上是失败的 – Hammer

+0

'Defaultwindspeed'实际上是4D吗?使用'Defaultwindspeed.shape'或'Defaultwindspeed.ndim'来检查... – atomh33ls

+0

@ atomh33ls它必须是,否则错误会是'索引太多'。我认为问题的形式是“goodtime”,“goodlat”和“plainslon”。 @Cwilliams,如果你印刷'goodtime.shape,goodlat.shape,plainslon.shape',你会得到什么? – askewchan

回答

0

正在发生的事情是,每个索引的的长度(where)阵列将是不同的,因此贵输出阵列的形状是不明确的,因此错误。要强制广播到适当的形状,你必须重塑你的阵列播出一个适当的形状,如下所示:

Windspeedsplains = Defaultwindspeed[goodtime[0][:, None, None, None],:,goodlat[0][:,None],plainslon[0]] 

[0]的是有监守np.where(a)返回长度a.ndim的元组与该元组是最适合自己的条件的索引的阵列中的每个元素。我假设你所有的布尔数组都是1d,所以所有的where输出都将是长度为1的元组,所以我们只需要它的一个数组,因此[0]

当我们得到数组后,我们想要重塑它以匹配你想要的输出数组所具有的形状。可推测,你的输出应该有的形状是(goodtime.size, Defaultwindspeed.shape[1], goodlat.size, plainslon.size),所以你必须让每个索引数组的形状与输出数组应该随该变量而变化的轴相匹配。例如,对于goodtime,您希望Windspeedplains根据轴的0轴的时间变化。因此,goodtime本身也必须沿着四轴的轴0变化,因此您强制索引数组的形状为(N, 1, 1, 1),这就是[:, None, None, None]所做的。

所以,你可以把上面的行更具可读性有:

goodtime = n.where((Time>=1948)&(Time<2013))[0][:, None, None, None] 
goodlat = n.where((Lat>=35)&(Lat<=50))[0][:, None] 
plainslon = n.where((Lon>=275)&(Lon<=285))[0] 

Windspeedsplains=Defaultwindspeed[goodtime, :, goodlat, plainslon] 

或实际,因为你可以直接与布尔数组索引:

goodtime = ((Time>=1948)&(Time<2013))[:, None, None, None] 
goodlat = ((Lat>=35)&(Lat<=50))[:, None] 
plainslon = ((Lon>=275)&(Lon<=285)) 

Windspeedsplains=Defaultwindspeed[goodtime, :, goodlat, plainslon] 

这里是一个略微简单的例子:

In [52]: a = np.arange(3*3*3).reshape(3,3,3) 

In [53]: a 
Out[53]: 
array([[[ 0, 1, 2], 
     [ 3, 4, 5], 
     [ 6, 7, 8]], 

     [[ 9, 10, 11], 
     [12, 13, 14], 
     [15, 16, 17]], 

     [[18, 19, 20], 
     [21, 22, 23], 
     [24, 25, 26]]]) 

In [54]: mask0 = np.where(a[:,0,0] >= 9) 

In [55]: mask0 
Out[55]: (array([1, 2]),) # <-- this is the length 1 tuple I was talking about. we want the array inside. 

In [56]: mask1 = np.where(a[0,:,0]%2 == 0) 

In [57]: mask1 
Out[57]: (array([0, 2]),) 

In [62]: mask2 = np.where(a[0,0,:] < 1) 

In [63]: mask2 
Out[63]: (array([0]),) 

In [67]: b = a[mask0[0][:, None, None], mask1[0][:, None], mask2[0]] 

In [68]: b 
Out[68]: 
array([[[ 9], 
     [15]], 

     [[18], 
     [24]]]) 

In [69]: b.shape 
Out[69]: (2, 2, 1) 
相关问题