我已经从CFD模拟以下数据:使用Python和大熊猫在一个文本文件分割数据
Average value for X = 0.5080000265E-0003 to 0.2489200234E-0001
Z = -.3141592741E+0001
Time = 0.7000032425E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
0.1360714249E-0001 0.2564597656E+0006
0.1663095318E-0001 0.2564346563E+0006
0.1965476200E-0001 0.2564095625E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7010014057E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.4535714164E-0002 0.2565349844E+0006
0.7559523918E-0002 0.2565098906E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
... ...
0.1259419441E+0001 0.2549983125E+0006
0.1262443304E+0001 0.2549983125E+0006
0.1265467167E+0001 0.2549983125E+0006
0.1268491030E+0001 0.2549982656E+0006
Time = 0.7020006657E+0001
Y P_g
0.1511904760E-0002 0.2565604063E+0006
0.1058333274E-0001 0.2564848125E+0006
... ...
正如你可以从上面的例子中看到,该数据被分成由几个垂直分区时间步标头标记为Time
。在每个部分中,Y
不会更改,但P_g
确实会更改。为了绘制数据,我需要将每个部分中的P_g
列在下一列中。例如,这是我需要重新创建数据:
Y 0.7000032425E+1 0.7020006657E+1 ...
0.1511904760E-0002 0.2565604063E+0006 0.2549982656E+0006 ...
0.4535714164E-0002 0.2565349844E+0006 0.2549982656E+0006 ...
0.7559523918E-0002 0.2565098906E+0006 0.2549982656E+0006 ...
0.1058333274E-0001 0.2564848125E+0006 0.2549982656E+0006 ...
0.1360714249E-0001 0.2564597656E+0006 0.2549982656E+0006 ...
使用熊猫,我可以从文本文件中读取数据,并创建具有Y
值的新数据帧索引(行)和Time
值作为列:
import pandas as pd
# Read in data from text file
# -------------------------------------------------------------------------
# data frame from text file contents, skip first 4 rows, separate by variable
# white space, no header
df = pd.read_table('ROP_s_SD.dat', skiprows=4, sep='\s*', header=None)
# Time data
# -------------------------------------------------------------------------
# data frame of the rows that contain the Time string
dftime = df.loc[df.ix[:,0].str.contains('Time')]
t = dftime[2].tolist() # time list
idx = dftime.index # index of rows containing Time string
# Y data
# -------------------------------------------------------------------------
# grab values for y to create index for new data frame
ido = idx[0]+2 # index of first y value
idf = idx[1] # index of last y value
y = [] # empty list to store y values
for i in range(ido, idf): # iterate through first section of y values
v = df.ix[i, 0] # get y value from data frame
y.append(float(v)) # add y value to y list
# New data frame
# ------------------------------------------------------------------------
# empty data frame with y as index and t as columns
dfnew = pd.DataFrame(None, index=y, columns=t)
print('dfnew is \n', dfnew.head())
空数据帧的头部,dfnew.head()
看起来如下:
7.000032 7.010014 7.020007 7.030043 7.040020 7.050035 7.060043
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
7.070004 7.080036 7.090022 ... 7.650011 7.660032 7.670026
0.001512 NaN NaN NaN ... NaN NaN NaN
0.004536 NaN NaN NaN ... NaN NaN NaN
0.007560 NaN NaN NaN ... NaN NaN NaN
0.010583 NaN NaN NaN ... NaN NaN NaN
0.013607 NaN NaN NaN ... NaN NaN NaN
7.680044 7.690029 7.700008 7.710012 7.720014 7.730019 7.740026
0.001512 NaN NaN NaN NaN NaN NaN NaN
0.004536 NaN NaN NaN NaN NaN NaN NaN
0.007560 NaN NaN NaN NaN NaN NaN NaN
0.010583 NaN NaN NaN NaN NaN NaN NaN
0.013607 NaN NaN NaN NaN NaN NaN NaN
[5 rows x 75 columns]
Ť每栏中的NaN
应包含来自该特定Time
部分的P_g
值。我如何将每个部分的P_g
值添加到各自的列中?
我正在阅读的文本文件可以下载here。
这很好用!谢谢。如果您有时间,将每行绘制为一条线的示例会很有帮助。 x轴应该是时间t,而y轴应该是压力P_g。 – wigging 2015-02-12 17:48:39
你真的想要420个独立的行吗?这可能不是最好的方式来看... – Ajean 2015-02-12 19:29:16
@Gavin我添加了一些绘图代码。 420条个体会变得很讨厌,所以我在2D中做到了。 – Ajean 2015-02-12 19:57:50