1
我有一个时间序列(第1列)中,用值(第2栏),这是时间序列中的每个子系列的特征的列数据帧。 如何删除符合条件的子系列?删除子系列(在数据帧中的行),其满足条件
我试图使循环创建一个额外的列与功能,指出要删除的行,但这种解决方案是非常计算成本昂贵(我有一列10毫米记录)。代码(慢溶液):
import numpy as np
import pandas as pd
# sample data (smaller than actual df)
# length of df = 100; should be 10000000 in the actual data frame
time_ser = 100*[25]
max_num = 20
distance = np.random.uniform(0,max_num,100)
to_remove= 100*[np.nan]
data_dict = {'time_ser':time_ser,
'distance':distance,
'to_remove': to_remove
}
df = pd.DataFrame(data_dict)
subser_size = 3
maxdist = 18
# loop which creates an additional column which indicates which indexes should be removed.
# Takes first value in a subseries and checks if it meets the condition.
# If it does, all values in subseries (i.e. rows) should be removed ('wrong').
for i,d in zip(range(len(df)), df.distance):
if d >= maxdist:
df.to_remove.iloc[i:i+subser_size] = 'wrong'
else:
df.to_remove.iloc[i] ='good'
感谢您接受。您也可以注册 - 点击接受标记上方'0'上方的小三角。谢谢。 – jezrael