用于大型数据集的Savitzky-Golay滤波

我想将Savitzky-Golay滤波器（来自prospectr包）应用于针对不同感兴趣区域采集的一组样本。这是一个数据样本。用于大型数据集的Savitzky-Golay滤波

> head(file,10) 
    subject eye sample_num area sample_value 
     1 L   1 1 -7.813280 
     1 L   2 1 -7.816787 
     1 L   3 1 -7.826342 
     1 L   4 1 -7.799060 
     1 L   5 1 -7.817019 
     1 L   6 1 -7.845589 
     1 L   7 1 -7.881824 
     1 L   8 1 -7.969951 
     1 L   9 1 -8.022991 
     1 L   10 1 -8.118056 


> dput(head(file)) 
structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L), eye = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("L", "R"), class = "factor"), 
    sample_num = 1:6, area = c(1L, 1L, 1L, 1L, 1L, 1L), sample_value = c(-7.81328047761194, 
-7.81678696801706, -7.82634248187633, -7.79906019616205, 
-7.81701949680171, -7.84558887846482)), .Names = c("subject", 
"eye", "sample_num", "area", "sample_value"), row.names = c(NA, 
6L), class = "data.frame")

sample_value中的值对应于为左眼和右眼记录的眼睛位置，并且每毫秒取得一次。

我想要做的是将过滤器应用于每个区域的样本数据。我试图使用包plyr中的ddply，以便按照主题，眼睛和区域将文件拆分为子集，并应用滤镜（我想将原始样本值和筛选后的值保留在新列中）。代码如下。

newfile <- ddply(file, .(file$subject, file$eye, file$area), 
      function(x){ 
       x$sg_filtered <- savitzkyGolay(x$sample_value, 1,1,3) 
       return(x)})

不过，我得到以下错误：

Error in `$<-.data.frame`(`*tmp*`, "sg", value = c(-0.00653100213219515, : 
    replacement has 1838 rows, data has 1840

据推测，这是因为包含过滤数据的列将不具有在每个区域的第一个和最后一个sample_value相应的值。有没有一种方法可以调整代码，以便为这些代码获得NA，并将两列保持相同的长度？我真的很感激任何帮助。谢谢！

来源

2016-09-21 user2711113

你可以使用'dput（头（文件））'返回一个重复的样品。这是来自'prospectr'包装正确吗？ –

嗨，谢谢你的评论。我编辑了这个问题，并按照你的建议使用了'dput（head（file））'。 – user2711113

如果你想垫NA S中返回的矢量，你可以用c()：

set.seed(123) 
x <- rnorm(100) 
w <- 3 # must be odd number 
out <- c(rep(NA, (w-1)/2), savitzkyGolay(x, 1, 1, w = w), rep(NA, (w-1)/2)) 
length(out) 
# [1] 100 
head(out) 
# [1]   NA 1.0595920 0.1503429 -0.7147103 0.8222783 0.1658142 
tail(out) 
# [1] 0.01382324 0.41334027 1.06643511 -1.21151668 -1.27951576   NA

来源

2016-09-22 11:18:02

用于大型数据集的Savitzky-Golay滤波

回答

相关问题