2017-08-31 137 views
1

我有一个组织为R中创建了一个数据帧如下:创建一个数据帧分类变量基于列的值

> all_data[3945:3952,] 
      Date btc_close eth_close vix_close gold_close DEXCHUS 
3945 2016-11-01 729.27  10.77  18.56  122.73  828 
3946 2016-11-02 742.46  NA  19.32  123.64  826 
3947 2016-11-03 687.51  NA  22.08  124.30  827 
3948 2016-11-04 702.54  NA  22.51  124.39  824 
3949 2016-11-05 704.16  NA  NA   NA  NA 
3950 2016-11-06 712.24  NA  NA   NA  NA 
3951 2016-11-07 704.02  NA  18.71  122.15  835 
3952 2016-11-08 709.15  10.87  18.74  121.64  843 

如何添加有3个级别的新列?水平将降低为-1,无变化为0,增加为1。此direction列应基于前几天的值为btc_close。 (注意:会有很多NA's-然后,我会想根据有数据btc_close仅行子集数据)

回答

3

下降NA行后,你可以在这个例子做

dat$change <- c(0, sign(diff(dat$btc_close))) 

,你会得到

dat 
      Date btc_close eth_close vix_close gold_close DEXCHUS change 
3945 2016-11-01 729.27  10.77  18.56  122.73  828  0 
3946 2016-11-02 742.46  NA  19.32  123.64  826  1 
3947 2016-11-03 687.51  NA  22.08  124.30  827  -1 
3948 2016-11-04 702.54  NA  22.51  124.39  824  1 
3949 2016-11-05 704.16  NA  NA   NA  NA  1 
3950 2016-11-06 712.24  NA  NA   NA  NA  1 
3951 2016-11-07 704.02  NA  18.71  122.15  835  -1 
3952 2016-11-08 709.15  10.87  18.74  121.64  843  1 

数据

dat <- 
structure(list(Date = structure(1:8, .Label = c("2016-11-01", 
"2016-11-02", "2016-11-03", "2016-11-04", "2016-11-05", "2016-11-06", 
"2016-11-07", "2016-11-08"), class = "factor"), btc_close = c(729.27, 
742.46, 687.51, 702.54, 704.16, 712.24, 704.02, 709.15), eth_close = c(10.77, 
NA, NA, NA, NA, NA, NA, 10.87), vix_close = c(18.56, 19.32, 22.08, 
22.51, NA, NA, 18.71, 18.74), gold_close = c(122.73, 123.64, 
124.3, 124.39, NA, NA, 122.15, 121.64), DEXCHUS = c(828L, 826L, 
827L, 824L, NA, NA, 835L, 843L)), .Names = c("Date", "btc_close", 
"eth_close", "vix_close", "gold_close", "DEXCHUS"), class = "data.frame", row.names = c("3945", 
"3946", "3947", "3948", "3949", "3950", "3951", "3952")) 
0

我会建议以下策略。您可以使用ifelse比较表中btc_closed列/属性中的值。由于您从第2行开始比较,因此请记得添加一个NA值(如果您愿意,也可以为0)。

df <- data.frame(btc_close = c(223, 222, 224, 224, 223, 223, 224), stuff = NA) 
df$direction <- c(NA, (sapply(2:nrow(df), (function(i){ 
    ifelse(df$btc_close[i] > df$btc_close[(i-1)], 1, 
     ifelse(df$btc_close[i] < df$btc_close[(i-1)], -1, 0))})))) 
相关问题