如何基于列名称对数据框进行子集划分？

我有这个数据帧：如何基于列名称对数据框进行子集划分？

dput(df) 
structure(list(Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"), 
    Date = structure(1:6, .Label = c("7/13/2017 15:01", "7/13/2017 15:02", 
    "7/13/2017 15:03", "7/13/2017 15:04", "7/13/2017 15:05", 
    "7/13/2017 15:06"), class = "factor"), Host_CPU = c(1.812950134, 
    2.288070679, 1.563278198, 1.925239563, 5.350669861, 2.612503052 
    ), UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19, 
    38.22), jvm1 = c(10.91, 11.13, 11.34, 11.56, 11.77, 11.99 
    ), jvm2 = c(11.47, 11.7, 11.91, 12.13, 12.35, 12.57), jvm3 = c(75.65, 
    76.88, 56.93, 58.99, 65.29, 67.97), jvm4 = c(39.43, 40.86, 
    42.27, 43.71, 45.09, 45.33), jvm5 = c(27.42, 29.63, 31.02, 
    32.37, 33.72, 37.71)), .Names = c("Server", "Date", "Host_CPU", 
"UsedMemPercent", "jvm1", "jvm2", "jvm3", "jvm4", "jvm5"), class = "data.frame", row.names = c(NA, 
-6L))

我只希望能够基于该变量的向量名子集这个数据帧：

select<-c("jvm3", "jvm4", "jvm5")

所以，我最后的DF应该像这个：

structure(list(Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"), 
    Date = structure(1:6, .Label = c("7/13/2017 15:01", "7/13/2017 15:02", 
    "7/13/2017 15:03", "7/13/2017 15:04", "7/13/2017 15:05", 
    "7/13/2017 15:06"), class = "factor"), Host_CPU = c(1.812950134, 
    2.288070679, 1.563278198, 1.925239563, 5.350669861, 2.612503052 
    ), UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19, 
    38.22), jvm3 = c(75.65, 76.88, 56.93, 58.99, 65.29, 67.97 
    ), jvm4 = c(39.43, 40.86, 42.27, 43.71, 45.09, 45.33), jvm5 = c(27.42, 
    29.63, 31.02, 32.37, 33.72, 37.71)), .Names = c("Server", 
"Date", "Host_CPU", "UsedMemPercent", "jvm3", "jvm4", "jvm5"), class = "data.frame", row.names = c(NA, 
-6L))

有什么想法吗？

来源

2017-07-14 user1471980

解决的办法是：'DF [选择]' –

'DF [C（ “服务器”， “日期”， “Host_CPU”， “UsedMemPercent”，选择）]'。或者，您可以使用'df [，c（“Server”， “Date”，“Host_CPU”，“UsedMemPercent”，select）]'。或者'子集（select = c（“Server”，“Date”，“Host_CPU”，“UsedMemPercent”，select））'。有关详细信息，请参阅'？subset'。或'？['。 – Gregor

请注意，非常感谢您采取额外的措施将dput的输出修改为可直接粘贴到R中的内容。因此，如果你将它粘贴到'your_data < - {在这里插入dput输出}' – Dason

保存你的数据帧给一个变量DF：

df <- 
    structure(
    list(
     Server = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "servera", class = "factor"), 
     Date = structure(
     1:6, 
     .Label = c(
      "7/13/2017 15:01", 
      "7/13/2017 15:02", 
      "7/13/2017 15:03", 
      "7/13/2017 15:04", 
      "7/13/2017 15:05", 
      "7/13/2017 15:06" 
     ), 
     class = "factor" 
    ), 
     Host_CPU = c(
     1.812950134, 
     2.288070679, 
     1.563278198, 
     1.925239563, 
     5.350669861, 
     2.612503052 
    ), 
     UsedMemPercent = c(38.19, 38.19, 38.19, 38.19, 38.19, 
         38.22), 
     jvm1 = c(10.91, 11.13, 11.34, 11.56, 11.77, 11.99), 
     jvm2 = c(11.47, 11.7, 11.91, 12.13, 12.35, 12.57), 
     jvm3 = c(75.65, 
       76.88, 56.93, 58.99, 65.29, 67.97), 
     jvm4 = c(39.43, 40.86, 
       42.27, 43.71, 45.09, 45.33), 
     jvm5 = c(27.42, 29.63, 31.02, 
       32.37, 33.72, 37.71) 
    ), 
    .Names = c(
     "Server", 
     "Date", 
     "Host_CPU", 
     "UsedMemPercent", 
     "jvm1", 
     "jvm2", 
     "jvm3", 
     "jvm4", 
     "jvm5" 
    ), 
    class = "data.frame", 
    row.names = c(NA,-6L) 
)

df[,select]应该是什么youre寻找

来源

2017-07-14 17:59:58

这个答案不起作用 – user1471980

@ user1471980如果你明显地创建了'select'，这个回答很好，但你没有并没有说明你还想保留其他几个。 –

@ user1471980是的，我误解了你的问题，看起来像你需要：'cbind（df [，1：4]，df [，select]）' –

这里有一种方法：

df[,c(1:4,7:9)]

您还可以使用dplyr选择栏目：

select(df, Server,Date,Host_CPU,UsedMemPercent,jvm3,jvm4,jvm5)

来源

2017-07-14 18:13:39 Mako212

请重新访问索引。如果R中使用索引机构[，可以使用主要有三种类型的索引：

逻辑矢量：长度相同的列数，TRUE手段选择列
数值向量 ：选择基于位置
字符向量列：基于名称选择栏

如果您使用的数据帧索引机制，可以通过两种方式处理这些对象：

作为一个列表，因为它们是在内部列出
作为基质，因为他们模拟天生在许多情况下，矩阵的行为

以iris数据框为例，比较您可以从数据框中选择列的多种方式。如果你把它当作一个列表，您有以下两种选择：

使用[[如果你想在矢量形式的单个列：

iris[["Species"]] 
# [1] setosa  setosa  setosa ... : is a vector

使用[，如果你想一列或多列，但你需要一个回数据帧：

iris["Species"] 
iris[c("Sepal.Width", "Species")]

如果你把它当作一个矩阵，你只是做同样的，你会用一个矩阵做。如果不指定任何行索引，这些命令实际上是等同于上面所用的那些：

iris[ , "Species"] # is the same as iris[["Species"]] 
iris[ , "Species", drop = FALSE] # is the same as iris["Species"] 
iris[ , c("Sepal.Width", "Species")] # is the same as iris[c("Sepal.Width", "Species")]

所以你的情况，你只需要：在子

select <- c("Server","Date","Host_CPU","UsedMemPercent", 
      "jvm3","jvm4","jvm5") 
df[select]

注：子集的作品，但只能交互使用。有帮助页面上的警告，指出：

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

来源

2017-07-14 18:51:10

如何基于列名称对数据框进行子集划分？

回答

相关问题