以下是使用reshape2
库的方法。
machine1.workingTime <- 1:10
machine2.workingTime <- 21:30
machine1.producedItems <- 101:110
machine2.producedItems <- 201:210
date <- c("2017-01-01","2017-01-02","2017-01-03","2017-01-04","2017-01-05","2017-01-06",
"2017-01-07","2017-01-08","2017-01-09","2017-01-10")
theData <- data.frame(date,
machine1.producedItems,
machine1.workingTime,
machine2.producedItems,
machine2.workingTime
)
library(reshape2)
meltedData <- melt(theData,measure.vars=2:5)
meltedData$variable <- as.character(meltedData$variable)
# now, extract machine numbers and variable names
variableNames <- strsplit(as.character(meltedData$variable),"[.]")
# token after the . is variable name
meltedData$columnName <- unlist(lapply(variableNames,function(x) x[2]))
# since all variables start with word 'machine' we can set chars 8+ as ID
meltedData$machineId <- as.numeric(unlist(lapply(variableNames,function(x) y <- substr(x[1],8,nchar(x[1])))))
theResult <- dcast(meltedData,machineId + date ~ columnName,value.var="value")
head(theResult)
的结果是:
> head(theResult)
machineId date producedItems workingTime
1 1 2017-01-01 101 1
2 1 2017-01-02 102 2
3 1 2017-01-03 103 3
4 1 2017-01-04 104 4
5 1 2017-01-05 105 5
6 1 2017-01-06 106 6
>
UPDATE(02Dec2017):回应的意见,如果没有其它标识符来唯一区分的多个行对一台机器,一个可以使用的聚合功能导致每台机器观察一次。
theResult <- dcast(meltedData,machineId ~ columnName,
fun.aggregate=mean,value.var="value")
head(theResult)
的结果如下。
> head(theResult)
machineId producedItems workingTime
1 1 105.5 5.5
2 2 205.5 25.5
>
UPDATE(02Dec2017):回应的意见,即使用一个唯一的顺序号来区分数据的行的溶液看起来是这样。
machine1.workingTime <- 1:10
machine2.workingTime <- 21:30
machine1.producedItems <- 101:110
machine2.producedItems <- 201:210
id <- 1:length(machine1.workingTime)
theData <- data.frame(id,
machine1.producedItems,
machine1.workingTime,
machine2.producedItems,
machine2.workingTime
)
meltedData <- melt(theData,measure.vars=2:5)
head(meltedData)
meltedData$variable <- as.character(meltedData$variable)
# now, extract machine numbers and variable names
variableNames <- strsplit(as.character(meltedData$variable),"[.]")
meltedData$columnName <- unlist(lapply(variableNames,function(x) x[2]))
meltedData$machineId <- as.numeric(unlist(lapply(variableNames,function(x) y <- substr(x[1],8,nchar(x[1])))))
theResult <- dcast(meltedData,machineId + id ~ columnName,value.var="value")
head(theResult)
...和输出。
head(theResult)
machineId id producedItems workingTime
1 1 1 101 1
2 1 2 102 2
3 1 3 103 3
4 1 4 104 4
5 1 5 105 5
6 1 6 106 6
>
请提供一个代码示例,包括您的数据帧(或捏造数据类似于您的数据帧),并显示你有多远了,并在那里你卡住了。 –
不清楚这些列是列名还是列中的值。什么是'MachineNum' – akrun
我认为您搜索的关键字是长格式与宽格式数据以及如何从其他格式转换。如果您提供示例数据,您可能会得到更好的答案。 – snoram