2014-09-23 83 views
1

我想在电话系统上随时测量同时呼叫的数量。与SQL example相同,但使用R和CSV文件。R中的同时呼叫计数

对于具有Id(数字),TimeOfCall(POSIXct),TimeOfClose(POSIXct)作为3列类型的CSV文件,我会执行以下操作,但我想知道是否有更高效的方式在R在Windows机器上)?

#Small dataset example. Much larger in reality 
id <- c(1,2,3,4) 

TimeOfStart <- c(as.POSIXct("2013-01-01 10:10:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-01-01 10:14:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-03-01 10:10:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-03-01 10:20:00", format="%Y-%d-%m %H:%M:%S")) 

TimeOfEnd <- c(as.POSIXct("2013-01-01 10:20:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-01-01 10:44:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-03-01 10:21:00", format="%Y-%d-%m %H:%M:%S"), as.POSIXct("2013-03-01 10:25:00", format="%Y-%d-%m %H:%M:%S")) 

call_data <- data.frame(id, TimeOfStart, TimeOfEnd) 

#holder for all POSIX converted to numeric entries 
stringoftimes <- '0' 

# loop through all rows and then concatenate all entries between start and end time after converting POSIX to numberics. 
for (i in 1:nrow(call_data)) 
{ 
    stringoftimes <- c(stringoftimes,c(as.numeric(call_data$TimeOfStart[i]):as.numeric(call_data$TimeOfStop[i]))) 

} 

#Convert to table so that count of entries takes place 
stringoftimes <- as.data.frame(table(stringoftimes)) 

#Sort table to see highest results first 
stringoftimes <- stringoftimes[order(stringoftimes$Freq, decreasing=TRUE),] 
+0

示例不起作用,因为call_data $ TimeOfCall/End应该是TimeOfStart/Stop? – Spacedman 2014-09-23 14:40:47

+0

谢谢Spacedman,更新错字 – 2014-09-23 15:02:52

回答

0

Howsabout此则:

在时间点该序列的呼叫的计算数目,由分:

> t = seq(min(call_data$TimeOfStart),max(call_data$TimeOfEnd),by=60) 

功能是这样的:

> ncalls = Vectorize(function(t){sum(t>=call_data$TimeOfStart & t<=call_data$TimeOfEnd)}) 

剧情作为step功能:

> plot(t,ncalls(t),type="s") 
+0

谢谢Spacedman。我有一个错误,说我无法分配600+ Mb的矢量大小。我会尝试你的方法在一小部分数据。 – 2014-09-23 15:16:10

+0

它可能会产生巨大的时间向量。你不能期望在十年内每分钟计算一次.... – Spacedman 2014-09-23 15:49:31