我有一个包含与含有低（L）的每一行的成本估算一个csv列表的行，中心（C）和上部（U）的范围估计为每个订单项估计这是由非R用户擅长编写的。我已经读入的R CSV数据的一个例子如下：蒙特卡罗模拟（三角形分布）跨CSV成本数据

  Item  l  c  u 
     <chr> <int> <int> <int> 
1 “CostItem1” 1500 1900 2600 
2 “CostItem2” 2400 3200 4400 
3 “CostItem3” 500 1000 1500

每一行，然后在三角分布函数（库（三角形））使用在若干迭代（运行= 10000如下在这种情况下）：

CostItem1 <- rtriangle(runs, l, u, c)

我目前手动输入在rtriangle函数每个成本项（CostItem1，CostItem2等）的范围内的估计数据。

我的问题是：

我怎样才能创建一个循环函数或其他方式，当它被读成R从CSV文件直接做到这一点？作为一个新手，我不知道如何解决这个问题，所有的谷歌搜索没有透露任何东西。

然后将成本项目数据在一个新的数据帧（TotalCostEstimate），其包含万个模拟和求和以提供模型化的总成本数据中的每一行（TOTALCOST）合并：

TotalCostEstimate<-data.frame(CostItem1 ,CostItem2 ,TotalCost=rowSums(x))

从这里的数据可为绘制并呈现用于分析和决策。对于少量的成本项目，手动输入并不算太坏，但我有时候行数> 50，我不想这样做50+次！

非常感谢您花时间看这个。

来源

2017-06-14 Nick

而不是直接从CSV中进行，最好将CSV读取到矩阵中，创建总成本矩阵，然后运行for循环来模拟这些值。

例如以这种方式：

runs<-1000 #Set number of runs 
Info_costs<- read.csv("Your_file_name.csv") #Read in the information 
Total_cost_items<-matrix(,nrow=runs,ncol=length(Info_costs$Item)) #Create an empty matrix to contain your simulations 
for (i in 1:length(Info_costs$Item)) 
    {Total_cost_items[,i]<-rtriangle(n=runs,Info_costs$l[i],Info_costs$u[i],Info_costs$c[i]) } 
#Fill the matrix 
Total_cost_items<-data.frame(Total_cost_items, rowSums(Total_cost_items)) #append the matrix with the row sums

您可能需要选择，当然还有正确的文件名，它可以正确读取您的文件来调整read.csv功能。此外，您可以稍后将数据框的列重新命名为更有用的东西

来源

2017-06-14 10:44:02

马尔滕，非常感谢你的作品。正如你所说，我需要调整列名称。我以为我会以CSV数据的行数为基础，允许添加多个项目，然后创建一个'n'行排列以便于命名。 – Nick

@nick_dawe不客气。如果您想稍后在附加的CSV上使用代码，那么列名听起来像是一个好主意。不知道你的意思与'n'行安排 –

您可以使用read.csv来读取数据并将其保存为data.frame。下面是一些伪数据：

df <- data.frame(Item=letters[1:3], l=1:3, c=2:4, u=3:5) 
df 

    Item l c u 
1 a 1 2 3 
2 b 2 3 4 
3 c 3 4 5

您可以使用foreach和dplyr来完成你想要的东西：

library(foreach) 
library(dplyr) 

df <- foreach(I=1:nrow(df), .combine=rbind) %do% rtriangle(10,df$l[I],df$c[I],df$u[I]) %>% 
as.data.frame() %>% 
mutate(sum = rowSums(.))

这项工作将通过df每一行迭代，执行rtriangle，得出的数据绑定到一个matrix ，转换matrix为data.frame，可以在其上计算rowSums。

我的输出

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 sum 
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

来源

2017-06-14 11:10:48 CPak

迟朴，感谢您的答复。当我运行你的代码时，它根据行创建了10列。马腾的代码提供了我需要的答案。再次感谢。 – Nick

我认为你的答案可以在'sum'列中找到，但是很高兴你得到了答案。 – CPak

解决 - 由于@Maarten蓬特！

以为我会发布最终工作的解决方案：

TotalCostEstimate<-matrix(,nrow=runs,ncol=length(basedata$Item)) #Create an empty matrix to contain your simulations 
for (i in 1:length(basedata$Item)) # Prepare distributions based on the distribution type select (1 [triangle] or 2 [discrete]) 
{if (basedata$DistType[i] == 1) { 
     TotalCostEstimate[,i]<-rtriangle(n=runs,basedata$l[i],basedata$u[i],basedata$c[i]) 
}else{ 
     TotalCostEstimate[,i]<- sample(c(0,basedata$u[i]),runs,replace=TRUE)   
     }} 
#Fill the matrix 
TotalCostEstimate<-data.frame(TotalCostEstimate, rowSums(TotalCostEstimate)) #append the matrix with the row sums 
for (i in 1:length(basedata$Item)) 
{colnames(TotalCostEstimate)[i]<-basedata$Item[i] } # Rename the column names to the cost items from base data 
#Rename the last column based on the number of cost items 
i<-length(basedata$Item) 
colnames(TotalCostEstimate)[i+1]<-"TotalCost"

需要注意的是我修改CSV文件中包括一个新的领域“DistType”，它允许用户选择分布在使用的类型模拟 - 离散的（开或关）或三角形：

  Item  l  c  u DistType 
      <chr> <int> <int> <int> <int> 
1  “CostItem1” 1500 1900 2600  1 
2  “CostItem2” 2400 3200 4400  1 
3  “CostItem3” 500 1000 1500  1 
4 “DiscCostItem4”  0  0 1500  2

我还修改了循环函数取CSV文件的成本项目名称，并将其分配到与最后求和柱[i中的输出的列+1]被命名为'TotalCost'。这允许根据列名称自动命名输出/图（再次使用循环）。

来源

2017-06-14 14:41:28 Nick

抱歉不应该使用我的手机dor。我的意思是：尼斯，但实际上并不需要列名的循环。 colnames（TotalCostEstimate）[1：length（basedata $ Item）] < - basedata $ Item应该做的诀窍和加速计算 –

再次感谢Maarten，会试试看。 – Nick

蒙特卡罗模拟（三角形分布）跨CSV成本数据

我的问题是：

回答

解决 - 由于@Maarten蓬特！

相关问题