我想将我的一些R代码移植到Julia; 基本上我已经重写了下述R代码朱莉娅:Julia pmap性能
library(parallel)
eps_1<-rnorm(1000000)
eps_2<-rnorm(1000000)
large_matrix<-ifelse(cbind(eps_1,eps_2)>0,1,0)
matrix_to_compare = expand.grid(c(0,1),c(0,1))
indices<-seq(1,1000000,4)
large_matrix<-lapply(indices,function(i)(large_matrix[i:(i+3),]))
function_compare<-function(x){
which((rowSums(x==matrix_to_compare)==2) %in% TRUE)
}
> system.time(lapply(large_matrix,function_compare))
user system elapsed
38.812 0.024 38.828
> system.time(mclapply(large_matrix,function_compare,mc.cores=11))
user system elapsed
63.128 1.648 6.108
从一个核心去11.现在我试图做同样在朱莉娅的时候作为一个可以看到我越来越显著加速:
#Define cluster:
addprocs(11);
using Distributions;
@everywhere using Iterators;
d = Normal();
eps_1 = rand(d,1000000);
eps_2 = rand(d,1000000);
#Create a large matrix:
large_matrix = hcat(eps_1,eps_2).>=0;
indices = collect(1:4:1000000)
#Split large matrix:
large_matrix = [large_matrix[i:(i+3),:] for i in indices];
#Define the function to apply:
@everywhere function function_split(x)
matrix_to_compare = transpose(reinterpret(Int,collect(product([0,1],[0,1])),(2,4)));
matrix_to_compare = matrix_to_compare.>0;
find(sum(x.==matrix_to_compare,2).==2)
end
@time map(function_split,large_matrix)
@time pmap(function_split,large_matrix)
5.167820 seconds (22.00 M allocations: 2.899 GB, 12.83% gc time)
18.569198 seconds (40.34 M allocations: 2.082 GB, 5.71% gc time)
正如人们可以注意到我没有得到任何加快与pmap。也许有人可以提出替代方案。
'large_matrix'是'250000-元件阵列{任何,1}:'也许这是问题? – daycaster
我真的不知道我是很新的朱莉娅 – Vitalijs
在朱莉娅0.4.6我得到'结果如下addprocs(3)':'4.173674秒(22.97中号分配:2.943 GB,14.57%GC时间)'和 '0.795733秒(292.07 k分配:12.377 MB,0.83%gc时间)'。此外,'large_matrix'的类型是'Array {BitArray {2},1}'。 – tim