在此先感谢您提供的任何和所有帮助。 我有一个相对较大的数据集,我想测试每个sting是否存在于从一个更大的数据集创建的一系列子集数据框中。 我能够在三个步骤中完成此操作,但是我想写一段代码来完成这一步。创建一个新列并根据所提供的条件输入1或0
由于我的文件的大小我想 创建子文件t2.a的用法是在我的文件t1中添加一个1或0,删除它; 然后重复这个过程T2.B,t2.c ...
再次感谢
我的实际数据集,类似于下dataframes。
t1<- data.frame (A1 = c("red", "blue", "green", "yellow", "brown"),
A2 = c("orange", "purple", "yellow", "black", NA),
A3 = c(1,2,4,5,7))
t2<- data.frame(B2 = c("black", "pink", "lime", "green", "grey", "mist", "blond", "grass", "violet", "red"),
B3 = c("a", "b", "a", "c", "d", "d", "a" , "c", "a", "b"))
> t1
A1 A2 A3
1 red orange 1
2 blue purple 2
3 green yellow 4
4 yellow black 5
5 brown <NA> 7
> t2
B2 B3
1 black a
2 pink b
3 lime a
4 green c
5 grey d
6 mist d
7 blond a
8 grass c
9 violet a
10 red b
#我的现有代码是三个步骤
# step 1. creates a subset of files
for(i in unique(t2$B3)) {
colName <- paste("t2", i, sep = ".")
assign(colName, t2[t2$B3==i,])
}
# step2. find if string exist in a given subfile
t1$t2.a<- ifelse(t1$A1 %in% t2.a$B2|t1$A2 %in% t2.a$B2,1,0)
#
t1$t2.b<- ifelse(t1$A1 %in% t2.b$B2|t1$A2 %in% t2.b$B2,1,0)
#
t1$t2.c<- ifelse(t1$A1 %in% t2.c$B2|t1$A2 %in% t2.c$B2,1,0)
#
t1$t2.d<- ifelse(t1$A1 %in% t2.d$B2|t1$A2 %in% t2.d$B2,1,0)
# 3.remove each newly created data set
rm(t2.a)
rm(t2.b)
rm(t2.c)
rm(t2.d)
#The result should look like the dataframe below
A1 A2 A3 t2.a t2.b t2.c t2.d
1 red orange 1 0 1 0 0
2 blue purple 2 0 0 0 0
3 green yellow 4 0 0 1 0
4 yellow black 5 1 0 0 0
5 brown <NA> 7 0 0 0 0
请显示预期的输出 – akrun
欢迎来到SO。你有没有努力去实际运行它? –