我在计算子集数据时遇到了一个问题。不过,最初我从文件提取一些信息到另一个。然后,我尝试计算每个器官的患者人数。正确运行之前的命令现在给我一个错误。它不显示任何错误 - 只是错误地计算值。R中子集化后的Miscalculation
输入文件是在这个环节:https://www.dropbox.com/sh/8bo4b4dpmydj19w/AADZ7WuoecrjPwm_qyF8NRMza?dl=0
这是我的命令行。
Clinical_Samples_map = read.xls("b.xlsx") # calling my file
Clinical_Samples_Original = read.xls("a.xlsx", sheet=1) # the file where I get additional information
Clinical_Samples_map$AnatomicLocation = Clinical_Samples_Original[match(Clinical_Samples_map$SampleID, Clinical_Samples_Original$TubeName),"AnatomicLocation"]
map<-Clinical_Samples_map # Just changing the name
# Anatomic Location
sub_map_AnatomicLocation <- map[!duplicated(map$patient_number), ] # Excluding the duplicate of patient by checking patient_number column
sub_map_AnatomicLocation <- data.frame(sub_map_AnatomicLocation)
sub_map_AnatomicLocation_patient <- subset(sub_map_AnatomicLocation, Disease != "Unknown" & AnatomicLocation != "Unknown") # Getting rid of "Unknown" value if there is any
AnatomicLocation_patient <- count_(sub_map_AnatomicLocation , c("Disease","AnatomicLocation"))
write.table(AnatomicLocation_patient, "AnatomicLocation_patient.txt",col.names = TRUE)
write.table(Clinical_Samples_map, "Clinical_Samples_map2.txt",col.names = TRUE)
但是,当我比较两个写入的txt文件我有不同的数字。有谁知道为什么发生这种情况?举例来说,如果你看看CD回肠的IT显示3例但是当我看Clinical_Samples_map2.txt我可以算4
附加的东西,如果我尝试生成一些情节与ggplot:
ggplot(data=Clinical_Samples_map, aes(x=Disease, y=AgeAtSampling, fill=Disease)) +
geom_boxplot(notch = TRUE) +
ggtitle("Clinical_Samples_map_Disease") +
scale_y_continuous(name = "Age at Sampling", breaks = seq(0, 80, 20), limits=c(0, 80)) +
scale_x_discrete(name = "Disease") +
geom_jitter(colour = "black", size = 2, width = 0.15, height = 0.3) +
theme(legend.position = "bottom") +
labs(fill = "Disease") +
theme(axis.title=element_text(face="plain", size="30", color="black",family = "Gill Sans MT"),
axis.text.x = element_text(colour="grey20",size=20,angle=45,hjust=.5,vjust=.5,face="plain"),
axis.text.y = element_text(colour="grey20",size=20,angle=0,hjust=1,vjust=0,face="plain"),
legend.text=element_text(face="plain", size="30", color="black"),
legend.title=element_text(face="plain", size="30", color="black"))
我得到了一个错误:
Error: Discrete value supplied to continuous scale
我认为这就是问题所在。我可以克服这个来产生情节,但我不明白为什么它计算错误?
任何人都可以帮助解决这个问题吗?我挣扎了很长时间,还没搞清楚。
非常感谢。
Bahti
对不起我的错误...下面是使用过的图书馆:图书馆(gdata)和图书馆(dplyr)... – Lothlorian
请更新您的文章'库'行,而不是在评论中,所有人都可以看到。 – Parfait