2016-09-13 52 views
1

从以下工作构建起:类型转换 - 仅在循环中强制类型转换?

sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(motifs[[1]], reverseComplement(s), min.score = "75%"))) 

我写这个循环:

percentages <- as.character(seq(0, 100, 5)) 

for (i in 1:length(percentages)) { 
    sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(
     motifs[[1]], 
     reverseComplement(s), 
     min.score = as.character(cat('"', percentages[i], "%" , '"', sep = "") 
    )))) 
} 

,并返回以下:

Error in .normargMinScore(min.score, pwm) : 
    'min.score' must be a single number or string 

我不知道,有一个与问题数据类型

min.score 

,但是当我检查:

test <- as.character(cat('"', percentages[1], "%" , '"', sep = "")) 
typeof(test) 


> typeof(test) 
[1] "character" 

它似乎是为了。

我认为这可能与类型强制有关,如R-bloggers所述,因为使用了sapply function。但这似乎并不正确。

帮助将不胜感激, 因为我还是新的R和编程

我sessionInfo()

R version 3.2.5 (2016-04-14) 
Platform: x86_64-pc-linux-gnu (64-bit) 
Running under: Ubuntu 14.04.4 LTS 

locale: 
[1] LC_CTYPE=en_US.UTF-8  LC_NUMERIC=C    
[3] LC_TIME=en_US.UTF-8  LC_COLLATE=en_US.UTF-8  
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 
[7] LC_PAPER=de_DE.UTF-8  LC_NAME=C     
[9] LC_ADDRESS=C    LC_TELEPHONE=C    
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C  

attached base packages: 
[1] stats4 parallel stats  graphics grDevices utils  
[7] datasets methods base  

other attached packages: 
[1] Biostrings_2.38.4 XVector_0.10.0  IRanges_2.4.8  
[4] S4Vectors_0.8.11 BiocGenerics_0.16.1 

loaded via a namespace (and not attached): 
[1] zlibbioc_1.16.0 tools_3.2.5 

,这是我怎么构建我的数据:

seq_set <- matrix(1:2000, 1000, 2) 
seq_set[, 1] <- 
    sapply(seq_set[, 1], function(s) 
    paste(sample(
     c('A', 'C', 'G', 'T'), 
     size = ncol(motifs[[1]]), 
     replace = T 
    ), collapse = '')) 
seq_set[, 2] <- 
    sapply(seq_set[, 2], function(s) 
    paste(sample(
     c('A', 'C', 'G', 'T'), 
     size = ncol(motifs[[2]]), 
     replace = T 
    ), collapse = '')) 

这些是我的书架中的包装:

AnnotationDbi     Annotation Database Interface 
Biobase      Biobase: Base functions for Bioconductor 
BiocGenerics     S4 generic functions for Bioconductor 
BiocInstaller     Install/Update Bioconductor, CRAN, and github Packages 
BiocParallel     Bioconductor facilities for parallel evaluation 
Biostrings     String objects representing biological sequences, and 
           matching algorithms 
bitops      Bitwise Operations 
BSgenome      Infrastructure for Biostrings-based genome data packages and 
           support for efficient SNP representation 
caTools      Tools: moving window statistics, GIF, Base64, ROC AUC, etc. 
CNEr       CNE Detection and Visualization 
DBI       R Database Interface 
DirichletMultinomial   Dirichlet-Multinomial Mixture Model Machine Learning for 
           Microbiome Data 
futile.logger     A Logging Utility for R 
futile.options    Futile options management 
GenomeInfoDb     Utilities for manipulating chromosome and other 'seqname' 
           identifiers 
GenomicAlignments    Representation and manipulation of short genomic alignments 
GenomicRanges     Representation and manipulation of genomic intervals and 
           variables defined along a genome 
gtools      Various R Programming Tools 
IRanges      Infrastructure for manipulating intervals on sequences 
lambda.r      Modeling Data with Functional Programming 
Rcpp       Seamless R and C++ Integration 
RCurl       General Network (HTTP/FTP/...) Client Interface for R 
Rsamtools      Binary alignment (BAM), FASTA, variant call (BCF), and tabix 
           file import 
RSQLite      SQLite Interface for R 
rtracklayer     R interface to genome browsers and their annotation tracks 
S4Vectors      S4 implementation of vectors and lists 
seqLogo      Sequence logos for DNA sequence alignments 
snow       Simple Network of Workstations 
SummarizedExperiment   SummarizedExperiment container 
TFBSTools      Software Package for Transcription Factor Binding Site 
           (TFBS) Analysis 
TFMPvalue      Efficient and Accurate P-Value Computation for Position 
           Weight Matrices 
XML       Tools for Parsing and Generating XML Within R and S-Plus 
XVector      Representation and manpulation of external sequences 
zlibbioc      An R packaged zlib-1.2.5 

Packages in library ‘/usr/lib/R/library’: 

base       The R Base Package 
boot       Bootstrap Functions (Originally by Angelo Canty for S) 
class       Functions for Classification 
cluster      "Finding Groups in Data": Cluster Analysis Extended 
           Rousseeuw et al. 
codetools      Code Analysis Tools for R 
compiler      The R Compiler Package 
datasets      The R Datasets Package 
foreign      Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, 
           Weka, dBase, ... 
graphics      The R Graphics Package 
grDevices      The R Graphics Devices and Support for Colours and Fonts 
grid       The Grid Graphics Package 
KernSmooth     Functions for Kernel Smoothing Supporting Wand & Jones 
           (1995) 
lattice      Trellis Graphics for R 
MASS       Support Functions and Datasets for Venables and Ripley's 
           MASS 
Matrix      Sparse and Dense Matrix Classes and Methods 
methods      Formal Methods and Classes 
mgcv       Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness 
           Estimation 
nlme       Linear and Nonlinear Mixed Effects Models 
nnet       Feed-Forward Neural Networks and Multinomial Log-Linear 
           Models 
parallel      Support for Parallel computation in R 
rpart       Recursive Partitioning and Regression Trees 
spatial      Functions for Kriging and Point Pattern Analysis 
splines      Regression Spline Functions and Classes 
stats       The R Stats Package 
stats4      Statistical Functions using S4 Classes 
survival      Survival Analysis 
tcltk       Tcl/Tk Interface 
tools       Tools for Package Development 
utils       The R Utils Package 
+1

有这样的sessionInfo是很好的,但也最好使这个例子完全重现(使用'library()'调用,一个小数据集等)。见http://stackoverflow.com/a/28481250/ – Frank

+1

很可能你想用'paste'而不是'cat'。猜测它应该是'min.score <-paste(''',百分比[i],“%”,''',sep =“”)'。 – nicola

+0

谢谢弗兰克,你是对的。我做了更改 – piderotrema

回答

0

尼科拉的评论伎俩。

这样:

seq_set_matches <- matrix(1:42, 21, 2) 
percentages <- as.character(seq(0, 100, 5)) 
for (i in 1:length(percentages)) { 
    seq_set_matches[i,1]<- sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(
     motifs[[1]], 
     reverseComplement(s), 
     min.score = paste(percentages[i], "%" , sep = "") 
    ))) 
} 

作品。亲爱的尼科拉,如果你喜欢,我很乐意接受你的帮助,作为正式的答复。再次感谢。