所以我一直在试图使用条件来只打印一个文件的一部分,但由于某种原因,当我在ipython中运行代码只是不断运行,永远不会停止。如何只使用python中的一部分文件?
我运行它的文件是:
Use the -noinfo option to turn off this help.
Use the -help option to get a list of command line options.
pilercr v1.06
By Robert C. Edgar
Temp1.None.fasta: 523 putative CRISPR arrays found.
DETAIL REPORT
Array 1
>contig-856000000 902 nucleotides
Pos Repeat %id Spacer Left flank Repeat Spacer
========== ====== ====== ====== ========== ======================================== ======
28 40 95.0 26 TGCTTCCCCG -.....................................T. CTTGGTCTTGCTGGTTCTCACCGACT
94 40 95.0 25 CTCACCGACT .T....................................C. GTCAGCGTGTAGCGACTGTATCTGG
159 40 100.0 CTGTATCTGG ........................................ TTGCTCGAA
========== ====== ====== ====== ========== ========================================
3 40 25 TAGTTGTGAATAGCTGACAAAATCATATCATATACAACAG
Array 2
>contig-2277000000 590 nucleotides
Pos Repeat %id Spacer Left flank Repeat Spacer
========== ====== ====== ====== ========== ===================================== ======
19 37 100.0 37 GAGGGTGAGG ..................................... ACTTTAGGTTCAAATCCGTAGAGCTGATCTGTAATAG
93 37 100.0 37 TCTGTAATAG ..................................... ATTCCGTTGTTGAAATAAAGTATGAATAATATTTGGT
167 37 100.0 35 AATATTTGGT ..................................... TTCTCGAACGTTCCATGCTTCATAATATACCTCCT
239 37 100.0 39 TATACCTCCT ..................................... CTGATGAATCTTACCTCGTACAGTGATGTAGCCAGGTAA
315 37 100.0 AGCCAGGTAA ..................................... CGTCAGTCATG
========== ====== ====== ====== ========== =====================================
5 37 37 GTAGAAATGAGACGTCCGCTGTAAAGGACATTGATAC
Array 3
>contig-2766000000 540 nucleotides
Pos Repeat %id Spacer Left flank Repeat Spacer
========== ====== ====== ====== ========== ===================================== ======
172 37 100.0 29 GTTTTAGATG ..................................... TATCGTAGCATCCCACTCCCCTGGTGTAA
238 37 100.0 29 CCTGGTGTAA ..................................... GTTGGACGCGCTGCTGGACGATAGGCTGC
304 37 97.3 29 GATAGGCTGC T.................................... ACGCCTTACAAGCTGACCCGCGCCCAATT
370 37 100.0 GCGCCCAATT ..................................... GTACCTTGTTC
========== ====== ====== ====== ========== =====================================
4 37 29 GGCTGTAAAAAGCCACCAAAATGATGGTAATTACAAG
SUMMARY BY SIMILARITY
Array Sequence Position Length # Copies Repeat Spacer + Consensus
===== ================ ========== ========== ======== ====== ====== = =========
5 contig-504300000 18 364 6 33 33 + --------------------------GTCGCT-C---CCCGCATGGGGAGCG--T-GGATTGAAAT-----
8 contig-974700000 15 229 4 32 33 - --------------------------GTCGCC-C---CCCATGCG-GGGGCG--T-GGATTGAAAC-----
12 contig-759000001 464 503 8 33 34 + --------------------------GTCGCT-C---CCTTTACGGGGAGCG--T-GGATTGAAAT-----
16 contig-293000000 77 406 6 37 36 - -----------------------GTAGAAATGAG---TTCCCCGATGAGAAG--G-GGATTGACAC-----
17 contig-457600000 28 416 6 37 38 - -----------------------GTAGAAATGGG---TGTCCCGATAGATAG--G-GGATTGACAC-----
18 contig-527300000 1 351 6 33 32 + -----------------------ATCGCG----C---CCCCACGGGGGCGTG--T-GAATTGAAAC-----
27 contig-132220000 21 234 4 33 34 + --------------------------GTCGCT-C---CCTTCACGGGGAGCG--T-GGATTGAAAT-----
36 contig-602400000 35 304 5 33 34 - --------------------------GTCGCC-C---CCCACGTGGGGGGCG--T-GGATTGAAAC-----
38 contig-124860000 131 232 4 32 34 + --------------------------GTCGCA-C---CCCTCGC-GGGTGCG--T-GGATTGAAAC-----
54 contig-979400000 138 231 4 32 34 - --------------------------GTCGCC-C---CTCTTGCA-GGGGCG--T-GGATTGAAAC-----
61 contig-992000005 149 693 11 30 36 - --------------------GTTAAAATCA--GA---CC---ATTTTG--------GGATTGAAAT-----
68 contig-103110000 37 238 4 34 34 + -----------------------GTCGTC----C---CCCACACGGGGGACG--T-GGATTGAAATA----
73 contig-372900000 1627 1013 16 30 35 + ----------------------------ATTAGAATCGTACTT--ATGTAGAATTGAAAT-----------
而且到目前为止我的代码是:
fname = 'crispr_pilrcr_1.out'
start=False
end=False
counter = 0
for line in open(fname, 'r'): # Open up the file
s = line.split() # Split each line into words
if not s: continue # Remove empty lines which would otherwise cause errors
if '==' in s[0]: continue # Removes seperation lines which consist of long '=======' strings
try:
if s[0] == 'DETAIL': # Only start in the section which starts with 'DETAIL'
start=True
print 'Starting'
if s[0] == 'SUMMARY': # Only end once this section has ended
end=True
print 'Ending'
while start==True or end==False: # Whilst in the section of the PILER-CR output which provides spacer sequences
try:
int(s[0])
print s[7]
except ValueError:
continue
except ValueError:
continue
我估计有可能有点问题“而”循环然而相同当我使用'和'而不是'或'时持续运行。
正如我所说我想选择“细节报告”和“按相似性概述”之间的文件部分,因此为什么我设置了条件以便在发现后尝试。
任何帮助你们可以提供的将是伟大的。
感谢, 汤姆
我会在黑暗中射击,在这里。尝试用'if'替换'while'。 (可能是'或'与'和') – Kevin 2014-09-26 14:19:01