我试图制作的代码是关于将含有来自DNA或mRNA的核苷酸(U,T,C,G或A)的FASTA文件翻译成氨基酸。它运行得很好,但是当我运行我的程序时,它报告了一些我不明白的错误,即使我试图修复它,但它不起作用。你们能给我一些提示吗?我搜寻了一些可能的解决方案,但对我来说太难理解了。mRNA/DNA与蛋白质
我的代码
#!/bin/usr/env python3
import sys
codontable = {
'ATA':'I', 'ATC':'I', 'ATT':'I', 'ATG':'M',
'ACA':'T', 'ACC':'T', 'ACG':'T', 'ACT':'T',
'AAC':'N', 'AAT':'N', 'AAA':'K', 'AAG':'K',
'AGC':'S', 'AGT':'S', 'AGA':'R', 'AGG':'R',
'CTA':'L', 'CTC':'L', 'CTG':'L', 'CTT':'L',
'CCA':'P', 'CCC':'P', 'CCG':'P', 'CCT':'P',
'CAC':'H', 'CAT':'H', 'CAA':'Q', 'CAG':'Q',
'CGA':'R', 'CGC':'R', 'CGG':'R', 'CGT':'R',
'GTA':'V', 'GTC':'V', 'GTG':'V', 'GTT':'V',
'GCA':'A', 'GCC':'A', 'GCG':'A', 'GCT':'A',
'GAC':'D', 'GAT':'D', 'GAA':'E', 'GAG':'E',
'GGA':'G', 'GGC':'G', 'GGG':'G', 'GGT':'G',
'TCA':'S', 'TCC':'S', 'TCG':'S', 'TCT':'S',
'TTC':'F', 'TTT':'F', 'TTA':'L', 'TTG':'L',
'TAC':'Y', 'TAT':'Y', 'TAA':'Stop', 'TAG':'Stop',
'TGC':'C', 'TGT':'C', 'TGA':'Stop', 'TGG':'W',
'AUA':'I', 'AUC':'I', 'AUU':'I', 'AUG':'M',
'ACU':'U', 'AAU':'N', 'AGU':'S', 'CUA':'L',
'CUG':'L', 'CUG':'L', 'CUU':'L', 'CCU':'P',
'CAU':'H', 'CGU':'R', 'GUA':'V', 'GUC':'V',
'GUG':'V', 'GUU':'V', 'GCU':'A', 'GAU':'D',
'GGU':'G', 'UCA':'S', 'UCC':'S', 'UCG':'S',
'UCU':'S', 'UUC':'F', 'UUU':'F', 'UUA':'L',
'UUG':'L', 'UAC':'Y', 'UAU':'Y', 'UAA':'Stop',
'UAG':'Stop', 'UGC':'C', 'UGU':'C', 'UGA':'Stop',
'UGG':'W'}
def divideintriplets(startwhere, moveover = 3):
startwhere = int(startwhere)
moveover = int(moveover)
readwhere = startwhere + moveover
triplet = sequence[startwhere:readwhere]
amino = codontable[triplet]
lentriplet = len(triplet)
if lentriplet == 3:
return amino
else:
return None
list_trans_nucl = []
filenames = sys.argv[1:]
startwhere = input('n\Where to start with counting? Choose one; \n1: Start counting at the first nucleotide. \n2: Start counting at the second nucleotide. \n3: Start counting at the third nucleotide. \n')
for file in filenames:
count = 0
contains_aminoacid = False
inputfile = open(file)
for line in inputfile:
if not line.startswith('>'):
sequence = line.strip("")
count += len(sequence)
trans_nucl = divideintriplets(startwhere)
list_trans_nucl.append(trans_nucl)
for element in sequence:
if element not in ['A', 'T', 'C', 'G', 'U']:
contains_aminoacid = True
if contains_aminoacid is False:
print(list_trans_nucl)
print("This file contains ", count, "nucleotides")
什么我的命令提示符下说什么是错的
Traceback (most recent call last):
File "C:\Users\Desktop\CCR5\DNA_mRNA_translator.py", line 61, in <module>
trans_nucl = divideintriplets(startwhere)
File "C:\Users\Desktop\CCR5\DNA_mRNA_translator.py", line 41, in divideintriplets
amino = codontable[triplet]
KeyError: ''
>gi|255652911:5001-11065 | Homo sapiens chemokine (C-C motif) receptor 5 (gene/pseudogene) (CCR5), RefSeqGene on chromosome 3
CTTCAGATAGATTATATCTGGAGTGAAGAATCCTGCCACCTATGTATCTGGCATAGTGTGAGTCCTCATA
AATGCTTACTGGTTTGAAGGGCAACAAAATAGTGAACAGAGTGAAAATCCCCACTAAGATCCTGGGTCCA
GAAAAAGATGGGAAACCTGTTTAGCTCACCCGTGAGCCCATAGTTAAAACTCTTTAGACAACAGGTTGTT
TCCGTTTACAGAGAACAATAATATTGGGTGGTGAGCATCTGTGTGGGGGTTGGGGTGGGATAGGGGATAC
GGGGAGAGTGGAGAAAAAGGGGACACAGGGTTAATGTGAAGTCCAGGATCCCCCTCTACATTTAAAGTTG
GTTTAAGTTGGCTTTAATTAATAGCAACTCTTAAGATAATCAGAATTTTCTTAACCTTTTAGCCTTACTG
TTGAAAAGCCCTGTGATCTTGTACAAATCATTTGCTTCTTGGATAGTAATTTCTTTTACTAAAATGTGGG
CTTTTGACTAGATGAATGTAAATGTTCTTCTAGCTCTGATATCCTTTATTCTTTATATTTTCTAACAGAT
TCTGTGTAGTGGGATGAGCAGAGAACAAAAACAAAATAATCCAGTGAGAAAAGCCCGTAAATAAACCTTC
AGACCAGAGATCTATTCTCTAGCTTATTTTAAGCTCAACTTAAAAAGAAGAACTGTTCTCTGATTCTTTT
CGCCTTCAATACACTTAATGATTTAACTCCACCCTCCTTCAAAAGAAACAGCATTTCCTACTTTTATACT
GTCTATATGATTGATTTGCACAGCTCATCTGGCCAGAAGAGCTGAGACATCCGTTCCCCTACAAGAAACT
CTCCCCGGTAAGTAACCTCTCAGCTGCTTGGCCTGTTAGTTAGCTTCTGAGATGAGTAAAAGACTTTACA
GGAAACCCATAGAAGACATTTGGCAAACACCAAGTGCTCATACAATTATCTTAAAATATAATCTTTAAGA
TAAGGAAAGGGTCACAGTTTGGAATGAGTTTCAGACGGTTATAACATCAAAGATACAAAACATGATTGTG
AGTGAAAGACTTTAAAGGGAGCAATAGTATTTTAATAACTAACAATCCTTACCTCTCAAAAGAAAGATTT
GCAGAGAGATGAGTCTTAGCTGAAATCTTGAAATCTTATCTTCTGCTAAGGAGAACTAAACCCTCTCCAG
TGAGATGCCTTCTGAATATGTGCCCACAAGAAGTTGTGTCTAAGTCTGGTTCTCTTTTTTCTTTTTCCTC
CAGACAAGAGGGAAGCCTAAAAATGGTCAAAATTAATATTAAATTACAAACGCCAAATAAAATTTTCCTC
TAATATATCAGTTTCATGGCACAGTTAGTATATAATTCTTTATGGTTCAAAATTAAAAATGAGCTTTTCT
AGGGGCTTCTCTCAGCTGCCTAGTCTAAGGTGCAGGGAGTTTGAGACTCACAGGGTTTAATAAGAGAAAA
TTCTCAGCTAGAGCAGCTGAACTTAAATAGACTAGGCAAGACAGCTGGTTATAAGACTAAACTACCCAGA
ATGCATGACATTCATCTGTGGTGGCAGACGAAACATTTTTTATTATATTATTTCTTGGGTATGTATGACA
ACTCTTAATTGTGGCAACTCAGAAACTACAAACACAAACTTCACAGAAAATGTGAGGATTTTACAATTGG
CTGTTGTCATCTATGACCTTCCCTGGGACTTGGGCACCCGGCCATTTCACTCTGACTACATCATGTCACC
AAACATCTGATGGTCTTGCCTTTTAATTCTCTTTTCGAGGACTGAGAGGGAGGGTAGCATGGTAGTTAAG
AGTGCAGGCTTCCCGCATTCAAAATCGGTTGCTTACTAGCTGTGTGGCTTTGAGCAAGTTACTCACCCTC
TCTGTGCTTCAAGGTCCTTGTCTGCAAAATGTGAAAAATATTTCCTGCCTCATAAGGTTGCCCTAAGGAT
TAAATGAATGAATGGGTATGATGCTTAGAACAGTGATTGGCATCCAGTATGTGCCCTCGAGGCCTCTTAA
TTATTACTGGCTTGCTCATAGTGCATGTTCTTTGTGGGCTAACTCTAGCGTCAATAAAAATGTTAAGACT
GAGTTGCAGCCGGGCATGGTGGCTCATGCCTGTAATCCCAGCATTCTAGGAGGCTGAGGCAGGAGGATCG
CTTGAGCCCAGGAGTTCGAGACCAGCCTGGGCAACATAGTGTGATCTTGTATCTATAAAAATAAACAAAA
TTAGCTTGGTGTGGTGGCGCCTGTAGTCCCCAGCCACTTGGAGGGGTGAGGTGAGAGGATTGCTTGAGCC
CGGGATGGTCCAGGCTGCAGTGAGCCATGATCGTGCCACTGCACTCCAGCCTGGGCGACAGAGTGAGACC
CTGTCTCACAACAACAACAACAACAACAAAAAGGCTGAGCTGCACCATGCTTGACCCAGTTTCTTAAAAT
TGTTGTCAAAGCTTCATTCACTCCATGGTGCTATAGAGCACAAGATTTTATTTGGTGAGATGGTGCTTTC
ATGAATTCCCCCAACAGAGCCAAGCTCTCCATCTAGTGGACAGGGAAGCTAGCAGCAAACCTTCCCTTCA
CTACAAAACTTCATTGCTTGGCCAAAAAGAGAGTTAATTCAATGTAGACATCTATGTAGGCAATTAAAAA
CCTATTGATGTATAAAACAGTTTGCATTCATGGAGGGCAACTAAATACATTCTAGGACTTTATAAAAGAT
CACTTTTTATTTATGCACAGGGTGGAACAAGATGGATTATCAAGTGTCAAGTCCAATCTATGACATCAAT
TATTATACATCGGAGCCCTGCCAAAAAATCAATGTGAAGCAAATCGCAGCCCGCCTCCTGCCTCCGCTCT
ACTCACTGGTGTTCATCTTTGGTTTTGTGGGCAACATGCTGGTCATCCTCATCCTGATAAACTGCAAAAG
GCTGAAGAGCATGACTGACATCTACCTGCTCAACCTGGCCATCTCTGACCTGTTTTTCCTTCTTACTGTC
CCCTTCTGGGCTCACTATGCTGCCGCCCAGTGGGACTTTGGAAATACAATGTGTCAACTCTTGACAGGGC
TCTATTTTATAGGCTTCTTCTCTGGAATCTTCTTCATCATCCTCCTGACAATCGATAGGTACCTGGCTGT
CGTCCATGCTGTGTTTGCTTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGG
GTGGTGGCTGTGTTTGCGTCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGAAGGTCTTCATTACA
CCTGCAGCTCTCATTTTCCATACAGTCAGTATCAATTCTGGAAGAATTTCCAGACATTAAAGATAGTCAT
CTTGGGGCTGGTCCTGCCGCTGCTTGTCATGGTCATCTGCTACTCGGGAATCCTAAAAACTCTGCTTCGG
TGTCGAAATGAGAAGAAGAGGCACAGGGCTGTGAGGCTTATCTTCACCATCATGATTGTTTATTTTCTCT
TCTGGGCTCCCTACAACATTGTCCTTCTCCTGAACACCTTCCAGGAATTCTTTGGCCTGAATAATTGCAG
TAGCTCTAACAGGTTGGACCAAGCTATGCAGGTGACAGAGACTCTTGGGATGACGCACTGCTGCATCAAC
CCCATCATCTATGCCTTTGTCGGGGAGAAGTTCAGAAACTACCTCTTAGTCTTCTTCCAAAAGCACATTG
CCAAACGCTTCTGCAAATGCTGTTCTATTTTCCAGCAAGAGGCTCCCGAGCGAGCAAGCTCAGTTTACAC
CCGATCCACTGGGGAGCAGGAAATATCTGTGGGCTTGTGACACGGACTCAAGTGGGCTGGTGACCCAGTC
AGAGTTGTGCACATGGCTTAGTTTTCATACACAGCCTGGGCTGGGGGTGGGGTGGGAGAGGTCTTTTTTA
AAAGGAAGTTACTGTTATAGAGGGTCTAAGATTCATCCATTTATTTGGCATCTGTTTAAAGTAGATTAGA
TCTTTTAAGCCCATCAATTATAGAAAGCCAAATCAAAATATGTTGATGAAAAATAGCAACCTTTTTATCT
CCCCTTCACATGCATCAAGTTATTGACAAACTCTCCCTTCACTCCGAAAGTTCCTTATGTATATTTAAAA
GAAAGCCTCAGAGAATTGCTGATTCTTGAGTTTAGTGATCTGAACAGAAATACCAAAATTATTTCAGAAA
TGTACAACTTTTTACCTAGTACAAGGCAACATATAGGTTGTAAATGTGTTTAAAACAGGTCTTTGTCTTG
CTATGGGGAGAAAAGACATGAATATGATTAGTAAAGAAATGACACTTTTCATGTGTGATTTCCCCTCCAA
GGTATGGTTAATAAGTTTCACTGACTTAGAACCAGGCGAGAGACTTGTGGCCTGGGAGAGCTGGGGAAGC
TTCTTAAATGAGAAGGAATTTGAGTTGGATCATCTATTGCTGGCAAAGACAGAAGCCTCACTGCAAGCAC
TGCATGGGCAAGCTTGGCTGTAGAAGGAGACAGAGCTGGTTGGGAAGACATGGGGAGGAAGGACAAGGCT
AGATCATGAAGAACCTTGACGGCATTGCTCCGTCTAAGTCATGAGCTGAGCAGGGAGATCCTGGTTGGTG
TTGCAGAAGGTTTACTCTGTGGCCAAAGGAGGGTCAGGAAGGATGAGCATTTAGGGCAAGGAGACCACCA
ACAGCCCTCAGGTCAGGGTGAGGATGGCCTCTGCTAAGCTCAAGGCGTGAGGATGGGAAGGAGGGAGGTA
TTCGTAAGGATGGGAAGGAGGGAGGTATTCGTGCAGCATATGAGGATGCAGAGTCAGCAGAACTGGGGTG
GATTTGGGTTGGAAGTGAGGGTCAGAGAGGAGTCAGAGAGAATCCCTAGTCTTCAAGCAGATTGGAGAAA
CCCTTGAAAAGACATCAAGCACAGAAGGAGGAGGAGGAGGTTTAGGTCAAGAAGAAGATGGATTGGTGTA
AAAGGATGGGTCTGGTTTGCAGAGCTTGAACACAGTCTCACCCAGACTCCAGGCTGTCTTTCACTGAATG
CTTCTGACTTCATAGATTTCCTTCCCATCCCAGCTGAAATACTGAGGGGTCTCCAGGAGGAGACTAGATT
TATGAATACACGAGGTATGAGGTCTAGGAACATACTTCAGCTCACACATGAGATCTAGGTGAGGATTGAT
TACCTAGTAGTCATTTCATGGGTTGTTGGGAGGATTCTATGAGGCAACCACAGGCAGCATTTAGCACATA
CTACACATTCAATAAGCATCAAACTCTTAGTTACTCATTCAGGGATAGCACTGAGCAAAGCATTGAGCAA
AGGGGTCCCATAGAGGTGAGGGAAGCCTGAAAAACTAAGATGCTGCCTGCCCAGTGCACACAAGTGTAGG
TATCATTTTCTGCATTTAACCGTCAATAGGCAAAGGGGGGAAGGGACATATTCATTTGGAAATAAGCTGC
CTTGAGCCTTAAAACCCACAAAAGTACAATTTACCAGCCTCCGTATTTCAGACTGAATGGGGGTGGGGGG
GGCGCCTTAGGTACTTATTCCAGATGCCTTCTCCAGACAAACCAGAAGCAACAGAAAAAATCGTCTCTCC
CTCCCTTTGAAATGAATATACCCCTTAGTGTTTGGGTATATTCATTTCAAAGGGAGAGAGAGAGGTTTTT
TTCTGTTCTGTCTCATATGATTGTGCACATACTTGAGACTGTTTTGAATTTGGGGGATGGCTAAAACCAT
CATAGTACAGGTAAGGTGAGGGAATAGTAAGTGGTGAGAACTACTCAGGGAATGAAGGTGTCAGAATAAT
AAGAGGTGCTACTGACTTTCTCAGCCTCTGAATATGAACGGTGAGCATTGTGGCTGTCAGCAGGAAGCAA
CGAAGGGAAATGTCTTTCCTTTTGCTCTTAAGTTGTGGAGAGTGCAACAGTAGCATAGGACCCTACCCTC
TGGGCCAAGTCAAAGACATTCTGACATCTTAGTATTTGCATATTCTTATGTATGTGAAAGTTACAAATTG
CTTGAAAGAAAATATGCATCTAATAAAAAACACCTTCTAAAATAA
“它进行得非常顺利,但是当我运行我的程序,它报告了一些错误...... “如果它报告错误,那么它在哪个方面进展顺利?此外,您的缩进已关闭。请修复。错误本身似乎很清楚 - 你使用空字符串作为密码。看看你如何将事物分解为三倍的逻辑。你可能有一个错误的错误,并且正在切割超过字符串的末尾。 –
该脚本是否立即失败,或者是否在通过错误退出之前通过循环的某个部分? – enpaul
工作的东西是核苷酸的计数和与氨基酸和核苷酸不同的区别。当我翻译含有DNA或mRNA的文件时,我的程序不能正常运行。 – Visintank