2012-04-02 114 views
1

Chat.txt文/字的文件从计数蟒蛇

ID674 25/01/1986 Thank you for choosing Optimus prime. Please wait for an Optimus prime Representative to respond. You are currently number 0 in the queue. You should be connected to an agent in approximately 0 minutes.. You are now chatting with 'Tom' 0  <br/> 
ID674 2gb Hi there! Welcome to Optus Web Chat 0/0/0 . How can I help you today? 1 
ID674 25-01-1986 I would like to change my bill plan from $0 with 0 expiry to something else $136. I find it very unuseful. Sam my phone no is 9838383821 2 

在上述文字只是在file.My需要几行的例子是,所有的日期,例如25/01/1986或0/0/0应替换为“DATE123”。
然后:)应该换成“smileys123”。 货币即$ 0或$ 136应替换为“Currency123”
'TOM'(通常代理商名称在单引号中)应替换为AGENT123
等等。输出应该是字符串的出现次数显示

DATE123=2 smileys123=2 Currency123=6 AGENT123=5 

我有这样的方法,因为现在请让我知道这一点,

class Replace: 
    dateformat=DATE123 
    smileys=smileys123 
    currency=currency123 

    count_dict={} 

    function count_data(type,count): 
    global count_dict 
    if type in count_dict: 
     count_dict[type]+=count 
    else: 
     count_dict = {type:count} 


    f=open("chat.txt") 
    while True: 
    for line in f.readlines(): 
     print line, 
     if ":)" in line: 
      smileys = line.count(":)") 
      count_data("smileys",smileys) 
     elif "$number" in line : #how to see whether it is currency or nor?? 
      currency=line.count("$number") //how can i do this 
      count_data("currecny",currency) 
     elif "1/2/3" in line : #how to validate date format 
      dateformat=line.count("dateformat") #how can i do this 
      count_data("currency",currency) 
     elif validate-agent-name in line: 
      agent_name=line.count("agentname") #How to do this get agentname in single quotes 
      count_data("agent_name",agent_name) 
    else: 
     break 
    f.close() 

    for keys in count_dict: 
    print keys,count_dict[keys] 


    The following would be the ouput 

    DATE123=2 smileys123=2 Currency123=6 AGENT123=5 
+3

你应该阅读PEP8(http://www.python.org/dev/peps/pep-0008/),否违法意图。 – Benjamin 2012-04-02 16:54:05

+0

你知道正则表达式吧? – Marcin 2012-04-02 17:00:38

+0

如果您只需要计算每个模式的出现次数,则不需要替换文件中的文本。只需使用're.findall()'。 – 2012-04-02 17:09:08

回答

1

货币即$ 0或$ 136应以 “Currency123” 和 'TOM'(通常被替换年龄NTS名在单引号)应与AGENT123更换和更多

我觉得你的类Repalce应该由字典所取代,在这种情况下,你可以做更多(因为它与方法)同时编写更少的代码。字典可以跟踪您需要替换的东西,并为您提供更多选项来动态更改替换需求。做这些事情,也许你的代码会更干净,更容易理解?因为你有更多的替换词汇,所以肯定会更短。

编辑:您可能希望将替换词的列表保存在文本文件中,并将它们加载到词典中。而不是将你的替换词拼写成一个班级。我认为这不是一个好主意。既然你没有说更多的话,那更有意义的话,更少的代码编写(和更清洁!)

要评论...使用

# Here is a comment 

代码的风格是不是最好的,阅读http://www.python.org/dev/peps/pep-0008/#pet-peeves,如果你想学习更好的编码风格,甚至可以阅读整章。

这里是正则表达式来检查它是货币,名称'Tom'和日期。

import re 

while True: 
    myString = input('Enter your string: ') 

    isMoney = re.match('^\$[0-9]+(,[0-9]{3})*(\.[0-9]{2})?$', myString) 
    isName = re.match('^\'+\w+\'$', myString) 
    isDate = re.match('^[0-1][0-9]\/[0-3][0-9]\/[0-1][0-9]{3}$', myString) 
    # or try '^[0-1]*?\/[0-9]*\/[0-9]*$ If you want 0/0/0 too... 

    if isMoney: 
     print('It is Money:', myString) 
    elif isName: 
     print('It is a Name:', myString) 
    elif isDate: 
     print('It is a Date:', myString) 
    else: 
     print('Not good.') 

Sanple输出:

Enter your string: $100 
It is Money: $100 
Enter your string: 100 
Not good. 
Enter your string: 'Tom' 
It is a Name: 'Tom' 
Enter your string: Tom 
Not good. 
Enter your string: 01/15/1989 
It is a Date: 01/15/1989 
Enter your string: 01151989 
Not good. 

,您可以用这些变量isSomething的一个条件,这取决于究竟需要做的工作。我想,我希望这可以帮助。如果您想了解有关正则表达式的更多信息,请查看"Regular Expression Primer"Python's RE Page

+0

匆匆忙忙时,评论:)现在已经改变它.. – Rajeev 2012-04-02 17:18:42

+0

编辑为'boolean'变量,也许这可以更好地适合您的代码。 – George 2012-04-02 18:18:06

+0

谢谢我想弄清楚使用它的最佳方式。 – Rajeev 2012-04-02 18:27:00

1

这不会做你说你需要的所有替换。但是,这里有一种方法可以使用正则表达式和默认字典来计算数据中的事物。如果你真的想要替换字符串,我相信你可以弄清楚:

lines = [ 
    "ID674 25/01/1986 Thank you for :) choosing Optimus prime. Please wait for an Optimus prime Representative to respond. You are currently number 0 in the queue. You should be connected to an agent in approximately 0 minutes.. You are now chatting with 'Tom' 0", 
    "ID674 2gb Hi there! Welcome to Optus Web Chat 0/0/0 . $5.45 How can I help you today? 1", 
    "ID674 25-01-1986 I would like to change my bill plan from $0 with 0 expiry to something else $136. I find it very unuseful. Sam my phone no is 9838383821 2'" 
] 

import re 
from collections import defaultdict 

p_smiley = re.compile(r':\)|:-\)') 
p_currency = re.compile(r'\$[\d.]+') 
p_date = re.compile(r'(\d{1,4}[/-]\d{1,4}[/-]\d{1,4})') 

count_dict = defaultdict(int) 

def count_data(type, count): 
    global count_dict 
    count_dict[type] += count 

for line in lines: 
    count_data('smiley', len(re.findall(p_smiley, line))) 
    count_data('date', len(re.findall(p_date, line))) 
    count_data('currency', len(re.findall(p_currency, line))) 
+0

非常好,谢谢你.. – Rajeev 2012-04-02 18:27:12