2015-08-09 170 views
1

我正在使用python从一系列空气质量监视器读取数据。使用类来计算平均值python

目前我正在做下面的计算平均值。我认为必须有更好的方法来使用类来做到这一点,因为大多数代码都会重复自身,以使其更加普遍适用和高效。

另一个问题是我有一些不同类型的监视器,它们主要在相同的原理上操作,但具有稍微不同的变量。下面给出的代码是移动到新显示器的真正噩梦,因为我必须编辑每一行。

问题在于当我搜索班级和平均时,我似乎得到的所有结果都是学校班级中的学生平均成绩,而不是使用软件班来计算多个变量的平均值。

基本上监视器使一个阅读秒,但我只需要1分钟的平均数,所以我通过跟随,直到分钟翻转。

任何建议感激地收到。

目前我在做这个:

while minute unchanged: 
    ## read serial port 

    readData = SER.read() 

    ## Split comma delimited data into dust, flow, temperature, relative humidity, pressure 
    ## for example data comes in as 000.007,2.0,+21.7,046,1010.0 
    measData = readData[0].split(',')   
    dustReading = measData[0].strip()      

    flowReading = measData[1] 
    flowReading = flowReading.strip() 

    tempReading = measData[2] 
    tempReading = tempReading.strip() 

    rhReading = measData[3] 
    rhReading = rhReading.strip() 

    pressReading = measData[4] 
    pressReading = pressReading.strip() 


    ######### Dust ####### 

    try : 
     nReading = nReading+1 
     dustReading = float(dustReading) 
     sumDustReading = sumDustReading + dustReading 
     meanDustReading = sumReading/float(nReading)    
    except : 
     pass  

    ####### Flow ########## 
    try : 
     flowReading = float(flowReading) 
     sumFlowReading = sumFlowReading+flowReading 
     meanFlowReading = float(sumFlowReading)/float(nReading) 
    except : 
     pass      

    ######## Temperature ######### 
    try:  
     tempReading = float(tempReading) 
     sumTempReading = sumTempReading+tempReading 
     meanTempReading = float(sumTempReading)/float(nReading) 
    except : 
     pass 

    ######### RH ######## 
    try : 
     rhReading = float(rhReading) 
     sumRhReading = sumRhReading+rhReading 
     meanRhReading = float(sumRhReading)/float(nReading) 
    except : 
     pass  

    ###### Pressure Reading ###### 
    try : 
     pressReading = float(pressReading) 
     sumPressReading = sumPressReading+pressReading 
     meanPressReading = float(sumPressReading)/float(nReading)    
    except : 
     pass 

理想我希望能够得到像

flow.mean 
flow.numberOfReads 
flow.sum 

万分感谢。

+1

你可能想要考虑这个模块https://docs.python.org/3/library/statistics.html。 –

+1

你在用'sumDustReading','meanDustReading'等做什么?他们在哪里首先定义? –

+0

@PadraicCunningham:我的猜测是各种'sumxxxReading'字段在while循环开始之前被初始化。我还假设'尽管分钟不变:'是半伪代码,而未发布的代码中还有其他一些与时间测量相关的东西。但我当然可能完全错误。 :) –

回答

2

您现有的代码有点奇怪&危险与所有那些try:...except块。这很危险,因为它忽略了各种错误,确实应该忽略而不是。例如:尝试使用未定义变量的值,零除错误和直接语法错误。


可以你想用一个类的话,但这样的事情,我会更倾向于使用一个简单的dict。我会发布两种方法的代码,以帮助您决定(或添加到您的困惑:))。

很显然,我没有将您的监控硬件连接到我的串口,因此为了测试此代码,我编写了一个简单的生成器函数,可以创建随机伪数据。希望你修改我的代码会很容易。

首先,使用一个类,其中包含当前总数和读数的代码;此类还具有计算平均需求量的属性。所以你可以做一些事情,比如print dust.total, flow.mean,如果你愿意的话。

from random import seed, random 

#A generator to make simulated data 
def fake_data(n): 
    for i in range(n): 
     d = 0.005 + 0.005 * random() 
     f = 1.5 + random() 
     t = 20.0 + 5.0 * random() 
     h = int(40 + 10.0 * random()) 
     p = 1005.0 + 10.0 * random() 

     s = '%07.3f,%3.1f,%-4.1f,%03d,%6.1f' % (d, f, t, h, p) 
     yield [s] 


class Data(object): 
    def __init__(self, name): 
     self.name = name 
     self.total = 0 
     self.numberOfReads = 0 

    def __repr__(self): 
     fmt = "Data(name='%s', total=%f, number=%d, mean=%f)" 
     return fmt % (self.name, self.total, self.numberOfReads, self.mean) 

    def __str__(self): 
     fmt = '%s\nmean: %f\nnumber: %d\ntotal: %f\n' 
     return fmt % (self.name, self.mean, self.numberOfReads, self.total) 

    def add(self, value): 
     self.total += value 
     self.numberOfReads += 1 

    @property 
    def mean(self): 
     try: 
      return float(self.total)/float(self.numberOfReads) 
     except ZeroDivisionError: 
      return None 


#Seed the randomiser 
seed(1) 

#Initialise the Data instances 
dust = Data('Dust') 
flow = Data('Flow') 
temp = Data('Temperature') 
rh = Data('Relative Humidity') 
press = Data('Pressure') 

for readData in fake_data(10): 
    ## Split comma delimited data into dust, flow, temperature, relative humidity, pressure 
    ## for example data comes in as 000.007,2.0,+21.7,046,1010.0 
    print readData 
    measData = readData[0].split(',') 

    #Convert data strings to floats 
    dustR, flowR, tempR, rhR, pressR = [float(s) for s in measData] 

    #Add new data to current totals 
    dust.add(dustR) 
    flow.add(flowR) 
    temp.add(tempR) 
    rh.add(rhR) 
    press.add(pressR) 

print 

for a in (dust, flow, temp, rh, press): 
    print a 

输出

['000.006,2.3,23.8,042,1010.0'] 
['000.007,2.2,23.9,040,1005.3'] 
['000.009,1.9,23.8,040,1009.5'] 
['000.009,1.7,24.7,049,1005.3'] 
['000.005,2.0,24.7,043,1007.2'] 
['000.007,1.5,21.1,044,1010.0'] 
['000.006,1.7,21.1,044,1007.9'] 
['000.005,2.3,22.8,046,1006.9'] 
['000.010,2.4,20.6,043,1012.2'] 
['000.009,2.4,22.1,048,1011.7'] 

Dust 
mean: 0.007300 
number: 10 
total: 0.073000 

Flow 
mean: 2.040000 
number: 10 
total: 20.400000 

Temperature 
mean: 22.860000 
number: 10 
total: 228.600000 

Relative Humidity 
mean: 43.900000 
number: 10 
total: 439.000000 

Pressure 
mean: 1008.600000 
number: 10 
total: 10086.000000 

这里是一个使用dict版本。我已经将随机种子设置为相同的值,以便假数据与之前的版本相同。

from random import seed, random 

#A generator to make simulated data 
def fake_data(n): 
    for i in range(n): 
     d = 0.005 + 0.005 * random() 
     f = 1.5 + random() 
     t = 20.0 + 5.0 * random() 
     h = int(40 + 10.0 * random()) 
     p = 1005.0 + 10.0 * random() 

     s = '%07.3f,%3.1f,%-4.1f,%03d,%6.1f' % (d, f, t, h, p) 
     yield [s] 


#Seed the randomiser 
seed(1) 

#data field names 
fields = ('Dust', 'Flow', 'Temp', 'rh', 'press') 

#initialise the data dictionary 
data = dict.fromkeys(fields, 0.0) 

nReading = 0 
for readData in fake_data(10): 
    nReading += 1 

    ## Split comma delimited data into dust, flow, temperature, relative humidity, pressure 
    ## for example data comes in as 000.007,2.0,+21.7,046,1010.0 
    print nReading, readData 
    measData = readData[0].split(',') 

    #Convert data strings to floats 
    floatData = [float(s) for s in measData] 

    #Add new data to current totals 
    for key, value in zip(fields, floatData): 
     data[key] += value 

print '\nNumber of readings = %d\n' % nReading 
nReading = float(nReading) 
for key in fields: 
    total = data[key] 
    mean = total/nReading 
    print '%s\nmean: %f\ntotal: %f\n' % (key, mean, total) 

输出

1 ['000.006,2.3,23.8,042,1010.0'] 
2 ['000.007,2.2,23.9,040,1005.3'] 
3 ['000.009,1.9,23.8,040,1009.5'] 
4 ['000.009,1.7,24.7,049,1005.3'] 
5 ['000.005,2.0,24.7,043,1007.2'] 
6 ['000.007,1.5,21.1,044,1010.0'] 
7 ['000.006,1.7,21.1,044,1007.9'] 
8 ['000.005,2.3,22.8,046,1006.9'] 
9 ['000.010,2.4,20.6,043,1012.2'] 
10 ['000.009,2.4,22.1,048,1011.7'] 

Number of readings = 10 

Dust 
mean: 0.007300 
total: 0.073000 

Flow 
mean: 2.040000 
total: 20.400000 

Temp 
mean: 22.860000 
total: 228.600000 

rh 
mean: 43.900000 
total: 439.000000 

press 
mean: 1008.600000 
total: 10086.000000 

下面是使用try:... except来验证输入数据的一个简单的例子。

data = [ 
    '1.1 2.2 3.3', 
    '4 5', 
    '6 garbage bytes', 
    '7 8 9 10', 
    '11 12 13', 
] 

for i, line in enumerate(data): 
    print '\nLine %d: %r' % (i, line) 
    row = line.split() 
    if len(row) != 3: 
     print 'Bad row length. Should be 3 not', len(row) 
     continue 

    try: 
     a, b, c = [float(s) for s in row] 
    except ValueError as err: 
     print 'Conversion error:', err 
     continue 

    print 'Data %d: a=%f, b=%f, c=%f' % (i, a, b, c) 

输出

Line 0: '1.1 2.2 3.3' 
Data 0: a=1.100000, b=2.200000, c=3.300000 

Line 1: '4 5' 
Bad row length. Should be 3 not 2 

Line 2: '6 garbage bytes' 
Conversion error: invalid literal for float(): garbage 

Line 3: '7 8 9 10' 
Bad row length. Should be 3 not 4 

Line 4: '11 12 13' 
Data 4: a=11.000000, b=12.000000, c=13.000000 
+0

感谢所有这些。非常感激。我会尝试一下,看看我们如何去。我把那些'尝试......''除了...'之外,因为显示器在输出方面不是很可靠。100中的大约1个读数是腐败或部分损坏的随机垃圾,当我转换为浮点数时会导致各种头痛和数据类型错误。 – bartman10

+0

@ bartman10:无后顾之忧。 'try:.. except'对于验证输入可能非常有用,但是你应该1)尽量减少'try:'块中的代码,2)使用命名的异常,因为一个空的except除了太多。我会在我的答案中添加一个小例子。 –

1

如果不断在名单的数据,很容易计算出平均和其它统计特性。而像列表和总和的长度是内置的。

所以先做一些清单。

dust, flow, temp, rh, pressure = [], [], [], [], [] 

我将以下面的数据为例。

readData = '000.007,2.0,+21.7,046,1010.0' 

让我们把它分开;

newdust, newflow, newtemp, newrh, newpressure = [float(n) for n in readData.split(',')] 
dust.append(newdust) 
... 
pressure.append(newpressure) 

计算平均值:

sum(pressure)/len(pressure) 

更新

这是很难给予办理不同种类的乐器的建议没有看到它们所产生的数据。

你可以比如写一个数据转换功能,为每个不同类型的传感器,读取从串口线,并返回一个tuple或数据的namedtuple,与不特定设备为None支持的测量。 假设你知道你连接的传感器可以选择正确的函数在程序开始时调用;

from collections import namedtuple 

Instrumentrecord = namedtuple('Instrumentrecord', 
           ['dust', 'flow', 'temp', 'humidity', 
           'pressure', 'windspeed', 'winddir']) 

def foo_sensor(dataline): 
    dust, flow, temp, rh, pressure = [float(n) for n in dataline.split(',')] 
    return Instrumentrecord(dust, flow, temp, rh, pressure, None, None) 


def main(argv): 
    .... 
    if sensortype == 'foo': 
     current_sensor = foo_sensor 
    ... 
    data = [] 
    while keep_going: 
     line = SER.read() 
     data.append(current_sensor(line)) 
+0

谢谢你的罗纳德。一个很好的建议,但我不确定它在仪器之间的可移植性方面对我有多大帮助。 (它们有不同的输出,但基本相同)。 – bartman10

+0

@ bartman10查看更新的答案。每种仪器的不同“翻译”功能都是可能的解决方案。 –