2013-03-04 80 views
1

我正在编写一个pyDatalog程序来分析来自Weather Underground的天气数据(就像我自己和公司其他人的演示一样)。我写了一个自定义的谓词解析它开始和结束时间之间返回读数:pyDatalog:处理自定义谓词中的未绑定变量

# class for the reading table. 
class Reading(Base): 
     __table__ = Table('reading', Base.metadata, autoload = True, autoload_with = engine) 
     def __repr__(self): 
     return str(self.Time) 
     # predicate to resolve 'timeBetween(X, Y, Z)' statements 
     # matches items as X where the time of day is between Y and Z (inclusive). 
     # if Y is later than Z, it returns the items not between Z and Y (exclusive). 
     # TODO - make it work where t1 and t2 are not bound. 
     # somehow needs to tell the engine to try somewhere else first. 
     @classmethod 
     def _pyD_timeBetween3(cls, dt, t1, t2): 
     if dt.is_const(): 
      # dt is already known 
      if t1.is_const() and t2.is_const(): 
      if (dt.id.Time.time() >= makeTime(t1.id)) and (dt.id.Time.time() <= makeTime(t2.id)): 
       yield (dt.id, t1.id, t2.id) 
     else: 
      # dt is an unbound variable 
      if t1.is_const() and t2.is_const(): 
      if makeTime(t2.id) > makeTime(t1.id): 
       op = 'and' 
      else: 
       op = 'or' 
      sqlWhere = "time(Time) >= '%s' %s time(Time) <= '%s'" % (t1.id, op, t2.id) 
      for instance in cls.session.query(cls).filter(sqlWhere): 
       yield(instance, t1.id, t2.id) 

这在T1和T2是绑定到特定值的情况下正常工作:

:> easterly(X) <= (Reading.WindDirection[X] == 'East') 
:> + rideAfter('11:00:00') 
:> + rideBefore('15:00:00') 
:> goodTime(X) <= rideAfter(Y) & rideBefore(Z) & Reading.timeBetween(X, Y, Z) 
:> goodTime(X) 
[(2013-02-19 11:25:00,), (2013-02-19 12:45:00,), (2013-02-19 12:50:00,), (2013-02-19 13:25:00,), (2013-02-19 14:30:00,), (2013-02-19 15:00:00,), (2013-02-19 13:35:00,), (2013-02-19 13:50:00,), (2013-02-19 12:20:00,), (2013-02-19 12:35:00,), (2013-02-19 14:05:00,), (2013-02-19 11:20:00,), (2013-02-19 11:50:00,), (2013-02-19 13:15:00,), (2013-02-19 14:55:00,), (2013-02-19 12:00:00,), (2013-02-19 13:00:00,), (2013-02-19 14:20:00,), (2013-02-19 14:15:00,), (2013-02-19 13:10:00,), (2013-02-19 12:10:00,), (2013-02-19 14:45:00,), (2013-02-19 14:35:00,), (2013-02-19 13:20:00,), (2013-02-19 11:10:00,), (2013-02-19 13:05:00,), (2013-02-19 12:55:00,), (2013-02-19 14:10:00,), (2013-02-19 13:45:00,), (2013-02-19 13:55:00,), (2013-02-19 11:05:00,), (2013-02-19 12:25:00,), (2013-02-19 14:00:00,), (2013-02-19 12:05:00,), (2013-02-19 12:40:00,), (2013-02-19 14:40:00,), (2013-02-19 11:00:00,), (2013-02-19 11:15:00,), (2013-02-19 11:30:00,), (2013-02-19 11:45:00,), (2013-02-19 13:40:00,), (2013-02-19 11:55:00,), (2013-02-19 14:25:00,), (2013-02-19 13:30:00,), (2013-02-19 12:30:00,), (2013-02-19 12:15:00,), (2013-02-19 11:40:00,), (2013-02-19 14:50:00,), (2013-02-19 11:35:00,)] 

但是如果我声明与其他顺序的条件GOODTIME规则(即,其中Y和Z是在它试图解决timeBetween点未结合的),它返回一个空集:

:> atoms('niceTime') 
:> niceTime(X) <= Reading.timeBetween(X, Y, Z) & rideAfter(Y) & rideBefore(Z) 
<pyDatalog.pyEngine.Clause object at 0x0adfa510> 
:> niceTime(X) 
[] 

钍似乎是错误的 - 两个查询应该返回相同的一组结果。

我的问题是在pyDatalog中是否有办法处理这种情况?我认为需要发生的是timeBetween谓词应该能够告诉引擎退出,并尝试在尝试这个之前先解决其他规则,但是在文档中我看不到任何这种引用。

回答

0

pyDatalog reference说:“虽然pyDatalog语句的顺序是无差异的,但是文本内部的文字顺序是显着的”,pyDatalog的确按照它们声明的顺序来解析谓词。

话虽如此,它可能会改善pyDatalog先解析绑定变量的谓词,但我不知道为什么这很重要。

+0

我唯一的原因是让语法更加透明并且独立于底层引擎。通常'&'是一个传递操作符,所以人们会期望来自其他语言。 – highfellow 2013-03-05 10:21:50

+0

感谢您的反馈。 在某些情况下,我考虑对子句使用以下语法:即使用正文文本列表而不是&来使用以下语法的子句: p(X)<=(q(X),r(X)) 。这具有不暗示交换性的优点,但我发现它比'&'更不可读。收到您的反馈意见后,我可能会在某些时候添加这个符号。 请注意'和'在Python中不是真正的交换。如果a为假,则b不在'a和b'中评估,因此'a和b'可能与'b和a'有不同的结果。 – user474491 2013-03-11 15:48:15