2017-02-23 357 views
0

我正在尝试将方解石与Kafka结合,我参考了CsvStreamableTable。将Kafka与Apache Calcite结合起来

每个ConsumerRecord是转换使用fowlloing代码为Object []:

static class ArrayRowConverter extends RowConverter<Object[]> { 
    private List<Schema.Field> fields; 

    public ArrayRowConverter(List<Schema.Field> fields) { 
     this.fields = fields; 
    } 

    @Override 
    Object[] convertRow(ConsumerRecord<String, GenericRecord> consumerRecord) { 
     Object[] objects = new Object[fields.size()+1]; 
     int i = 0 ; 
     objects[i++] = consumerRecord.timestamp(); 
     for(Schema.Field field : this.fields) { 
      Object obj = consumerRecord.value().get(field.name()); 
      if(obj instanceof Utf8){ 
       objects[i ++] = obj.toString(); 
      }else { 
       objects[i ++] = obj; 
      } 
     } 
     return objects; 
    } 
} 

枚举被实现为以下中,一个线程是从卡夫卡不断地轮询记录,并把它们放入一个队列,getRecord()方法轮询从那个队列:

public E current() { 
    return current; 
} 

public boolean moveNext() { 
for(;;) { 
    if(cancelFlag.get()) { 
     return false; 
    } 
    ConsumerRecord<String, GenericRecord> record = getRecord(); 
    if(record == null) { 
     try { 
      Thread.sleep(200L); 
     } catch (InterruptedException e) { 
      e.printStackTrace(); 
     } 
     continue; 
    } 
    current = rowConvert.convertRow(record); 
    return true; 
    } 
} 

我测试SELECT STREAM * FROM Kafka.clicks,它工作正常。 rowtime是明确添加的第一列,值是卡夫卡的记录时间戳。

但当我

SELECT STREAM FLOOR(rowtime TO HOUR) 
AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip 

扔例外

java.sql.SQLException: Error while executing SQL "SELECT STREAM FLOOR(rowtime TO HOUR) AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks GROUP BY FLOOR(rowtime TO HOUR), ip": From line 1, column 85 to line 1, column 119: Streaming aggregation requires at least one monotonic expression in GROUP BY clause 
    at org.apache.calcite.avatica.Helper.createException(Helper.java:56) 
    at org.apache.calcite.avatica.Helper.createException(Helper.java:41) 

回答

0

需要声明的是, “ROWTIME” 列是单调的。在MockCatalogReader中,请注意“ORDER”和“SHIPMENTS”流中“ROWTIME”是如何声明为单调的。这就是为什么SqlValidatorTest.testStreamGroupBy()中的某些查询是有效的,而其他查询则不是。验证者依赖的关键方法是SqlValidatorTable.getMonotonicity(String columnName)

+0

感谢朱利安,有没有简单的方法来声明列单调,还是应该像MockTable一样实现? – user2283216