2017-06-30 12 views
1

使用Postgres 9.6。如何有效地计算嵌套在Postgres中的JSONB数组的统计数据?

我有这个工作,但怀疑有一个更有效的方法。在MyEventLength阵列上计算AVG,SUM等的最佳方法是什么?

DROP TABLE IF EXISTS activity; 
DROP SEQUENCE IF EXISTS activity_id_seq; 
CREATE SEQUENCE activity_id_seq; 

CREATE TABLE activity (
    id INT CHECK (id > 0) NOT NULL DEFAULT NEXTVAL ('activity_id_seq'), 
    user_id INT, 
    events JSONB 
); 

INSERT INTO activity (user_id,events) VALUES 
(1, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'), 
(1, '{"MyEvent":{"MyEventLength":[12],"MyEventValue":[4]}}'), 
(2, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'), 
(1, '{"MyEvent":{"MyEventLength":[1000,2000],"MyEventValue":[450,550]}}'); 

到目前为止,这是我可以找出计算的平均水平MyEventLength阵列user_id 1的最佳方式:

SELECT avg(recs::text::numeric) FROM (
    SELECT jsonb_array_elements(a.event_length) as recs FROM (
     SELECT events->'MyEvent'->'MyEventLength' as event_length from activity 
     WHERE user_id = 1 
    )a 
) b; 

或者这种变化:

SELECT avg(recs) FROM (
    SELECT jsonb_array_elements_text(a.event_length)::numeric as recs FROM (
     SELECT events->'MyEvent'->'MyEventLength' as event_length from activity 
     WHERE user_id = 1 
    )a 
) b; 

是有没有更好的方法来做到这一点,不需要那么多的子选择?

回答

1

您需要标量值传递行avg(),否则(如果你尝试通过像jsonb_array_elements_text(..)一些设置返回函数的输出),你会得到一个错误,如本:

ERROR: set-valued function called in context that cannot accept a set 

所以你绝对需要至少1个子查询或CTE。

选项1,W/O CTE:

select avg(v::numeric) 
from (
    select 
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength') 
    from activity 
    where user_id = 1 
) as a(v); 

选项2,CTE(可读性更好):

with vals as (
    select 
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength')::numeric as val 
    from activity 
    where user_id = 1 
) 
select avg(val) 
from vals 
; 

UPDATE,选择3:原来,你可以做到这一点没有任何嵌套查询,使用隐式JOIN LATERAL

select avg(val::text::numeric) 
from activity a, jsonb_array_elements(a.events->'MyEvent'->'MyEventLength') vals(val) 
where user_id = 1; 
+1

太棒了,谢谢! – Clay