2015-02-23 105 views
3

有没有一些可以在Phabricator上打印差异统计信息的开源库?如果没有,那么有人能指出我如何编写一些代码来做到这一点的正确方向?Phabricator的评论统计

以下是我想要的一些示例统计信息: 接受审阅前所需的平均补丁数量。 平均回复评论所花费的时间。 有我作为评论者的评论的平均数量,我最终审查。

+0

我无法找到一个,但这里有可能减少代码你”量的包装d需要写信e:http://bloomberg.github.io/phabricator-tools/ – 2016-03-20 13:55:40

回答

8

是否有一些开源库,可以打印Pharizicator上的差异统计数据?如果没有,那么有人能指出我如何编写一些代码来做到这一点的正确方向?

我还没有找到一个库来做到这一点,但Phabricator提供Conduit这是一个“非正式的传输临时JSON斑点机制”。它提供了一个variety of methods,可用于查询Phabricator的信息。

如果您使用Arcanist与Phabricator进行交互(并且拥有~/.arcrc文件),那么您可以使用Python绑定编写脚本(并将Conduit的身份验证和通信抽象出来)。

已经安装了Python phabricator库后,我能找到我phid使用我的用户名user.find被警告,这种方法已经过时):

# Import favourite statistics library 
import numpy as np 

# Import the datetime library to print time deltas 
import datetime as dt 

# Setup Phabricator connection 
from phabricator import Phabricator 
ph = Phabricator() 

# Find username's phid 
username = 'chris.fournier' 
phid = ph.user.find(aliases=[username])[username] 

# Print pid 
print phid 

这应该打印phid(像我):

PHID-USER-2h22hnj6t5sxyfq4lzox 

补丁的平均数量是用尽我的评论被接受之前。

以我phid,我可以查询,我已经使用differential.query创作了所有的比较和计算多少diff文件(即“补丁”),每个差异了。然后,我可以计算出我喜欢的任何描述性统计量(使用像numpy这样的Python库)。

# Find all authored diffs 
diffs_authored = ph.differential.query(authors=[phid]) 

# A diff can be either closed or accepted if it has passed review 
passed_status_names = set(['Closed', 'Accepted']) 

# Find out how many revisions each diff had before it was passed 
diff_revisions = list() 
for diff in diffs_authored: 
    if diff['statusName'] in passed_status_names: 
     diff_revisions.append(len(diff['diffs'])) 

# Generate some results 
print 'Mean\t\t\t', np.mean(diff_revisions) 
print 'Standard deviation\t', np.std(diff_revisions) 
print 'Max\t\t\t', np.max(diff_revisions) 
print 'Reviews\t\t\t', len(diff_revisions) 

这将打印有关我多少补丁需要作出之前,被我的同事们在工作中接受了我的diff如下统计:

Mean    1.80404040404 
Standard deviation 1.36920822006 
Max     13 
Reviews    495 

的时间平均长度需要为我回复评论。

以我phid,我可以查询,我列为审阅,然后遍历每个差分注释的所有的diff文件(使用differential.getrevisioncomments; 被警告,这种方法不建议使用)。在这些评论中,我可以第一次看到我发表了评论(或接受或拒绝),然后从创建第一条评论时减去差异创建时间。

我可以在评论中包装整个搜索以产生一个响应时间到一个函数(如果我没有对该差异进行评论,返回None)。

def reviewer_response_time(reviewer_phid, diff): 
    response_time = None 
    creation_time = long(diff['dateCreated']) 

    # Get comments 
    diff_id = diff['id'] 
    comments = ph.differential.getrevisioncomments(
     ids=[int(diff_id)])[diff_id] 

    # Get date of first comment typed into Phabricator by reviewer 
    first_comment_time = None 
    for comment in comments: 
     if comment['authorPHID'] == reviewer_phid: 
      first_comment_time = long(comment['dateCreated']) 
      break 

    # If the reviewer provided a review calculate the response time 
    if first_comment_time: 
     response_time = first_comment_time - creation_time 

    return response_time 

然后我就可以使用该函数来获取的响应时间为我所有的diff文件(使用differential.query),我回顾(迭代中的100批,因为,在我的情况,我做了足够的评论中,查询功能如果我一次问他们所有人,会超时)。

# Find all diffs that the user is listed as a reviewer for 
offset = 0 
limit = 100 
diffs = ph.differential.query(reviewers=[phid], 
           limit=limit, 
           offset=offset) 
response_times = list() 

while len(diffs.response) > 0: 
    for diff in diffs: 
     response_times.append(reviewer_response_time(phid, diff)) 
    offset += limit 
    diffs = ph.differential.query(reviewers=[phid], 
            limit=limit, 
            offset=offset) 

只选择那些我回顾过的评论,然后我可以计算和打印统计数据,如下所示。

# Choose only those reviews that the user was asked to review and did 
reviewed_response_times = \ 
    [response_time for response_time in response_times 
    if response_time is not None] 

# Generate some results 
mean_reviewed_response_time = np.mean(reviewed_response_times) 
std_reviewed_response_time = np.std(reviewed_response_times) 
max_reviewed_response_time = np.max(reviewed_response_times) 

# Print results 
print 'Mean\t\t\t', dt.timedelta(seconds=mean_reviewed_response_time) 
print 'Standard deviation\t', dt.timedelta(seconds=std_reviewed_response_time) 
print 'Max\t\t\t', dt.timedelta(seconds=max_reviewed_response_time) 
print 'Reviews\t\t\t', len(reviewed_response_times) 

这将打印大约花了多长时间我才能到每个评论如下统计:

Mean    10:04:15.351555 
Standard deviation 2 days, 11:18:40.778145 
Max     66 days, 23:04:10 
Reviews    1061 

条评论中有我作为一个评论家,我最终审查的平均数。

我可以重用我前面生成确定的评论,我检讨与我被列为审阅和评论我被分配了我参与的百分比数的数量的数据。

# Generate some results 
print 'Diffs reviewed\t\t', len(reviewed_response_times) 
print 'Diffs asked to review\t', len(response_times) 
print 'Percentage\t\t', \ 
    float(len(reviewed_response_times))/len(response_times) 

这将打印有关我多少评论是能够执行与分配给我的如下统计:

Diffs reviewed   1061 
Diffs asked to review 1302 
Percentage    0.81490015361