2017-03-04 58 views
1

我有一个数据框,其中包含一个来自gnucash mysql数据库的帐户的子字段和父字段。我想将帐户层次结构存储在数据框中。过去,我在mySQL中使用了递归连接,但随着层次越来越深,它变得非常繁琐。你也必须知道你的树有多少层次。我希望有一种更简单的方法来构建层次结构(有或没有最大深度的知识)。R层次数据上的递归合并

的样本数据:

account_id <- c(1:11) 
account_name <- c('root_account','dining', 'food', 'discretionary_expense', 
        'expenses', 'base_salary_wife', 'base_salary_husband', 
        'base_salary', 'salary', 'taxable_income', 
        'income') 
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1) 
test.data <- data.frame(account_id, account_name, account_parentid) 

所需的输出:

account_id   account_name account_parentid lvl2_parentid lvl3_parentid lvl4_parentid lvls 
1   1   root_account    NA   NA   NA   NA NA 
2   2    dining    3    4    6   NA 4 
3   3     food    4    5   NA   NA 3 
4   4 discretionary_expense    5   NA   NA   NA 2 
5   5    expenses    1   NA   NA   NA 1 
6   6  base_salary_wife    8    9   10   11 5 
7   7 base_salary_husband    8    9   10   11 5 
8   8   base_salary    9   10   11   NA 4 
9   9    salary    10   11   NA   NA 3 
10   10  taxable_income    11   NA   NA   NA 2 
11   11    income    1   NA   NA   NA 1 

回答

1

您可以使用data.tree包分层数据的工作:

获取测试数据:

account_id <- c(1:11) 
account_name <- c('root_account','dining', 'food', 'discretionary_expense', 
        'expenses', 'base_salary_wife', 'base_salary_husband', 
        'base_salary', 'salary', 'taxable_income', 
        'income') 
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1) 
test.data <- data.frame(account_id, account_parentid, account_name, stringsAsFactors = F) 

转换t Ødata.tree结构:

library(data.tree) 
tree1 <- FromDataFrameNetwork(test.data[-1,]) 
tree1$account_name <- 'root_account' 

显示:

ToDataFrameTree(tree1, account = 'name', 'account_name', 'pathString') 

这将显示如下所示:

   levelName account   account_name pathString 
1 1       1   root_account    1 
2 ¦--5      5    expenses   1/5 
3 ¦ °--4     4 discretionary_expense   1/5/4 
4 ¦  °--3    3     food  1/5/4/3 
5 ¦   °--2   2    dining  1/5/4/3/2 
6 °--11      11    income   1/11 
7  °--10     10  taxable_income  1/11/10 
8   °--9    9    salary  1/11/10/9 
9    °--8   8   base_salary 1/11/10/9/8 
10     ¦--6  6  base_salary_wife 1/11/10/9/8/6 
11     °--7  7 base_salary_husband 1/11/10/9/8/7 

的问题不是一部分,但它真正变得有趣的是,当你希望总结层次结构等。请参阅data.tree小插曲herehere