2015-10-20 41 views
0

我有一个文档术语矩阵,其频率> 600个字,并且有相应的日期(mm/dd/yyyy)对于每个频率值:如何绘制字与时间的频率,并将时间变量分组为月/年和年R

 > head(mydata3,3) 
    Claim.Number Note.Date LOSSDATE DATEREPORTED 
1  106810 7/10/1998 12/9/1997 12/29/1997 
2  106810 7/21/1998 12/9/1997 12/29/1997 
3  106810 10/21/1999 12/9/1997 12/29/1997 
    DATEENTERED Row Topic absenc abus academ access 
1 1/5/1998 3  4  0 0  0  0 
2 1/5/1998 4  2  0 0  0  0 
3 1/5/1998 8 11  0 0  0  0 
    accid accommod account accus act action activ add 
1  0  0  0  0 0  0  0 0 
2  0  0  0  0 0  0  0 0 
3  0  0  0  0 0  0  0 0 
    addit addl adequ adjust administr admiss advanc 
1  0 0  0  0   0  0  0 
2  0 0  0  0   0  0  0 
3  0 0  0  0   0  0  0 
    advers advic african age agenc agreement aid ambul 
1  0  0  0 0  0   0 0  0 
2  0  0  0 0  0   0 0  0 
3  0  0  0 0  0   0 0  0 
    amount analysi ankl answer anticip appeal appel 
1  0  0 0  0  0  0  0 
2  0  0 0  0  0  2  0 
3  0  0 0  0  0  1  0 
    appli applic appoint appropri approv approxim arbitr 
1  0  0  0  1  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    argu argument aris arm arrang arriv asap assault 
1 0  0 0 0  0  0 0  0 
2 0  0 0 0  0  0 0  0 
3 0  0 1 0  0  0 0  0 
    assert assess assist athlet attach attent audit auto 
1  0  0  0  0  0  2  0 0 
2  0  0  0  0  0  0  0 0 
3  0  0  0  0  0  0  0 0 
    avoid await award background balanc ball bar basi 
1  0  0  0   0  0 0 0 0 
2  0  0  0   0  0 0 0 0 
3  0  0  0   0  0 0 0 0 
    benefit big bill black board breach break. brief 
1  0 0 0  0  0  0  0  0 
2  0 0 0  0  0  0  0  0 
3  0 0 0  0  0  0  0  0 
    broken broker budget build bus busi call campus cap 
1  0  0  0  0 0 0 0  0 0 
2  0  0  0  0 0 0 2  0 0 
3  0  0  0  0 0 0 0  0 0 
    car care carrier center cgl chair chang charg child 
1 0 0  0  0 0  0  0  0  0 
2 0 0  0  0 0  0  0  0  0 
3 0 0  0  0 0  0  0  0  0 
    children circuit cite citi civil clean client clinic 
1  0  0 0 0  0  0  0  0 
2  0  0 0 0  0  0  0  0 
3  0  0 0 0  0  0  0  0 
    close closur cmc coach code collect commit committe 
1  0  0 0  0 0  0  0  0 
2  0  0 0  0 0  0  0  0 
3  0  0 0  0 0  0  0  0 
    communic compani compar compel compens complain 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    complet conclud condit conduct conf confer confid 
1  0  0  0  0 0  0  0 
2  0  0  0  0 0  0  0 
3  0  0  0  0 0  0  0 
    conflict connect construct consult contact contend 
1  0  0   0  0  0  0 
2  0  0   0  0  0  0 
3  0  0   0  0  0  0 
    contract contractor contribut control convers 
1  0   0   0  0  0 
2  0   0   0  0  0 
3  0   0   0  0  0 
    convinc cooper coordin copi correct cost counter 
1  0  0  0 0  0 0  0 
2  0  0  0 0  0 0  0 
3  0  0  0 1  0 0  0 
    counti cours court cover coverag creat credibl 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    credit crimin cross cut damag danger deadlin deal 
1  0  0  0 0  0  0  0 0 
2  0  0  0 0  0  0  0 0 
3  0  0  0 0  0  0  0 0 
    dean death decis declin deduct defam defect defend 
1 0  0  0  0  0  0  0  0 
2 0  0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0  0 
    degre delay demand deni denial depart depos deposit 
1  0  0  0 0  0  0  0  0 
2  1  0  0 1  0  0  0  0 
3  1  0  0 0  0  0  0  0 
    dept despit develop diari difficult director disabl 
1 0  1  0  1   0  0  0 
2 1  0  0  0   0  0  0 
3 0  0  0  0   0  0  0 
    discharg disciplin disciplinari discoveri discrimin 
1  0   0   0   0   1 
2  0   0   0   0   1 
3  0   0   0   0   0 
    discuss dismiss disput distress district doc docket 
1  0  0  0  0  0 0  0 
2  0  0  0  0  0 0  0 
3  0  0  0  0  0 0  0 
    doctor document done door dorm doubt draft drive 
1  0  0 0 0 0  0  0  0 
2  0  0 0 0 0  0  1  0 
3  0  0 0 0 0  0  0  0 
    driver drop due earlier earn educ eeoc effort ell 
1  0 0 0  0 0 0 0  0 0 
2  0 0 0  0 0 0 0  0 0 
3  0 0 0  0 0 0 0  0 0 
    els email emot employ employe encourag end endors 
1 0  0 0  0  0  0 1  0 
2 0  0 0  0  0  0 0  0 
3 0  0 0  1  2  0 1  0 
    enrol entitl environ estim evalu event evid exam 
1  0  0  0  0  0  0 0 2 
2  0  0  0  0  0  0 0 2 
3  0  0  0  0  0  0 0 2 
    examin exceed excess exchang exclus execut expens 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    experi expert expir exposur extend extens extent 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    extrem eye face facil faculti fail failur fall fals 
1  0 0 0  0  0 0  0 0 0 
2  0 0 0  0  1 2  1 0 0 
3  0 0 0  0  0 3  0 0 0 
    fault favor fax feder fee fell femal field fight 
1  0  0 0  0 0 0  0  0  0 
2  0  0 0  0 0 0  0  0  0 
3  0  0 0  0 0 0  0  0  0 
    final financi finish fire firm floor focus foot forc 
1  0  0  0 0 0  0  0 0 0 
2  0  0  0 0 0  0  0 0 0 
3  0  0  0 0 0  0  0 0 0 
    form formal former forward fractur free fund futur 
1 0  0  0  0  0 0 0  0 
2 0  0  0  0  0 0 0  0 
3 0  0  0  0  0 0 0  0 
    game gender gone grade graduat grant grievanc ground 
1 0  0 0  0  0  0  0  0 
2 0  0 0  0  0  1  0  0 
3 0  0 0  1  1  0  0  0 
    group hand happi harass head health hear held higher 
1  0 0  0  0 0  0 0 0  0 
2  0 0  0  0 0  0 0 0  0 
3  0 0  0  0 0  0 0 0  0 
    hire histori hit hold home hospit hostil hous human 
1 0  0 0 0 0  0  0 0  0 
2 0  0 0 0 0  0  0 0  0 
3 0  0 0 0 0  0  0 0  0 
    ice identifi immedi immun impact import impress 
1 0  0  0  0  0  0  0 
2 0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0 
    improv inappropri inclin incur indemn individu injur 
1  0   0  0  0  0  0  0 
2  0   0  0  0  0  0  0 
3  0   0  0  0  0  0  0 
    injuri inquir inquiri inspect instruct intent 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    interest intern invoic job joint judg judgment juri 
1  0  0  0 0  0 0  0 0 
2  0  0  0 0  0 0  0 0 
3  0  1  0 0  0 0  0 0 
    jurisdict key knee knowledg lacer lack larg latest 
1   0 0 0  0  0 0 0  0 
2   0 0 0  0  0 0 0  0 
3   0 0 0  0  0 0 0  0 
    law lawyer layer learn leav leg legal letter level 
1 0  0  0  0 0 0  0  1  0 
2 0  1  0  0 0 0  0  0  0 
3 0  0  0  0 0 0  0  0  0 
    liabil lien life limit litig live lmtcb local lose 
1  0 0 0  0  0 0  0  0 0 
2  0 0 0  0  0 0  0  0 0 
3  0 0 0  0  0 0  0  0 0 
    loss lost low mail mainten major male manag materi 
1 0 0 0 0  0  0 0  0  0 
2 0 0 0 0  0  0 0  0  0 
3 0 0 0 0  0  0 0  0  0 
    mcad med mediat medic medicar meet memo merit messag 
1 0 0  0  0  0 0 0  0  0 
2 0 0  0  0  0 0 0  0  0 
3 0 0  0  0  0 2 0  0  0 
    million minor mom money monitor motion msj mtd 
1  0  0 0  0  0  0 0 0 
2  0  0 0  0  0  0 0 0 
3  0  0 0  0  0  0 0 0 
    nation near neck neglig negoti news noth notic 
1  1 0 0  0  0 0 0  0 
2  0 0 0  0  0 0 0  0 
3  0 0 0  0  0 0 0  0 
    notifi numer nurs object oblig ocr offer offici ongo 
1  0  0 0  0  0 0  0  0 0 
2  0  0 0  0  0 0  0  0 0 
3  0  0 0  0  0 2  0  0 0 
    open oper opinion opportun oppos opposit oral order 
1 0 0  0  0  0  0 0  0 
2 1 0  0  0  0  0 0  0 
3 0 0  0  0  0  0 0  0 
    origin outlin outstand owe paid pain park parti 
1  0  0  0 0 0 0 0  0 
2  0  0  0 0 0 0 0  0 
3  0  0  0 0 0 0 0  0 
    partner pass pay payment pend perman personnel petit 
1  0 1 0  0 0  0   0  0 
2  0 1 0  0 0  0   0  0 
3  0 2 0  0 1  0   0  0 
    phone photo physic physician pictur plan player 
1  0  0  0   0  0 0  0 
2  0  0  0   0  0 0  0 
3  0  0  0   0  0 0  0 
    plead poa polic polici poor postpon potenti practic 
1  0 0  0  0 0  0  0  0 
2  0 0  0  0 0  0  0  0 
3  0 0  0  0 0  0  0  0 
    preliminari premis prepar pres presid press pressur 
1   0  0  0 0  0  0  0 
2   0  0  0 0  0  0  0 
3   0  0  0 0  0  0  0 
    prevail prevent primari privat proceed product 
1  0  0  0  0  0  0 
2  0  0  0  0  0  0 
3  0  0  0  0  0  0 
    profession professor progress project promis promot 
1   0   0  0  0  0  0 
2   0   1  0  0  0  0 
3   0   2  0  0  0  0 
    proper properti propos protect provis provost pull 
1  0  0  0  0  0  0 0 
2  0  0  0  0  0  1 0 
3  0  0  0  0  0  0 0 
    punit pursu push qualifi quick quiet quit race rais 
1  0  0 0  0  0  0 0 0 0 
2  0  0 0  0  0  0 0 0 0 
3  0  0 0  0  0  0 0 0 0 
    rang rate reach recal receipt recov recoveri rediari 
1 0 0  0  0  0  0  0  0 
2 0 0  0  0  0  0  0  0 
3 0 0  0  0  0  0  0  0 
    reduc reimburs reinsur reject relationship releas 
1  0  0  0  0   0  0 
2  0  0  0  0   0  0 
3  0  0  0  0   0  0 
    relief remain remedi remov renew reopen rep repair 
1  0  0  0  0  0  0 0  0 
2  0  0  0  0  0  0 0  0 
3  0  0  1  0  0  0 0  0 
    repeat. replac repli repres represent research 
1  0  0  0  0   0  0 
2  0  0  0  0   0  0 
3  0  0  0  0   0  0 
    reserv resid resign resolut resolv respect respond 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    rest retain retali retent retir return reveal review 
1 0  0  0  0  0  0  0  2 
2 0  0  0  0  0  0  0  0 
3 0  0  0  0  0  0  0  1 
    revis risk role ror rts rule run safeti salari 
1  0 0 0 0 0 0 0  0  0 
2  0 0 0 0 0 0 0  0  0 
3  0 0 0 0 0 0 0  0  0 
    schedul search section secur select semest separ 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    serious serv servic settl settlement sex sexual 
1  0 0  0  0   0 0  0 
2  0 0  0  0   0 0  0 
3  0 0  0  0   0 0  0 
    shoulder side sidewalk sign signific sir sit site 
1  0 0  0 0  0 0 0 0 
2  0 0  0 0  0 0 0 0 
3  0 0  0 0  0 0 0 0 
    situat slip small snow speak spent split staff stage 
1  0 0  0 0  0  0  0  0  0 
2  0 0  0 0  0  0  0  0  0 
3  0 0  0 0  0  0  0  0  0 
    stair standard statement status statut step stop 
1  0  0   0  0  0 0 0 
2  0  0   0  2  0 0 0 
3  0  0   0  0  0 0 0 
    stori strategi street strike struck studi subject 
1  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0 
3  0  0  0  0  0  0  0 
    substanti success sue suffer suffici suggest summari 
1   0  0 0  0  0  0  0 
2   0  0 0  0  0  0  0 
3   0  0 0  0  0  0  0 
    supervis supervisor supplement supv surgeri suspect 
1  0   0   0 0  0  0 
2  0   0   0 0  0  0 
3  0   0   0 0  0  0 
    suspend sustain system tabl tcw teach teacher team 
1  0  0  0 0 0  0  0 0 
2  0  0  0 0 0  0  0 0 
3  0  0  0 0 0  0  0 0 
    telephon tender tenur term termin test testifi 
1  0  0  0 0  0 0  0 
2  0  0  0 0  0 1  0 
3  0  0  0 0  0 0  0 
    testimoni theori threaten titl top total tpa track 
1   0  0  0 0 0  0 0  0 
2   0  0  0 0 0  0 0  0 
3   0  0  0 0 0  0 0  0 
    train transcript transfer transport travel treat 
1  0   0  0   0  0  0 
2  0   0  0   0  0  0 
3  0   0  0   0  0  0 
    treatment trial trip troubl tuition unabl unclear 
1   0  0 0  0  0  0  0 
2   0  0 0  0  0  0  0 
3   0  0 0  0  0  0  0 
    unfortun upcom updat vacat valu vehicl verdict video 
1  0  0  1  0 0  0  0  0 
2  0  0  0  0 0  0  0  0 
3  0  0  0  0 0  0  0  0 
    violat visitor voicemail wage wait walk warn watch 
1  0  0   0 0 0 0 0  0 
2  0  0   0 0 0 0 0  0 
3  0  0   0 0 0 0 0  0 
    water weak white win withdraw worker write written 
1  0 0  0 0  0  0  0  0 
2  0 0  0 0  0  0  0  0 
3  0 0  0 0  0  0  1  0 
    wrote xbocx xdolx ximex xmsjx xnpcx xoopx xprosex 
1  0  0  0  0  0  0  0  0 
2  0  0  0  0  0  0  0  0 
3  1  0  0  0  0  0  0  0 
    xsolx 
1  0 
2  0 
3  0 

我试图按月份/年份和年份对频率值进行分组。例如,对于“上诉”这个词,不是在1998年1月5日发生了2次,而是在1998年5月1日发生了另一次发生,我想在1998年1月发生3次,然后发生3次假设一年中其余时间没有更多的点击)。然后我想绘制每月/每年/每月/每年的频率,以及每年与每年的频率。

我试图通过月/年使用下面的代码组:

df %>% 
     mutate(month_year = format(date, "%Y/%m")) %>% 
     group_by(month_year) %>% 
     summarise(total = sum(vocabfreq)) 

其中值均列文字的原始数据集的频率。另一个问题是我的数据集非常大,我很难在一个显示特色的图表上绘制多个系列。

回答

1

xts方法:

library(xts) 
dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'), 
        word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4)) 
dates <- as.POSIXct(dat$date, format='%m/%d/%Y') 
dat.xts <- xts(subset(dat, select= -date), order.by=dates) 
apply.daily(dat.xts, colSums) 
apply.monthly(dat.xts, colSums) 
+0

当我尝试这个,日期列不包括在数据框? –

+0

这不是一个data.frame,它是一个xts对象。日期可以通过'index(dat.xts)'来检索。这是处理日期数据的更有效的方法。 – DunderChief

+0

另外,如果在您的问题中提供一个可重复的示例,那么我们可以使用您的数据给出一个示例。请参阅'?dput' – DunderChief

0

您应该使用summarise_each,而不是summarise。顺便说一句,我使用@DunderChief的代码来生成数据。谢谢你。

dat <- data.frame(date=c('7/10/2014', '7/10/2014', '7/11/2014', '8/05/2015', '9/21/2015'), 
       word1= c(1,2,1, 4, 3), word2=c(3, 10, 1, 2, 4)) 
library(dplyr) 

dat %>% 
    mutate(date = as.Date(date, format='%m/%d/%Y')) %>% 
    group_by(date) %>% 
    summarise_each(funs(sum(.))) 
+0

当我在我的数据集上尝试这个时,我得到一个错误:没有为“日期”对象定义总和 –

+0

@Learning_R哦,原因是你在数据文件中有其他日期列,而'.'表示除分组列以外的每一列。您不能在这些日期对象上使用'sum' – Hao

+0

@Learning_R您应该将所有日期列放入group_by中,或者考虑组织日期列的好方法。 – Hao

相关问题