为什么grep（）在readLines（）之后不起作用？

我中的R开发了一个程序来读取报告可在网上和第2行是：为什么grep（）在readLines（）之后不起作用？

page1 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC03-LeisOrc-RL&municipioSelecionado=3100203&exercicioSelecionado=2014") 
line1 <- grep("Leis Autorizativas",page1)

程序的其余部分工作得很好，我得到了我所需要的数据。然后我试图去适应它读出不同的报告，但此时第二行没有工作：

page2 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC08-ConsultarDecretos-RL&municipioSelecionado=3101607&exercicioSelecionado=2013") 
line2 <- grep("Decretos de Alterações",page2)

在第一种情况下“第1页”是一个字符向量，并在第二案“第2页”是一个大字符矢量。这种差异可能导致问题吗？如果是这样，是否有人提示如何解决它？

（使用htmltab（）或readHTMLtable（）并没有产生好的结果）

谢谢。

来源

2017-10-08 ViniLima

你表明不能在我结束 – akrun

这是因为“Decretos deAlterações”不完全由ascii字符组成。

如果你尝试用

page2 <- readLines("http://reportviewer.tce.mg.gov.br/default.aspx?server=noruega&relatorio=SICOM_Consulta/2013_2014/Modulo_AM/UC08-ConsultarDecretos-RL&municipioSelecionado=3101607&exercicioSelecionado=2013") 

grep("Decretos de Altera&#231;&#245;es ", page2) 

[1] 366

它的工作原理。

要知道把什么号码更换：

utf8ToInt("ç") 
[1] 231

然后把&和;之间所产生的数量，并替换非ASCII字符。

最佳

科林

来源

2017-10-08 20:36:05

大，科林开的联系！非常感谢你。 – ViniLima

为什么grep（）在readLines（）之后不起作用？

回答

相关问题