2014-09-02 94 views
0

我得到这个作为对API命中的响应。正则表达式/子字符串提取所有匹配的模式/组

1735 Queries 

Taking 1.001303 to 31.856310 seconds to complete 

SET timestamp=XXX; 
SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

38 Queries 

Taking 1.007646 to 5.284330 seconds to complete 

SET timestamp=XXX; 
show slave status; 

6 Queries 

Taking 1.021271 to 1.959838 seconds to complete 

SET timestamp=XXX; 
SHOW SLAVE STATUS; 

2 Queries 

Taking 4.825584, 18.947725 seconds to complete 

use marketing; 
SET timestamp=XXX; 
SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

我已经提取了这一点响应HTML,并把它作为一个字符串now.I需要尽可能简明扼要这样,我得到一个地图格式地图的值的检索值(查询 - > T1到T2秒)基本上,这是在MySQL从服务器上运行的所有慢查询的状态。我正在建立一个警报系统。所以从整个段落中以字符串的形式我需要分开查询并保存相应的时间范围。 1.001303到31.856310是一个时间范围。和反对的时间范围内相应的查询是:

SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

这个信息,我希望在一阶地图保存。甲地图形式的(query:String->timeRange:String)

又如:

("use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified xyz ;"->"4.825584 to 18.947725 seconds") 

“” “(。)###()### \ n \ n(。*)###”” “.r.findAllIn(reqSlowQueryData).matchData foreach {m => println(”group0“+ m.group(1)+”next group“+ m.group(2)+ m.group(3)}

我正在使用上面的语句来提取重复的单元格,以便稍后对其进行操作。但它似乎不工作;

感谢提前!我知道有几种方法可以做到这一点,但所有引人注目的方法都是低效和乏味的。我需要Scala来做同样的事情!也许我可以使用subString方法递归提取?

+0

就像近复制你昨天公布,目前还不清楚你想要什么。请编辑您的问题并更加清晰地格式化它。然后告诉我们你已经尝试了什么,以及你被困在哪里,因为这不是一个网站来说“给我的代码” – 2014-09-02 07:24:14

+3

可能的重复[智能和快速的方法来解析字符串以获取所需的数据](http:// stackoverflow.com/questions/25608460/intelligent-and-quick-way-to-parse-string-to-get-required-data) – 2014-09-02 07:24:44

回答

1

如果你想使用Scala的尝试:

val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range 

    val txt = """ 
       |1735 Queries 
       | 
       |Taking 1.001303 to 31.856310 seconds to complete 
       | 
       |SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 
       | 
       |38 Queries 
       | 
       |Taking 1.007646 to 5.284330 seconds to complete 
       | 
       |SET timestamp=XXX; show slave status; 
       | 
       |6 Queries 
       | 
       |Taking 1.021271 to 1.959838 seconds to complete 
       | 
       |SET timestamp=XXX; SHOW SLAVE STATUS; 
       | 
       |2 Queries 
       | 
       |Taking 4.825584, 18.947725 seconds to complete 
       | 
       |use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 
    """.stripMargin 


def logToMap(txt:String) = { 
    val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){ 
     (acc,el) => 
     val (taking,map) = acc // taking contains range 
     taking match { 
      case Some(range) if el.trim.nonEmpty => //Some contains range 
      (None,map + (el -> range)) // add to map 
      case None => 
      regex.findFirstIn(el) match { //extract range 
       case Some(range) => (Some(range),map) 
       case _ => (None,map) 
      } 
      case _ => (taking,map) // probably empty line 
     } 
    } 
map 
} 
0

修改ajozwik的回答为SQL工作命令,多行:

val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range 
    def logToMap(txt:String) = { 
    val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){ 
    (accumulator,element) => 
     val (taking,map) = accumulator 
     taking match { 
     case Some(range) if element.trim.nonEmpty=> { 
      if (element.contains("Queries")) 
      (None, map) 
      else 
      (Some(range),map+(range->(map.getOrElse(range,"")+element))) 
     } 
     case None => 
      regex.findFirstIn(element) match { 
      case Some(range) => (Some(range),map) 
      case _ => (None,map) 
      } 
     case _ => (taking,map) 
     } 
    } 
    println(map) 
    map 
    }