2016-09-26 84 views
3

我有这样的文字:提取docIDs并从文件的文件并把它们放在一个HashMap

.I 1 
 
.T 
 
experimental investigation of the aerodynamics of a 
 
wing in a slipstream . 
 
.A 
 
brenckman,m. 
 
.B 
 
j. ae. scs. 25, 1958, 324. 
 
.W 
 
experimental investigation of the aerodynamics of a 
 
wing in a slipstream . 
 
    an empirical evaluation of the destalling effects was made for 
 
the specific configuration of the experiment . 
 
.I 2 
 
.T 
 
simple shear flow past a flat plate in an incompressible fluid of small 
 
viscosity . 
 
.A 
 
ting-yili 
 
.B 
 
department of aeronautical engineering, rensselaer polytechnic 
 
institute 
 
troy, n.y. 
 
.W 
 
simple shear flow past a flat plate in an incompressible fluid of small 
 
viscosity .the discussion here is restricted to two-dimensional incompressible steady flow . 
 
.I 3 
 
.T 
 
the boundary layer in simple shear flow past a flat plate . 
 
.A 
 
m. b. glauert 
 
.B 
 
department of mathematics, university of manchester, manchester, 
 
england 
 
.W 
 
the boundary layer in simple shear flow past a flat plate . 
 
the boundary-layer equations are presented for steady 
 
flow with no pressure gradient .

我需要一个正则表达式在Java中,这将给如下: 每当GET一个“.I 1”,将给出以“.W”结尾之前的文本。“I 2”

+0

好,现在的问题是:什么样的图案你试过了吗? –

+0

使用Java,您需要打开MULTILINE模式https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#MULTILINE 然后,像\ .I之类的东西\ s1。*?\。W(。*?)\。我\ s2应该工作(需要一些转义)。如果我之后的数字对您很重要,您可能需要添加更多组。或者,由于您匹配的最后一件事似乎是您想要匹配的下一件事的一部分,您可能希望排除它。我倾向于为这类东西编写单元测试,然后调整正则表达式直到它工作。也许你可以发布一些代码来说明你确切需要什么? –

回答

1

我认为最简单的方法是使用以下模式找到第一个匹配:

(?<=\.I\s1\s)[\W\w]+(?=\.I\s2) 

你会得到第一个匹配:

(?<=\.W\s)[\W\w]+ 

你会得到一个结果:

.T 
experimental investigation of the aerodynamics of a 
wing in a slipstream . 
.A 
brenckman,m. 
.B 
j. ae. scs. 25, 1958, 324. 
.W 
experimental investigation of the aerodynamics of a 
wing in a slipstream . 
    an empirical evaluation of the destalling effects was made for 
the specific configuration of the experiment . 

然后通过以下方式找到从第一场比赛的第二场比赛

experimental investigation of the aerodynamics of a 
wing in a slipstream . 
    an empirical evaluation of the destalling effects was made for 
the specific configuration of the experiment . 

你的情况可能是这样的:

public static void main(String[] args) { 
    Map<String, String> hashMap = new HashMap<>(); 

    String text = " ... "; // your text here 

    String p1 = null, p2 = "(?<=\\.W\\s)[\\W\\w]+"; 
    Pattern r1 = null, r2 = null; 
    Matcher m1 = null, m2 = null; 

    int i = 1; 
    do { 
     if(i == 3) { 
      p1 = "(?<=\\.I\\s"+ i +"\\s)[\\W\\w]+(?=($))"; 
      i++; 
     } else 
      p1 = "(?<=\\.I\\s"+ i +"\\s)[\\W\\w]+(?=(\\.I\\s"+ ++i +"))"; 

     r1 = Pattern.compile(p1); 
     r2 = Pattern.compile(p2); 

     m1 = r1.matcher(text); 

     String textPart; 
     if(m1.find()) { 
      textPart = m1.group(0); 
      m2 = r2.matcher(textPart); 
      if(m2.find()) 
       hashMap.put(".I " + (i - 1), m2.group(0));    
     }  
    } while(i < 4); 

    for(Map.Entry<String, String> item : hashMap.entrySet()) { 
     System.out.println(item.getKey()); 
     System.out.println(item.getValue()); 
     System.out.println(); 
    } 
} 

结果:

.I 2 
simple shear flow past a flat plate in an incompressible fluid of small 
viscosity .the discussion here is restricted to two-dimensional incompressible steady flow . 


.I 1 
experimental investigation of the aerodynamics of a 
wing in a slipstream . 
    an empirical evaluation of the destalling effects was made for 
the specific configuration of the experiment . 


.I 3 
the boundary layer in simple shear flow past a flat plate . 
the boundary-layer equations are presented for steady 
flow with no pressure gradient . 
+0

谢谢你阿列克谢。它工作。 – user3701435

+0

不客气! –

相关问题