如何将文件输入分割为Java中的部分

我需要分离下面的文件中的每个规则。我怎样才能在Java中做到这一点？如何将文件输入分割为Java中的部分

这是该文件的内容

rule apt_regin_2011_32bit_stage1 { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin 32 bit stage 1 loaders" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$key1={331015EA261D38A7} 
$key2={9145A98BA37617DE} 
$key3={EF745F23AA67243D} 
$mz="MZ" 
condition: 
($mz at 0) and any of ($key*) and filesize < 300000 
} 


rule apt_regin_rc5key { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin RC5 decryption keys" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01} 
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78} 
condition: 
any of ($key*) 
} 



rule apt_regin_vfs { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin VFSes" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$a1={00 02 00 08 00 08 03 F6 D7 F3 52} 
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52} 
$a3={00 04 00 10 00 10 03 C2 D3 1C 93} 
$a4={00 04 00 10 C8 00 04 C8 93 06 D8} 
condition: 
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0) 
} 


rule apt_regin_dispatcher_disp_dll { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin disp.dll dispatcher" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$mz="MZ" 
$string1="shit" 
$string2="disp.dll" 
$string3="255.255.255.255" 
$string4="StackWalk64" 
$string5="imagehlp.dll" 
condition: 
($mz at 0) and (all of ($string*)) 
}

按照该文件中所看到的，我需要每一个在文件输入中找到的4条规则分开，任何想法我怎么能做到这一点？请耐心等待我。我是新手提前赞赏！

将所有4条规则分开后，我需要将每条规则放入一个数组列表中。

例如： ArrayList的[0]

rule apt_regin_2011_32bit_stage1 { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin 32 bit stage 1 loaders" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$key1={331015EA261D38A7} 
$key2={9145A98BA37617DE} 
$key3={EF745F23AA67243D} 
$mz="MZ" 
condition: 
($mz at 0) and any of ($key*) and filesize < 300000 
}

ArrayList的[1]

rule apt_regin_rc5key { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin RC5 decryption keys" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01} 
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78} 
condition: 
any of ($key*) 
}

ArrayList的[2]

rule apt_regin_vfs { 
meta: 
copyright = "Kaspersky Lab" 
description = "Rule to detect Regin VFSes" 
version = "1.0" 
last_modified = "2014-11-18" 
strings: 
$a1={00 02 00 08 00 08 03 F6 D7 F3 52} 
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52} 
$a3={00 04 00 10 00 10 03 C2 D3 1C 93} 
$a4={00 04 00 10 C8 00 04 C8 93 06 D8} 
condition: 
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0) 
}

等。

我该怎么做？

来源

2016-09-20 Shawn

Check out ['String.split（“regex”）']（http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split（java。 lang.String））并搜索正则表达式的基本教程。他们非常强大/有用。 – qxz

只是为了记录：如果你的问题是只以“分段”中你输入“规则”，那么就这样做：

List<List<String>> sections = new ArrayList<>(); 
List<String> currentSection = null; 

try (BufferedReader br = new BufferedReader(new FileReader(file))) { 
    String line; 
    while ((line = br.readLine()) != null) { 
    if(line.startsWith("rule ")) { 
     if (currentSection != null) { 
     // we are finished with the previous section! 
     sections.add(currentSection); 
     } 
     currentSection = new ArrayList<>(); 
     currentSection.add(line); 
    } else { 
     if(! line.trim().isEmpty()) { 
     // any non-empty line goes into the current section 
     currentSection.add(line);   
     } 
    } 
} 
} // end of try/while ... I am too lazy to count my braces ;-) 
if (currentSelection != null) { 
    // make sure to add the final section, too! 
    sections.add(currentSelection); 
}

但随后：你是不是你真正的非常精确要求。我很确定你真正的问题不在于“分割”输入文件。

很可能，您的实际任务是读取该文件，并且对于该文件中的每个部分，您需要获取部分/全部内容以供进一步处理。

换句话说：你实际上在问“我该如何解析/处理”这个输入。我们无法回答这个问题。因为您没有告诉我们您要如何处理这些数据。

从本质上说，这是你的选择空间：

如果真的是有这样一个固定的布局，然后在“解析”归结为了解“先来规则，然后是元，这好像 ...”。含义：您将数据结构“硬编码”到您的代码中。例如：你完全“知道”第三行包含copyright = "some value"。然后你开始使用正则表达式（或简单的字符串方法如indexOf（），substring（））来提取你感兴趣的信息。
如果文件格式实际上是某种“标准”（如XMl，JSON ，YAML，...），那么你可能只需拿起一些第三方库来解析这些文件。对于你的例子...我不能说;这绝对不是我熟悉的格式。
最糟糕的情况下，您需要编写自己的解析器。编写解析器是一个复杂的问题，但是“研究得很好”的话题，例如见here。

来源

2016-09-20 03:01:13 GhostCat

你好。感谢您的回应。我编辑了最终需要的案例。你能告诉我如何将每个分离的规则添加到arraylist？ – Shawn

请看我更新的答案。我提供了一些代码给你一些想法如何做到这一点。请注意：此代码未编译/测试;不要盲目地复制/粘贴它。一行一行地阅读，直到你明白它应该做什么**;然后相应地调整你自己的代码！ – GhostCat

很棒！你真的很擅长java。 Upvoted您的解决方案 – Shawn

如何将文件输入分割为Java中的部分

回答

相关问题