正则表达式双引号和换行符之间的提取

例如，我想解析python文件与三重双引号之间的文本，并从该文本生成html表。例如像正则表达式双引号和换行符之间的提取

""" 
Replaces greater than operator ('>') with 'NOT BETWEEN 0 AND #' 
Replaces equals operator ('=') with 'BETWEEN # AND #' 

Tested against: 
    * Microsoft SQL Server 2005 
    * MySQL 4, 5.0 and 5.5 
    * Oracle 10g 
    * PostgreSQL 8.3, 8.4, 9.0 

Requirement: 
    * Microsoft Access 

Notes: 
    * Useful to bypass weak and bespoke web application firewalls that 
     filter the greater than character 
    * The BETWEEN clause is SQL standard. Hence, this tamper script 
     should work against all (?) databases 

>>> tamper('1 AND A > B--') 
'1 AND A NOT BETWEEN 0 AND B--' 
>>> tamper('1 AND A = B--') 
'1 AND A BETWEEN B AND B--' 
"""

HTML表格必须是简单的表包含5列

柱一切"""和\n if new line is empty之间
柱一切Tested against:和\n if new line is empty或Requirement:之间和

文本块\n if new line is empty
柱一切Notes:和\n if new line is empty
柱之间的所有>>>和\n
之间柱一切

4 column end和\n之间，结果必然是：

替换大于运算符（ '>' ）'NOT BETWEEN 0 AND＃' 用'BETWEEN＃AND＃'替换等于运算符（'='）
- 的Microsoft SQL Server 2005
  - 的MySQL 4，5.0和5.5
  - 的Oracle 10g
  - 的PostgreSQL 8.3，8.4，9.0
  或
  - Microsoft Access
- 有用绕过弱和定制的web应用程序的防火墙过滤除字符
- 之间子句越大SQL标准。因此，这种篡改脚本应反对各种（？）数据库
篡改（ '1个A> B--'）篡改（ '1个A = B--'）
“1和未介于0和B--” “1和B之间和B--”

我可以使用什么样的语法来提取？我将使用VBScript.RegExp。

Set fso = CreateObject("Scripting.FileSystemObject") 
txt = fso.OpenTextFile("C:\path\to\your.py").ReadAll 

Set re = New RegExp 
re.Pattern = """([^""]*)""" 
re.Global = True 

For Each m In re.Execute(txt) 
    WScript.Echo m.SubMatches(0) 
Next

来源

2017-04-16 Senior Pomidor

你的问题是相当广泛的，所以我只是概述一种处理这个问题的方法。否则，我不得不为你编写整个脚本，这是不会发生的。

提取docquotes之间的所有内容。使用正则表达式这样提取docquotes之间的文本：
```
Set re1 = New RegExp 
re1.Pattern = """""""([\s\S]*?)""""""" 

For Each m In re1.Execute(txt) 
    docstr = m.SubMatches(0) 
Next 
```
请注意，您需要设置re.Global到True，如果你在你的文件超过1个文档字符串，并希望所有的人处理。否则，你只会得到第一场比赛。
删除前导和与第二正则表达式结尾的空白：
```
Set re2 = New RegExp 
re2.Pattern = "^\s*|\s*$" 
re2.Global = True 'find all matches 

docstr = re2.Replace(docstr, "") 
```
不能使用Trim对于这一点，因为函数只处理空间，而不是其他的空白。

无论是在分割连续2+行字符串分解得到的文档部分，或使用其他正则表达式来提取它们：

Set re3 = New RegExp 
re3.Pattern = "([\s\S]*?)\r\n\r\n" + 
       "Tested against:\r\n([\s\S]*?)\r\n\r\n" + 
       ... 

For Each m In re3.Execute(txt) 
    descr = m.SubMatches(0) 
    tested = m.SubMatches(1) 
    ... 
Next

继续打破章节，直到你具备的要素你想显示。然后从这些元素构建HTML。

来源

2017-04-16 17:41:47

非常感谢你 –

优秀的解释。 – Lankymart

正则表达式双引号和换行符之间的提取

回答

相关问题