我已经发布了关于similar question Python中字符提取使用正则表达式,但我有一个非贪婪量词另一个问题,所以我用一个不同的例子问一个问题。问题是我需要使用Python中的正则表达式提取字符串文本的所有相关部分,并使用两个特定的匹配项。具体而言,这里是一个例子文本:通过在Python
example = """
The Bank does offer a hybrid loan. Hybrid loans are loans that start as a
fixed rate mortgage but after a set number of years automatically adjust
to an adjustable rate mortgage. The Bank offers a three year fixed rate mortgage
after which the interest rate will adjust annually. Item 1. Business 3-13 Item 1a.
Risk Factors 13-15 Item 1b. Unresolved Staff Comments 15 Item 2. Properties 15-16
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements
or to update the reasons why actual results could differ from those projected
in the forward-looking statements. PART 1. ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a\n community bank operating
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street.
In addition, the Bank owns the property from 200 to 208 Ditto Street,
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""
,并和我想提取“之间”从开始起匹配“项目1.”的文本的部分和结束匹配“项目2.”,所以最后的结果应该是这样的:
final_result_1 = """
ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a\n community bank operating
in Northwest Ohio since 1897.
"""
final_result_2 = """
Item 1. Business 3-13 Item 1a.
Risk Factors 13-15 Item 1b. Unresolved Staff Comments 15
"""
最终结果的顺序应该是在最终结果的文本的长度方面,所以“final_result_1”是两个中最长的文本部分,'final_result_2'是最短的一个。你可以参考上一个问题here的答案。先谢谢你!
我很想帮忙,但这个问题是非常令人迷惑。你能否创建一些简短的示例文本并解释一下你想要输出的内容? –
@krcoder,你需要从文本中排除“ITEM 2”,对不对? –
@code_byter,这是真的,以及'final_result_2'被排除的'Item 2'。 – krcoder