2012-04-02 49 views
10

我与建立一个正则表达式解析这种字符串(圣经经文)挣扎:PHP的preg_match圣经经文格式

'John 14:16–17, 25–26' 
    'John 14:16–17' 
    'John 14:16' 
    'John 14' 
    'John' 

所以基本模式是:

Book [[Chapter][:Verse]]

哪里章节和诗节是可选的。

+0

所以它应该匹配,即使它只是书的名字?你有一本书应该匹配的清单吗?否则,它只会匹配每个字。 – JJJ 2012-04-02 09:36:41

+0

只要匹配任何单词,真正的问题是我有这么多的可选部分。 – Dziamid 2012-04-02 09:41:01

回答

4

试试这个位置

\b[a-zA-Z]+(?:\s+\d+)?(?::\d+(?:–\d+)?(?:,\s*\d+(?:–\d+)?)*)? 

看到和测试here on Regexr

因为(?:,\s*\d+(?:–\d+)?)*在年底可以有经文列表,经文范围结尾。

+0

你是最普通的一个。我只添加了'[ - ]'而不是@Robby建议的连字符,并且一些捕获括号使它完美。 – Dziamid 2012-04-02 09:59:52

3

使用这个表达式:

[A-Za-z]+(([0-9]+)(:[0-9]+)?([\-–][0-9]+)?(, [0-9]+[\-–][0-9]+)?)? 

或者在它的 '漂亮' 的版本:

\w+((\d+)(:\d+)?([\-–]\d+)?(, \d+[\-–]\d+)?)? 

更新:要匹配破折号或连字符


NOTE:我测试过它,它匹配所有5个可能的版本。

例子:http://regexr.com?30h4q

enter image description here

9

我认为这确实你需要:

\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})? 

假设:

  • 的数字总是在套1或2位数字
  • 破折号将匹配以下-

下面是正则表达式与评论:

" 
\w   # Match a single character that is a “word character” (letters, digits, and underscores) 
    +   # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
\s   # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) 
    ?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 1 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 2 
    :   # Match the character “:” literally 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 3 
    [-–]  # Match a single character present in the list “-–” 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 4 
    ,   # Match the character “,” literally 
    \s   # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
    [-–]  # Match a single character present in the list “-–” 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
" 

这里是在PHP中的一些用法示例:

if (preg_match('/\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})?/', $subject)) { 
    # Successful match 
} else { 
    # Match attempt failed 
} 

获取给定字符串中所有匹配的数组

preg_match_all('/\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})?/', $subject, $result, PREG_PATTERN_ORDER); 
$result = $result[0]; 
+0

所以它会匹配破折号或连字符? – Dziamid 2012-04-02 09:47:11

+0

是的,这是正确的吗? – Robbie 2012-04-02 09:48:26

+0

为此+1,谢谢 – Dziamid 2012-04-02 10:22:18

0
([1|2|3]?([i|I]+)?(\s?)\w+(\s+?))((\d+)?(,?)(\s?)(\d+))+(:?)((\d+)?([\-–]\d+)?(,(\s?)\d+[\-–]\d+)?)? 

适用于几乎每本书...

0
(\b[a-zA-Z]\w+\s\d+)(:\d+)+([-–]\d+)?([,;](\s)?(\d+:)?\d+([-–]\d+)?)? 

这是这里介绍的所有代码的混合。唯一的格式,它不会强调是“书名只”或“书&节只”(刚加入“:1,所有”后章#)我发现提供给有资格太多变化的其他代码,不符合圣经经文的语法。

这些是我在RegExr测试的例子:(不能发表图片尚未)

约翰 洪堡14:16-17,25-26
约14: 16-17
约翰14:16
约翰77:3; 2:9-11
约5:1-所有 布拉德555-783-6867
约翰6
您好你怎么样
斯拉32:5约14 :16-17,25-36
23时34分
约14:16-17,25-36
约翰福音14:16-17; 32:25