2014-12-13 54 views
1

我正在将内容从phpBB迁移到WordPress。我已经成功地将bbcode转换为html。phpbb BBCode到HTML(正则表达式或其他)

BBCode由注入每个标签的字母数字字符串变得复杂。

常见的职位将包含文字是这样的...

[url=url] Click here [/url:583ow9wo] 

[b:583ow9wo] BOLD [/b:583ow9wo] 

[img:583ow9wo] jpg [/img:583ow9wo] 

我初出茅庐使用正则表达式,但认为这可能是一条出路,因为我发现从以下后https://stackoverflow.com/a/5505874/4356865(使用正则表达式一些帮助[/?b:\ d {5}]),但在这个实例中的正则表达式只会从这个例子中删除数字字符。

任何帮助表示赞赏。

+0

上交你的正则表达式换行符修改器我是在垃圾从头开始编写自己的正则表达式,但http://txt2re.com/是一个很大的帮助,以启动找出使用的正确表达式。 – woubuc 2014-12-13 14:05:20

+2

难道你不能使用浮动在网络上的数万亿bbcode-to-html转换器之一吗? – Wintermute 2014-12-13 14:43:59

回答

1

像这样的东西会为没有属性标签工作:

\[(b|i|u)(:[a-z0-9]+)?\](.*?)\[\/\1(?:\2)?\] 

\[    -- matches literal "[" 
    (b|i|u)  -- matches b, i, or u, captures as backreference 1 
    (:[a-z0-9]+)? -- matches colon and then alphanumeric string, captures as backreference 2 
       -- the question mark allows the :string not to be present. 
\]    -- matches literal "]" 
(.*?)   -- matches anything*, as few times as required to finish the match, creates backreference 3. 
\[    -- matches literal "[" 
    \/    -- matches literal "/" 
    \1    -- invokes backreference 1 to make sure the opening/closing tags match 
    (?:\2)?  -- invokes backreference 2 to further make sure it's the same tag 
\]    -- matches literal "]" 

匹配像URL标记是足够

容易有属性的标签,他们做不同的事情与他们的属性,以及所以它可能更容易处理像IMG这样的标签像URL的标签。

\[(url)(?:\s*=\s*(.*?))?(:[a-z0-9]+)\](.*?)\[\/\1(?:\3)?\] 

\[     -- matches literal "[" 
    (url)    -- matches literal "url", in parentheses so we can invoke backreference 1 later, easier for you to modify 
    (?:     -- ?: signifies a non-capturing group, so it creates a group without creating a backreference, or altering the backreference count. 
    \s*=\s*   -- matches literal "=", padded by any amount of whitespace on either side 
    (.*?)    -- matches any character, as few times as possible, to complete the match, creates backreference 2 
)     -- closes the noncapturing group 
    (:[a-z0-9]+)  -- matches the alphanumeric string as backreference 3. 
\]     -- matches literal "]" 
(.*?)     -- matches any character as few times as possible to complete the match, backreference 4 
\[     -- matches literal "[" 
    \/     -- matches literal "/" 
    \1     -- invokes backreference 1 
    (?:\3)?    -- invokes backreference 3 
\]     -- matches literal "[" 

为了您的替换,标签的内容本身在反向引用中,所以您可以为b/i/u标签做类似的事情。

<\1>\3</\1> 

的URL标记,它是这样的

<A href="\2">\4</A> 

我说点/周期在多个地方的任何字符匹配。它匹配除换行符以外的任何字符。您可以通过使用"dotall"修改s这样

/(.*)<foo>/s