2015-02-07 301 views

回答

2

不知道你原来的单引号模式是正确的,这个人会发现所有逗号双引号外面:

preg_match_all('~"(?:[^\\\"]+(?:\\\.)*|\\\.)*+"(*SKIP)(*F)|,~s', $subject, $matches); 

图案的详细资料:

~ 
" 
(?:   # all possible content between quotes 
    [^\\\"]+ # all that is not a double quote or a backslash 
    (?:\\\.)* # eventual escaped characters 
    |   # OR 
    \\\.  # an escaped character 
)*+   # repeat zero or more times (possessive) 
"    # closing double quote, can be replaced with (?:"|\z) or "? 
(*SKIP)(*F) # forces the pattern to fail and to not retry double quoted parts 
|    # OR 
,    # a comma 
~ 
s    # allow the dot to match newlines characters 

注意:如果你要考虑在孤儿双引号之后作为引用子字符串的子字符串(直到字符串结尾),可以用(?:"|\z)或更多简单地替换模式中的最后一个双引号

注2:大幅减少的找到匹配所需的步骤,该模式可以写成这样:

~[^,"]*+\K(?:"[^"\\\]*+(?:(?:\\\.)+[^\\\"]*)*+"?|,(*ACCEPT)|\z(*COMMIT).)(*SKIP)(*F)~s 

,或者如果你想使用的第一个字符识别工艺:

~(?=[",])(?:"[^"\\\]*+(?:(?:\\\.)+[^\\\"]*)*+"?(*SKIP)(*F)|,)~s 
1

匹配单引号和双引号之外的所有逗号。

(?s)(?:(?<!\\)'(?:\\'|[^'])*'|(?<!\\)"(?:\\"|[^"])*")(*SKIP)(*F)|, 

DEMO