2011-03-30 26 views
0

以下步骤重现错误。任何解决方法/修复?谢谢。 还发生这种情况与关于3个字符的非英文字符串的正则表达式错误(Java :: JavaLang :: ArrayIndexOutOfBoundsException:4)

的JRuby 1.5.2/2.3.9导轨和JRuby 1.6/3.0.5轨道

regex = /(aaa|bbb):/ 
str = "\343\202\242:" 
str =~ regex 

步骤操作。

d:\myapp>jruby script/rails console 
Loading development environment (Rails 3.0.5) 
irb(main):001:0> regex = /(aaa|bbb):/ 
=> /(aaa|bbb):/ 
irb(main):002:0> str = "\343\202\242:" 
=> "péó:" 
irb(main):003:0> str =~ regex 
Java::JavaLang::ArrayIndexOutOfBoundsException: 4 
     from org.jcodings.MultiByteEncoding.safeLengthForUptoFour(MultiByteEncoding.java:5 
     from org.jcodings.specific.NonStrictUTF8Encoding.length(NonStrictUTF8Encoding.java 
     from org.joni.Matcher.forwardSearchRange(Matcher.java:124) 
     from org.joni.Matcher.search(Matcher.java:432) 
     from org.jruby.RubyRegexp.search(RubyRegexp.java:1474) 
     from org.jruby.RubyRegexp.op_match(RubyRegexp.java:1391) 
     from org.jruby.RubyString.op_match(RubyString.java:1557) 
     from org.jruby.RubyString$i$1$0$op_match.call(RubyString$i$1$0$op_match.gen:65535) 
     from org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java: 
     from org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:139) 
     from org.jruby.ast.CallOneArgNode.interpret(CallOneArgNode.java:57) 
     from org.jruby.ast.NewlineNode.interpret(NewlineNode.java:103) 
     from org.jruby.ast.RootNode.interpret(RootNode.java:129) 
     from org.jruby.evaluator.ASTInterpreter.INTERPRET_EVAL(ASTInterpreter.java:95) 
     from org.jruby.evaluator.ASTInterpreter.evalWithBinding(ASTInterpreter.java:160) 
     from org.jruby.RubyKernel.evalCommon(RubyKernel.java:1134) 
... 158 levels... 
     from org.jruby.RubyKernel$s$1$0$require.call(RubyKernel$s$1$0$require.gen:65535) 
     from org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrNBlock.call(Java 
     from org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61) 
     from org.jruby.internal.runtime.methods.AliasMethod.call(AliasMethod.java:61) 
     from org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java: 
     from org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:139) 
     from script.rails.__file__(script/rails:6) 
     from script.rails.load(script/rails) 
     from org.jruby.Ruby.runScript(Ruby.java:670) 
     from org.jruby.Ruby.runNormally(Ruby.java:574) 
     from org.jruby.Ruby.runFromMain(Ruby.java:423) 
     from org.jruby.Main.doRunFromMain(Main.java:278) 
     from org.jruby.Main.internalRun(Main.java:198) 
     from org.jruby.Main.run(Main.java:164) 
     from org.jruby.Main.run(Main.java:148) 
     from org.jruby.Main.main(Main.java:128)irb(main):004:0> 

回答

0

也许正则表达式没有分隔。
如何
STR =〜/(AAA | BBB):/


正则表达式= '(AAA | BBB):'
STR =〜/正则表达式/

+0

通过str =〜/正则表达式/正则表达式为/正则表达式/不/(AAA | BBB):/为正则表达式不与它的值代替。 – 2011-03-31 22:46:45

0

我认为,因为您传递的字符串包含多字节字符,您需要传递/ u regex参数以将其解析为UTF-8字符串。

刚才检查本不同版本的Ruby的,而且只出现在JRuby的,所以我想你已经发现了一个错误发生;如果你使用类似“弦”)

.to_java_string首先它似乎工作,但它实际上首先将其转换为ISO-8859-1,而不是您想要的。要维护你的编码,只需使用.to_java并将其传递给正则表达式。

这里有一个解决办法应该工作,我认为:

regex = /(aaa|bbb):/u 
str = "\343\202\242:" 
str.to_java =~ regex