拆分一个字符串，如果一个子存在

“尼康AW130 16MP点和拍摄数码相机黑色5倍光学变焦”

“尼康AW130 16 MP点&傻瓜相机黑色”

我想比较字符串这样的，你可以看到他们两个都是一样的，基于空间时，我记号化，并在第二个字符串比较每个字16和MP之间的空间将CAU的e实际上并不存在的差异。

是否有无论如何我可以在第一个字符串中添加一个空间，其中16MP是在一起，这样我就可以根据空间进行标记。

val productList=List("Nikon Coolpix AW130 16MP Point and Shoot Digital Camera Black with 5x Optical Zoom","Nikon Coolpix AW130 16 MP Point & Shoot Camera Black") 
val tokens = ListBuffer[String]() 
    productList.split(" ").foreach(x => { 
     tokens += x 
    }) 

    val res = tokens.toList

来源

2015-10-15 Naba

'replaceAll（“\\ b16 MP \\ b”，“16MP”）'？或'replaceAll（“\\ b16MP \\ b”，“16 MP”）' –

你到底想要什么？比较两个字符串，不管空间？ – dsharew

你能描述这些字符串的格式吗？我认为你不希望我们给你一个这些例子特有的答案 – Dici

你可以用RegEx来做到这一点。搜索两种格式并将其替换为一个特定格式。

来源

2015-10-15 12:18:30 Alexander

如果你只是想删除一个号码和一个固定MP串之间的空间，你可以使用下面的正则表达式：

scala> "Nikon Coolpix AW130 16 MP Point & Shoot Camera Black".replaceAll("""(\d+) ?(MP)""", "$1$2") 
res13: String = Nikon Coolpix AW130 16MP Point & Shoot Camera Black

的(\d+)一部分的任何数量的匹配至少有1位
的?（注意空格）匹配0或一个空格
的(MP)部分字符串匹配字面上的210。
$1$2将第一个圆括号(\d+)的匹配内容的内容打印到第二个匹配的(MP)的匹配项上 - 如果有空格，则省略该空格。

之后，16MP tokenS应该相等。不过，您仍然会遇到and与&的问题。

来源

2015-10-15 12:21:44

太棒了！这是我想要的还添加了几个更多的模式，如：str.replaceAll（“”“（\ d +）？（MP | GB | mm | cm）”“” – Naba

你不给足够的细节有关这些字符串的格式，但是从这个例子我可以推断出这样的事情：(\w+) (\d+)\s*MP Point.*

然后，您可以解析字符串和阅读正则表达式的群体比较产品。

下面是一个例子：

def main(args: Array[String]): Unit = { 
    val s0 = "Nikon Coolpix AW130 16MP Point and Shoot Digital Camera Black with 5x Optical Zoom" 
    val s1 = "Nikon Coolpix AW130 16 MP Point & Shoot Camera Black" 
    println(Product.parse(s0) == Product.parse(s1)) // prints true 
} 

case class Product(name: String, resolution: Int) 
object Product { 
    private val regex = new Regex("(\\w+) (\\d+)\\s*MP Point.*", "productName", "resolution") 
    def parse(s: String) = regex.findFirstMatchIn(s) match { 
     case Some(m) => Product(m.group("productName"), m.group("resolution").toInt) 
     case None => throw new RuntimeException("Invalid format") 
    } 
}

来源

2015-10-15 12:22:47 Dici

而是分裂很容易做正则表达式替换;连续。

public static boolean equivalent(Sting a, String b) { 
    normalize(a).equalsIgnoreCase(normalize(b)); 
} 

private static String normalize(String s) { 
    return s.replaceAll("(\\d+)", "$0 ") // At least one space after digits. 
     .replaceAll("\\bLimited\\b", "Ltd") // Example. 
     .replace("'", "") // Example. 
     .replace("&", " and ") 
     .replaceAll("\\s+", " ") // Multiple spaces to one. 
     .trim(); 
}

或者对标准化字符串进行拆分（以获取关键字）。

来源

2015-10-15 12:29:09

拆分一个字符串，如果一个子存在

回答

相关问题