0
这是我计算词频码词频计数
word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay", "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]
arr_stop_kwd=["a","and"]
frequencies = Hash.new(0)
word_arr.each { |word|
if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
frequencies["#{word.downcase}"] += 1
end
}
当我有100K的数据将采取9.03秒,即,S来多少时间我可以计算出任何其它方式
THX提前
先生我使用红宝石1.8.7当我需要'facets'我发现一个错误堆栈级别太深我该如何解决这个 – 2013-03-20 11:06:49
你需要安装宝石。尝试运行'gem install facets'或者添加'facets'到您的'.gemfile'如果你正在使用bundler – 2013-03-20 11:20:15
我已经安装了它们 – 2013-03-20 11:28:19