获取Word频率从矢量在C++

我GOOGLE了这个问题，并找不到一个答案，与我的代码工作，所以我写这个来获取单词的频率唯一的问题是，我得到错误的数字除了形成一个我认为是侥幸的词汇之外。此外，我正在检查，看看一个单词是否已经进入矢量，所以我不计算两次相同的单词。获取Word频率从矢量在C++

fileSize = textFile.size(); 
vector<wordFrequency> words (fileSize); 
int index = 0; 
for(int i = 0; i <= fileSize - 1; i++) 
{ 
    for(int j = 0; j < fileSize - 1; j++) 
    { 
     if(string::npos != textFile[i].find(textFile[j]) && words[i].Word != textFile[j]) 
     { 
      words[j].Word = textFile[i]; 
      words[j].Times = index++; 
     } 
    } 
    index = 0; 
}

任何帮助，将不胜感激。

来源

2012-03-11 bobthemac

您是否获得了比预期更多的事件？你的程序中的文本文件的查找成员函数做什么？ – bhuwansahni 2012-03-11 11:37:31

@bhuwansahni是的，我得到一个是正确的。 find是一个查找匹配字符串的向量函数。 – bobthemac 2012-03-11 11:40:06

什么发现失败和成功的回报？ – bhuwansahni 2012-03-11 11:45:53

试试这个代码，而不是如果你不想使用地图容器..

struct wordFreq{ 
    string word; 
    int count; 
    wordFreq(string str, int c):word(str),count(c){} 
    }; 
vector<wordFreq> words; 

int ffind(vector<wordFreq>::iterator i, vector<wordFreq>::iterator j, string s) 
{ 
    for(;i<j;i++){ 
     if((*i).word == s) 
      return 1; 
    } 
    return 0; 
}

代码查找没有出现在一个文本载体则是：

for(int i=0; i< textfile.size();i++){ 
    if(ffind(words.begin(),words.end(),textfile[i])) // Check whether word already checked for, if so move to the next one, i.e. avoid repetitions 
     continue; 
    words.push_back(wordFreq(textfile[i],1));   // Add the word to vector as it was not checked before and set its count to 1 
    for(int j = i+1;j<textfile.size();j++){   // find possible duplicates of textfile[i] 
     if(file[j] == (*(words.end()-1)).word) 
      (*(words.end()-1)).count++; 
    } 
}

来源

2012-03-11 12:59:21 bhuwansahni

需要一点调整，但现在得到它的工作感谢帮助。 – bobthemac 2012-03-11 14:02:07

哎哟...这很尴尬！使用'map'或'unordered_map'类更简单！ – 2012-03-11 14:11:30

是啊使用地图会好得多，但如果你不想使用它... – bhuwansahni 2012-03-11 17:10:46

请考虑使用std::map<std::string,int>代替。地图类将处理确保你没有任何重复。

来源

2012-03-11 11:41:30

使用的关联容器：

typedef std::unordered_map<std::string, unsigned> WordFrequencies; 

WordFrequencies count(std::vector<std::string> const& words) { 
    WordFrequencies wf; 
    for (std::string const& word: words) { 
    wf[word] += 1; 
    } 
    return wf; 
}

这是很难得简单...

注：您可以map取代unordered_map，如果你想在世界上按字母顺序排序，你可以编写自定义的比较操作对待他们不区分大小写。

来源

2012-03-11 14:14:02

获取Word频率从矢量在C++

回答

相关问题