列表中有空白的字符串？

我有这个函数sentanceParse与一个字符串输入返回一个列表。输入内容可能类似于“你好，我叫安东，你叫什么名字？”然后返回值将是一个包含“你好我的名字是安东”和“你叫什么名字？”的列表。但是，这不是发生了什么。看起来好像句子中的空格被当作分隔符对待，因此返回值相当“你好”，“我的”，“名字”等等，而不是我所期望的。列表中有空白的字符串？

你会如何建议我解决这个问题？

由于我不是100％肯定的，问题不在我的代码中，我将添加到后期还有：

主营：

list<string> mylist = sentanceParse(textCipher); 
list<string>::iterator it; 
for(it = mylist.begin(); it != mylist.end(); it++){ 
    textCipher = *it; 
    cout << textCipher << endl; //This prints out the words separately instead of the entire sentances.

sentanceParse：

list<string> sentanceParse(string strParse){ 
    list<string> strList; 
    int len = strParse.length(); 
    int pos = 0; 
    int count = 0; 
    for(int i = 0; i < len; i++){ 
     if(strParse.at(i) == '.' || strParse.at(i) == '!' || strParse.at(i) == '?'){ 
      if(i < strParse.length() - 1){ 
       while(i < strParse.length() - 1 && (strParse.at(i+1) == '.' || strParse.at(i+1) == '!' || strParse.at(i+1) == '?')){ 
        if(strParse.at(i+1) == '?'){ 
         strParse.replace(i, 1, "?"); 
        } 
        strParse.erase(i+1, 1); 
        len -= 1; 
       } 
      } 
      char strTemp[2000]; 
      int lenTemp = strParse.copy(strTemp, i - pos + 1, pos); 
      strTemp[lenTemp] = '\0'; 
      std::string strAdd(strTemp); 
      strList.push_back(strAdd); 
      pos = i + 1; 
      count ++; 
     } 
    } 

    if(count == 0){ 
     strList.push_back(strParse); 
    } 

    return strList; 
}

来源

2012-02-28 Anton

任何你不使用Boost的理由？有['升压:: tokenizer'（http://www.boost.org/doc/libs/1_49_0/libs/tokenizer/index.html）为例，这将做你的工作完全正常的（虽然文件是一点点..斯巴达人）。 – Xeo 2012-02-28 01:39:32

从来没有听说过它，我会检查出来。 – Anton 2012-02-28 01:40:01

基本上，它看起来像'标记生成器> toks（strParse，char_seperator （） “！？”）; for（auto＆tok：toks）{/ *处理每个句子... * /}' – Xeo 2012-02-28 01:44:34

你实现句子解析是错误的，这里是一个简单的正确的解决方案。

std::list<std::string> sentence_parse(const std::string &str){ 
    std::string temp; 
    std::list<std::string> t; 

    for(int x=0; x<str.size();++x){ 
     if(str[x]=='.'||str[x]=='!'||str[x]=='?'){ 
      if(temp!="")t.push_back(temp);//Handle special case of input with 
             //multiple punctuation Ex. Hi!!!! 
      temp=""; 
     }else temp+=str[x]; 
    } 
    return t; 
}

编辑：

下面是使用该功能的完整示例程序。在你的控制台输入一些句子，按回车，它会用新行分隔句子而不是标点符号。

#include <iostream> 
#include <string> 
#include <list> 
std::list<std::string> sentence_parse(const std::string &str){ 
    std::string temp; 
    std::list<std::string> t; 

    for(int x=0; x<str.size();++x){ 
     if(str[x]=='.'||str[x]=='!'||str[x]=='?'){ 
      if(temp!="")t.push_back(temp);//Handle special case of input with 
              //multiple punctuation Ex. Hi!!!! 
      temp=""; 
     }else temp+=str[x]; 
    } 
    return t; 
} 
int main (int argc, const char * argv[]) 
{ 
    std::string s; 

    while (std::getline(std::cin,s)) {  
     std::list<std::string> t= sentence_parse(s); 
     std::list<std::string>::iterator x=t.begin(); 
     while (x!=t.end()) { 
      std::cout<<*x<<"\n"; 
      ++x; 
     } 

    } 

    return 0; 
}

来源

2012-02-28 01:48:10

但是说输入是“一二三！”，它们不会被划分为列表中的“一”，“二”和“三”吗？我的问题是空格的行为像分隔符。 – Anton 2012-02-28 01:59:51

试试这个代码，它将工作，不会有空白作为分隔符。 – 2012-02-28 02:06:46

它不起作用（测试它）。它只返回每个给定惯例的最后一个单词。当列表被添加到列表中时，只包含最后一个空白后的内容。 – Anton 2012-02-28 02:17:04

// This function should be easy to adapt to any basic libary 
// this is in Windows MFC 
// pass in a string, a char and a stringarray 
// returns an array of strings using char as the separator 

void tokenizeString(CString theString, TCHAR theToken, CStringArray *theParameters) 
{ 
    CString temp = ""; 
    int i = 0; 

    for(i = 0; i < theString.GetLength(); i++) 
    {         
     if (theString.GetAt(i) != theToken) 
     { 
      temp += theString.GetAt(i); 
     } 
     else 
     { 
      theParameters->Add(temp); 
      temp = ""; 
     } 
     if(i == theString.GetLength()-1) 
      theParameters->Add(temp); 
    } 
}

来源

2012-02-28 01:40:01

如果输入不是用户生成的，这将工作。假设输入是“Hi !!!我是安东，你叫什么名字！？”。在这种情况下，我希望回归是“嗨！”，“我是安东”，“你叫什么名字？”。我可能会考虑像在这里一样处理数组，而不是列表。 – Anton 2012-02-28 01:48:10

列表中有空白的字符串？

回答

相关问题