如何在C++中读取格式不正确的输入数据？

我正在学习C++并被困在练习中。如何读取格式不正确的数据？比如，我给，我需要读取数据，看起来像这样的文件：如何在C++中读取格式不正确的输入数据？

1 z 2 
1 xy 2 
3 A 8000 E 1777 E 2001

的第一，第二和第三线构成一个“模块”。有许多模块的数据将通过键盘输入。我的程序必须接受用户的所有输入（直到用户决定输入“q”退出），然后读取该输入并操作数据。理想情况下，输入将被格式化正确像上面的例子，但有时数据将有额外的空格，制表符，回车或数据从以前的模块开始，像这样：

2 R 5001 E 4777 1 z 2  1 xy 2 
3 A 8000 E 1777 
E 2001

什么是最好的如何读取和处理格式不正确的输入数据？在这种情况下，我希望能够提取1 z 2,1 xy 2和3 A 8000 E 1777 E 2001并将其存储在数组或某种类型的STL容器中，并且能够稍后使用此信息做些事情（例如基于数字的添加，减少或乘数在该号码前面是否有“A”，“S”或“M”）。

我的程序必须能够认识到Z和XY是变量和Z = 2和XY = 2

来源

2016-03-07 MNRC

可能应该尝试使用正则表达式.. http://www.cplusplus.com/reference/regex/ –

您可以使用std::cin >>避免空白

std::string input = ""; 
std::vector<std::vector<std::string>> data; 
std::vector<std::string> temp; 
unsigned line = 1; 

while (std::cin >> input && input != "q") { 
    temp.push_back(input); 
    if (int(input) == line && line != 1) { 
     data.push_back(temp); 
     temp.clear(); 
     line++; 
    } 
}

这将用data向量填充所有来自标准输入的输入，直到输入“q”后才输入。

这就是您要找的内容吗？

编辑：我添加了你的行（行号）解析请求的行。

来源

2016-03-07 04:58:15

我的程序如何识别'2 R 5001 E 4777'不是模块的一部分（从第二个例）？或者'1 z 2 1 xy 2'应该分别分为第1行和第2行？ – MNRC

'while（std :: cin >> input &&！！=“q”）'会更好......处理文件结尾（Windows键盘上的^ Z，UNIX/linux上的^ D或者管道的末端/重定向输入）。 MNRC：如果换行符相关，可以使用'while（getline（std :: cin，line））{std :: istringstream iss（line）; '然后Paul的解决方案...'while while（iss >> input && input！= q）...'，然后在每行之后，用'data'做一些事情 - 处理和移除行，或者存储一些sentinel字符串来指示那是一条线路结束的地方。 –

@TonyD你说得对。我编辑了我的答案。如果逐行解析是你想要的，那么使用isstringstream会很有帮助。这个问题有点令人困惑，所以很难理解你想要什么;）但是，我同意托尼 –

这种事很难做到。这里是我的：

ifstream in("file.txt"); 
vector<string> v; 
string line; 

while(getline(in, line)) { 
    v.push_back(line.substr(1)); // remove line number 
}

来源

2016-03-07 05:10:11

Hi G. Johal。 'while（getline（'）方法通常是有用的，但在重读这个问题几次时，用户输入的例子太破碎了，不能依赖任何行结束符。 //删除行号“也假设第一个数字是为了递增，但它似乎是该模块所遵循的变量名称/数字值对的数目 –

此外，如果有任何空行输入，这段代码调用未定义的行为 – dreamlax

如果你更好地描述了你的输入部分的逻辑目的，你会得到更好的答案。我打算猜想每个模块以一个有多少个变量名/数值对的计数开始，这允许更加结构化的方法来读取和存储这些值。我从变量名称向值中填充了一个映射（二叉树）向量（数组），这可能会方便以后的查找和处理。

std::vector<std::map<std::string, int>>> vars; 
int vars_in_module; 
while (cin >> vars_in_module) 
{ 
    vars.emplace_back(); // add an empty module to vector 
    std::string identifier; 
    int value; 
    for (int i = 1; i <= vars_in_module; ++i) 
     if (cin >> identifier >> value) 
      vars.back()[identifier] = value; // add var to module 
     else 
     { 
      std::cerr << "error parsing variable identifier & value\n"; 
      exit(1); 
     } 
}

的map不字典顺序排序重新（使用最左边的字符的ASCII顺序，然后 - 如果这等于 - 一个向右等），而不是保存在它们被输入的顺序，这取决于您放置变量的用途，这可能也可能不重要。 map稍后可以快速搜索特定的标识符，但如果您关心输入顺序，则可以使用vector。

来源

2016-03-07 05:38:10

我认为这是要走的路，这很有意义，并且似乎回答了令人困惑的问题:)。做得好！ –

你可以使用正则表达式：

#include <regex> 
#include <string> 
#include <iostream> 

int main() 
{ 
    // get data from file or user input etc. Here I have hardcoded it with 
    // some newlines just to show how it works. 
    std::string data = 
     R"(2 R 5001 E 4777 1 z 2  1 
     xy 2  3 A 8000 
     E 1777  E 2001)"; 

    // Unfortunately the amount of space involved makes this regex rather 
    // ugly, but basically "\s+" means to match at least one whitespace 
    // character (which includes newlines, tabs, and spaces) 
    std::regex moduleregex(R"(1\s+z\s+2\s+1\s+xy\s+2\s+3\s+([AMS])\s+(\d+)\s+E\s+(\d+)\s+E\s+(\d+))"); 

    std::smatch result; 
    if (std::regex_search(data, result, moduleregex)) 
    { 
     // Program will end up here if the match was successful 
     std::string op = result[1]; 
     int operand1 = std::stoi(result[2]); 
     int operand2 = std::stoi(result[3]); 
     int operand3 = std::stoi(result[4]); 

     // based on the input above: 
     // "op" now contains "A" (it could be "M" or "S" depending on input) 
     // "operand1" now contains 8000 
     // "operand2" now contains 1777 
     // "operand3" now contains 2001 
    } 
    else 
    { 
     std::cerr << "Could not find module information in input" << std::endl; 
    } 
}

注意，没有错误检查这里除了输入是否匹配的正则表达式。您将要包装的代码在try/catch块和catch std::out_of_range将被抛出，如果输入的是为int类型太大（你也可以使用long与std::stol或long long与std::stoll如果你需要支持更高的范围）。它也只会匹配正数。如果你想匹配负数，这是读者的练习！

来源

2016-03-07 06:05:14 dreamlax

如何在C++中读取格式不正确的输入数据？

回答

相关问题