2017-04-12 65 views
3

我试图解析一个由空格分隔的,可选标记的关键字字符串。例如,Boost.Spirit解析可选前缀

descr:expense type:receivable customer 27.3 

其中冒号之前的表达是标记,并且它是可选的(即,假定默认标记)。

我不能完全得到解析器来做我想做的事。我对canonical example做了一些小修改,其目的是解析键/值对(很像HTTP查询字符串)。

typedef std::pair<boost::optional<std::string>, std::string> pair_type; 
typedef std::vector<pair_type> pairs_type; 

template <typename Iterator> 
struct field_value_sequence_default_field 
    : qi::grammar<Iterator, pairs_type()> 
{ 
    field_value_sequence_default_field() 
     : field_value_sequence_default_field::base_type(query) 
    { 
     query = pair >> *(qi::lit(' ') >> pair); 
     pair = -(field >> ':') >> value; 
     field = +qi::char_("a-zA-Z0-9"); 
     value = +qi::char_("a-zA-Z0-9+-\\."); 
    } 

    qi::rule<Iterator, pairs_type()> query; 
    qi::rule<Iterator, pair_type()> pair; 
    qi::rule<Iterator, std::string()> field, value; 
}; 

然而,当我分析它,当标签被冷落时,optional<string>不是空/假。相反,它有一个价值的副本。这一对的第二部分也具有价值。

如果untagged关键字不能是标签(语法规则,例如有一个小数点),那么事情就像我所期望的那样工作。

我在做什么错?这是PEG的概念错误吗?

回答

2

相反,它有一个值的副本。这一对的第二部分也具有价值。

这是容器属性和回溯的常见错误:使用qi::hold,例如, Understanding Boost.spirit's string parser

pair = -qi::hold[field >> ':'] >> value; 

完整的示例Live On Coliru

#include <boost/spirit/include/qi.hpp> 
#include <boost/fusion/adapted/std_pair.hpp> 
#include <boost/optional/optional_io.hpp> 
#include <iostream> 

namespace qi = boost::spirit::qi; 

typedef std::pair<boost::optional<std::string>, std::string> pair_type; 
typedef std::vector<pair_type> pairs_type; 

template <typename Iterator> 
struct Grammar : qi::grammar<Iterator, pairs_type()> 
{ 
    Grammar() : Grammar::base_type(query) { 
     query = pair % ' '; 
     pair = -qi::hold[field >> ':'] >> value; 
     field = +qi::char_("a-zA-Z0-9"); 
     value = +qi::char_("a-zA-Z0-9+-\\."); 
    } 
    private: 
    qi::rule<Iterator, pairs_type()> query; 
    qi::rule<Iterator, pair_type()> pair; 
    qi::rule<Iterator, std::string()> field, value; 
}; 

int main() 
{ 
    using It = std::string::const_iterator; 

    for (std::string const input : { 
      "descr:expense type:receivable customer 27.3", 
      "expense type:receivable customer 27.3", 
      "descr:expense receivable customer 27.3", 
      "expense receivable customer 27.3", 
    }) { 
     It f = input.begin(), l = input.end(); 

     std::cout << "==== '" << input << "' =============\n"; 
     pairs_type data; 
     if (qi::parse(f, l, Grammar<It>(), data)) { 
      std::cout << "Parsed: \n"; 
      for (auto& p : data) { 
       std::cout << p.first << "\t->'" << p.second << "'\n"; 
      } 
     } else { 
      std::cout << "Parse failed\n"; 
     } 

     if (f != l) 
      std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n"; 
    } 
} 

印刷

==== 'descr:expense type:receivable customer 27.3' ============= 
Parsed: 
descr ->'expense' 
type ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'expense type:receivable customer 27.3' ============= 
Parsed: 
-- ->'expense' 
type ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'descr:expense receivable customer 27.3' ============= 
Parsed: 
descr ->'expense' 
-- ->'receivable' 
-- ->'customer' 
-- ->'27.3' 
==== 'expense receivable customer 27.3' ============= 
Parsed: 
-- ->'expense' 
-- ->'receivable' 
-- ->'customer' 
-- ->'27.3'