2014-09-03 104 views
2

最少的代码示例如下:使用regex.h时内存泄漏?

#include <cstdlib> 
#include <iostream> 
#include <vector> 
#include <regex.h> 

using namespace std; 

class regex_result { 
public: 
    /** Contains indices of starting positions of matches.*/ 
    std::vector<int> positions; 
    /** Contains lengths of matches.*/ 
    std::vector<int> lengths; 
}; 

regex_result match_regex(string regex_string, const char* string) { 
    regex_result result; 
    regex_t* regex = new regex_t; 
    regcomp(regex, regex_string.c_str(), REG_EXTENDED); 
    /* "P" is a pointer into the string which points to the end of the 
     previous match. */ 
    const char* pointer = string; 
    /* "n_matches" is the maximum number of matches allowed. */ 
    const int n_matches = 10; 
    regmatch_t matches[n_matches]; 
    int nomatch = 0; 
    while (!nomatch) { 
     nomatch = regexec(regex, pointer, n_matches, matches, 0); 
     if (nomatch) 
      break; 
     for (int i = 0; i < n_matches; i++) { 
      int start, 
       finish; 
      if (matches[i].rm_so == -1) { 
       break; 
      } 
      start = matches[i].rm_so + (pointer - string); 
      finish = matches[i].rm_eo + (pointer - string); 
      result.positions.push_back(start); 
      result.lengths.push_back(finish - start); 
     } 
     pointer += matches[0].rm_eo; 
    } 
    delete regex; 
    return result; 
} 

int main(int argc, char** argv) { 
    string str = "this is a test"; 
    string pat = "this"; 
    regex_result res = match_regex(pat, str.c_str()); 
    cout << res.positions.size() << endl; 
    return 0; 
} 

所以我写了解析给定的字符串为正则表达式匹配的功能。结果保存在一个基本上是两个向量的类中,一个用于匹配的位置,另一个用于相应的匹配长度。

这工作正常,但是当我跑valgrind,它显示了一些大量的内存泄漏。

在使用上的代码valgrind --leak-check=full上面我得到:

==24843== Memcheck, a memory error detector 
==24843== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. 
==24843== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info 
==24843== Command: ./test 
==24843== 
1 
==24843== 
==24843== HEAP SUMMARY: 
==24843==  in use at exit: 11,688 bytes in 37 blocks 
==24843== total heap usage: 54 allocs, 17 frees, 12,868 bytes allocated 
==24843== 
==24843== 256 bytes in 1 blocks are definitely lost in loss record 14 of 18 
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x543549A: regcomp (regcomp.c:487) 
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>) 
==24843== by 0x4010CA: main (in <path>) 
==24843== 
==24843== 11,432 (224 direct, 11,208 indirect) bytes in 1 blocks are definitely lost in  loss record 18 of 18 
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x4C2CF1F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) 
==24843== by 0x5434BAF: re_compile_internal (regcomp.c:760) 
==24843== by 0x54354FF: regcomp (regcomp.c:506) 
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>) 
==24843== by 0x4010CA: main (in <path>) 
==24843== 
==24843== LEAK SUMMARY: 
==24843== definitely lost: 480 bytes in 2 blocks 
==24843== indirectly lost: 11,208 bytes in 35 blocks 
==24843==  possibly lost: 0 bytes in 0 blocks 
==24843== still reachable: 0 bytes in 0 blocks 
==24843==   suppressed: 0 bytes in 0 blocks 
==24843== 
==24843== For counts of detected and suppressed errors, rerun with: -v 
==24843== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) 

是我的代码错误或是否真的在这些文件中的错误?

回答

4

您的regex_t管理不需要是动态的,尽管这与您的问题没有直接关系,但有点奇怪。真正的问题是你永远不会regfree()如果编译成功(你应该验证)你的结果表达。您应该设置你的正则表达式是这样的:

regex_t regex; 
int res = regcomp(&regex, regex_string.c_str(), REG_EXTENDED); 
if (res == 0) 
{ 
    // use your expression via &regex 
    .... 

    // and eventually free it when done. 
    regfree(&regex); 
} 

如果您的实施支持他们,我强烈提醒使用C++ 11提供<regex>库,因为它有很好的RAII解决方案,这在很大程度上。

+0

啊,谢谢。我选择了你的答案,即使你稍晚一点,因为你提供了额外的信息。 – kunterbunt 2014-09-03 15:35:05

+0

目前,C++ 11不是一种选择,因此我正在这样做。 – kunterbunt 2014-09-03 15:45:52

2

您必须致电regfree()以释放由regcomp()分配的内存。