双重字符串，无需科学记数法或尾随零，高效

此例程被称为一个数十亿次以创建大型csv文件，其中包含数字。有没有更有效的方法呢？双重字符串，无需科学记数法或尾随零，高效

static std::string dbl2str(double d) 
    { 
     std::stringstream ss; 
     ss << std::fixed << std::setprecision(10) << d;    //convert double to string w fixed notation, hi precision 
     std::string s = ss.str();         //output to std::string 
     s.erase(s.find_last_not_of('0') + 1, std::string::npos);  //remove trailing 000s (123.1200 => 123.12, 123.000 => 123.) 
     return (s[s.size()-1] == '.') ? s.substr(0, s.size()-1) : s; //remove dangling decimal (123. => 123) 
    }

来源

2013-03-01 tpascale

标题看起来不对，应该是字符串的两倍？ – hyde 2013-03-01 20:27:20

oops - 标题倒退。。。当然它是字符串的两倍 – tpascale 2013-03-02 13:34:45

可能的重复[在没有科学记数法的情况下在C++中格式化有效数字]（http://stackoverflow.com/questions/17211122/formatting-n-significant-digits-in-c-without-scientific-注释） – mirams 2016-10-20 08:00:52

开始之前，请检查是否在此功能中花费了大量时间。通过测量，或者用探查器或其他方法来做到这一点。知道你称之为数十亿次是非常好的，但如果事实证明你的程序仍然只有1％的时间用在这个函数中，那么你在这里所做的任何事情都不可能将你的程序的性能提高1％以上。如果是这样的话，你的问题的答案将是“为了你的目的不，这个功能不会显着提高效率，如果你尝试，你浪费你的时间”。第一件事，避免s.substr(0, s.size()-1)。这复制了大部分字符串和它使您的功能不符合NRVO的要求，所以我认为一般情况下您会得到一份返还的副本。所以我做的第一个改变是取代最后一行：

if(s[s.size()-1] == '.') { 
    s.erase(s.end()-1); 
} 
return s;

但是，如果性能是一个严重的问题，那么这就是我该怎么做。我不希望这是最快的，但它避免了一些不必要的分配和复制的问题。涉及stringstream的任何方法都需要从stringstream复制结果，因此我们需要更低级的操作，snprintf。

static std::string dbl2str(double d) 
{ 
    size_t len = std::snprintf(0, 0, "%.10f", d); 
    std::string s(len+1, 0); 
    // technically non-portable, see below 
    std::snprintf(&s[0], len+1, "%.10f", d); 
    // remove nul terminator 
    s.pop_back(); 
    // remove trailing zeros 
    s.erase(s.find_last_not_of('0') + 1, std::string::npos); 
    // remove trailing point 
    if(s.back() == '.') { 
     s.pop_back(); 
    } 
    return s; 
}

到snprintf第二呼叫假定std::string使用连续的存储。这在C++ 11中得到了保证。在C++ 03中不能保证，但对于C++委员会已知的std::string的所有主动维护实现都是如此。如果性能真的很重要，那么我认为做出这种不可移植的假设是合理的，因为直接写入字符串可以在以后将字符串保存到字符串中。

s.pop_back()是说s.erase(s.end()-1)的C++ 11的方式，并s.back()是s[s.size()-1]

对于另一可能改进，你可以摆脱第一次调用到snprintf，而是大小的s像一些价值std::numeric_limits<double>::max_exponent10 + 14（基本上，-DBL_MAX需要的长度）。麻烦的是，这分配和零比通常需要更多的内存（一个IEEE双322字节）。我的直觉是，这比第一次调用snprintf时要慢，更不用说在调用者将字符串返回值保持一段时间的情况下浪费内存。但是你可以随时测试它。

或者，std::max((int)std::log10(d), 0) + 14计算所需大小的合理上限，并且可能比snprintf可以更准确地计算它的上限。

最后，您可以通过更改功能界面来提高性能。例如，而不是返回一个新的字符串，你也许可以追加到一个由调用者传递一个字符串：

void append_dbl2str(std::string &s, double d) { 
    size_t len = std::snprintf(0, 0, "%.10f", d); 
    size_t oldsize = s.size(); 
    s.resize(oldsize + len + 1); 
    // technically non-portable 
    std::snprintf(&s[oldsize], len+1, "%.10f", d); 
    // remove nul terminator 
    s.pop_back(); 
    // remove trailing zeros 
    s.erase(s.find_last_not_of('0') + 1, std::string::npos); 
    // remove trailing point 
    if(s.back() == '.') { 
     s.pop_back(); 
    } 
}

然后调用者可以reserve()足够的空间，调用函数几次（推测可能与其他字符串中追加之间），并将所得到的数据块一次性写入文件，除reserve以外没有任何内存分配。 “Plenty”不一定是整个文件，它可以是一行或“段落”，但是任何可以避免数十亿内存分配的东西都是潜在的性能提升。

来源

2013-03-01 21:31:01

感谢这个非常详细的解释 – tpascale 2013-03-02 13:44:24

使用snprintf和的stringstream和string
char不是数组指针传递给char缓冲区dbl2str到其中它打印（为了避免string拷贝构造返回时调用）。组装在一个字符缓冲区要打印的字符串（或转换字符缓存器调用时将字符串或将其添加到现有的字符串）

声明函数inline在头文件

#include <cstdio> 
inline void dbl2str(char *buffer, int bufsize, double d) 
{ 
    /** the caller must make sure that there is enough memory allocated for buffer */ 
    int len = snprintf(buffer, bufsize, "%lf", d); 

    /* len is the number of characters put into the buffer excluding the trailing \0 
    so buffer[len] is the \0 and buffer[len-1] is the last 'visible' character */ 

    while (len >= 1 && buffer[len-1] == '0') 
    --len; 

    /* terminate the string where the last '0' character was or overwrite the existing 
    0 if there was no '0' */ 
    buffer[len] = 0; 

    /* check for a trailing decimal point */ 
    if (len >= 1 && buffer[len-1] == '.') 
    buffer[len-1] = 0; 
}

来源

2013-03-01 19:48:05

关键字*内联*不直接影响优化，因为“内联”是链接器的指令，该链接可能会在链接中出现多次，并且不是错误。这个函数已经是* static *了。 – hyde 2013-03-01 20:22:47

在速度或简洁方面有效吗？

char buf[64]; 
sprintf(buf, "%-.*G", 16, 1.0); 
cout << buf << endl;

显示“1”。在恢复为科学记数法之前，格式化最多16位数字，没有尾随零。

来源

2013-12-22 12:04:24 BSalita

- 并非严格必要（它左侧证明） – 2014-12-02 03:24:45

双重字符串，无需科学记数法或尾随零，高效

回答

相关问题