如何创建一个自定义的随机分布函数？

通常我生成使用the built in random functions值，但现在我需要创建表单如何创建一个自定义的随机分布函数？

f(x) = k*log(x) + m

的随机分布是否有可能定义一个定制的随机分布函数？对于我的实际模型，我有x = [1, 1.4e7), k = -0.905787102751, m = 14.913170454。理想情况下，我想它的工作电流内置的分布怎么办：

int main() 
{ 
    std::mt19937 generator; 

    std::uniform_real_distribution<> dist(0.0, 1.0); 
    my_distribution my_dist(0.0, 10.0); // Distribution using f(x) 

    double uni_val = dist(generator); 
    double log_val = my_dist(generator); 
}

来源

2017-04-26 pingul

这个问题和C++一样重要。例如，请参阅https://en.wikipedia.org/wiki/Inverse_transform_sampling。 – jwimberley

什么是域名？ –

@ YvesDaoust对于最初的问题，它是在1 - > 1.4e7之间。我添加了一个答案，我如何解决它。 – pingul

我跟着@ jwimberley的想法几乎到了点，以为我会在这里分享我的成果。我创建了一个类，执行以下操作：

构造参数：
- CDF（归一化或未归一化），这是PDF的积分。
- 分布的下限和上限
- （可选）表示我们应该采用多少个CDF采样点的分辨率。
计算来自CDF的映射 - >随机数x。这是我们的逆CDF功能。
产生由随机点：
- 使用std::random(0, 1]之间生成随机概率页。

代码：

// sampled_distribution.hh #ifndef SAMPLED_DISTRIBUTION #define SAMPLED_DISTRIBUTION #include <algorithm> #include <vector> #include <random> #include <stdexcept> template <typename T = double, bool Interpolate = true> class Sampled_distribution { public: using CDFFunc = T (*)(T); Sampled_distribution(CDFFunc cdfFunc, T low, T high, unsigned resolution = 200) : mLow(low), mHigh(high), mRes(resolution), mDist(0.0, 1.0) { if (mLow >= mHigh) throw InvalidBounds(); mSampledCDF.resize(mRes + 1); const T cdfLow = cdfFunc(low); const T cdfHigh = cdfFunc(high); T last_p = 0; for (unsigned i = 0; i < mSampledCDF.size(); ++i) { const T x = i/mRes*(mHigh - mLow) + mLow; const T p = (cdfFunc(x) - cdfLow)/(cdfHigh - cdfLow); // normalising if (! (p >= last_p)) throw CDFNotMonotonic(); mSampledCDF[i] = Sample{p, x}; last_p = p; } } template <typename Generator> T operator()(Generator& g) { T cdf = mDist(g); auto s = std::upper_bound(mSampledCDF.begin(), mSampledCDF.end(), cdf); auto bs = s - 1; if (Interpolate && bs >= mSampledCDF.begin()) { const T r = (cdf - bs->prob)/(s->prob - bs->prob); return r*bs->value + (1 - r)*s->value; } return s->value; } private: struct InvalidBounds : public std::runtime_error { InvalidBounds() : std::runtime_error("") {} }; struct CDFNotMonotonic : public std::runtime_error { CDFNotMonotonic() : std::runtime_error("") {} }; const T mLow, mHigh; const double mRes; struct Sample { T prob, value; friend bool operator<(T p, const Sample& s) { return p < s.prob; } }; std::vector<Sample> mSampledCDF; std::uniform_real_distribution<> mDist; }; #endif

下面是结果的部分地块。对于给定的PDF，我们需要首先通过积分来分析计算CDF。

数线性

正弦

你可以用下面的演示试试这个自己：

// main.cc #include "sampled_distribution.hh" #include <iostream> #include <fstream> int main() { auto logFunc = [](double x) { const double k = -1.0; const double m = 10; return x*(k*std::log(x) + m - k); // PDF(x) = k*log(x) + m }; auto sinFunc = [](double x) { return x + std::cos(x); }; // PDF(x) = 1 - sin(x) std::mt19937 gen; //Sampled_distribution<> dist(logFunc, 1.0, 1e4); Sampled_distribution<> dist(sinFunc, 0.0, 6.28); std::ofstream file("d.txt"); for (int i = 0; i < 100000; i++) file << dist(gen) << std::endl; }

的数据与蟒蛇绘制。

// dist_plot.py import numpy as np import matplotlib.pyplot as plt d = np.loadtxt("d.txt") fig, ax = plt.subplots() bins = np.arange(d.min(), d.max(), (d.max() - d.min())/50) ax.hist(d, edgecolor='white', bins=bins) plt.show()

运行带有演示：

clang++ -std=c++11 -stdlib=libc++ main.cc -o main; ./main; python dist_plot.py

来源

2017-04-26 22:00:06 pingul

关于这段代码，有几件事可以说，但这确实属于代码审查。 – Walter

@Walter该帖子没有要求审查。这是我如何创建自定义随机发布的答案，回答了我自己的问题。对于downvote，我真的很惊讶。 – pingul

你的代码远非最佳。首先，你至少应该测试CDF的单调性。其次，你可以实现一个更好的方法来倒置它，例如使用样条或多项式插值。第三，如果您向用户请求PDF和CDF，则可以使用Newton-Raphson对后者进行反转，这可以收敛到机器精度。最后，这对你最初的问题来说是矫枉过正的。 – Walter

这是非常有可能的，但它尽可能多的数学问题，因为一个C++的问题。创建伪随机数发生器的最一般方法是Inverse transform sampling。从本质上讲，任何PDF的CDF均匀分布在0和1之间（如果这不明显，只要记住CDF的值是一个概率并考虑这一点）。所以，你只需要对0到1之间的随机统一数字进行采样并应用CDF的逆。在您的情况下，使用$ f（x）= k * log（x）+ m $（您没有指定界限，但我假设他们在1和某个正数之间> 1）CDF及其它反是相当混乱 - 我留给你的问题！在C++的实施将看起来像

double inverseCDF(double p, double k, double m, double lowerBound, double upperBound) { 
    // do math, which might include numerically finds roots of equations 
}

然后生成的代码看起来就像

class my_distribution { 
    // ... constructor, private variables, etc. 
    template< class Generator > 
    double operator()(Generator& g) { 
      std::uniform_real_distribution<> dist(0.0, 1.0); 
      double cdf = dist(g); 
      return inverseCDF(cdf,this->k,this->m,this->lowerBound,this->upperBound); 
    } 
}

来源

2017-04-26 12:41:54 jwimberley

这是很好的建议，并带领我走上正确的道路。 Upvoted。我添加了一个答案，概述了我是如何实现它的 - 这是你的想法？如果您有任何问题，请提出改进建议。 – pingul

正如指出的其他地方，用于采样任何PDF的标准方法是在从区间选取的均匀随机的点反转其CDF [0,1] 。

如果您遇到特定的问题，CDF是一个简单的函数，但其反过来不是。在这种情况下，可以使用传统的数值工具（如Newton-Raphson迭代）将其倒置。不幸的是，您未能指定x的范围或参数m和k的允许选项。我已经实现了通用的m,k和范围（and posted it on code review）以满足C++ RandomNumberDistribution concept。

来源

2017-04-27 14:52:00 Walter

如何创建一个自定义的随机分布函数？

回答

相关问题