<random>: Implement modified ziggurat algorithm for normal_distribution#6104
<random>: Implement modified ziggurat algorithm for normal_distribution#6104statementreply wants to merge 16 commits into
<random>: Implement modified ziggurat algorithm for normal_distribution#6104Conversation
|
|
||
| _STD_BEGIN | ||
|
|
||
| template <class _Ty, class _Uty, bool _Signed, int _Lx> |
There was a problem hiding this comment.
Can we remove _Signed is it's always true?
| template <class _Ty, class _Uty, bool _Signed, int _Lx> | |
| template <class _Ty, class _Uty, int _Lx> |
There was a problem hiding this comment.
It's reserved for exponential_distribution, where _Signed would be false.
| static_assert(_Lx <= 254, "invalid table size"); | ||
|
|
||
| using _Uint_type = _Uty; | ||
| using _Xtype = conditional_t<_Signed, make_signed_t<_Uty>, _Uty>; |
There was a problem hiding this comment.
| using _Xtype = conditional_t<_Signed, make_signed_t<_Uty>, _Uty>; | |
| using _Xtype = make_signed_t<_Uty>; |
Also, would it be better to rename it to _Sint_type?
| using _Xtype = conditional_t<_Signed, make_signed_t<_Uty>, _Uty>; | ||
|
|
||
| static constexpr int _Layer_num = _Lx; | ||
| static constexpr _Ty _Width_scale = (_Signed ? _Ty{1} : _Ty{2}) * _Ty{(numeric_limits<_Uty>::max)() / 2u + 1u}; |
There was a problem hiding this comment.
Let's avoid numeric_limits.
| static constexpr _Ty _Width_scale = (_Signed ? _Ty{1} : _Ty{2}) * _Ty{(numeric_limits<_Uty>::max)() / 2u + 1u}; | |
| static constexpr _Ty _Width_scale = _Ty{static_cast<_Uty>(-1) / 2u + 1u}; |
Co-authored-by: A. Jiang <de34@live.cn>
| const _Fty _Ydiff = _Yy1 - _Yy0; | ||
|
|
||
| for (;;) { | ||
| const _Fty _Dx = _Xdiff * _STD _Nrand_impl<_Fty>(_Eng); |
There was a problem hiding this comment.
The upper bits of _Xbits could be used on the first iteration. Also for the tail, below.
| const _Fty _Xval = -_STD log(_Fty{1.0} - _STD _Nrand_impl<_Fty>(_Eng)) * _Tail_reciprocal; | ||
| const _Fty _Yval = _STD _Nrand_impl<_Fty>(_Eng); | ||
| if (_Yval < _STD exp(_Xval * _Xval * _Fty{-0.5})) { | ||
| const _Fty _Xorig = _Tail + _Xval; | ||
| return static_cast<_Ty>(_Xbits < 0 ? -_Xorig : _Xorig); | ||
| } |
There was a problem hiding this comment.
Marsaglia's tail method might be faster if we don't have a fast exponential generator, as it only needs sqrt when returning, instead of an additional log or exp. It can be found in Luc Devroye's book, section IX.1.2, or this DTIC report.
After using 1 bit for the sign and 24 bits for the magnitude of the random `float` from the 32-bit random value, we have only 7 unused bits.
This PR implements modified ziggurat algorithm for
normal_distribution, based on McFarland, C. D. (2015). A modified ziggurat algorithm for generating exponentially and normally distributed pseudorandom numbers.The main differences between this PR and the paper:
Benchmark results for x64 on Intel Core i5-13600KF:
BM_Generator<std::mt19937_64>BM_Generator<b_r::mt19937_64>BM_Distribution<std::mt19937_64, std_old::normal_distribution<double>>BM_Distribution<std::mt19937_64, std_new::normal_distribution<double>>BM_Distribution<std::mt19937_64, boost_r::normal_distribution<double>>BM_Distribution<b_r::mt19937_64, boost_r::normal_distribution<double>>BM_Generator<std::mt19937>BM_Generator<b_r::mt19937>BM_Distribution<std::mt19937, std_old::normal_distribution<float>>BM_Distribution<std::mt19937, std_new::normal_distribution<float>>BM_Distribution<std::mt19937, boost_r::normal_distribution<float>>BM_Distribution<b_r::mt19937, boost_r::normal_distribution<float>>Removed
tests\GH_004618_normal_distribution_avoids_resetswhich became moot.Fixes #1003.