Generate samples from Zipf distribution http://en.wikipedia.org/wiki/Zipf%27s_law. More...
#include <DiscreteRandomGeneratorZipf.h>
Public Member Functions | |
DiscreteRandomGeneratorZipf (double s=0.8) | |
Constructor; sets the "s" of the algorithm. More... | |
double | pmf (uint64_t val) |
Returns the probability of generating val + 1 from this distribution Uses val + 1 because dist. More... | |
Public Member Functions inherited from Hypertable::DiscreteRandomGenerator | |
DiscreteRandomGenerator () | |
Default constructor; sets up a random number generator with a constant seed value of 1. More... | |
virtual | ~DiscreteRandomGenerator () |
Destructor - cleans up allocated resources. More... | |
void | set_seed (uint32_t s) |
Sets the seed for the random number generator. More... | |
void | set_value_count (uint64_t value_count) |
Sets the size of the generated range. More... | |
void | set_pool_min (uint64_t pool_min) |
Sets the lowest value of the desired distribution. More... | |
void | set_pool_max (uint64_t pool_max) |
Sets the highest value of the desired distribution. More... | |
virtual uint64_t | get_sample () |
Returns a random sample from the distribution. More... | |
Private Attributes | |
bool | m_initialized |
true if m_norm was initialized More... | |
double | m_s |
The 's' of the zipfian algorithm. More... | |
double | m_norm |
Helper for calculating the probability in pmf() More... | |
Additional Inherited Members | |
Protected Member Functions inherited from Hypertable::DiscreteRandomGenerator | |
virtual void | generate_cmf () |
Generate the cumulative mass function for the distribution. More... | |
Protected Attributes inherited from Hypertable::DiscreteRandomGenerator | |
std::mt19937 | m_random_engine {1} |
The random number generator. More... | |
uint64_t | m_value_count {} |
Number of values in the range. More... | |
uint64_t | m_pool_min {} |
Lower bound of the range. More... | |
uint64_t | m_pool_max {} |
Upper bound of the range. More... | |
uint64_t * | m_numbers {} |
Array with the random samples. More... | |
double * | m_cmf {} |
The cumulative mass of the distribution. More... | |
Generate samples from Zipf distribution http://en.wikipedia.org/wiki/Zipf%27s_law.
Designed for case where parameter 0 < s < 1, under which condition the probability of the number k (ie of rank k) occuring is (www.icis.ntu.edu.sg/scs-ijit/1204/1204_6.pdf):
Pk = C/k^s where C is approximated by (1-s)/(N^(1-s))
From the paper listed above a default s=0.8 the most popular 20% occur with a cumulative probability of about 72% for a large number of samples.
In this class, m_s replaces s, m_C replaces C and m_max_val replaces N
Definition at line 58 of file DiscreteRandomGeneratorZipf.h.
DiscreteRandomGeneratorZipf::DiscreteRandomGeneratorZipf | ( | double | s = 0.8 | ) |
Constructor; sets the "s" of the algorithm.
Definition at line 31 of file DiscreteRandomGeneratorZipf.cc.
|
virtual |
Returns the probability of generating val + 1 from this distribution Uses val + 1 because dist.
pmf is undefined at 0. Works for the range [0, max_val]
Reimplemented from Hypertable::DiscreteRandomGenerator.
Definition at line 37 of file DiscreteRandomGeneratorZipf.cc.
|
private |
true if m_norm
was initialized
Definition at line 73 of file DiscreteRandomGeneratorZipf.h.
|
private |
Helper for calculating the probability in pmf()
Definition at line 79 of file DiscreteRandomGeneratorZipf.h.
|
private |
The 's' of the zipfian algorithm.
Definition at line 76 of file DiscreteRandomGeneratorZipf.h.