How to return a random list item by weight
planted on in: JavaScript and Randomness.
~1,569 words, about a 8 min read.
Random number generation in JavaScript is as simple at typing Math.random()
, however unlike C++11 with its std::piecewise_constant_distribution there is no built in method for returning weighted random numbers.
Before I explain the optimal method for weighted random number picking, I shall first show a cheap and dirty method where you fill an array with a copy of an object n
times and pick at random an element from that array, for example:
var animals = [
{name: 'cat', weight: 5},
{name: 'dog', weight: 2},
{name: 'rabbit', weight: 2},
{name: 'mouse', weight: 1}
];
var weightedArray = [];
for (var i = 0; i < animals.length; i++) {
for (var n = 0; n < animals[i].weight; n++){
weightedArray.push(i);
}
}
console.log(animals[weightedArray[Math.floor(Math.random() * weightedArray.length)]]);
I have written a basic example below that shows the probability distribution of the above method; because cat is referenced inside weightedArray
five times more than mouse it is as many times more likely to be picked. As you can see from the percentages in the codepen they roughly translate to their weightings with the accuracy increasing the more rolls you do.
See the Pen pPwEVq by Simon Dann (@carbontwelve) on CodePen.
This method of generating weighted random number picking is quite "hacky" and far from optimal; this is because the bigger your input data set the bigger the weightedArray
becomes, consuming memory unnecessarily.
The ideal method
To efficiently pick an element from an array with the correct weighted random distribution you first need to normalise each items weight as a percentage of the sum total of all elements weights ā this is the distribution weight.
var animals = [
{name: 'cat', weight: 5},
{name: 'dog', weight: 2},
{name: 'rabbit', weight: 2},
{name: 'mouse', weight: 1}
];
// Normalise Weights
var weightTotal = 0;
for (var i = 0; i < animals.length; i++) {
weightTotal += animals[i].weight;
}
for (var i = 0; i < animals.length; i++) {
animals[i].distribution = animals[i].weight / weightTotal;
}
console.log(animals);
The above normalization will result in an animals array that looks like the following, with all the distribution values adding up to 1:
var animals = [
{name: 'cat', weight: 5, distribution: 0.5},
{name: 'dog', weight: 2, distribution: 0.2},
{name: 'rabbit', weight: 2, distribution: 0.2},
{name: 'mouse', weight: 1, distribution: 0.1}
];
This can be visualised as a stacked graph with each element filling an area that is proportionate to their weight:
Now you have a pool of elements, each with a distribution value that is proportionate to their weight; to randomly select a item weighted by its distribution value you pick a random number between 0 and 1 called the selector and loop through each element in the pool subtracting their distribution value from the selector until it becomes a negative number at which point you have the element to return.
For example if the selector is 0.7 you would get the cat element and subtract 0.5 resulting in 0.2, next you would subtract 0.2 for dog before hitting rabbit and subtracting 0.2 resulting in the selector hitting negative 0.2. You would then return the rabbit element.
I have extended the above example to show the weighted pick algorithm:
var animals = [
{name: 'cat', weight: 5},
{name: 'dog', weight: 2},
{name: 'rabbit', weight: 2},
{name: 'mouse', weight: 1}
];
// Normalise Weights
var weightTotal = 0;
for (var i = 0; i < animals.length; i++) {
weightTotal += animals[i].weight;
}
for (var i = 0; i < animals.length; i++) {
animals[i].distribution = animals[i].weight / weightTotal;
}
function pickOne(pool) {
var key = 0;
var selector = Math.random();
while (selector > 0) {
selector -= pool[key].distribution;
key++;
}
// Because the selector was decremented before key was
// incremented we need to decrement the key to get the
// element that actually exited the loop.
key--;
// Using splice to return a copy of the element.
return pool[key];
}
console.log(pickOne(animals));
As you can see from the codepen example below this method provides the same probability distribution with it becoming more accurate the more iterations that you do do, however unlike the array based method this one does not consume vast amounts of RAM and is a lot quicker to execute when iterating hundreds of thousands of times.
See the Pen Loop based weighted random pool selection by Simon Dann (@carbontwelve) on CodePen.