# Random Shuffles

Shuffling a collection of items is a surprisingly frequent task in programming: essentially, it comes up whenever a known input must be processed in random order. What is more, there is a delightful, three-line algorithm to accomplish this task correctly, in-place, and in optimal time. Unfortunately, this simple three-line solution seems to be insufficiently known, leading to various, less-than-optimal ad-hoc alternatives being used in practice — but that is entirely unnecessary!

### The Algorithm

The basic algorithm consists of a single loop over the elements in the input array. At each step, we pick an element at random from the as-yet unvisited “tail” of the array, and exchange it with the current element. Crucially, the current element is always considered part of the “tail”. In Python, it looks like this:

for i in range( data ):
j = random.randrange( i, len(data) )
data[i], data[j] = data[j], data[i]


These three lines (or the idea behind them) should be committed to memory!

Notice the use of the random.randrange( start, stop ) function, which returns a random index precisely from the required range: i <= j < len(data).

To see that this algorithm produces each possible permutation with equal probability, it is enough to realize that:

• the first element is selected randomly from all possible choices
• the algorithm is recursive (the implementation isn’t, but the algorithm is).

Because each element is touched once, the algorithm has the “best possible” time complexity. The input sequence is shuffled in place, without the need for extra storage.

The algorithm goes back to the early 20th century statistics pioneers Ronald Fisher and Frank Yates, with practical computer implementations being provided later by Richard Durstenfeld and Donald Knuth. See its Wikipedia entry.

data = []