# Random Shuffles

Shuffling a collection of items is a surprisingly frequent task
in programming: essentially, it comes up whenever a known input
must be processed in random order. What is more, there is a delightful,
*three-line* algorithm to accomplish this task correctly, in-place, and
in optimal time. Unfortunately, this simple three-line solution seems
to be insufficiently known, leading to various, less-than-optimal ad-hoc
alternatives being used in practice — but that is entirely
unnecessary!

### The Algorithm

The basic algorithm consists of a single loop over the elements
in the input array. At each step, we pick an element at random
from the as-yet unvisited “tail” of the array, and exchange it
with the current element. Crucially, the *current* element is
always considered part of the “tail”. In Python, it looks like
this:

```
for i in range( data ):
j = random.randrange( i, len(data) )
data[i], data[j] = data[j], data[i]
```

These three lines (or the idea behind them) should be committed to memory!

Notice the use of the `random.randrange( start, stop )`

function,
which returns a random index precisely from the required range:
`i <= j < len(data)`

.

To see that this algorithm produces each possible permutation with equal probability, it is enough to realize that:

- the first element is selected randomly from all possible choices
- the algorithm is recursive (the implementation isn’t, but the algorithm is).

Because each element is touched once, the algorithm has the “best possible” time complexity. The input sequence is shuffled in place, without the need for extra storage.

The algorithm goes back to the early 20th century statistics pioneers Ronald Fisher and Frank Yates, with practical computer implementations being provided later by Richard Durstenfeld and Donald Knuth. See its Wikipedia entry.

### Reading from a Stream

The above Wikipedia article also mentions a fascinating variant, wherein
a randomly shuffled array is built up from an *input stream*, even
when the total number of elements in the stream is not known. For
each value read from the stream, the array is extended by one
element, and a random location within the newly extended array is found
for the latest element.

```
data = []
while True:
elem = input() # raises EOFError when done
data.append( elem ) # add new elem at end, maybe shuffle later!
i = random.randrange( 0, len(data) )
if i != len(data)-1: # if i is not last element in data: exchange
data[i], data[-1] = data[-1], data[i]
```