Since their are four nucleotide bases, assuming a random distribution, each base will occur once every four nucleotide positions.

The probability that a specific nucleotine sequence will occur can thus
be calculated as 4^{n}, where 4
represents the number of bases, and n the total number
of nucleotides in the sequence.

For example, the probability that starting along any sequence, a base
will be adenine is 1/4. The probability that the next base will be
cytosine is also 1/4, so the likelihood that the dinucleotide combination
AC will occur is 1/4 X 1/4 = 1/16 or 4^{2}. This is true
for all other dinucleotide combinations.

The probability of a 16 nucleotide sequence, which is probably unique
in the human genome, would be 4^{16}, or 1/4,294,976,296.
This sequence would occur only once in 3 billion nucleotides, as compared
to a 15 bp sequence, which with a probability of 4^{15} or 1/1,073,741,824
and would likely occur 3 times.

While there are many repetitive sequences within the human genome, it
is reasonable to assume that each 20,000 bp fragment cloned into a Charon
4A vector is unique.