| Hi,
|
| I've been wondering about the proper application of
statistics with
| regard to comparing PRNGs and encrypted text to truly
random sources.
|
| As I understand it, when looking at output, one can take a
| hypothetical source model (e.g. "P(0) = 0.3, P(1) =
0.7, all bits
| independent") and come up with a probability that
the source may have
| generated that output. One cannot, however, say what
probability such
| a source had generated the output, because there is an
infinite number
| of sources (e.g. "P(0) = 0.29999.., P(1) =
7.000..."). Can one say
| that, if the source must be A or B, what probability it
actually was A
| (and if so, how)?
That's not the way it's done. Ignore for a moment that we
have a sequence
(which is probably irrelevant for this purpose, but might
not be). Instead,
just imagine we have a large collection of values generated
by the PRNG -
or, looked at another way, a large collection of values
alleged to have been
drawn from a population with P(0) = 0.3 and P(1) = 0.7. Now
take a truely
random sample from that collection and ask the question:
What is the
probability that I would have seen this result, given that
the collection
I'm drawing from is really taken from the alleged
distribution? You don't
need any information about *other* possible distributions.
(Not that there
aren't other questions you can ask. Thus, if the
collection could have
been drawn from either of two possible distributions, you
can ask which
is more probable to have resulted in the random sample you
saw.)
The randomness in the sampling is essential. When you have
it, you wipe out
any underlying bias in the way the collection was created.
| Also, it strikes me that it may not be possible to prove
something
| cannot be distinguished from random, but that proofs must
be of the
| opposite form, i.e. that some source is distinguishable
from random.
Actually, what one tends to prove are things like: If X is
uniformally
randomly distributed over (0,1), then 2X is uniformally
randomly distributed
over (0,2). (On the other hand, X + X, while still random,
is *not*
uniformally distributed.) That's about as close as you are
going to get
to a "proof of randomness".
| Am I correct? Are there any other subtleties in the
application of
| statistics to crypto that anyone wishes to describe? I
have yet to
| find a good book on statistics in these kinds of
situations, or for
| that matter in any.
Statistics in general require subtle reasoning.
| As an aside, it's amusing to see the abuse of statistics
and
| probability in the media. For example, when people ask
"what's the
| probability of <some non-repeating event or
condition>?"
That may or may not be a meaningful concept. If I toss a
coin, and
depending
on the result, blow up a building - there is no way to
repeat the blowing up
of the building, but still it's meaningful to say that the
probability that
the building gets blown up is 50%.
-- Jerry
| --
| "Curiousity killed the cat, but for a while I was a
suspect" -- Steven
Wright
| Security Guru for Hire http://www.li
ghtconsulting.com/~travis/ -><-
| GPG fingerprint: 9D3F 395A DAC5 5CCC 9066 151D 0A6B 4098
0C55 1484
|
|
------------------------------------------------------------
---------
| The Cryptography Mailing List
| Unsubscribe by sending "unsubscribe
cryptography" to
majordomo metzdowd.com
|
------------------------------------------------------------
---------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe
cryptography" to majordomo metzdowd.com
|