[Pkg-sysvinit-devel] fundamental properties of entropy

Thu Sep 16 22:41:12 UTC 2010

A correction:

Alas on 09/15/2010 08:22 PM, I wrote:

> There is a fundamental principle in the cryptography / security
> business says that you cannot make something secure by throwing
> together a whole bunch of insecure elements.  You can make it
> complicated, but you cannot make it secure.  

That is overstated.  There are many examples of cryptosystems
(including RNGs) where the whole is vastly more secure than 
the sum of the parts.  For example, in a system such as AES,
the full system with many rounds is resistant to attacks that 
would succeed against a "reduced rounds" version.

Sorry about that.

I should have said something more like:

Making something more complex does not automatically (or even
usually) make it more secure.  Careful engineering is required.

====

The goal of any security system is to impose a large burden 
on the bad guys but only a small burden on the good guys.

Adding more complexity to a system increases the workload on 
the good guys ... maybe not a lot, but some.  Thereupon the
first question is:  (i) Does it create a disproportionately 
large increase in the workload on the bad guys?  And then 
good engineering practice demands asking the reciprocal
question:  (ii) Given this amount of burden on the bad guys, 
is there an easier and/or better way of achieving the same 
result?

on 09/15/2010 01:29 PM, Henrique de Moraes 
Holschuh wrote in part:

>> Part 1: enough stored entropy to use as "seed material" (4Kib for Linux)
>> that is unknown to the attacker.
>> 
>> Part 2: something that is unique to this specific device among all others.
>> 
>> Part 3: something that is provably different each time this specific device
>> is rebooted, i.e. each time there has been an irreversible loss of state.

Let's work through the various use-cases, case by case.

A) Suppose the stored random.seed (as mentioned in Part 1) is 
 random, unique on a machine-by-machine basis, and unknown to 
 the attacker.

 Then Part 2 doesn't help.  You are in equally good shape with
 or without it.

B) Suppose the stored random.seed non-existent and/or known to 
 the attacker.  This case is quite common, since it is what people 
 get by downloading a "ready-to-use" Live CD image.

 Then Part 2 doesn't help.  You are in equally bad shape with 
 or without it -- or very nearly so.  

 Such a system might resist casual attack by small children, 
 but it will not succumb immediately to any halfway serious 
 attack.  Hint:  dictionary attack against the MAC address, 
 mobo serial number, and whatever else is being used to 
 distinguish the machine.

C) The only remaining use-case I can think of is the following:
 Suppose there is a small group of machines, all booting from
 bit-for-bit identical media, and all booting at the same instant,
 so that they cannot be distinguished by looking at the RTC.
 The random.seed file is random and unique to this /group/ of
 machines, and is zealously protected so that it remains unknown
 to the attackers.

 Further suppose there is some _standard_ place that init.d/urandom
 can look to find a way to distinguish the members of this group.

 I can just barely imagine this happening in some sort of VPS
 situation, where the VPS operator unwisely decides to boot
 all of the virtual machines from bit-for-bit identical media.

 Discussion:  First of all, IMHO this seems like asking what 
 is the procedure for hiking barefoot across a huge pile of 
 broken glass. The usual answer is "Don't do it.  Wear shoes,
 or hike somewhere else.  Shoes are readily available."

 Setting aside the metaphor, my advice to the VPS operator
 would be "Don't do it.  Disk space on the host machine is
 very cheap.  For N virtual machines, you can afford to make
 N copies of the boot-disk image, and give each of them its
 own unique random.seed file.  You already went to the trouble
 of generating a random.seed for the group, so the incremental
 work of generating one for each host is trivial.  If you are 
 the least bit serious about security, you will do this.  
 Tools that make it easy to do this are available."
      http://www.av8n.com/computer/htm/fixup-live-cd.htm
 There are many other simple ways of accomplishing the same 
 goal.

 We also observe the following weakness in Case C: Compromise 
 of any one of the N machines compromises all the others.
 This is another reason why "Don't do it" is good advice.

   Note: This can be considered an application of engineering
   principle (ii) mentioned above:  If we want to make the
   machines distinguishable, we should do it the right way.
   The standard, robust way of doing this is via the 
   random.seed file.  It is big enough to resist dictionary
   attacks, whereas serial numbers etc. are not.

The same advice applies to anyone who wants to boot from a
downloaded or otherwise cloned boot medium:  Don't do it.
If you are the least bit serious about security, you will
personalize the random-seed file.  Tools to make this easy
are available.

This exhausts the set of use-cases I can think of.  If anybody
knows of other cases that are worth considering, please let
us know, as clearly as possible, what those cases are.

In all cases that I can think of, Part 2 doesn't help.  The
good cases don't get any better, and the bad cases don't
get significantly better.  Any worthwhile goal that could
be achieved by Part 2 can be achieved more nicely by other 
means.

> Part 2/3 are NOT about security,
> they're about keeping some variance on the data retrieved from the
> random-numbers kernel subsystem across reboots of the same
> real-entropy-starved device, and also on syncronized boots of several
> nearly-identical real-entropy-starved devices.

Thanks for the clarification.  I needed the clarification.
Heretofore I thought security was the driving force behind
all of this.

I fear we are pursuing two unrelated lines of thought.  I 
have proposed specific code to achieve a specific security-
related goal.

There is evidently another proposal floating around.  I have
not seen specific code, and I despite my best efforts I am
unable to figure out what the practical goal is supposed to 
be.  In particular, many people are concerned about security,
but relatively few are concerned about variance for the sake 
of variance, when it is specifically "NOT about security".

I suspect this other proposal should be given its own ticket
number, since it involves different code, different use-cases,
and explicitly different goals.