How to Choose a Good Password
FAQ: How do I choose a good password or phrase?
ANS: shocking nonsense makes the most sense
With the intrinsic strength of some of the modern
encryption, authentication, and message digest algorithms
such as RSA, MD5, SHS and IDEA the user password or phrase
is becoming more and more the focus of vulnerability.
Considering even the early PGP 1.0 application for example,
a Deputy with the Los Angeles Country Sheriff's Department
admitted in early 1993 that both they and the FBI despaired
of breaking the system except through a successful
dictionary attack (trying many possible passwords or
phrases from lists of probable choices and their
variations) rather than "breaking" the underlying
cryptographic algorithm mathematically.
The fundamental reason why attacking or trying to
guess the user's password or phrase will increasingly be
the focus of cryptanalysis is that the user's choice of
password may represent a much simpler cryptographic key
than optimal for the encryption algorithm. This weakness of
the user's password choice provides the cryptanalytic
wedge.
For example, suppose a user chooses the password
'david.' On the surface the entropy of this key (or the
number of different equiprobable key states) appears to be
five characters chosen from a set of twenty-six with
replacements: 26^5 or 1.188 x 10^7. But since the user is
apparently biased toward common given names, which a
majority appear in lists numbering only 6,000-7,000
entries, the true entropy is undoubtedly much closer to 6.5
x 10^3, or about four orders of magnitude smaller than the
raw length might suggest. (In fact this password probably
possesses a much smaller entropy than even this for the
very common name "david" would be one of the first names to
be checked by an optimized dictionary attack program.) In
other words, "entropy" is not a fixed physical quantity:
the cryptanalyst can exploit whole meanings and contexts,
not just byte frequencies, digraphs, or even whole-word
correlations to reduce the entropy of the key space he or
she is trying to explore.
To thwart this avenue of attack we would like to
discover a method of selecting passwords or phrases that
have at least as many bits of entropy (or "hard-to-
guessness") as the entropy of the cryptographic key of the
underlying algorithm being used.
To compare, DES (Data Encryption Standard) is believed
to have about 54-55 bits (~4 x 10 ^16) of entropy while the
IDEA algorithm is believed to have about 128 bits (~3.5 x
10^38) of entropy. The closer the entropy of the user's
password or phrase is to the intrinsic entropy of the
cryptographic key of the underlying algorithm being used,
the more likely an attacker would need to search a
substantially larger portion of the algorithm's key space
in order to discover it.
Unfortunately many documents suggest choosing
passwords or phrases that are distinctly inferior to the
latest methods. For example, one white paper widely
archived on the internet suggests selecting an original
password by constructing an acronym from a popular song
lyric or from a line of script from, for example, the SF
movie "Star Wars". Both of these ideas turn out to be weak
because both the entire script to Stars Wars and entire
sets of song lyrics to thousands of popular songs are
available on-line to everyone and, in some case, are
already embedded into "crack" dictionary attack programs.
However the conflict between choosing an easy-to-
remember key and choosing a key with a high level of
entropy is not a hopeless task if we exploit mnemonic
devices that have been known for a long time outside the
field of cryptography. With the goal of making up a
passphrase not included in any existing corpus yet very
easy to remember, an effective technique the one known as
"shocking nonsense."
"Shocking nonsense" means to make up a short phrase or
sentence that is both nonsensical and shocking in the
culture of the user, that is, it contains grossly obscene,
racist or impossible or other extreme juxtaposition of
ideas. This technique is permissable because the
passphrase, by its nature, ought never to be revealed to
anyone with sensibilities to be offended.
Further, shocking nonsense is unlikely to be
duplicated anywhere because it does not describe a matter-
of-fact that could be accidentally rediscovered by anyone
else and the emotional evocation makes it difficult for the
creator to forget. A relatively mild example of such
shocking nonsense might be: "mollusks peck my galloping
genitals ." The reader can undoubtedly make up many far
more shocking examples for himself or herself...
Even relatively short phrases offer acceptable entropy
because the far larger "alphabet" pool of word symbols that
may be chosen than characters form the Roman alphabet. Even
choosing from a vocabulary of a few thousand words a five
word phrase might have on the order of 58 to 60 bits of
entropy -- more than what is needed for the DES algorithm,
for example. If in the case an entire phrase cannot be used
because the password is restricted to, say, eight
alphanumeric characters, concatenating the first letters of
a suitable shocking nonsense passphrase should usually give
a better than reasonable starting point if followed by
adding numeric and non-alphabetic characters.
When you are permitted to use passphrases of arbitrary
length (in PGP for example) it is not necessary to further
perturb your 'shocking nonsense' passphrase to include
numbers or special symbols because the pool of word choices
is already very high. Not needing those special symbols or
numbers (that are not intrinsically meaningful) makes the
shocking nonsense passphrase that much easier to remember.
Appendix A. For software developers
For software developers designing "front-ends" or user
interfaces to conventional short-password applications,
very good results will come from permitting the user
arbitrary length passphrases that are then "crunched" or
processed using a strong digest algorithm such as the 160-
bit SHS (Secure Hash Standard) or the 128-bit MD5 (Message
Digest rev. 5). The interface program then chooses the
appropriate number of bits from the digest and supplies
them to the engine enforcing a short password. This 'key
crunching' technique will assure the developer that even
the short password key space will have a far greater
opportunity of being fully exploited by the user.
Appendix B. A tool to experimentally investigate entropy
A practical Unix tool for investigating the entropy of
typical user keys can be found in Wu and Manber's 'agrep'
(approximate grep) similarity pattern matching tool
available in C source from cs.arizona.edu [192.12.69.5].
This tool can determine the "edit distance," that is, the
number of insertions, substitutions, or deletions that
would be required of an arbitrary pattern in order for it
to match any of a large corpus of words or phrases, say the
usr/dict word list, or over the set of Star Trek trivia
archives. The user can then adjust the pattern to give an
arbitrary high threshold difference between it and common
words and phrases in the corpus to make crack programs that
systematically vary known strings less likely to succeed.
It is often surprising to discover that a substring pattern
like "hxirtes" is only of edit distance two from as many as
forty separate words ranging from "bushfires" to "whitest."
Certainly no password or phrase ought to be chosen as a
working password or phrase that is within two or fewer edit
distance from a known string or substring in any on-line
collection.
select references
[selection and of passwords in differing threat environments]
Department of Defense Password Management Guideline
CSC-STD-002-85
published by the Computer Security Center of the Department
of Defense Fort George G. Meade, MD 20755
[discovering weak passwords]
The COPS Security Checker System by D. Farmer, E. Spafford
Purdue University Technical Report CSD-TR-993
West Lafayette, IN 47907
[an example of automated key cracking]
With Microscope and Tweezers:An Analysis of the Internet Virus of 1988
by M. Eichin, J. Rochlis,
Massachusetts Institute of Technology Cambridge, MA 02139
[password vulnerabilities in distributed systems]
Computer Emergency Response - An International Problem
by R. Pethia, K. van Wyk CERT/Software Engineering Institute
Carnegie Mellon University, Pittsburgh, PA 15213
[key metrics and the MD5 message digest algorithm]
Answers to Frequently Asked Questions About Today's Cryptography
by Paul Fahn
RSA Laboratories, Redwood City, CA 94065
(available through anonymous FTP from rsa.com)
[implementation details of the MD5 message digest algorithm]
RFC-1321 ('request for comments') The MD5 algorithm
by R. Rivest MIT Center for Computer Science
(available on the internet from gatekeeper.dec.com)
[implementation details of the NIST Secure Hash Standard]
The Secure Hash Standard (SHS) Specification, Jan 1992
DRAFT
Federal Information Processing Standards Publication YY
Director, Computer Systems Laboratory
National Institute of Standards and Technology
Gaithersburg, MD 20899
(The SHS was approved as a Federal Standard in May, 1993)
[other possible approaches to password generation]
Automated Password Generator, NIST publication ????
Director, Computer Systems Laboratory
National Institute of Standards and Technology
Gaithersburg, MD 20899
|