Keyboard sounds reveal their words
The new eavesdropping method is in contrast to
one developed by researchers at IBM, who in 2004 used the unique vibrations
produced by each key, along with a copy of what the person was typing, to assign
a letter to each sound and then to transcribe new typed sounds.
Keyboard sounds reveal their
words
• 15:28 14 September
2005
• NewScientist.com news
service
• Celeste Biever
The unique sounds a keyboard makes as different keys
are hit can now be transformed into a reasonable transcription of what was
typed, using a new technique developed by computer scientists.
Nefarious hackers could use the technique to
eavesdrop on passwords, personal emails and confidential company reports, using
nothing but a recording taken with a hidden microphone.
“This is spying,” says Bruce Schneier, a
security expert and cryptographer based in Mountain View, California, US.
“It’s definitely a security risk.”
Although which key makes which sound is unknown and
will vary depending on the keyboard and the person, machine learning and the
structure of the English language is used to work it out.
The new eavesdropping method is in contrast to one
developed by researchers at IBM, who in 2004 used the unique vibrations produced
by each key, along with a copy of what the person was typing, to assign a letter
to each sound and then to transcribe new typed sounds.
Totally forbidden
“The main contribution of our paper is that we
do this blind and with no training info,” says Doug Tygar, the computer
scientist at the University of California, Berkeley, US, who devised the new
technique with colleagues Li Zhuang and Feng Zhou. He claims that system needs
only 10 minutes’ worth of typing recorded using off-the-shelf audio
recorders and microphones.
Each key hits a different part of the
keyboard’s plastic under-plate, resulting in a slightly different sound.
The algorithm separates the keystrokes and divides them into 50 classes
depending on similarities in frequency and amplitude.
The next step is to it map these classes to the
different keys, using the probability of two letters being next to each other in
a word. This is done using a knowledge of which pairs of characters, known as
“digrams” are common, less common and totally forbidden.
“Th” and “er” are common, “qu” is less
common and “bf” is totally forbidden. This technique alone maps 60%
of the keystrokes correctly, the researchers claim.
“Quite
readable”
But Tygar's team also applies spelling rules, which
correct words from the transcription that do not make sense, such as
“tre”, and grammar, which cuts out unlikely combinations such as
“the a”. Word options that have fewer than one-quarter of the
characters corrected are then kept and the rest discarded, and the digram
algorithm is reapplied.
The process is repeated and the result is a mapping
with 92% of the characters correct. These can then be used to transcribe further
recordings. “It’s quite readable,” writes Edward Felten, a
computer scientist at Princeton University in New Jersey, in his
blog.
In an experiment, Tygar’s team found that even
random assortments of characters, such as passwords, could also be estimated
using the software. Provided they were buried in a lot of good English, they
found that about one in 20 times the software would get the password completely
right, arriving at good approximations the rest of the time.
Although this does not sound very accurate, it is
trivial for a hacker to try 20 different possible passwords. But Schneier points
out that this is unnecessary: “If I can read everything you type, I
don’t care about your password.”
There are limitations as to what the software can
transcribe as it currently recognises only lower case letters and the
“space”, “full stop”, “comma” and
“enter” keys. Tygar believes the algorithm could be tweaked to
recognise the delete key because people use it in a very distinctive way, often
hitting up to 10 times in a row.
But numbers are harder as there are no forbidden
combinations. It could be done, he says, perhaps using other techniques such as
a knowledge of the spatial location of the numbers on the keyboard and relating
that to the level of similarity in the sounds.
Parked car
The algorithm was also trained with a cellphone
ringing in the background, and was still successful. But a second typist in the
room might pose a problem, says Tygar, who envisions a hacker parking a car
outside an open window, and leaving a long-range microphone trained on the
target, to record key strokes.
The algorithm has been programmed in English, but
most other languages should be possible as they contain similar digrams and
spelling and grammar rules, Tygar says. “We have not tried experiments in
other languages, but we don’t see any barrier at all,” he
adds.
The paper will be presented at the Conference on
Computer and Communications Security in Alexandria, Virginia in November
2005.
Posted: Mon - November 21, 2005 at 10:53 PM