EECE 5699: Computer Hardware and System Security Lab 3 Fall 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
EECE 5699: Computer Hardware and System Security
Lab 3 Fall 2022
Lab 3 - Acoustic Emanation Attacks
1 Description
In addition to power consumption and EM emanations, other physical side channels including acoustic and thermal map can also be effective in leaking information. In this lab, you will implement an attack to steal the secret from an audio recording of keyboard typings. You will need a profile of key stroke characteristics - preprocessing a set of audio recordings of each character. You will use this profile to train a neural network, and use the audio recording (for the secret typing) for inference by the neural network. In the end, you will submit the secret flag you have stolen from the given audio recording.
All audio recordings used in this lab are collected on a Dell keyboard and a microphone on an iPhone headphone. These audio files are collected from one channel of data at 44.1KHz with 16-bit samples. Each file in the training set records one key typing for more than 100 times. The audio file contains the recording of secret typings - an 8-character passphrase. Your job is to create a neural net using files in the training set, and then use your trained neural net to recover the passphrase. You may not recover all eight characters, but you can make a guess based on what you have recovered.
Matlab and Python both have rich machine learning and signal processing toolboxes/packages. You can pick either one. The Python package scikit-learn is a fantastic tool for data science beginners (click the turquoise box and follow the link to learn more about the package). It hides the complications but is more than sufficient to complete this lab. Here is a list of files provided to you:
1. Training dataset: 26 keys with 100 clicks per key. Example file name: a.wav
2. Testing data: 26 keys with 8 clicks per key. Example file name: a-test.wav
3. Secret audio recording. File name secret0.wav
4. A function to extract key stroke touch peaks from the audio recording files. extractKeyStroke .m for Matlab users and extractKeyStroke .py for Python users.
2 Accept your Assignment on Github
Same as prior labs, please click the following link to accept assignment: https://classroom.github.com/ a/lxU1cbA9
3 Preprocessing Raw Data
In this step, you will preprocess your raw data in the audio recording files and prepare them for neural nets inputs. The input for a neural net consists of one measurement and label. The measurement is an audio
recording for one key typing event. The label is the corresponding character for the key, but you can also assign a number to each of the characters. For example, you can assign 0 to a and 1 to b.
Each file in the training set contains more than 100 measurement, and its filename is the label. You will need to create 100 inputs from each file and feed them to your neural net.
To view an audio file, you can use the following commands in Matlab. You can use the similar functions in Python. Following polts are generated from Matlab.
>> rawSound = audioread(’a .wav’)
>> plot(rawSound)
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
|
0 0.5 1 1.5 2 2.5
6
10
Figure 1: The audio measurements for typing the key “a” for 100 times
File “a.wav” is shown in Figure 1. As you can see, there are 100 high peaks, each of the peaks represents when a finger pushes the key. If you zoom into it, you will see it more clearly as shown in Figure 2. The push peak interval and the release peak interval are about 100 ms apart. What is more, the push peak region actually consists of touch peak and hit peak, and most information about a key is contained in the touch peak. You want to extract those for training a neural net rather than using the entire trace.
0.15
0.1
0.05
0
-0.05
-0.1
-0.15
-0.2
-0.25
|
6.8
104
3.1 Extracting Input Data
Using the given function extraKeyStroke to parse and extract input data from an audio file. Read the function and understand what it is doing. You will need to provide a threshold and number of typings that you want to extract from the audio file. You may use 15 for the threshold and 100 for the number of keys when the input is a-z.wav file. The output of the extraKeyStroke function will be a 100 by 441 matrix if the threshold value is correct.
Once the peak is extracted, you can use the fft function to transform the samples in the time domain into the frequency domain. With the default setting of FFT, the output of it should be a matrix with the same size as the original, however, in the frequency domain.
>> input = abs(fft(peak))
4 Building a Neural Net
You can use a Neural Net, such as patternnet, newpr, or others in Matlab, for your attack. To use a patternnet, you can follow the following commands:
>> |
trainFcn = ’trainscg’; |
|
>> |
net = patternnet(hiddenLayerSize, |
trainFcn); |
>> |
net = train(net, input, target); |
|
In Matlab, if you have an access to a decent computer, you can use trainbr for your training function instead of trainscg. It takes a lot of memory and can be quite slow, but generally better for a complex problem. Use the following command to see the list of available training function in Matlab. |
||
>> help nntrain |
||
In Python, with scikit- learn, you can follow the following codes: |
||
from sklearn .neural_network import MLPClassifier mlp = MLPClassifier(hidden_layer_sizes=(hiddenLayerSize), verbose=True) mlp.fit (inputs, target) |
You may need to adjust the “hiddenLayersize” to increase the accuracy of your neural net. If applicable, you can also try to add more hidden layers to neural net, to get a better performance model.
5 Accuracy Test
Once you have trained your neural net, it is time to test it out. Use audio files such as a-test.wav for testing. Each of these audio file contains at least 8 key strokes of the letter indicated in the file name. Use your trained neural net to classify each key stroke and report for each of these audio files:
1. How many times the probability of the correct key is the highest
2. How many times the probability of the correct key is the second largest
3. How many times the probability of the correct key is the third largest
6 Attack
You will need to extract eight peaks from the secret file, secret0.wav, and transform those peaks into frequency domain. The process is the same as how you extract data points for training your neural net. With secret peaks extracted, use your neural net to recover the secret. Hint: the secret is an English word.
>> secret_output = net(secret_input); % with Patternnet |
in Python: |
secret_output = mlp .predict(secret_input) # with scikit-learn |
7 What You Need to Turn In
To receive credit on this lab, you will need to turn in all your code to GitHub, and a PDF with following items:
1. A table showing accuracy test for all 26 keys
2. Recovered secret
2022-10-18
Acoustic Emanation Attacks