Detecting Software Vulnerability using Deep LearningOriginally the project was applied for software vulnerability detection using deep learning. However, our project changed and the computing resource was used for our AI security research. The project title is "Auditing privacy leakage in ASRs (Automatic Speech Recognition Systems". With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this project, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system
Principal investigatorChao Chen email@example.com
Area of science
Systems usedTopaz and Zeus
We define user-level membership inference as querying with a user’s data, if this user has any data within target model’s training set, even if the query data are not members of the training set, this user is the user-level member of this training set. The challenges of user-level membership inference on ASRs vai black-box access are:
1) Lack of information about the target model is challenging. As strict black-box inference has little knowledge about the target model’s performance, it is hard for shadow models to mimic a target model.
2) User-level inference requires a higher level of robustness than record-level inference. Unlike record-level, user-level inference needs to consider the speaker’s voice characteristics.
3) ASR systems are complicated due to their learning architectures, causing membership inference with shadow models to be computationally intensive and time-consuming
In this project, we design and evaluate our audio auditor to help users determine whether their audio records have been used to train an ASR model without their consent. We investigate two types of targeted ASR models: a hybrid ASR system and an end-to-end ASR system. With an audio signal input, both of the models transcribe speech into written text. The auditor audits the target model with an intent via strict black-box access to infer user-level membership. The auditor will behave differently depending on whether audio is transcribed from within its training set or from other datasets. Thus, one can analyze the transcriptions and use the outputs to train a binary classifier as the auditor. As our primary focus is to infer user-level membership, instead of using the rank lists of several top output re-sults, we only use one text output, the user’s speed, and the input audio’s true transcription while analyzing the transcription outputs
As mentioned before, this project requires tremendous computing resources, as we need to train several speech recognition models using LSTM, RNN and GRU, to evaluate our audio auditor. Each model consumed weeks’ of training even with GPUs. Without Pawswy’s support, it would not be successful
List of Publications
Miao, Y., Xue, M., Chen, C., Pan, L., Zhang, J., Kaafar, D. Zhao, B.Z.H., and Xiang, Y., 2020. The Audio Auditor: User-Level Membership Inference in Internet of Things Voice Services. submitted to The 20th Privacy Enhancing Technologies Symposium.
Miao, Y., Zhao, B.Z.H., Xue, M., Chen, C., Pan, L., Zhang, J., Kaafar, D. and Xiang, Y., 2019. The audio auditor: Participant-level membership inference in voice-based iot. in ACM Conference on Computer and Communications Security (CCS) Workshop on Privacy Preserving Machine Learning