DeepEar: Robust Smartphone Audio Sensing in Unconstrained Acoustic Environments using Deep Learning

被引:202
作者
Lane, Nicholas D. [1 ]
Georgiev, Petko [2 ]
Qendro, Lorena [3 ]
机构
[1] Bell Labs, Cambridge, England
[2] Univ Cambridge, Cambridge CB2 1TN, England
[3] Univ Bologna, I-40126 Bologna, Italy
来源
PROCEEDINGS OF THE 2015 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING (UBICOMP 2015) | 2015年
关键词
Mobile Sensing; Deep Learning; Audio Sensing; SPEECH RECOGNITION;
D O I
10.1145/2750858.2804262
中图分类号
TP301 [理论、方法];
学科分类号
080201 [机械制造及其自动化];
摘要
Microphones are remarkably powerful sensors of human behavior and context. However, audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse acoustic environments (such as, bedrooms, vehicles, or cafes), that users encounter on a daily basis. Towards addressing this challenge, we turn to the field of deep learning; an area of machine learning that has radically changed related audio modeling domains like speech recognition. In this paper, we present DeepEar - the first mobile audio sensing framework built from coupled Deep Neural Networks (DNNs) that simultaneously perform common audio sensing tasks. We train DeepEar with a large-scale dataset including unlabeled data from 168 place visits. The resulting learned model, involving 2.3M parameters, enables DeepEar to significantly increase inference robustness to background noise beyond conventional approaches present in mobile devices. Finally, we show DeepEar is feasible for smartphones by building a cloud-free DSP-based prototype that runs continuously, using only 6% of the smartphone's battery daily.
引用
收藏
页码:283 / 294
页数:12
相关论文
共 69 条
[1]
Amft O, 2005, LECT NOTES COMPUT SC, V3660, P56
[2]
[Anonymous], 2006, P 12 ACM SIGKDD INT
[3]
[Anonymous], 2010, Proc. of the 8th ACM Conference on Embedded Networked Sensor Systems, DOI DOI 10.1145/1869983.1869992
[4]
[Anonymous], 2011, INTERSPEECH
[5]
[Anonymous], 2011, UbiComp
[6]
[Anonymous], 2008, Advances in Neural Information Processing Systems, DOI DOI 10.7751/mitpress/8996.003.0015
[7]
[Anonymous], 2004 SPEAK ODYSS SPE
[8]
[Anonymous], 2004, Advances in Neural Information Processing Systems
[9]
[Anonymous], 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
[10]
[Anonymous], 2006, Pattern recognition and machine learning