文摘
Performance optimization in speaker recognition is a challenging task in the field of vocal based human-computer interaction. Many researches have shown that deep learning Neural Network methods have the best performance in comparison with other classifiers. However, those methods with many parameters require a lot of tunings in order to optimize the performance in different supervised learning tasks. In this paper, we show that picking a good combination of parameters can significantly improve the performance of Stochastic Gradient Descent deep learning Neural Network method in automatic speaker recognition even in a noisy environment. Parameters that are analyzed are learning rate, hidden and input layer dropout rate.