We propose an instantaneous amplitude (IA)-based model for speech signal representation. This can avoid the difficulty in dealing with the time-varying phases and allow us to perform an optimization procedure easily such that the synthetic signal can be made as close to the original one as possible. Experiments show that synthetic speech with the developed technique is of excellent quality and almost perceptually indistinguishable from the original speech. Initial work on the coding of the parameters for the IA model is done and high-quality synthesized speech at an average bit rate of 8 kbit/s is achieved.
展开▼