PROBLEM TO BE SOLVED: To provide a time-frequency mask estimator learning device for learning a time-frequency mask estimator using a real part and an imaginary part of an STFT of an observed signal. A real part of an STFT spectrum of an observed signal, a CNN processing part that executes a convolutional neural network process on a value in which a corresponding imaginary part is regarded as a real number, and a real part of an STFT spectrum of the observed signal, A BN processing unit that executes a batch normalization process that uses a parameter common to the norm operation for a value that considers the corresponding imaginary part as a real number, a real part of the STFT spectrum of the observed signal, and a corresponding imaginary part as a real number. A time-frequency mask estimator including a GLU processing unit that executes a gate linear unit process on a value obtained by combining considered values, and a known observation signal and time obtained by superimposing a known target sound and a known noise. It includes a learning unit that learns the parameters of the time-frequency mask estimator so that the cost function between the value multiplied by the frequency mask and the known target sound is minimized. [Selection diagram] Figure 1
展开▼