A signal interval in which a talker collision exists at least between the first and second audio signals is detected from the plurality of received audio signals. The processor receives the positive detection result and, in response, processes at least one of the audio signals to make it perceptually distinguishable. The mixer mixes the audio signal to provide an output signal, where the processed signal replaces the corresponding received signal. In an exemplary embodiment, signal components are separated from talker collisions in frequency or in time. The present invention may be useful in a conference hosting system.
展开▼