Speech Processing

Speech processing can extract audio features characterizing specifically each speaker.
These features describe temporal intonation, e.g. prosody, or voice quality characteristics.

The robust analysis of such features, along with the development of dedicate models can find application in every area of human action.

Within this field, we are currently developing this technology within two main fields:

the analysis of dysarthric speech to improve speech recognitions systems and improve diagnostic tools for dysarthric speech evaluation. One of the objective is to optimize automatic speech recognition systems for dysarthric speakers.
Emotional Speech processing
The automatic analysis of audio speech features can be used for speech recognition but also to acquire information about speaker emotions, mood and personality. The research within this field aims at developing specific model that could describe the above dimensions from speech.

Applications range from the monitoring of worker stress in the workplace, to the detection of harmful situation to the analysis of psychiatric or neurological diseases.

Currently we are developing tool for the design of a clinical decision support system for the diagnosis of psychiatric diseases, such as bipolar disorders and attention deficit hyperactivity disorders to improve therapeutical interventions.

Research lines

Trustworthy Artificial and Embodied Intelligence

Human-Centric Systems