Speech and Signal processing

Part of the Data Analytics lab, the Speech and signal processing group focusses on automatic speech recognition and prosody analysis of large volume of voice recordings generated across diverse Xerox Services businesses to derive new actionable insights.

Project Themes

  • Audio analytics
  • Healthcare

Analytics for Audio

XRCI is developing technology to understand speech and audio using state-of-the-art speech recognition systems that are heavily based on the Hidden Markov Model, the Gaussian Mixture models (HMM-GMM), and new acoustic modeling paradigms. The primary goal of the study is to enable high performance, actionable, and large-scale speech analytics at the medium to deep granularity on large call volumes.

In addition to traditional automatic speech recognition, XRCI is interested in understanding the meta information available in a live conversation or discourse and the adaptation of language and acoustic models to suit a speaker and the domain of discourse.

Analytics for Healthcare

Analytics in healthcare services can play a big role in making hospitals more efficient. Analytical systems complement the efforts of the care-givers, help patients manage their illnesses better, and assist healthy individuals in maintaining a healthy lifestyle.

XRCI’s on-going signal processing project in healthcare monitors health vitals and disease diagnostics remotely. XRCI has developed the technology for non-contact, non-intrusive capture of body vitals using camera-based devices and sophisticated signal processing. We are investigating use of different types of cameras such as simple webcam, thermal camera, depth camera and hyperspectral cameras to detect respiratory rate, heart rate, oxygen saturation and other parameters, which require detailed modelling and accurate calibration of the signals w.r.t ground truth from traditional healthcare devices.