Sunteți pe pagina 1din 4

Interaction Analysis over Speech for Call Centre

Veerus Dmello Parth Kholkute


Computer Department Computer Department
St. Francis Institute of Technology St. Francis Institute of Technology
Mumbai, India Mumbai, India

Rhea Kolhapurkar Nicholas Patric Ms. Ankita Karia


Computer Department Computer Department Assistant Professor
St. Francis Institute of Technology St. Francis Institute of Technology Computer Department
Mumbai, India Mumbai, India St. Francis Institute of Technology
Mumbai, India

Abstract - The aim is to apply well-known data mining Call centre optimization is an important part of customer
techniques to the problem of predicting the quality of relationship management which consists of people, processes,
interactions like those done in call centre’s and the problem of technology and strategies. Service quality of a call centre is a
predicting the quality of service. The analysis of call centre result of comparison of actual service performance and
conversations will provide useful insights for enhancing Call customer expectations. Evaluating the service quality which is
Centre Analytics to a level that will enable new metrics and key offered by customer service agent to customer is more
performance indicators (KPIs) beyond the standard approach. difficult than evaluating the product quality.
The main operations will be speaker diarization, speech to text,
agent analysis, emotion recognition and hot-topic analysis. Call centre’s provide services for many types of sectors
RAVDESS (Ryerson Audio-Visual Database of Emotional such as telecommunication, finance, transportation, health,
Speech and Song) is the database used for emotion analysis, automotive etc. Several studies have proposed various
CMU-Sphinx is used for speech to text transcription. Emotional approaches and solutions for the problem of evaluating agent
analysis uses Convolutional Neural Networks to recognize performance. Performance evaluation in call centre’s is
emotions. The agent analysis and hot topic and root cause generally performed through listening randomly selected calls
analysis use Natural Language Processing (NLP) techniques. All from recorded calls, and evaluating the words one by one in
of this will happen in real-time as and when the call is taking the related conversation. obvious demand for automatic
place. The data gathered will be stored in a database for easy
performance evaluation systems to reduce employee costs and
retrieval and for maintaining records.
to increase the time efficiency. Takeuchi has analysed the
Index Terms— Agent Analysis, Emotion recognition, Hot topic recorded calls from a rental car reservation office with Trigger
analysis, Interaction mining, Speech to Text Segment Detection to find whether a customer has the
intention of booking a car or not. This system is used to
analyse the content of call centre conversations and detect the
I. INTRODUCTION main issue addressed in the call. None of the previous existing
This paper presents a call monitoring system to assess calls methods cover all the ranges of call centre analysis and give
live and help call centres to collect data to analyse their worthwhile insights on this problem.
service and gain insights from them. In our increasingly
Call Centre Analytics is aimed at solving the above issue
industrialized and globalized world, a large number of
by enabling tapping into the content of conversations. Just
companies include call centres in their structures and more
text-based content analysis approaches are highly sensitive to
than $300 billion is spent annually on call centres around the
input quality and that conversational input is fundamentally
world. For a customer, addressing the call centre actually
different than text. Therefore, conversations should be treated
means addressing the company itself, and any negative
differently. First, seek to diarize the speech as to separate the
experience on the part of the customer can lead to the rejection
components of the customer and the agent. Thereafter, use a
of company products and services. Such data analysis can help
mixture of emotional analysis as well as speech to text
improve the quality of customer service and lower the costs.
transcription while performing agent analysis through Natural
Language Processing (NLP) technology which needs be
adapted to and be robust enough to deal with the machine learning (Gilman et al. 2004; Zweig et al. 2006;
conversational domain in order to achieve acceptable Takeuchi et al. 2009). These methods failed in providing
performance. Moreover, the level of analysis of conversation satisfactory results even in very broad categories. The problem
cannot be set to semantics only. It must consider the purpose still lies on data sparseness and that huge amount of training
of language in its context, i.e., pragmatics. Our approach to data is necessary to achieve reasonable discriminatory power.
Call Centre Analytics is based on Interaction Mining. Getting huge training data is not an option also because
Interaction Mining is a new research field aimed at extracting training is highly influenced by domain specificity.
useful information from conversations. In contrast to Text Transferring trained models from a domain to another would
Mining, Interaction Mining is more robust, tailored for the be problematic [4].
conversational domain, and slanted towards pragmatic and
discourse analysis. All these details are stored in that will III. CHALLENGES IDENTIFIED
contain data about the current and previous calls to further aid The challenges identified are as follows:
the agents.
A. Lack of a concrete dataset of calls
Since the call centre data collected by various
organizations are confidential and contain privileged
II. LITERATURE SURVEY information, there is no dataset available that contain calls and
Call centres provide services for many types of sectors the interaction between the agent and the customer. This
such as telecommunication, finance, transportation, health, provides a hinderance in measurement of the system
automotive etc. Several studies have proposed various performance during evaluation.
approaches and solutions for the problem of evaluating agent
performance. Performance evaluation in call centres is B. Real time call analysis
generally performed through listening randomly selected calls The system proposed will perform real time analysis. Call
from recorded calls, and evaluating the words one by one in centre’s have a call module that is the platform used to take
the related conversation. Obvious demand for automatic and make calls. Since access to this call module is not
performance evaluation systems to reduce the employee costs available, the analysis will require a temporary software that
and to increase the time efficiency. Takeuchi has analysed the will act as the call module or replace it for the evaluation of
recorded calls from a rental car reservation office with Trigger the system proposed.
Segment Detection to find whether a customer has the
intention of booking a car or not. Mishne has proposed a call IV. PROBLEM DEFINITION
centre monitoring system that uses text analytics and
Call centre optimization is an important part of customer
information retrieval methods. The system is used to analyse
relationship management which consists of people, processes,
the content of call centre conversations and detect the main
technology and strategies. Service quality of a call centre is a
issue addressed in the call. The project has presented speech
result of comparison of actual service performance of the
analytics system adapted automatic speech recognition and
agent and customer satisfaction. Extracting data that will help
text mining technologies [1]. Minnucci (2004) reports that the
to reduce total call time and better the service provided by the
most required metrics by call centre managers are indeed the
call centre, i.e. get real time emotion of the speaker and the
qualitative ones topped by Call Quality (100%) and Customer
agent, mine data from the speech to text transcription and use
Satisfaction (78%) [2].
them to rate the agent’s performance.
However, these performance metrics are difficult to
implement with the adequate level of accuracy. For instance, V. PROPOSED SYSTEM METHODODLOGY
the Baird study (2004) points out that for Customer
Satisfaction, accuracy can be “negatively affected by A. System Architecture
insufficient number of administered surveys per agent
resulting in not enough samples of individual agent’s work to
constitute a representative sample. The result could be an
unfair judgment of the agent’s performance and allocations of
bonuses based more upon chance, good fortune than merit.
“Accuracy is defined in as true indication and it depends on
the actual level of performance attainment, especially with
regard to statistical validity [3].
Current approaches to Call Centre Analytics are mostly
based on Speech Analytics and Text Mining, which is
essentially Search and Sentiment Analysis. Recorded speech is
first indexed and searched against a set of negative terms and
relevant topics. There are currently two main approaches for
speech index: i) Phonetic Transcription and ii) Large
Vocabulary Conversational Speech Recognition (LVCSR).
Another common approach to the analysis of call centre data
is that of automatic call categorisation through supervised Fig. 1: Architecture block diagram for the proposed system
As shown in Fig. 1, the system will process the live call. This will be measured against the real time evaluation done by
Speaker diarization will be performed on the call to break the system to validate its performance.
down the call into two speaker components. Customer Speech
audio and Agent Speech audio will be given for Emotional VII. EXPERIMENTAL SET UP
Analysis, where the detected emotions are shown with the
help of a graph. Audios will then further be given for Speech Hardware requirements –
to Text Transcription, where the output is stored in a text file.
The output from the text file will be given as input for Agent The hardware will require a high-performance CPU and
Analysis and Hot Topic Analysis, where performance results GPU to perform the complex operations in the software
of the agent will be generated and stored in the database. model.
Minimum requirements -
B. Proposed technical solution CPU: Intel Core2 Quad Q6600 @ 2.4 GHz (or AMD
The solution is divided into modules, the output from all Phenom 7950 Quad-Core, AMD Athlon II X4 620 equivalent)
the modules will aid in giving the final output which will then RAM: 2 GB
be stored in a database.
Software requirements –
Speaker Diarization – It is the process of partitioning an Microsoft Windows 10/8.1/8/7
input audio stream into homogeneous segments according to Python 3.6 or above
the speaker identity. The partitions will be agent and customer
and will enable us to analyze each of them individually.
The set ups for individual modules are -
Emotional Recognition - Gauge the emotional state of
your customers by analyzing their voices for tell-tale Emotional Analysis –
variations in pitch or tone. In essence, the software determines Database in use for testing and training - RAVDESS
the emotional tone behind a series of words, used to gain an (Ryerson Audio-Visual Database of Emotional Speech and
understanding of the attitudes, opinions and emotions Song).
expressed. Emotion analysis is done by converting the audio Method - Conversion to spectrogram, extraction of
into spectrograms and predict the emotion from it. features and MFCC’s, and use of convolution neural network.
Speech to Text Transcription - Speech to text then allows Agent Analysis –
us to process the speech to text which is stored as a .txt file.
This file is then used to analyze the agents. Database in use for testing and training - Raw database
received from STT.
Agent Analysis - The agent is analyzed using the speech to Method - Using unique algorithms such as cosine
text transcription on the following criterion (KPI factors) - similarity, sequence matching, etc., to provide insights on the
o Greeting score KPI factors mentioned in the proposed solution.
o Number of slangs
o Number of repeated sentences Speech to text transcription –
o Closing score
Database in use for testing and training - CMU Sphinx,
o Banned words
which is a continuous-speech, speaker-independent
o Call length
recognition system making use of Hidden Markov acoustic
o Whether the problem was solved or not
models (HMMs).
The above details will be stored in a .xml file for storage
Method - Glob for selection of files and use of CMU
purposes and easy retrieval. Each .xml file will be then used to
Sphinx for speech recognition.
aid in further optimizing the agent performances.
Hot topics with root cause analysis - Uncover and track the Speaker Diarization –
most frequent topics mentioned by customers, which aids in Method – Individually recognize the two different audio
identifying trending customer satisfaction issues. frequencies and use RNN to automatically separate them into
two different audio streams.
VI. PERFORMANCE EVALUATION PARAMAETER
The individual modules will be unit tested according to Hot topic analysis –
criterions set by us. The unit tests include and are not Method – Using NLP algorithms like cosine similarity,
restricted to test sets for all the modules. sequence matching, etc.
The integrated system will evaluate scripted calls to
provide results. These results will be evaluated against REFERENCES
customer feedback. The customer feedback will help validate [1] Betül Karakus,Galip Aydin, “Call Center Performance Evaluation
against the performance of the system. Another evaluation Using Big Data Analytic.”, International Symposium on Networks,
parameter will include an independent team to individually go Computers and Communications (ISNCC), 2016, pp. 1-2.
[2] Vincenzo Pallotta, Rodolfo Delmonte, Lammert Vrieling, David
through each call recording and evaluate it by themselves. Walker, “Interaction Mining: the new frontier of Call Center
Analytics.”, In: Lai C., Semeraro G., Vargiu E. (eds) New Challenges in
Distributed Information Filtering and Retrieval. Studies in [4] Sathit Prasomphan, “Detecting Human Emotion via Speech Recognition
Computational Intelligence, vol 439. Springer, Berlin, Heidelberg, by using Speech Spectrogram.”, IEEE International Conference on
2013, pp. 91-111. Data Science and Advanced Analytics (DSAA) Paris, France, IEEE
[3] Baird H, “Ensuring Data Validity Maintaining Service Quality in the 2015, pp 66-73.
Contact Center.”, Telecom Directions LLC, 2004, pp 3.

S-ar putea să vă placă și