Documente Academic
Documente Profesional
Documente Cultură
1. Introduction
In order to effectively manage an organization, timely access to the right information pertaining
to that organization by the right person is the key; this enables monitoring of activities and
assessment of performance of the organization [1]. Access to such information still poses a
challenge in organizations; information systems collect and process vast amount of data in
various forms that are not readily perceptible to decision makers [1]. Decisions in organizations
are made by human beings and not the management information systems, as a result,
presentation of data plays a very important role in any decision making process, and for a
strategy based on any decision to be successful, the information has to be comprehensive and
perceptive [2][3]. It is argued that Management Information Systems (MIS) have been
supporting organizations in their different tasks; however, today majority of these systems have
undergone significant depreciation [4]. This is because such systems have not met decision
makers expectations, such as: making decisions under pressure; monitoring competition;
possessing such information on their organizations that includes different points of view; and
carrying out constant analyses of numerous data and considering different variants of
organization performance [4] [5].
Business Intelligence (BI) systems come to the rescue of decision makers. Business
Intelligence is regarded as the process of taking large amounts of data, analyzing it, and
presenting a high-level set of reports that summarize that data into the basis of business actions,
enabling management to make vital day-to-day business decisions [6]. The implementation of
business intelligence systems can contribute to improved information quality in various ways;
faster access to information, easier querying and analysis, a higher level of interactivity, and
improved data consistency due to data integration processes, among others[7]. The benefits of
having a Business Intelligence system cannot be overemphasized; universities being one of the
organizations also require such systems.
Copyright 2015 The Authors www.eChallenges.org Page 1 of 7
Today more and more organizations are turning towards BI for making better business
decisions. This is true based on the Gartners report which states that BI applications have been
ranked the top technology priority for four years in a row [8]. In learning institutions
specifically, it is argued that new data analytics approaches are creating new ways of
understanding trends and behaviours in students that can be applied in improving learning
design, strengthening student retention, providing early warning signals concerning individual
students and helping to personalise the learner's experience [9].
It has been posited that closed and commercial Bi tools have dominated market as opposed
to open source [10]. With the popularity of BI usage in the industries, there has been little focus
on implementation in institutions of higher learning especially in Kenya.
This study described an open source implementation of Business Intelligence system taking
a case of the Technical University of Kenya (TUK). TUK was chosen as a study area due to one
of the authors over three year working experience in the University as a software developer.
The challenges that the University was facing in regards to access to valuable, correct, timely,
and actionable information depicting the whole business picture for effective decision making
are elaborated. Among the decisions that were most challenging included determining: student
admission trend; qualified students for various scholarships in the university; success factor of
various programmes offered; budget for various departments; best performing faculty or school
in the university; student retention level; staff employment legal requirements; needy students
for hostel accommodation; expansion of the university, among others.
According to Gartners maturity model for Business Intelligence [11], TUK could be said to
be at the tactical stage of BI maturity level: the management seemed to start investing in BI;
metrics used were only at departmental levels; most of the data, tools, and applications were in
silos; and users seemed not skilled enough to take advantage of the BI system. With this regard,
issues relating to information access for decision making could be broadly based on; data
acquisition, storage, cleaning, integration, and summarization; presentation and timely distribution of
the generated information to the right users.
Challenge in data acquisition was brought about by the presence of multiple data sources
available in silos across departments and existing in various formats. Acquisition process was
mainly manual, and in most cases done by IT officers. Unstructured data existed in log files, web
sites, emails, social media, and official documents like memos; it was difficult to store, clean,
integrate and summarize these kind of data with structured data from MIS. The decision makers
needed information from these multiple disconnected data across all departments.Presentation
and timely distribution of information to right users was a challenge; too much time and money
was spent on generating the required reports; reports were generated using spreadsheets.
Valuable team members were busy creating reports instead of making decisions on the report
data. One time, a retreat of some senior managers had to be organized to collate data for staff
ethnic balance which was required urgently by the Government agency. These reports were
often rigid, not dynamic; preventing data analysis and drill-down. Data quality was poor and not
validated and therefore not trusted fully; concerns over data accuracy eroded confidence to make
important decisions. Some reports were often only available monthly or quarterly; however users
needed them on-demand.
2. Objectives
The main aim of this study was to implement a Business Intelligence system for the Technical
University of Kenya using open source tools. The specific objectives were:
1. To investigate appropriate open source tools in the implementation of the BI system.
2. To design and develop the University data warehouse.
3. To perform analytics pertaining to the University.
4. To implement dashboards for different BI users.
INFORMATION
Data Mining (R)
INTEGRATED DATA
4. Technology Description
4.1 Big Data Analytics
In todays businesses, increasing standards, automation, and technologies have led to vast
amounts of data becoming available in data warehouses, improved extract, transform and load
(ETL) and reporting technologies [15] enabling Business Intelligence by sifting through large
amounts of data, extracting pertinent information, and turning that information into knowledge
upon which actions can be taken. It has been lately noted that the cost of data acquisition and
data storage has declined significantly, hence, increasing the appetite of businesses to acquire
very large volumes in order to extract as much competitive advantage from it as possible [14].
This has led to the birth of the term Big Data, which denotes, datasets whose size is beyond the
ability of typical database software tools to capture, store, manage, and analyze [16]. Big Data
Analytics is the process of analysing and examining large volumes of data of a variety of types
so as to uncover hidden patterns, unknown correlations and other useful information with the
aim of helping organisations to make better decisions and possibly gain competitive advantages
[17]. Big Data Analytics can be carried out using these seven techniques: association rule
learning; classification tree analysis; genetic algorithms; machine learning; regression analysis;
sentiment analysis; and social network analysis [18]. In the Education sector, Big Data Analytics
can be applied to predict student success or dropout from school by analysing data from
Learning Management Systems (LMS) and other student online systems; e-books and mobile
device utilization and effect on students by analysing book usage, course content, and content
presentation; and track finance and budgeting of educational intuitions to identify new market
opportunities [19]. Furthermore, it is argued that institutions can apply data mining techniques
and analytics to gain an understanding on different topics such as, administrative and
instructional applications, recruitment, admission processing, financial planning, donor tracking,
and student performance monitoring [20].
4.2 Hadoop Framework
Hadoop is an Apache open source framework that allows for the distributed processing of large
data sets across clusters of computers using simple programming models, designed to scale up
from single servers to thousands of machines, each offering local computation and storage [21].
Hadoop is composed of two main components: Hadoop Distributed File System (HDFS) and
MapReduce. Hadoop Distributed File System (HDFS) is a distributed file system designed to
run on commodity hardware, it is highly fault-tolerant and is designed to be deployed on low-
cost hardware [22]. MapReduce is a programming model and an associated implementation for
processing and generating large datasets that is amenable to a broad variety of real-world tasks
[23]. Hadoop provides a reliable distributed storage through HDFS and an analysis system by
MapReduce and was designed to scale up from a few servers to hundreds or thousands of
computers, having a high degree of fault tolerance [25]. Hadoop includes an ecosystem of other
projects built on top of HDFS and MapReduce in helping achieving certain operations on the
platform.
5. Results
Hadoop framework and R were used in the implementation of the BI system. A data warehouse
was designed and developed using Apache Hive, one of the Apache projects under Hadoop.
Data was extracted from internal sources (such as MIS, students portal), official documents like
memos, and external sources. The size of total data collected was a few gigabytes. Data was
loaded into Hadoop HDFS just the way they were retrieved. Appropriate schemas on the data
warehouse were used in transferring data from HDFS to the data warehouse. With this, it was
possible to develop a fully working data warehouse using Apache Hive and Impala. Data
analysis was carried out on the data warehouse using ad-hoc queries and data visualization. The
data warehouse powered by Hive was accessed through Hue. A number of analytics were
achieved. Among the interesting reports generated were on the student admission trends, student
neediness profiling for accommodation and scholarships, and staff composition. Using
generalized linear model in R, future student admission pattern could be predicted; since some
data were missing, it was not clear the reason behind the admission decline. When student data
from Student MIS was combined with other sources like students online portal, financial system
(Sage Pastel), a near complete student profile was generated which enabled the Directorate of
Student Support Services to determine the neediness levels of various students. This enabled
faster allocation of hostel rooms to students. The neediness level list was anticipated to be used
as basis for allocation of various scholarships that usually granted. On the admission trend,
information on regional distribution was revealed whereby one of the counties thought to be
popular with the University seemed to be declining. Different dashboards were created for
different users user needs. Staff composition report enabled the institution to submit urgent
report required by the Auditor General and Kenyan Senate; this would, before, require officers
involved to work overtime. The dashboards included for course application, student ranking of
neediness levels, and student admission. These dashboards were interactive enough to allow user
for drill-down and roll-ups.
6. Business Benefits
Understanding where the value of information technology lies, and how to measure that value,
has remained an important issue for both managers and academics [27], and hence it was crucial
to evaluate the business benefits of the BI implementation in TUK. Business intelligence (BI) is
one area of IT in which traditional evaluation techniques may perform poorly, as many of the
benefits are strategic, and consequently not easily quantifiable [28]. Most benefits were thus
intangible. This study used Process Model with 6 critical factors as this was a new approach
acknowledging traditional evaluation problems and the data warehousing focus is closely related
to BI systems therefore might prove useful [29] as shown in table 1.
Table 1: Critical Factors for Evaluating Business Benefits
Critical Factor Explanation Perceived Evidence
Problems Evaluating Delays in report generation; difficulty in data consolidation due to
intangible multiple data sources available in different formats; rigid reports, not
benefits dynamic; preventing data analysis and drill-down; poor data quality.
Economic Determined Extent of holding expensive retreats for certain report generation; staff
7. Conclusions
This study managed to implement a BI system for the Technical University of Kenya through
the use of open source tools mainly Hadoop and R software. Reports that could take weeks or
months to generate were now available at a click of a button. The student profiling enabled
analysis of students from various points of views through slicing, dicing and drill-downs.
Although, the top management was not involved in the study, still there were some tangible
results achieved. Due to constraints during the study, various departments that were not
integrated into the data warehouse, it would be interesting if all these departmental data were
incorporated and even external sources like social media. Although the authors were also
interested in the open source technical implementation of the system, involving the top
management like the Vice Chancellor and the Deputy Vice Chancellors in the study seemed to
would have improved the success of the BI implementation. It would be interesting also to make
it near real-time where the BI is integrated with the sources.
The issues in TUK are evident in other typical organization in Kenya. In effect, the BI
system implementation approach would be replicated in other organizations. Universities in
Kenya have fairly the same business processes; hence this implementation approach would
apply to a typical university in Kenya. Business answers like admission trends, and student
neediness profiling for accommodation and scholarships would be the same in other universities.
The implementation model can be applied in other sectors like health care, transport, cyber
security, national security, county governance, banking, and insurance, among others. Possible
analytics can be used in: fraud detection by sifting through bank transaction logs; detection of
network intrusion by analysing network logs; reporting of possible terrorist attack by analysing
bank transactions, videos of surveillance cameras; public opinion on government service through
sentimental analysis of social media data; Internet of Things (IoT) due to presence of connected
devices (mobile phones, fridges, laptops, watches); faster detection loan defaulters by analysing
borrowing history of applicants, and others. The study experienced challenges like long learning
curve for certain Hadoop products like Mahout, R manipulation; difficult server configuration
References
[1] T. H. Davenport and L. Prusak, Working knowledge: How organizations manage what they know. Harvard
Business Press, 1998.
[2] J. P. Herring, "The role of intelligence in formulating strategy," Journal of Business Strategy, pp. 54-60, 1992.
[3] S. Malik, Enterprise dashboards design and best practices, 1st ed. New Jersey: Wiley, 2005.
[4] C. M. Olszak and E. Ziemba, "Approach to building and implementing business intelligence systems,"
Interdisciplinary Journal of Information Knowledge and Management, pp. 2,134-148, 2007.
[5] G. Muhammad, J. Ibrahim, Z. Bhatti, and A. Waqas, "Muhammad, G., Ibrahim, J., Bhatti, Z., & Waqas, A.
(2014). Business Intelligence as a Knowledge Management Tool in Providing Financial Consultancy
Services," American Journal of Information Systems, pp. 2(2)26-32, 2014.
[6] R. Stackowiak, J. Rayman, and R. Greenwald, Oracle Data Warehousing and Business Intelligence Solutions.
Indianapolis: Wiley Publishing, Inc, 2007.
[7] A. Popovic, P. S. Coelho, and J. Jaklic, "The Impact of Business Intelligence System Maturity on Information
Quality," Information Research, p. aer417, 2009.
[8] Gartner Research. (2009, Jun.) Gartner Research. [Online]. http://www.gartner.com/newsroom/id/1017812
[9] N. Karen, J. A. Clark, I. D. Stoodley, and T. A. Creagh, "Establishing a framework for transforming student
engagement, success and retention in higher education institutions," Queensland University of Technology,
Sydney, Final Report, 2014.
[10] M. Golfareli, Open source BI platforms: a functional and architectural comparison. Berlin Heidelberg:
Springer, 2009.
[11] B. Burton, "Results of Business Intelligence and Performance Management Maturity Survey," Gartner Inc.
Research, 2009.
[12] L. Wise, Using Open Source Platforms for Business Intelligence: Avoid Pitfalls and Maximize ROI. Newnes,
2012.
[13] R. Kimball and M. Ross, The data warehouse toolkit: the complete guide to dimensional modeling. John
Wiley & Sons, 2011.
[14] C. Surajit, U. Dayal, and V. Narasayya, "An overview of business intelligence technology," Communications
of the ACM, vol. 44, no. 8, 2011.
[15] J. Ranjan, "Business intelligence: concepts, components, techniques and benefits," ournal of Theoretical and
Applied Information Technology, pp. 60-70, 2009.
[16] M. Minelli, M. Chambers, and A. Dhiraj, Big data, big analytics: emerging business intelligence and analytic
trends for today's businesses. John Wiley & Sons, 2012.
[17] S. Miller. (2013) Singapore Management University. [Online]. http://ink.library.smu.edu.sg/podcasts/8/
[18] Talegaon and A. P. Shubhada, "ANALYTICS OF BIG DATA," COMPUSOFT, An international journal of
advanced computer technology, vol. 3, no. 10, Oct. 2014.
[19] F. Kalota, "Applications of Big Data in Education," International Journal of Social, Behavioral, Educational,
Economic and Management Engineering, vol. 9, no. 5, 2015.
[20] A. G. Picciano, "The Evolution of Big Data and Learning Analytics in American Higher Education," Journal
of Asynchronous Learning Networks, vol. 6, no. 3, pp. 9-20, 2012.
[21] Apache Hadoop. (2015) Hadoop. [Online]. http://hadoop.apache.org/
[22] D. Borthakur. (2008, Jan.) Hadoop Apache Project. [Online].
http://hadoop.apache.org/common/docs/current/hdfs-design.pdf
[23] J. Dean and G. Sanjay, "MapReduce: simplified data processing on large clusters," Communications of the
ACM, vol. 51, no. 1, pp. 107-113, 2008.
[24] T. Condie, N. Conway, P. Alvaro, and J. Hellerstein, "MapReduce Online," NSDI, vol. 10, no. 4, p. 20, 2010.
[25] B. Oancea and R. M. Dragoescu, Integrating R and Hadoop for Big Data Analysis. arXiv preprint
arXiv:1407.4908., 2014.
[26] The R Project for Statistical Computing. (2015) The R Project for Statistical Computing. [Online].
http://www.r-project.org
[27] M. Davern and R. Kauffman, "Discovering Potential and Realizing Value from Information," Journal of
Management Information Systems Spring, pp. 121-143, 2000.
[28] Z. Irani and P. D. E. Love, "The Propagation of Technology Management Taxonomies for Evaluating,"
Journal of Management Information Systems, vol. 17, no. 3, pp. 161-177, 2001.
[29] M. Gibson, A. David, J. Ilona, and A. Melbourne, "Evaluating the Intangible Benefits of Business Intelligence:
Review & Research Agenda," in Proceedings of the 2004 IFIP International Conference on Decision Support
Systems (DSS2004): Decision Support in an Uncertain and Complex World, Prato, Italy, 2004.