Sunteți pe pagina 1din 82

UBICC Journal

Ubiquitous Computing and Communication Journal

Volume 4 · Number 3 · July 2009 · ISSN 1992-8424

Special Issue on ICIT 2009 Conference – Applied Computing

UBICC Publishers © 2009


Ubiquitous Computing and Communication Journal
Co-Editor Dr. AL-Dahoud Ali

Ubiquitous Computing and


Communication Journal

Book: 2009 Volume 4


Publishing Date: 07-30-2009
Proceedings
ISSN 1992-8424

This work is subjected to copyright. All rights are reserved whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illusions, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication of
parts thereof is permitted only under the provision of the copyright law 1965, in its current version, and
permission of use must always be obtained from UBICC Publishers. Violations are liable to prosecution under
the copy right law.

UBICC Journal is a part of UBICC Publishers


www.ubicc.org

© UBICC Journal
Printed in South Korea

Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services, South Korea

UBICC Publishers

2
Guest Editor’s Biography
Dr. Al-Dahoud, is a associated professor at Al-Zaytoonah University, Amman, Jordan.
He took his PhD from La Sabianza1/Italy and Kiev Polytechnic/Ukraine, on 1996.
He worked at Al-Zaytoonah University since 1996 until now. He worked as visiting professor in many
universities in Jordan and Middle East, as supervisor of master and PhD degrees in computer science. He
established the ICIT since 2003 and he is the program chair of ICIT until now. He was the Vice President of the
IT committee in the ministry of youth/Jordan, 2005, 2006. Al-Dahoud was the General Chair of (ICITST-2008),
June 23–28, 2008, Dublin, Ireland (www.icitst.org).
He has directed and led many projects sponsored by NUFFIC/Netherlands:
- The Tailor-made Training 2007 and On-Line Learning & Learning in an Integrated Virtual Environment" 2008.
His hobby is conference organization, so he participates in the following conferences as general chair, program
chair, session’s organizer or in the publicity committee:
- ICITs, ICITST, ICITNS, DepCos, ICTA, ACITs, IMCL, WSEAS, and AICCSA
Journals Activities: Al-Dahoud worked as Editor in Chief or guest editor or in the Editorial board of the
following Journals:
Journal of Digital Information Management, IAJIT, Journal of Computer Science, Int. J. Internet Technology and
Secured Transactions, and UBICC.
He published many books and journal papers, and participated as speaker in many conferences worldwide.
UBICC Journal
Volume 4, Number 3, July 2009

SPECIAL ISSUE ON ICIT 2009 CONFERENCE: APPLIED COMPUTING

538 The influence of blended learning model on developing leadership skills


of school administrators
Tufan Aytaç

544 Towards the implementation of temporal-based software version


management in Universiti Darul Iman Malaysia
Mohd Nordin Abdul Rahman, Azrul Amri Jamal, Wan Dagang Wan Ali

551 Effective digital forensic analysis of the NTFS disk image


Mamoun Alazab, Sitalakshmi Venkatraman, Paul Watters

559 Job and application-level scheduling in distributed computing


Victor V. Toporkov

571 Least and greatest fixed points of a while semantics function


Fairouz Tchier

585 Case studies in thin client acceptance


Paul Doyle, Mark Deegan, David Markey, Rose Tinabo, Bossi Masamila, David Tracey

599 An interactive composition of workflow applications based on UML


activity diagram
Vousra Bendaly Hlaoui, Leila Jemni Ben Ayed

609 How to map perspectives


Gilbert Ahamer

4
Special Issue on ICIT 2009 Conference - Applied Computing

THE INFLUENCE OF BLENDED LEARNING MODEL ON DEVELOPING


LEADERSHIP SKILLS OF SCHOOL ADMINISTRATORS

Tufan AYTAÇ
The Ministry of National Education, Ankara, TURKEY
taytac1@yahoo.com

ABSTRACT

The usage of b-learning approach on in-service education activities in Turkish education system are
getting more and more important these days. Generally, traditional education and computer based
education applications are used on in-service education activities. Blended learning (b-learning)
combines online learning with face-to-face learning. The goal of blended learning is to provide the most
efficient and effective learning experience by combining learning environments. The purpose of this
research is to find out the effect of b-learning approach on developing administrators’ leadership skills.
To identify what the school administrators’ educational needs and to know their existing leadership
skills, needs assessment questionnaire was applied to 72 school administrators who were selected from
33 primary schools in 11 region of Ankara capital city. According to the descriptive statistical analysis
results of questionnaire, in-service training programme was prepared for the development of school
administrators’ leadership skills. The school administrators were separated into three groups as
computer based learning (CBL) (25 participants), blended learning (BL) (23 participants) and
traditional learning (TL) (24 participant) groups. These groups were trained separately with these three
different learning environments by using the in-service training programme. According to the results of
pre-test, post test and achievements score means, it was observed that BL groups’ score is the highest
when compared to TL and CBL groups. As a result of this research, in terms of achievements and
effectiveness, b-learning was found to be the most effective learning environment when compared to the
others. Both learners and tutors findings strongly suggest that blended learning is available alternative
delivery method for inservice education activities.1

Keywords: Blended Learning, e-Learning, Information Technology, In-service education

innovative and technological advances offered by online


1 INTRODUCTION learning with the interaction and participation offered in the
Blended Learning (b-Learning or Hybrid Learning) consists best of the traditional learning [20].
of the combination of e-Learning and traditional education
approach. Blended learning combines online learning with The ground of blended learning approach constitutes the
face-to-face learning. The goal of blended learning is to powerfull side of traditional education and computer based
provide the most efficient and effective learning experience educations instead of using one or the other on its own.
by combining different learning environments. b-Learning
stands in the forefront in respect of interactivity with target Basic characteristics of Blended learning which reflects
learner group, enriching learning process and integration of the values of 21st century education are [2];
technology into education [1,2,3,16,21].  Providing a new way of learning and teaching,
 Teaching how to learn,
 Creating digital learners,
E-learning has had an interesting impact on the learning
environment. Blended learning is the most logical and natural  Be more economical,
evolution of our learning agenda. It suggests an elegant  Focusing on technology and communication
solution to the challenges of tailoring learning and  Improving project-based learning,
development. It represents an opportunity to integrate the  And improving teaching process.

1
This research project article has been supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) (SOBAG 1001
Programme).

UbiCC Journal – Volume 4 No. 3 538


Special Issue on ICIT 2009 Conference - Applied Computing

Blended Learning practices provide project based learning course materials. While such uses may be unique and
opportunities for active learning and interaction among engaging, they are not exactly novel [13].
learners and especially provides as a way to meet the
educational needs of the learners. Blended learning programs
may include several forms of learning tools, such as real-time
virtual/collaboration software, self-paced web-based courses,
electronic performance support systems (EPSS) embedded
within the learning-task environment, and knowledge
management systems. Blended learning contains various
event-based activities, including face-to-face learning, e-
learning, and self-paced learning activities. Blended learning
often occurs as a mixture of traditional instructor-led training,
synchronous online training, asynchronous self-paced study,
and structured task based training from a teacher or mentor.
The aim of blended learning is to combine the best of
classroom face-to-face learning experiences with the best of
online learning experiences. Overall, blended learning refers
to the integration (or the so-called blending) of e-learning Figure 2: A Blend of Learning Theories
tools and techniques with traditional face-to-face teaching
delivery methods. The two important factors here are the time By applying learning theories of Keller, Gagné, Bloom,
spent on online activities and the amount of technology Merrill, Clark and Gery, (see Figure 2) five key ingredients
utilized, see Concept of Blended Learning figure 1 below: emerge as important elements of a blended learning process
[3,4,6,7,8,9,10,11,12,15,16,19]. (see Figure 2):
1. Live Events: Synchronous, instructor-led learning events
in which all learners participate at the same time, such as in a
live “virtual classroom.”
2. Self-Paced Learning: Learning experiences that the
learner completes individually, at his own speed and on his
own time, such as interactive, Internet-based or CD-ROM
training.
3. Collaboration: Environments in which learners
communicate with others, for example, e-mail, threaded
discussions or online chat.
4. Assessment: A measure of learners’ knowledge. Pre-
assessments can come before live or self-paced events, to
determine prior knowledge, and post-assessments can occur
following live or self-paced learning events, to measure
learning transfer.
Fig. 1: Concept of Blended Learning 5. Performance Support Materials: On-the-job reference
materials that enhance learning retention and transfer,
If two or more of these learning environments which are including PDA downloads, and printable references,
stated above are used to teach an educational objective, it can summaries, and job aids.
be said that blended learning is realized. However blended
learning has more meaning than showing a web page during 2 PURPOSE
a lesson in the classroom and using information immediately
The purpose of this research is to find out the effects of
in the web page to explain the lesson. Blended learning is a
b-learning approach on developing school administrators’
learning of environment which combines environments of
leadership skills.
face to face learning and web-based distance learning.

Blended learning overcomes this limitation of an e- 3 RESEARCH DESIGN


learning only approach [12]. Today blended learning To determine what the school administrators’ educational
primarily functions as a replacement for extension of face-to needs on leadership skills, needs assessment questionnaire
face environments. For instance, it might be used to foster was applied to 72 school administrators who were selected
learning communities, extend training events, offer follow-up from 33 primary schools in 11 regions within Ankara capital
resources in a community of practice, access guest experts, city. According to the results of this questionnaire, in-service
provide timely mentoring or coaching, present online lab or
simulation activities, and deliver prework or supplemental

UbiCC Journal – Volume 4 No. 3 539


Special Issue on ICIT 2009 Conference - Applied Computing

training programme on developing school administrator’s


leadership skill was prepared.
The most needed leadership skills of school administrators
according to the results of needs assessment were determined
as human relations in administration, basic management
skills for school principles, job satisfaction at organizations
and motivation.
After that, content and learning activities of "School
Administrators Leadership Skills Development In-service
Programme" were prepared. Beside that course notes as
training materials were prepared to be distributed to the
participants in the form of CDROM and printed documents.
The school administrators were separated into three groups
as Computer Based Learning (CBL) (25 participants),
Blended Learning (BL) (23 participant) and Traditional
Learning (TL) (24 participant) groups. These groups were
trained according to three different methods by preparing
education programme. The groups were given two days
course.
Before the in-service training the school administrators
who were in BL group reached the digital content and studied
learning activities included in "School Administrators
Leadership Skills Development In-service Programme" which
is prepared by using Moodle Learning Managing System
Figure. 3: The Moodle interface
Softwware and published on http://beg.meb.gov.tr:8088/
website. On the other hand; all the in-service training content and
The school administrators who are in the BL group were activities were taught to CBL group by lecturer with aid of
entered to the http://beg.meb.gov.tr:8088/ webpage by using computer and projector. Finally, TL group was trained in a
their usernames and passwords given to them three weeks traditional way by using blackboard
ago, before the in-service training. The interface of the Multiple choice test which was made up of 20 questions
website is shown in the Fig. 2. The school administrators in were applied to the groups to investigate their achievements
this group shared information, chatted, and studied activities on leadership skills. This test was shown to content experts to
with their colleagues and subject area specialist about the identify its content validity. To find out the statistical
content and learning activities included in the site whenever significant difference among three groups score means, one-
they want. As online learner, school administrators build way Anova and Scheffe test were used. This test was applied
their confidence and learning processes as they get used to to all groups as pre-test at the beginning and as post-test at
working independently online. Blended learning activities the end of in-service training [5]. Blended Learning Model
included online knowledge gathering and construction in which was used on the research process showed Figure 3.
teams or groups, publishing of electronic content, interactive
elements like online brainstorming, discussion, several forms
of feedback, evaluation and assessment, as well as other
blended learning techniques. Lecturers posted messages to the
BL group as a whole and to each administrators individually
to meet their need for support. They posted explanation to
guide learners in more complex tasks, encouraged them to
communicate, to do their individual assignments, and to use
the Moddle platform tools. They have at their disposal to
facilitate their work. Tutors controlled and marked the online
assignments, filled in learners’ performance reports, and
write feedback on their performance in their online portfolios.
Lecturers followed school administrators learning
improvements and gave encouragement when motivation
level began to falter. And after that this group was trained by
lecturer as subject area specialist. Lecturer trained this group Figure. 3: The Process of Blended Learning Model
by using face to face education, computer based education and
online training website prepared by moodle software.

UbiCC Journal – Volume 4 No. 3 540


Special Issue on ICIT 2009 Conference - Applied Computing

4 RESULTS Table 3: The One-Way Anova Results on the difference


Between Groups According to Achievement Scores
When three groups’ pre-test score means were
(Difference between pre-test and post test)
compared, it was seen that there were significant differences
among them (F (2-69)=53,350, p<.01). (Table I).
Source of Sum of df Mean Mean
Table 1: The One-Way Anova Results on the Difference Variance Squares Square F Sig. Difference
Between Groups According to Pre-Test Scores
Between BDE-KÖ,
46,540 2 23,270 18,086 ,000
groups GÖ-KÖ
Source of Sum of df Mean Mean
Variance Squares Square F Sig. Difference Within
88,779 69 1,287
groups
Between BDE-KÖ,
278,668 2 139,334 53,350 ,000 Total 135,319 71
groups GÖ-KÖ
Within It was seen that BL group’s pre-test, post test and
180,207 69 2,612 achievements score means were the highest when compared
groups
to TL and CBL groups. The reason of this might be that BL
Total 458,875 71
group might be more ready than others since they studied
When three groups were statistically compared content and activities which were published with Moodle
according to data, BL groups’ pre-test score mean ( =12.87) software before other groups. They also experienced both face
was found statistically higher than the other two groups (BDE to face and computer based learning environments.
=9.12, TL =8.29). The reason of this might be that BL
group was more ready and successful than others. Since they
5 CONCLUSION
studied earlier all content and activities which were prepared
with Moodle software and published on internet. The influence of b-learning model on developing
leadership skills of school administrators was more effective
Table 2: The One-Way Anova Results on the Difference than computer based education and traditional learning.
Between Groups According to Post-Test Scores As a result of this research, in terms of time, cost and
effectiveness, b-Learning was found to be the most effective
Source of Sum of df Mean Mean method to in comparison with the other approaches.
Variance Squares Square Difference Particularly, it appeared that it is necessary to use more b-
F Sig.
Learning approach in in-service training of administrators
Between BDE-KÖ, and teachers. It is required effective usage of b-Learning
90,61
groups 544,539 2 272,270 ,000 BDÖ-GÖ, approaches for integrating education with information
0
GÖ-KÖ technologies, enriching learning-teaching process,
Within implementing face to face education, providing computer
207,336 69 3,005
groups based learning, realizing hands on learning and
Total 751,875 71 individualizing the learning.
When school administrators’ post test score means were At the research blended-learning arrangements involved
compared among three groups, it was found that there was a e-mentoring or e-tutoring. The role of the e-mentor/tutor is
significant difference between post test score means. (F (2- critical as this requires a transformation process to that of
69)=90,610, p<.01). (Table II). There was also statistically learning facilitator. Being teachers and online tutors has
significant differences among BL group’s post test score introduced beneficial qualitative changes in teachers’ roles,
mean ( =17.35), CBL group’s post test score mean ( =12.44) but it has also meant a quantitative increase in the number of
and TL group’s post test mean ( =10.79). Especially, BL hours dedicated to learners. Lecturers less spent time in face-
group administrators’ post test score mean is the highest of to-face classes than the online environment (Moddle
all. platform).
The difference between school administrators’ pre-test Moodle programme which is used blended learning
and post test scores was being calculated to identify their approach has great potential to create a successful blended
achievement scores. It was seen that there was a meaningful learning experience by providing a plethora of excellent tools
difference among the groups’ achievement score means (F (2- that can be used to enhance conventional classroom
69)=18.086, p<.01) (Table 3). BL group’s achievement instruction, in hybrid courses, or any distance learning
means ( =4.48) has been higher than CBL ( =3.28) and TL arrangements [18].
( =2.50) groups’ achievement means. Finally, lecturers identified learners who may be
experiencing particular problems and help them address their
weakneses in remedial work sessions if necessary.

UbiCC Journal – Volume 4 No. 3 541


Special Issue on ICIT 2009 Conference - Applied Computing

We observed that b-Learning opportunities for teaching - Blended learning takes time for both the instructor and
objectives make learning entertaining, funny, lasting and learner to adapt to this relatively new concept in delivering
economics as an effective way. In this sense, according to us instruction.
trainers should use b-Learning environment for the Especially, it can be concluded that all the in-service
integration of ICT effectively in learning and teaching. training should be taught more effectively by using b-
Last year, the Turkish Ministry of National Education In- Learning approach. The technological leadership role of the
service Training Department implemented more than 700 in- school administrations is very important for the success of b-
service training courses. The usage of b-Learning Learning approach.
methodology especially in these in-service trainings will The feature of blended learning models has a vital
enrich and support the learning-teaching process of those in- importance for applying individual learning and active
service training. More projects about the usage of b-Learning learning. According to some authors “a blend is integrated
in-service training should be supported and performed. strategy to delivering on promises about learning and
Particularly, the initiatives of the Turkish Ministry of performance [17].
National Education for improving schools information In sum, both learners and tutors findings strongly suggest
technologies and internet infrastructure, distributing that blended learning is available alternative delivery method
authoring software to the teachers, developing education for courses. In supporting blended learning, especially in-
portal and its content, moodle and similar learning service education courses remains both a national leader in
management system software should be used for supporting b- the effective use of technology for teaching and learning, and
learning usage in-service training. School administrators a pioneer in identifying the right mix of face-to-face and
state that b-learning approaches will be used more effectively online communication practices that will enhance learning
in the class. All school administrators’ comments regarding effectiveness [19]. The result of this research backs up all of
the blended course were positive. these. To develop the technological leadership of school
administrators, b-learning approaches should be used
It is cited as below that the positives of the blended learning effectively. Blended learning offers opportunities for both in-
course activities which are used at this research; service school administrators, in-service teachers and their
- Improvement in the quantity and/or quality of the learners.
communications among the school administrators in
discussion board or online groups and face to face activities in
the classroom. REFERENCES
- Good cooperative learning activities
- Blended learning were more effective than classroom [1] Aytaç, T. Eğitimde Bilişim Teknolojileri. Asil Yayın
alone. Higher learner value and impact; the effectiveness Dağıtım, pp. 48-53 (2006).
greater than for nonblended approaches. Learners like b- [2] Aytaç, T. The Influence of B-Learning Model on
learning appraoches. Developing Leadership Skills of Education
- Accessibility to b-learning content and activities rapidly Administrators Research Education Programme, pp.
(every time, everywhere) 48-53. (2006).
- Improved relationships between tutors and students
[3] Singh, H. “Building Effective Blended Learning
- The immediate feedback that could be given school
Programs”, Educational Technology, Vol. 43, Number
administrators
- Flexibility in scheduling and timetabling of course work. 6, pp. 51-54, November – December, (2003).
- An increase of the time actually spent on face-to-face in [4] Oliver, M. ve Trigwell, K. ‘‘Can ‘Blended Learning’ Be
classroom Redeemed?. E-Learning, Vol.2. Number 1, pp. 17,
- Cost effectiveness for both the accrediting learning (2005).
institution and the learner [5] Büyüköztürk, Ş. Sosyal Bilimler İçin Veri Analizi El
The increased cost, reduced training time, and the ability to Kitabı. İstatistik, Araştırma deseni SPSS
easily update training materials offer additional compelling Uygulamaları ve Yorum, 8. Baskı, PegemA Yayıncılık,
reasons for educators to embrace blended learning [22]. Pp: 40-53, Ankara, (2007).

At the research there are some problems according to school [6] Bonk, C. J.; Olson, T. M.; Wisher, R. A. and Orvis, K. L.
administrators opinions cited as below: Learning from Focus Groups: An Examination of
- Some technical web, internet problems access to moddle Blended Learning’’, Journal of Distance Education.
platform. Vol. 17, No 3. pp. 100. (2002).
- The failure of online Power Point presentation of lecture [7] Marsh, J. How to Design Effective Blended Learning.
material to meet some school administrators’ expectations. www.brandon-hall.com. Erişim tarihi: 15 February 2009.
- Some school administrations lack of enthusiasm being in a
blended learning course. [8] Orhan, F. Altınışık, S. A. and Kablan, Z. “Karma
- Limited knowledge in the use of technology. Öğrenme (Blended Learning) Ortamına Dayalı Bir

UbiCC Journal – Volume 4 No. 3 542


Special Issue on ICIT 2009 Conference - Applied Computing

Uygulama: Yıldız Teknik Üniversitesi Örneği”, IV. [20] Thorne, K. Blended Learning: How to Integrate
Uluslararası Eğitim Teknolojileri Sempozyumu, 24-26 Online and Traditional Learning. United States, Kogan
Kasım 2004, Sakarya, Vol: 1, pp.646-651, (2004). Page, (2004).
[21] Rovai, Alfred P. and Jordan, Hope M. "Blended Learning
[9] Dracup, Mary. "Role Play in Blended Learning: A Case
with Traditional and Fully Online Graduate Courses."
Study Exploring the Impact of Story and Other Elements,
International Review of Research in Open and
Australasian Journal of Educational Technology,
Distance Learning. 2004. Retrieved Sept 27, from
24(3), pp.294-310, (2008).
http://www.irrodl.org/content/v5.2/rovaijordan.html.
[10] Cooper, G. and Heinze, A. "Centralization of (2008)
Assessment: meeting the challenges of Multi-year Team [22] G. Thorsteinsson and T. Page. “Blended Learning
Projects in Information Systems Education." Journal of Approach to Improve, In-Servıce Teacher Education In
Information Systems Education, 18, 3, pp.345 – 356, Europe Through The Fıste Comenıus 2.1. Project” Ict in
(2007). Education: Reflections and Perspectives, Bucharest,
June 14-16, (2007).
[11] Heinze, A. Lecturer in Information Systems,
http://www.aheinze.me.uk/Blended_Learning_Higher_E
ducation.html, Erişim tarihi: 15 February 2009.
[12] Langley, Amanda. “Experiential Learning, E-Learning
and Social Learning: The EES Approach to Developing
Blended Learning” The Fourth Education in a
Changing Environment Conference Book, Edited by
Eamon O’Doherty, İnforming Science Press, pp.171-172,
(2007).
[13] Bonk, C. J. & Graham, C. R. (Eds.). “Future Directions
of Blended Learning In Higher Education and
Workplace Learning Settings” Handbook of blended
learning: Global Perspectives, local designs. San
Francisco, CA: Pfeiffer Publishing. (2004).
[14] Carman, Jared M. Blended Learning Design: Five Key
Ingredients, Director, Product Development
KnowledgeNet, October 2002 www.brandon-hall.com.
Erişim tarihi: 15 February 2009.
[15] Derntl M. Motschnig-Pitrik, Renate. A Layered
Blended Learning Systems Structure, Proceedings of I-
KNOW ’04 Graz, Austria, June 30 - July 2, (2004).
[16] Bañadosa, Emerita. Blended-learning Pedagogical
Model for Teaching and Learning EFL Successfully
Through an Online Interactive Multimedia Environment,
CALICO Journal, Vol. 23, No. 3, p-p 533-550, (2006).
[17] Rosset, A., Douglis, F. &Frazee, R. V. Strategies for
building blended learning. Learning Circuits.
Retrieved August 13, 2007, from
http://www.learningcircuits.org/2003/jul2003/rossett.htm
.
[18] Brandl, K. (2005). Are you ready to moodle?. Language,
Learning & Technology, Vol. 9, No. 2, pp. 16-23, May
(2005).
[19] Blended Learning Pilot Project, Final Report for 2003-
2004 and 2004-2005 Rochester Institute of Technology.
(2004). Blended Learning Pilot Project: Final Report
for the Academic Year 2003 – 2004. Retrieved Feb 5,
,fromhttp://distancelearning.rit.edu/blended/Files/Blende
dPilotFinalReport2003_04.pdf. (2009).

UbiCC Journal – Volume 4 No. 3 543


Special Issue on ICIT 2009 Conference - Applied Computing

TOWARDS THE IMPLEMENTATION OF TEMPORAL-BASED


SOFTWARE VERSION MANAGEMENT AT UNIVERSITI DARUL IMAN
MALAYSIA

M Nordin A Rahman, Azrul Amri Jamal and W Dagang W Ali


Faculty of Informatics
Universiti Darul Iman Malaysia, KUSZA Campus
21300 K Terengganu, Malaysia
mohdnabd@udm.edu.my, azrulamri@udm.edu.my, wan@udm.edu.my

ABSTRACT
Integrated software is very important for the university to manage day-to-day operations. This integrated software is
going through evolution process when changes are requested by the users and finally the new versions are created.
Software version management is the process of identifying and keeping track of different versions of software.
Complexity level of this process would become complicated should software was distributed in many places. This
paper presents a temporal-based software version management model. The model is purposely implemented for
managing software versions in Information Technology Centre, Universiti Darul Iman Malaysia. Temporal elements
such as valid time and transaction time are the main attributes considered, to be inserted into the software version
management database. By having these two attributes, it would help the people involved in software process to
organize data and perform monitoring activities with more efficient.

Keywords: version management, temporal database, valid time, transaction time.

1. INTRODUCTION  The use of data to conduct analysis of past events


e.g., the change of valid time for the project or
Software evolution is concerned with modifying version due to any event.
software once it is delivered to a customer. Software  To keep track of all the transactions status on the
managers must devise a systematic procedure to project or object life cycle.
ensure that different software versions may be
retrieved when required and are not accidentally Universiti Darul Iman Malaysia (UDM) is the
changed. Controlling the development of different first full university at East Cost of Malaysia located at
software versions can be a complex task, even for a the state of Terengganu. It was setup on 1st January
single author to handle. This task is likely to become 2006. UDM has two campus named as KUSZA
more complex as the number of software authors Campus and City Campus. Another new campus
increases, and more complex still if those software known as Besut Campus will be operated soon. To
authors are distributed geographically with only date, KUSZA Campus has six faculties and City
limited means of communication, such as electronic Campus has three faculties. The university also has
mail, to connect them. an Information Technology Centre (ITC-UDM) that
Temporal based data management has been a hot purposely for developing and maintaining the
topic in the database research community since the university information systems and information
last couple of decades. Due to this effort, a large technology infrastructure.
infrastructure such as data models, query languages In this paper, we concentrate on the modelling of
and index structures has been developed for the a temporal-based software version management.
management of data involving time [11]. Nowadays, Based on the model, a simple web-based web
a number of software has adopted the concepts of application has been developed and suggested to be
temporal database management such as artificial used by ITC-UDM. The rest of the paper is organized
intelligence software, geographic information systems as follows: next section reviews the concept of
and robotics. Temporal management aspects of any temporal data management. Section 3 discusses on
objects could include: the current techniques in software version
management. Current issues in software version
 The capability to detect change such as the management at ITC-UDM are discussed in Section 4.
amount of change in a specific project or object The specifications of the proposed temporal-based
over a certain period of time. software version management model are explained in
Section 5. Conclusion is placed in Section 6.

UbiCC Journal – Volume 4 No. 3 544


Special Issue on ICIT 2009 Conference - Applied Computing

2. TEMPORAL DATA CONCEPT 3. RELATED TOOLS IN SOFTWARE VERSION


MANAGEMENT
To date, transaction time and valid time are the two
well-known of time that are usually considered in the In distributed software process, a good version
literature of temporal database management [2, 4, 6, management combines systematic procedures and
9, 10, 11, 12]. The valid time of a database fact is the automate tools to manage different versions in many
time when the fact is true in the miniworld [2, 6, 9, locations. Most of the methods of version naming use
10]. In other words, valid time concerns the a numeric structure [5]. Identifying versions of the
evaluation of data with respect to the application system appears to be straightforward. The first
reality that data describe. Valid time can be version and release of a system is simply called 1.0,
represented with single chronon identifiers (e.g., subsequent versions are 1.1, 1.2 and so on.
event time-stamps), with intervals (e.g., as interval Meanwhile, [3] suggests that every new version
time-stamps), or as valid time elements, which are produced should be placed in a different directory or
finite sets of intervals [9]. Meanwhile, the transaction location from the old version. Therefore, the version
time of a database fact is the time when the fact is accessing process would be easier and effective.
current in the database and may be retrieved [2, 6, 9, Besides that, should this method be implemented
10]. This means, that the transaction time is the using a suitable database management system, the
evaluation time of data with respect to the system concept of lock access could be used to prevent the
where data are stored. Supporting transaction time is occurrence of overlapping process. Present, there are
necessary when one would like to roll back the state many software evolution management tools available
of the database to a previous point in the time. [9] in market. Selected tools are described as follows:
proposed four implicit times could be taken out from
valid time and transaction time:  Software Release Manager (SRM) – SRM is a
free software and supported on most UNIX and
 valid time – valid-from and valid-to LINUX platforms. It supports the software
 transaction time – transaction-start and version management for distributed
transaction-stop organizations. In particular, SRM tracks
dependency information to automate and
Temporal information can be classified into two optimize the retrieval of systems components as
divisions; absolute temporal and relative temporal [9]. well as versions.
Most of the research in temporal databases  Revision Control System (RCS) – RCS uses the
concentrated on temporal models with absolute concepts of tree structures. Each branch in the
temporal information. To extend the scope of tree represents a variant of the version. These
temporal dimension, [12] presented a model which branches will be numbered by an entering
allows relative temporal information e.g., “event A sequence into a system database. RCS records
happened before event B and after January 01, 2003”. details of any transaction made such as the
[12] suggests several temporal operators that could be author, date and reason for the updating.
used for describing the relative temporal information:  Change and Configuration Control (CCC) –
{equal, before, after, meets, overlaps, starts, during, CCC is one of the complete tools for software
finishes, finished-by, contains, started-by, overlapped- configuration management. It provides a good
by, met-by and after}. platform for an identification, change control and
In various temporal research papers the theory of status accounting. CCC allows a simultaneously
time-element can be divided into two categories: working for a same version via virtual copies.
intervals and points [6, 9, 11]. If T is denoted a This can be merged and changes can be applied
nonempty set of time-elements and d is denoted a across configurations.
function from T to R+, the set of nonnegative real  Software Management System (SMS) – SMS
numbers then: allows all the aspects in software configuration
management such as version control, workspace
interval , if d(t)  0 management, system modelling, derived object
time _ element, t   management, change detection in the repository
 point , otherwise etc. SMS possesses the desired characteristics,
providing resources of version control of systems
According to this classification, the set of time- and having a good user interface.
elements, T, may be expressed as T = I  P, where I
is the set of intervals and P is the set of points.

UbiCC Journal – Volume 4 No. 3 545


Special Issue on ICIT 2009 Conference - Applied Computing

4. THE SOFTWARE VERSION MANAGEMENT  The current approach maintains only the concept
ISSUES IN ITC-UDM of current view version of which an existing
version will be overwritten by a new incoming
There are three divisions have been formed at ITC- version during the process of an update.
UDM. These divisions and their function are as
follows: Based on the mentioned problems, we strongly
believe that the development of temporal-based
 Infrastructure and Application Systems (AIS) – software version management tool for ITC-UDM
to develop and maintain the university software; could gain the following benefits:
maintain the university computer networking;
 Technical and Services (TS) – to support the  To support project and software managers in
maintenance of information technology planning, managing and evaluating version
hardware, training, multimedia services and help management.
desk.  Assigning timestamps (absolute and relative) to
 Administration and Procurement (AP) - to each transaction will provide transaction-time
manage the daily operation of ITC-UDM such as database functionality, meaning to retain all
administration, procurement etc. previously current database state and making
them available for time-based queries.
Each division is headed by a division leader and  To increase the effectiveness and efficiency of
supported by several information technology officers, the collaborative software version management
assistant information technology officers and process.
technicians. All the university software modules are
developed and maintained by AIS Division. Figure 1 5. THE MODEL
depicts the main software modules managed by the
ITC-UDM. There are over thousands source code Version control is one of the main tasks in software
files are produced by the division. Therefore, it is not configuration management. For any software version
easy for the division to manage all those artefacts. would have its own valid time. The collection of
software versions should be organized into systematic
Academic Module way for the purpose of retrieval efficiency and to
University Software Module

recognize valid time of those versions. Besides the


used of unique sign for the associate version, the
Human Resource Module method of time-stamping is also needed to be
embedded into the version management database.
Student Affairs Module
5.1 The Temporal-Based Version Management
Specifications
Finance Module Temporal elements involved in the model are
transaction time (tt), absolute valid time (avt) and
relative valid time (rvt) which can be denoted as, TE
Department of Development = {tt, avt, rvt}. Transaction time is a date-stamping
and it represents a transaction when a new valid time
for a version is recorded into the application database.
Figure 1: University Software Modules Absolute valid time is represent by two different
attributes known as valid-from and valid-until and it
From study done by the authors, two main also using an approach of date-stamping. Meanwhile,
weaknesses have been found in the current approach relative valid time which involves a time interval, will
for ITC-UDM in managing all versions of source be represented by a combination of temporal
codes produced: operators, OPERATORs = {op1, op2, op3, …, opn}
and one or more defined event(s), signed as EVENTs
 Non systematic procedure used for managing = {event1, event2, event3, …, eventn}. This model,
software versions and it is difficult to recognize considered only five temporal operators, hence will be
the valid time for each version. denoted as OPERATORs = {equal, before, after,
 The current approach does not consider the meets, met_by}. Table 1 illustrates the general
aspect of relative temporal in representing the definitions of temporal operators based on time
valid time for each version. interval and time points. Figure 2 shows the

UbiCC Journal – Volume 4 No. 3 546


Special Issue on ICIT 2009 Conference - Applied Computing

organization of temporal elements that involved in S = {A1, A2, A3, …, An, tt, avt-from, avt-until, rvt-
software version management. If we have a software from, rvt-until}
with a set of version signed as, V = {v1, v2, v3, …, vn}
then the model is: where, Ai = attribute name of a version, tt  P and,
avt-from, avt-until, rvt-from and rvt-until  T.
TEMPORAL(vi  V)  (tt  avt  rvt)
Table 2 exhibits the temporal-based version-
where, record management for representing KEWNET’s
avt = [avt-from, avt-until], software version history. For example, KEWNET
rvt = [rvt-from, rvt-until], Ver. 1.1 has been updated three times. For the first
rvt-from = {{opi  OPERATORs}  {eventi  time, the version has been recorded on tt3 with
EVENTs}} and, absolute valid time is from avf2 to avu3 and relative
rvt-until = {{opi  OPERATORs}  {eventi  valid time is from rvf2 to rvu3. For the second
EVENTs}}. updated, on tt4, absolute valid time is from avf2 to
avu4 and relative valid time is from rvf2 to rvu4. The
Thus, if the software that has a set of feature attributes version has another change request and therefore the
Ai then a complete scheme for a temporal-based in version would have a new absolute valid time from
software version management can be signed as: avf2 to avu5 and relative valid time from rvf2 to rvu5.
This transaction is recorded on tt5.

Table 1: The definitions of temporal operator base on time point and time interval

Temporal Operator Time Point Time Interval


equal t = {(t = ti)  T}  ={( = i)  T}
before  = {( < ti)  T}  = {( < i)  T}
after  = {( > ti)  T}  = {(  i)  T}
meets  = {(  ti)  T}  = {(  i)  T}
met_by  = {(  ti)  T}  = {(  i)  T}

Transaction time

Software
version

Valid time

Absolute Relative

From Until From Until

Figure 2: Temporal elements in software version management

UbiCC Journal – Volume 4 No. 3 547


Special Issue on ICIT 2009 Conference - Applied Computing

Table 2: Version-Record for KEWNET software

Ver # tt avt-from avt-until rvt-from rvt-until


1.0 tt1 avf1 avu1 rvf1 rvu1
1.0 tt2 avf1 avu2 rvf1 rvu2
1.1 tt3 avf2 avu3 rvf2 rvu3
1.1 tt4 avf2 avu4 rvf2 rvu4
1.1 tt5 avf2 avu5 rvf2 rvu5
1.2 tt6 avf3 avu6 rvf3 rvu6
1.2 tt7 avf3 avu7 rvf3 rvu7
2.0 tt8 avf4 avu8 rvf4 rvu8
2.0 tt9 avf4 avu9 rvf4 rvu9
2.1 tt10 avf5 avu10 rvf5 rvu10

5.2 The Temporal-Based Version Management During the register version process the software
Functionality manager needs to record the foundations information
To carry out experiments validating the model of the software version. Attributes that needed to be
proposed, a client-server prototype has been key-in by software manager can be signed as, Av =
developed. The prototype has three main modules: {version code, date release, version description,
register version, update the version valid time and origin version code, version id}. Figure 3 illustrates
queries. the screen sample used to register the basic
information of the software version.

Figure 3: Register the software version

On completion of new software version Any changes of a software version valid time,
registration, then the software manager needs to software manager needs to update by using this form.
update its valid time and this can be done by using the The tool also allows the user to make a query to the
module update the version valid time, illustrated in database. The users can browse the version valid time
Figure 4. The attributes for this module formed as AT and status for any registered software as shown in
= {version code, transaction date, description, date Figure 5. Meanwhile, Figure 6 shows the output form
start, date end, time start, time end, update by, of query for all histories of valid time and status for a
position}. Attribute transaction date is the current software version.
date and will be auto-generated by the server.

UbiCC Journal – Volume 4 No. 3 548


Special Issue on ICIT 2009 Conference - Applied Computing

Figure 4: Update the software version valid time

Figure 5: The software version valid time report

Figure 6: The transaction records of a version

UbiCC Journal – Volume 4 No. 3 549


Special Issue on ICIT 2009 Conference - Applied Computing

6. CONCLUSION The Very Large Database Journal, Vol. 14,


2005, 2 – 29.
In practical software version management, it is
[5] A. Dix, T. Rodden, and I. Sommerville.
frequently important to retain a perfect record of past
“Modelling Versions in Collaborative Work”,
and current valid time for a version states. We cannot
IEE – Proc. Software Engineering, 1997, 195 –
replace or overwritten the record of old valid time of a
206.
software version during the updating process. Hence,
this paper introduces a new model in software version [6] H. Gregerson, and C. S. Jensen, “Temporal
management based on temporal elements. Here, an Entity-Relationship Models – A Survey”, IEEE
important issue discussed is temporal aspects such as Trans. On Knowledge and Data Engineering,
valid time and transaction time have been stamped on 11, 1999, 464 – 497.
each software version so that the monitoring and
[7] A. Gustavsson. “Maintaining the Evaluation of
conflict management processes can be easily made.
Based on the proposed model, a prototype has Software Objects in an Integrated
been developed. The prototype will be experimented Environment”, ACM – Proc. 2nd Intl. Workshop
On Software Configuration Management, 1989,
in ITC-UDM. It will be used to monitor and keep
114 – 117.
track the evolution of the software version, systems
module and software documents in university’s [8] A. Havewala. “The Version Control Process:
software. For further improvements, currently, we are How and Why it can save your project”, Dr.
investigating related issues including combining the Dobb’s Journal. 24, 1999, 100 – 111.
model with change request management, considering
more temporal operators and developing a standard [9] C. S. Jensen and R. T. Snodgrass. “Temporal
temporal model for all configuration items in software Data Management”, IEEE Trans. on Knowledge
configuration managements. and Data Engineering. 11, 1999, 36 – 44.
[10] K. Torp, C. S. Jensen and R. T. Snodgrass,
References: “Effective Timestamping in Database”, The
Very Large Database Journal, Vol. 8, 1999, 267
[1] Bertino, E., Bettini, C., Ferrari, E. and Samarati, – 288.
P. “A Temporal Access Control Mechanism for
Database Systems”, IEEE Trans. On Knowledge [11] B. Knight, and J. Ma. “A General Temporal
and Data Engineering, 8, 1996, 67 – 79. Theory”, The Computer Journal, 37, 1994, 114
– 123.
[2] C. E. Dyreson, W. S. Evans, H. Lin and R. T.
Snodgrass, “Efficiently Supporting Temporal [12] B. Knight and J. Ma. “A Temporal Database
Granularities”, IEEE Trans. On Knowledge and Model Supporting Relative and Absolute Time”,
Data Engineering, Vol. 12 (4), 2000, 568 – 587. The Computer Journal. 37, 1994, 588 – 597.

[3] G. M. Clemm. “Replacing Version Control With [13] A. Lie. “Change Oriented Versioning in a
Job Control”, ACM – Proc. 2nd Intl. Workshop Software Engineering Database”, ACM – Proc.
On Software Configuration Management, 1989, 2nd Intl. Workshop on Software Configuration
162 – 169. Management. 1989, 56 – 65.

[4] D. Gao, C. S. Jensen, R. T. Snodgrass and M. D. [14] H. Mary. “Beyond Version Control”, Software
Soo, “Join Operations in Temporal Databases”, Magazine. 16, 1996, 45 – 47.

UbiCC Journal – Volume 4 No. 3 550


Special Issue on ICIT 2009 Conference - Applied Computing

EFFECTIVE DIGITAL FORENSIC ANALYSIS


OF THE NTFS DISK IMAGE

Mamoun Alazab, Sitalakshmi Venkatraman, Paul Watters

University of Ballarat, Australia


{m.alazab, s.venkatraman, p.watters} @ballarat.edu.au

ABSTRACT
Forensic analysis of the Windows NT File System (NTFS) could provide useful
information leading towards malware detection and presentation of digital
evidence for the court of law. Since NTFS records every event of the system,
forensic tools are required to process an enormous amount of information related
to user / kernel environment, buffer overflows, trace conditions, network stack, etc.
This has led to imperfect forensic tools that are practical for implementation and
hence become popular, but are not comprehensive and effective. Many existing
techniques have failed to identify malicious code in hidden data of the NTFS disk
image. This research discusses the analysis technique we have adopted to
successfully detect maliciousness in hidden data, by investigating the NTFS boot
sector. We have conducted experimental studies with some of the existing popular
forensics tools and have identified their limitations. Further, through our proposed
three-stage forensic analysis process, our experimental investigation attempts to
unearth the vulnerabilities of NTFS disk image and the weaknesses of the current
forensic techniques.

Keywords: NTFS, forensics, disk image, data hiding.

1 INTRODUCTION predominant operating systems in use, such as


Windows 2000, Windows XP, Windows Server 2003,
Digital forensics is the science of identifying, Windows Server 2008, Windows Vista, Windows 7
extracting, analyzing and presenting the digital and even in most free UNIX distributions [7, 8, 9].
evidence that has been stored in the digital electronic Hence, malware writers try to target on NTFS as this
storage devices to be used in a court of law [1, 2, 3]. could result in affecting more computer users.
While forensic investigation attempts to provide full Another compelling reason for witnessing a strong
descriptions of a digital crime scene, in computer relationship between computer crime and the NTFS
systems, the primary goals of digital forensic file system is the lack of literature that unearth the
analysis are fivefold: i) to identify all the unwanted vulnerabilities of NTFS and the weaknesses of the
events that have taken place, ii) to ascertain their present digital forensic techniques [10]. This paper
effect on the system, iii) to acquire the necessary attempts to fill this gap by studying the techniques
evidence to support a lawsuit, iv) to prevent future used in the analysis of the NTFS disk image. Our
incidents by detecting the malicious techniques used objectives are i) to explore the NTFS disk image
and v) to recognize the incitement reasons and structure and its vulnerabilities, ii) to investigate
intendance of the attacker for future predictions [2, different commonly used digital forensic techniques
4]. The general component in digital forensic process such as signatures, data hiding, timestamp, etc. and
are; acquisition, preservation, and analysis [5]. their weaknesses, and iii) finally to suggest
Digital electronic evidence could be described as improvements in static analysis of NTFS disk image.
the information and data of investigative value that
are stored by an electric device, such evidence [6]. 2 FORENSIC ANALYSIS PROCESS
This research focuses on the abovementioned third
goal of acquiring the necessary evidence of In this section, we describe the forensic analysis
intrusions that take place on a computer system. In process we had adopted to achieve the above
particular, this paper investigates the digital forensic mentioned objectives of this research work. We
techniques that could be used to analyze and acquire conducted an empirical study using selected digital
evidences from the most commonly used file system forensic tools that are predominantly used in practice.
on computers, namely, Windows NT File System Several factors such as effectiveness, uniqueness and
(NTFS). robustness in analyzing NTFS disk image were
Today, NTFS file system is the basis of considered in selecting the tools / utilities required

UbiCC Journal – Volume 4 No. 3 551


Special Issue on ICIT 2009 Conference - Applied Computing

for this empirical study. Since each utility does to detect a keyword or phrase from the disk image.
some specific functionality, a collection of such tools
were necessary to perform a comprehensive set of 2.3 Stage 3 - Analysis of NTFS File System
functionalities. Hence, the following forensic In the final stage of the experimental study, we
utilities / tools were adopted to conduct the analyzed the data obtained from the NTFS disk
experimental investigation in this research work: image that contributed towards meaningful
i) Disk imaging utilities such as dd [11] or conclusions of the forensic investigation. We
dcfldd V1.3.4-1 [12] for obtaining sector- adopted a collection of tools such as the Sleuth Kit
by-sector mirror image of the disk; (TSK), Autopsy Forensic by Brian Carrier and
ii) Evidence collection using utilities such as NTFSINFO v1.0 from Microsoft Sysinternals by
Hexedit [13], Frhed 1.4.0[14] and Strings Mark Russinovich to perform different aspects of the
V2.41[15] to introspect the binary code of NTFS file system analysis.
the NTFS disk image;
iii) NTFS disk analysis using software tools 3 FORENSIC INVESTIGATION STEPS
such as The Sleuth KIT (TSK) 3.01[16] and
Autopsy [17] and NTFSINFO v1.0 [18] to Many aspects must be taken into consideration
explore and extract intruded data as well as when conducting a computer forensic investigation.
hidden data for performing forensic analysis. There are different approaches adopted by an
investigator while examining a crime scene. From
For the experimental investigation of the the literature, we find five steps adopted, such as,
effectiveness of the above tools, we created test data Policy and procedure development, Evidence
on a Pentium (R) Core (TM) 2 Due CPU, 2.19 GHz, assessment, Evidence acquisition, Evidence
2.98 of RAM with Windows XP professional that examination, and documenting and reporting [26]. In
adopts the NTFS file system partition. In this pilot our proposed approach for the digital forensic
empirical study, we focused on the boot sector of the investigation, we adopted the following nine steps as
NTFS disk image. We adopted the following three shown in Figure 1:
stages to perform digital forensic analysis in a
comprehensive manner: Step 1: Policy and Procedure Development – In this
Stage 1: Hard disk data acquisition, step, suitable tools that are needed in the digital
Stage 2: Evidence searching and scene are determined as part of administrative
Stage 3: Analysis of NTFS file system. considerations. All aspects of policy and procedure
development are considered to determine the mission
2.1 Stage 1 - Hard Disk Data Acquisition statement, skills and knowledge, funding, personal
As the first stage in forensic analysis, we used requirement, evidence handling and support from
the dcfldd developed by Nicholas Harbour and dd management.
utility from George Garner to acquire the NTFS disk
Step 2: Hard Disk Acquisition – This step involves
image from the digital electronic storage device.
forensic duplication that could be achieved by
This utility was selected for investigation since it
obtaining NTFS image of the original disk using DD
provides simple and flexible acquisition tools. The
tool command. This step is for obtaining sector-by-
main advantage of using these tools is that we could
sector mirror image of the disk and the output of the
extract the data in or between partitions to a separate
image file is created as Image.dd.
file for more analysis. In addition, this utility
provides built-in MD5 hashing features. Some of its Step 3: Check the Data Integrity – This step ensures
salient features allow the analyst to calculate, save, the integrity of data acquired through reporting of a
and verify the MD5 hash values. In digital forensic hash function. We used MD5 tool to guarantee the
analysis, using hashing technique is important to integrity of the original media and the resulting
ensure data integrity and to identify which values of image file.
data have been maliciously changed as well as to
explore known data objects [19]. Step 4: Extract MFT in the Boot Sector – In this step,
the MFT is extracted from the boot sector. We
analyzed the MFT using WinHex hexeditor tool and
2.2 Stage 2 - Evidence searching
checked number of sectors allocated to the NTFS file
The next stage involved searching for evidences
system using NTFSINO.
with respect to system tampering. An evidence of
intrusion could be gained by looking for some Step 5: Extract $Boot file and Backup boot sector –
known signatures, timestamps as well as even In this step, the $Boot file is extracted to investigate
searching for hidden data [20]. In this stage, we used hidden data. We analyzed the hidden data in the
the Strings command by Mark Russinovich, Frhed $Boot metadata file system using WinHex, TSK and
hexeditor tool by Rihan Kibria and WinHex Autopsy tools.
hexeditor tool by X-Ways Software Technology AG
Step 6: Compare Boot sector and Backup – A

UbiCC Journal – Volume 4 No. 3 552


Special Issue on ICIT 2009 Conference - Applied Computing

comparison of the original and backup boot sectors is 4 BOOT SECTOR ANALYSIS OF NTFS
performed in this step. We obtained another 2
Images from the original Image using DD tool. The 4.1 NTFS Disk Image
output generated resulted in two image files named, As mentioned in the previous section, the first
backupbootsector.dd and bootsector.dd. We analyzed step to be adopted by a digital forensic investigator is
the two image file named backupbootsector.dd and to acquire a duplicate copy of the NTFS disk image
bootsector.dd using WinHex hex-editor tool, TSK before beginning the analysis. This is to ensure that
and Autopsy tools. the data on the original devices have not been
changed during the analysis. Therefore, it is required
Step 7: Check the Data Integrity – In this step the
to isolate the original infected computer from the
integrity of data is verified again for test of
disk image in order to extract the evidence that could
congruence. We adopted the hashing technique
be found on the electronic storage devices. By
using MD5 tool for the two created image files to
conducting investigations on the disk image, we
check the data integrity.
could unearth any hidden intrusions since the image
Step 8: Extract the ASCII and UNICODE –This step captures the invisible information as well [21]. The
involves extracting the advantages of analyzing disk images are that the
ASCII and UNICODE characters from the binary investigators can: a) preserve the digital crime-scene,
files in the disk image. We used the Strings b) obtain the information in slack space, c) access
command tool and keyword search for matching text unallocated space, free space, and used space, d)
or hexadecimal values recorded on the disk. Through recover file fragments, hidden or deleted files and
keyword search, we could find even files that directories, e) view the partition structure and f) get
contain specific words. date-stamp and ownership of files and folders [3, 22].
Step 9: Physical Presentation – In this final step, all
4.2 Master File Table
the findings from the forensic investigation are
To investigate how intrusions result in data
documented. It involves presenting the digital hiding, data deletion and other obfuscations, it is
evidence through documentation and reporting essential to understand the physical characteristics of
procedures.
the Microsoft NTFS file system. Master File Table
(MFT) is the core of NTFS since it contains details
of every file and folder on the volume and allocates
two sectors for every MFT entry [23]. Hence, a good
knowledge of the MFT layout structure also
facilitates the disk recovery process. Each MFT entry
has a fixed size which is 1 KB (at byte offset 64 in
the boot sector one could identify the MFT record
size). We provide the MFT layout and represent the
plan of the NTFS file system using Figure 2. The
main purpose of NTFS is to facilitate reading and
writing of the file attributes and the MFT enables a
forensic analyst to examine in some detail the
structure and working of the NTFS volume.
Therefore, it’s important to understand how the
attributes are stored in the MFT entry.
The key feature to note is that MFT entry within
the MFT contains attributes that can have any format
and any size. Further, as shown in Figure 2, every
attribute contains an entry header which is allocated
in the first 42 bytes of a file record, and it contains an
attribute header and attribute content. The attribute
header is used to identify the size, name and the flag
value. The attribute content can reside in the MFT
followed by the attribute header if the size is less
than 700 bytes (known as a resident attribute),
otherwise it will store the attribute content in an
external cluster called cluster run (known as a non-
resident attribute). This is because; the MFT entry is
1KB in size and hence cannot fit anything that
Figure 1: Forensic investigation steps occupies more than 700 bytes.

UbiCC Journal – Volume 4 No. 3 553


Special Issue on ICIT 2009 Conference - Applied Computing

the steps in Figure 1 to analyze the boot sector image.


As shown in Figure 3, we performed an analysis of
the data structure of this boot sector and the results
of the investigation conducted using existing forensic
tools is summarized in Table 2. From these results,
we could conclude that the existing forensic tools do
not check possible infections that could take place in
certain hidden data of the boot sector. Hence, we
describe the hidden data analysis technique that we
had adopted in the next section.

5 HIDDEN DATA ANALYSIS AND RESULTS

The recent cyber crime trends are to use different


Figure 2: MFT layout structure obfuscated techniques such as disguising file names,
hiding attributes and deleting files to intrude the
4.3 Boot Sector Analysis and Results computer system. Since the Windows operating
We performed boot sector analysis by system does not zero the slack space, it becomes a
investigating metadata files that are used to describe vehicle to hide data, especially in $Boot file. Hence,
the file system. We followed the steps described in in this study, we have analyzed the hidden data in the
previous section (Figure 1) by first creating a NTFS $Boot file structure. The $Boot entry is stored in a
disk image of the test computer using the dd utility metadata file at the first cluster in sector 0 of the file
for investigating the boot sector. We used system, called $Boot, from where the system boots.
NTFSINFO tool on the disk image as shown in Table It is the only metadata file that has a static location
1 which shows the boot sector of the test device and so that it cannot be relocated. Microsoft allocates the
information about the on-disk structure. Such data first 16 sectors of the file system to $Boot and only
structure examination enables us to view the MFT half of these sectors contains non-zero values [3].
information, allocation size, volume size and In order to investigate the NTFS file system, one
metadata files. We extracted useful information such requires to possess substantial knowledge and
as the size of clusters, sector numbers in the file experience to analyze the data structure and the
system, starting cluster address of the MFT, the size hidden data [24]. The $Boot metadata file structure is
of each MFT entry and the serial number given for located in MFT entry 7 and contains the boot sector
the file system. of the file system. It contains information about the
size of the volume, clusters and the MFT. The $Boot
Table 1: NTFS Information Details. metadata file structure has four attributes, namely,
$STANDARD_INFORMATION, $FILE_NAME,
Volume Size $SECURITY_DESCRIPTION and $DATA. The
----------- $STANDARD_INFORMATION attribute contains
Volume size : 483 MB temporal information such as flags, owner, security
Total sectors : 991199 ID and the last accessed, written, and created times.
Total clusters : 123899 The $FILE_NAME attribute contains the file name
Free clusters : 106696 in UNICODE, the size and temporal information as
Free space : 416 MB (86% of drive) well. The $SECURITY_DESCRIPTION attribute
Allocation Size contains information about the access control and
---------------- security properties. Finally, the $DATA attribute
Bytes per sector : 512 contains the file contents. These attributes values for
Bytes per cluster : 4096 the test sample are shown in Table 2 as an
Bytes per MFT record : 1024 illustration. To achieve this, we used the following
Clusters per MFT record: 0 TSK command tools:
MFT Information
--------------- Istat –f ntfs c:\image.dd 7
MFT size : 0 MB (0% of drive)
MFT start cluster : 41300 From our investigations of the resulting attribute
MFT zone clusters : 41344 - 56800 values, we find that, the $Boot data structure of the
MFT zone size : 60 MB (12% of drive) NTFS file system could be used to hide data. By
MFT mirror start : 61949 analyzing the hidden data in the boot sector, one
Meta-Data files could provide useful information for digital forensics.
The size of the data that could be hidden in the boot
sector is limited by the number of non-zero that
From the information gained above, we followed

UbiCC Journal – Volume 4 No. 3 554


Special Issue on ICIT 2009 Conference - Applied Computing

Microsoft allocated in the first 16 sectors of the file md5out=c:\hash1.md5


system. The data could be hidden in the $Boot
metadata files without raising suspicion and without dd if=image.dd bs=512 count=1 of=c:\bootsector.dd
affecting the functionality of the system [25]. –md5sum –verifymd5 –md5out=c:\hash2.md5

Table 2: Results of $Boot Analysis We found that hidden data in the $Boot data
structure could not be detected directly by the
MFT Entry Header Values: existing tools used in this study and manual
Entry: 7 Sequence: 7 inspections were required alongside these forensic
$LogFile Sequence Number: 0 tools. Hence, through the analysis conducted with
Allocated File various existing utilities and tools, we arrived at the
Links: 1 following results:

$STANDARD_INFORMATION Attribute Values: i) Since NTFS stores all events that take place
Flags: Hidden, System on a computer system, there is a huge amount
Owner ID: 0 of data analysis required while scanning the
Created: Mon Feb 09 12:09:06 2009 entire NTFS disk image for forensic purposes.
File Modified: Mon Feb 09 12:09:06 2009 In this empirical study, by merely focusing
MFT Modified: Mon Feb 09 12:09:06 2009 on the hidden data of the $Boot file, we have
Accessed: Mon Feb 09 12:09:06 2009 shown that a variety of tools and utilities had
to be adopted along with manual inspections.
$FILE_NAME Attribute Values: Hence, it takes an enormous amount of time
Flags: Hidden, System to analyze the data derived with such tools.
Name: $Boot ii) The existing forensic tools are not
Parent MFT Entry: 5 Sequence: 5 comprehensive and effective in identifying
Allocated Size: 8192 Actual Size: 8192 the recent computer threats. Not all computer
Created: Mon Feb 09 12:09:06 2009 infections are detected by forensic tools,
File Modified: Mon Feb 09 12:09:06 2009 especially intrusions that are in the form of
MFT Modified: Mon Feb 09 12:09:06 2009 hidden data in the $Boot file go unchecked.
Accessed: Mon Feb 09 12:09:06 2009
iii) It was mandatory to perform manual
Attributes: investigations alongside the existing tools. By
Type: $STANDARD_INFORMATION (16-0) adopting a manual introspection of the $Boot
Name: N/A Resident size: 48 file using the three-stage approach of i) hard
Type: $FILE_NAME (48-2) Name: N/A Resident disk acquisition, ii) evidence searching and
size: 76 iii) analysis of the NTFS file system, we
Type: $SECURITY_DESCRIPTOR (80-3) could successfully identify hidden data in the
Name: N/A Resident size: 116 $Boot file.
Type: $DATA (128-1) Name: $Data Non- iv) Intelligent search techniques could be adopted
Resident size: 8192 to extract the ASCII and UNICODE
01 characters from binary files in the disk image
on either the full file system image or just the
Analysis of the $Boot data structure of the NTFS unallocated space, which could speed-up the
file system will identify any hidden data. The process of identifying hidden data.
analyzer should start by making a comparison
between the boot sector and the backup boot sector. v) One of the main reasons for having varying
The image with the boot sector and backup boot tools is that Microsoft has different versions
sector are supposed to be identical; otherwise there is of the NTFS file system to be catered for.
some data hidden in the $Boot data structure. One While Windows XP and Windows Server
method is to check the integrity of the backup boot 2003 use the same NTFS version, Windows
sector and the boot sector by calculating the MD5 for Vista uses the NTFS 3.1 version [7]. The new
both of them. A difference in checksum indicates NTFS 3.1 has changed the on-disk structure.
that there is some hidden data. We performed this For example, the location of the volume boot
comparison by adopting the following commands on record is at physical sector 2,048. Most of the
the $Boot image file and the backup boot image: existing tools do not work with all the
different versions of NTFS file system, and
dd if=image.dd bs=512 count=1 skip=61949 hence a comprehensive tool is warranted to
of=c:\backupbootsector.dd –md5sum –verifymd5 – cope with these changes.

UbiCC Journal – Volume 4 No. 3 555


Special Issue on ICIT 2009 Conference - Applied Computing

Figure 3: Analysis of the test boot Sector

Table 2: Results from the analysis of the test boot sector.

Byte
Size Description Value Action / Result
Range
If bootable, jump. If non-bootable,
0 -- 2 3 Jump to boot code 9458411
used to store error message
3 -- 10 8 OEM Name – System ID NTFS
11 -- 12 2 Bytes per sector: 512
13 -- 13 1 Sectors per cluster 8
14 -- 15 2 Reserved sectors 0 Unused – Possible Infection
16 -- 20 5 Unused 0 Unused – Possible Infection
21 -- 21 1 Media descriptor 0
22 -- 23 2 Unused 0 Unused – Possible Infection
24 -- 25 2 Sectors per track 63 No Check – Possible Infection
26 -- 27 2 Number of heads 255 No Check – Possible Infection
28 -- 31 4 Unused 32 No Check – Possible Infection
32 -- 35 4 Unused 0 Unused – Possible Infection
36 -- 39 4 Drive type check 80 00 00 00 For USB thumb drive
Number of sectors in file
40 -- 47 8 0.47264 GB
system (volume)
Starting cluster address of
48 -- 55 8 4*8=32
$MFT
Starting cluster address of MFT
56 -- 63 8 619,49
Mirror $DATA attribute
64 -- 64 1 Size of record - MFT entry 210=1024
65 -- 67 3 Unused 0 Unused – Possible Infection
68 -- 68 1 Size of index record 01h
69 -- 71 3 Unused 0 Unused – Possible Infection
72 -- 79 8 Serial number C87C8h
80 -- 83 4 Unused 0 Unused – Possible Infection
84 -- 509 426 Boot code ~
510 --511 2 Boot signature 0xAA55

UbiCC Journal – Volume 4 No. 3 556


Special Issue on ICIT 2009 Conference - Applied Computing

6 CONCLUSIONS AND FUTURE Digital forensic techniques for static analysis


RESEARCH DIRECTIONS of NTFS images, Proceedings of ICIT2009,
Fourth International Conference on
Recent methods adopted by computer intruders, Information Technology, IEEE Xplore (2009).
attackers and malwares are to target hidden and [3] B. Carrier: File system forensic analysis,
deleted data so that they could evade from virus Addison-Wesley Professional, USA, (2008).
scanners and become even difficult to be identified [4] S. Ardisson: Producing a Forensic Image of
using existing digital forensic tools. This paper has Your Client’s Hard Drive? What You Need to
attempted to explore the difficulties involved in Know, Qubit, 1, pp. 1-2 (2007).
digital forensics, especially in conducting NTFS [5] M. Andrew: Defining a Process Model for
disk image analysis and to propose an effective Forensic Analysis of Digital Devices and
digital forensic analysis. Storage Media, Proceedings of SADFE2007,
In this empirical study, we have found that the Second International Workshop on Systematic
boot sector of the NTFS file system could be used Approaches to Digital Forensic Engineering,
as a vehicle to hide data by computer attackers as pp. 16-30 (2007).
there is a potential weakness. We have emphasized [6] E Investigation: Electronic Crime Scene
the knowledge and importance of file systems for Investigation: A Guide for First Responders,
digital forensics, as several techniques to hide data US Department of Justice, NCJ, (2001).
such as slack space and hidden attributes are being [7] Svensson, A., “Computer Forensic Applied to
recently adopted by attackers. This is an important Windows NTFS Computers”, Stockholm's
NTFS file system weakness to be addressed and University, Royal Institute of Technology,
research in this domain area could lead to effective (2005).
solution for the open problem of detecting new [8] NTFS, http://www.ntfs.com, 22/2/2009.
malicious codes that make use of such an [9] D. Purcell & S. Lang: Forensic Artifacts of
obfuscated mode of attack. We have shown that Microsoft Windows Vista System, Lecture
the existing forensic software tools are not Notes in Computer Science, Springer, 5075,
competent enough to comprehensively detect all pp. 304-319 (2008).
hidden data in boot sectors. [10] T. Newsham, C. Palmer, A; Stamos & J.
As a first step to address this problem, we have Burns: Breaking forensics software:
proposed a three-stage forensic analysis process Weaknesses in critical evidence collection,
consisting of nine steps to facilitate the Proceedings of the 2007 Black Hat
experimental study. We have reported the results Conference, (2007).
gathered by following these proposed steps. By [11] DD tool, George Garner’s site, Retrieved
adopting effective search techniques, we were January, 2009 from
successful in identifying some unknown malicious http://users.erols.com/gmgarner/forensics/.
hidden data in the $Boot file that were undetected [12] DCFL tool, Nicholas Harbour,
by current forensic tools. http://dcfldd.sourceforge.net/, accessed on
In this pilot study we had adopted a few 14/1/2009.
forensic techniques and effective manual [13] WinHex tool, X-Ways Software Technology
inspections of the NTFS file image. Our future AG, Retrieved January, 2009 from
research directions would be to automate the http://www.x-ways.net/winhex/.
proposed process so as to facilitate forensic analysis [14] FRHED tool, Raihan Kibria site,
of the NTFS disk image in an efficient and http://frhed.sourceforge.net/, 14/1/2009.
comprehensive manner. We plan to extract and [15] STRINGS, Mark Russinovich, Retrieved
extrapolate malware signatures effectively as well January, 2009 from
as intelligently for any existing and even new http://technet.microsoft.com/en-
malware that use hidden and obfuscated modes of us/sysinternals/bb897439.aspx.
attack. We would automate the knowledge of how [16] TSK tools, Brian Carrier site,
to extract data from hidden data structures and how http://www.sleuthkit.org/sleuthkit/, 14/1/2009.
to reclaim deleted data and we believe this would [17] Autopsy tools, Brian Carrier site, Retrieved
extensively benefit the digital evidence collection January, 2009 from
and recovery process. http://www.sleuthkit.org/autopsy/.
[18] NTFSINFO tool, Mark Russinovich,
7 REFERENCES Retrieved January, 2009 from
http://technet.microsoft.com/en-
[1] M. Reith, C. Carr, & G. Gunsch: An au/sysinternals/bb897424.aspx.
examination of digital forensic models, [19] V. Roussev, Y.Chen, T. Bourg & G. Richard:
International Journal of Digital Evidence, 1, Forensic file system hashing revisited, Digital
pp. 1-12 (2002). Investigation, Elsevier, 3, pp. 82-90 (2006).
[2] M. Alazab, S. Venkatraman & P. Watters: [20] K. Chow, F. Law, M. Kwan & K. Lai: The

UbiCC Journal – Volume 4 No. 3 557


Special Issue on ICIT 2009 Conference - Applied Computing

Rules of Time on NTFS File System,


Proceedings of the Second International
Workshop on Systematic Approaches to
Digital Forensic Engineering, pp. 71-85(2007).

[21] K.; Jones, R. Bejtlich & C. Rose: Real digital


forensics: computer security and incident
response, Addison-Wesley Professional, USA,
(2008).
[22] H. Carvey: Windows Forensic Analysis DVD
Toolkit, Syngress Press, USA, (2007).
[23] L. Naiqi, W. Yujie & H. QinKe: Computer
Forensics Research and Implementation Based
on NTFS File System, CCCM'08, ISECS
International Colloquium on Computing,
Communication, Control, and Management,
(2008).
[24] J. Aquilina, E. Casey & C. Malin: Malware
Forensics Investigating and Analyzing
Malicious Code, Syngress Publishing,USA,
(2008).
[25] E. Huebner, D. Bem & C., Wee: Data hiding
in the NTFS file system”, Digital
Investigation, Elsevier, (2006), 3, 211-226.
[26] S. Hart, J. Ashcroft & D. Daniels:
Forensic examination of digital evidence: a
guide for law enforcement,
National Institute of Justice NIJ-US,
Washington DC, USA, Tech. Rep. NCJ,
(2004).

UbiCC Journal – Volume 4 No. 3 558


Special Issue on ICIT 2009 Conference - Applied Computing

JOB AND APPLICATION-LEVEL SCHEDULING IN DISTRIBUTED


COMPUTING

Victor V. Toporkov
Computer Science Department, Moscow Power Engineering Institute,
ul. Krasnokazarmennaya 14, Moscow, 111250 Russia
ToporkovVV@mpei.ru

ABSTRACT
This paper presents an integrated approach for scheduling in distributed computing
with strategies as sets of job supporting schedules generated by a critical works
method. The strategies are implemented using a combination of job-flow and
application-level techniques of scheduling within virtual organizations of Grid.
Applications are regarded as compound jobs with a complex structure containing
several tasks co-allocated to processor nodes. The choice of the specific schedule
depends on the load level of the resource dynamics and is formed as a resource
request, which is sent to a local batch-job management system. We propose
scheduling framework and compare diverse types of scheduling strategies using
simulation studies.

Keywords: distributed computing, scheduling, application level, job flow,


metascheduler, strategy, supporting schedules, task, critical work.

1 INTRODUCTION allows adapting resource usage and optimizing a


schedule for the specific job, for example, decreasing
The fact that a distributed computational its completion time. Such approaches are important,
environment is heterogeneous and dynamic along because they take into account details of job structure
with the autonomy of processor nodes makes it much and users resource load preferences [5]. However,
more difficult to manage and assign resources for job when independent users apply totally different
execution at the required quality level [1]. criteria for application optimization along with job-
When constructing a computing environment flow competition, it can degrade resource usage and
based on the available resources, e.g. in the model integral performance, e.g. system throughput,
which is used in X-Com system [2], one normally processor nodes load balance, and job completion
does not create a set of rules for resource allocation time.
as opposed to constructing clusters or Grid-based Alternative way of scheduling in distributed
virtual organizations. This reminds of some computing based on virtual organizations includes a
techniques, implemented in Condor project [3, 4]. set of specific rules for resource use and assignment
Non-clustered Grid resource computing that regulates mutual relations between users and
environments are using similar approach. For resource owners [1]. In this case only job-flow level
example, @Home projects which are based on scheduling and allocation efficiency can be
BOINC system realize cycle stealing, i.e. either idle increased. Grid-dispatchers [12] or metaschedulers
computers or idle cycles of a specific computer. are acting as managing centres like in the GrADS
Another still similar approach is related to the project [13]. However, joint computing nature of
management of distributed computing based on virtual organizations creates a number of serious
resource broker assignment [5-11]. Besides Condor challenges. Under such conditions, when different
project [3, 4], one can also mention several applications are not isolated, it is difficult to achieve
application-level scheduling projects: AppLeS [6], desirable resource performance: execution of the
APST [7], Legion [8], DRM [9], Condor-G [10], and user’s processes can cause unpredictable impact on
Nimrod/G [11]. other neighbouring processes execution time.
It is known, that scheduling jobs with Therefore, there are researches that pay attention to
independent brokers, or application-level scheduling, the creation of virtual machine based virtual Grid

UbiCC Journal – Volume 4 No. 3 559


Special Issue on ICIT 2009 Conference - Applied Computing

workspaces by means of specialized operating managing strategies [17-20]. Availability of


systems, e.g., in the new European project XtreemOS heterogeneous resources, data replication policies
(http://www.xtreemos.eu). [12, 21, 22] and multiprocessor job structure for
Inseparability of the resources makes it much efficient co-allocation between several processor
more complicated to manage jobs in a virtual nodes should be taken into account.
organization, because the presence of local job-flows In this work, the multicriteria strategy is regarded
launched by owners of processor nodes should be as a set of supporting schedules in order to cover
taken into account. Dynamical load balance of possible events related to resource availability.
different job-flows can be based on economical The outline of the paper is as follows.
principles [14] that support fairshare division model In section 2, we provide details of application-
for users and owners. Actual job-flows presence level and job-flow scheduling with a critical works
requires forecasting resource state and their method and strategies as sets of possible supporting
reservation [15], for example by means of Maui schedules.
cluster scheduler simulation approach or methods, Section 3 presents a framework for integrated
implemented in systems such as GARA, Ursala, and job-flow and application-level scheduling.
Silver [16]. Simulation studies of coordinated scheduling
The above-mentioned works are related to either techniques and results are discussed in Section 4.
job-flow scheduling problems or application-level We conclude and point to future directions in
scheduling. Section 5.
Fundamental difference between them and the
approach described is that the resultant dispatching 2 APPLICATION-LEVEL AND JOB-FLOW
strategies are based on the integration of job-flows SCHEDULING STRATEGIES
management methods and compound job scheduling
methods on processor nodes. It allows increasing the 2.1 Application-Level Scheduling Strategy
quality of service for the jobs and distributed The application-level scheduling strategy is a set
environment resource usage efficiency. of possible resource allocation and supporting
It is considered, that the job can be compound schedules (distributions) for all N tasks in the job
(multiprocessor) and the tasks, included in the job, [18]:
are heterogeneous in terms of computation volume
and resource need. In order to complete the job, one Distribution:=
should co-allocate the tasks to different nodes. Each < <Task 1/Allocation i,
[Start 1, End 1]>, …,
task is executed on a single node and it is supposed,
<Task N/Allocation j,
that the local management system interprets it as a [Start N, End N]> >,
job accompanied by a resource request.
On one hand, the structure of the job is usually where Allocation i, j is the processor node i,
not taken into account. The rare exception is the j for Task 1, N; Start 1, N, End 1, N – run
Maui cluster scheduler [16], which allows for a time and stop time for Task 1, N execution.
single job to contain several parallel, but Time interval [Start, End] is treated as so
homogeneous (in terms of resource requirements) called walltime (WT), defined at the resource
tasks. On the other hand, there are several resource-
reservation time [15] in the local batch-job
query languages. Thus, JDL from WLMS
management system.
(http://edms.cern.ch) defines alternatives and
Figure 1 shows some examples of job graphs in
preferences when making resource query, ClassAds
strategies with different degrees of distribution, task
extensions in Condor-G [10] allows forming
details, and data replication policies [19]. The first
resource-queries for dependant jobs. The execution
type strategy S1 allows scheduling with fine-grain
of compound jobs is also supported by WLMS
computations and multiple data replicas, the second
scheduling system of gLite platform
type strategy S2 is one with fine-grain computations
(http://www.glite.org), though the resource
and a bounded number of data replicas, and the third
requirements of specific components are not taken
type S3 implies coarse-grain computations and
into account.
constrained data replication. The vertices P1, …, P6,
What sets our work apart from other scheduling
P23, and P45 correspond to tasks, while D1, …, D8,
research is that we consider coordinated application-
level and job-flow management as a fundamental part D12, D36, and D78 correspond to data
of the effective scheduling strategy within the virtual transmissions. The transition from graph G1 to
organization. graphs G2 and G3 is performed through lumping of
Environment state of distribution, dynamics of its tasks and reducing of the parallelism level.
configuration, user’s and owner’s preferences cause The job graph is parameterized by prior estimates
the need of building multifactor and multicriteria job of the duration Tij of execution of a task Pi for a

UbiCC Journal – Volume 4 No. 3 560


Special Issue on ICIT 2009 Conference - Applied Computing

processor node nj of the type j, of relative volumes The processor node load level LLj is the ratio of
Vij of computations on a processor (CPU) of the the total time of usage of the node of the type j to
type j, etc. (Table 1). the job run time. Schedules in Fig. 2, b and Fig. 2, c
It is to mention, such estimations are also are related to strategies S2 and S3.
necessary in several methods of priority scheduling
including backfilling in Maui cluster scheduler. 2.2 Critical Works Method
Strategies are generated with a critical works
P2 D3 P4 G1 method [20].
D1
D4
D7 The gist of the method is a multiphase procedure.
The first step of any phase is scheduling of a critical
P1 D5 P6 work – the longest (in terms of estimated execution
P3 P5
D2 D8
time Tij for task Pi) chain of unassigned tasks
D6 along with the best combination of available
G2
P2 P4 resources. The second step is resolving collisions
cased by conflicts between tasks of different critical
works competing for the same resource.
P1 P6 (a)
D12 D36 D78 Nodes
n1 P1 P2 P4 LL1=0.35
P3 P5 G3 n2 P5 LL2=0.10
n3 P3 LL3=0.15
n4 P6 LL4=0.50
D12 D36 D78
Nodes CF=41
P1 P6
P23 P45 n1 P1 P2 P6 LL1=0.35
n2
Figure 1: Examples of job graphs. n3 P3 P4
LL2=0
LL3=0.65
n4 P5 LL4=0.50
Figure 2 shows fragments of strategies of types Nodes CF=37

S1, S2, and S3 for jobs in Fig. 1. n1 P2 P4 P6 LL1=0.35


n2 P5 LL2=0.10
The duration of all data transmissions is equal to n3 P3 LL3=0.15
one unit of time for G1, while the transmissions D12 n4 P1 LL4=0.50
CF=41
and D78 require two units of time and the
20 Time
transmission D36 requires four units of time for G2 0 5 10 15
(b)
and G3. Nodes
We assume that the lumping of tasks is n1 P1 P2 P4 P6 LL1=0.60
characterized by summing of the values of n2 LL2=0
n3 P3 LL3=0.15
corresponding parameters of constituent subtasks n4 P5 LL4=0.20
(see Table 1). CF=39
0 5 10 15 20 Time
Table 1: User's task estimations. (c)
Nodes
Tij, Tasks n1 P1 P23 P45 P6 LL1=1
Vij P1 P2 P3 P4 P5 P6 n2 LL2=0
Ti1 2 3 1 2 1 2 n3
n4
LL3=0
LL4=0
Ti2 4 6 2 4 2 4 CF=25
Ti3 6 9 3 6 3 6 0 5 10 15 20 Time
Ti4 8 12 4 8 4 8
Vij 20 30 10 20 10 20 Figure 2: Fragments of scheduling strategies S1 (a),
S2 (b), S3 (c).
Supporting schedules in Fig. 2, a present a subset
of a Pareto-optimal strategy of the type S1 for tasks For example, there are four critical works 12, 11,
Pi, i=1, …, 6 in G1. 10, and 9 time units long (including data transfer
The Pareto relation is generated by a vector of time) on fastest processor nodes of the type 1 for the
criteria CF, LLj, j=1, …, 4. job graph G1 in Fig. 1, a (see Table 1):
A job execution cost-function CF is equal to the
sum of Vij/Ti, where Ti is the real load time of (P1, P2, P4, P6), (P1, P2, P5, P6),
processor node j by task Pi rounded to nearest not-
smaller integer. Obviously, actual solving time Ti (P1, P3, P4, P6), (P1, P3, P5, P6).
for a task can be different from user estimation Tij The schedule with CF=37 has a collision (see Fig.
(see Table 1). 2, a), which occurred due to simultaneous attempts of

UbiCC Journal – Volume 4 No. 3 561


Special Issue on ICIT 2009 Conference - Applied Computing

tasks P4 and P5 to occupy processor node n4. This The conflicts between competing tasks are
collision is further resolved by the allocation of P4 resolved through unused processors, which, being
to the processor node n3 and P5 to the node n4. used as resources, are accompanied with a minimum
Such reallocations can be based on virtual value of the penalty cost function that is equal to the
organization economics – in order to take higher sum of Vij/Tij (see Table 1) for competing tasks.
performance processor node, user should “pay” It is required to construct a strategy that is
more. Cost-functions can be used in economical conditionally minimal in terms of the cost function
models [14] of resource distribution in virtual CF for the upper and lower boundaries of the
organizations. It is worth noting that full costing in maximum range for the duration Tij of the
CF is not calculated in real money, but in some execution of each task Pi (see Table 1). It is a
conventional units (quotas), for example like in modification of the strategy S1 with fine-grain
corporate non-commercial virtual organizations. The computations, active data replication policy, and the
essential point is different – user should pay best- and worst execution time estimations.
additional cost in order to use more powerful The strategy with a conditional minimum with
resource or to start the task faster. The choice of a respect to CF is shown in Table 2 by schedules 1, 2,
specific schedule from the strategy depends on the and 3 (Ai is allocation of task Pi, i = 1, …, 6) and
state and load level of processor nodes, and data the scheduling diagrams are demonstrated in Fig. 2,
storage policies. a.
The strategies that are conditionally maximal with
2.3 Examples of Scheduling Strategies respect to criteria LL1, LL2, LL3, and LL4 are given
Let us assume that we need to construct a in Table 2 by the cases 4-7; 8, 9; 10, 11; and 12-14,
conditionally optimal strategy of the distribution of respectively. Since there are no conditional branches
processors according to the main scheme of the in the job graph (see Fig. 1), LLj is the ratio of the
critical works method from [20] for a job represented total time of usage of a processor of type j to the
by the information graph G1 (see Fig. 1). Prior walltime WT of the job completion.
estimates for the duration Tij of processing tasks The Pareto-optimal strategy involves all
P1, …, P6 and relative computing volumes Vij for schedules in Table 2. The schedules 2, 5, and 13
four types of processors are shown in Table 1, where have resolved collisions between tasks P4 and P5.
i = 1, …, 6; j = 1, …, 4. The number of processors Let us assume that the load of processors is such
of each type is equal to 1. The duration of all data that the tasks P1, P2, and P3 can be assigned with
exchanges D1, …, D8 is equal to one unit of time. no more than three units of time on the first and third
The walltime is given to be WT = 20. The criterion of processors (see Table 2). The metascheduler runs
resource-use efficiency is a cost function CF. We through the set of supporting schedules and chooses a
take a prior estimate for the duration Tij that is the concrete variant of resource distribution that depends
nearest to the limit time Ti for the execution of task on the actual load of processor nodes.
Pi on a processor of type j, which determines the
type j of the processor used.

Table 2: The strategy of the type MS1.


Sche- Duration Allocation Criteria
dule
T1 T2 T3 T4 T5 T6 A1 A2 A3 A4 A5 A6 CF LL1 LL2 LL3 LL4
1 2 3 3 2 2 10 1 1 3 1 2 4 41 0.35 0.10 0.15 0.50
2 2 3 3 10 10 2 1 1 3 3 4 1 37 0.35 0 0.65 0.50
3 10 3 3 2 2 2 4 1 3 1 2 1 41 0.35 0.10 0.15 0.50
4 2 3 3 2 2 10 1 1 3 1 2 1 41 0.85 0.10 0.15 0
5 2 3 3 10 10 2 1 1 3 4 1 1 38 0.85 0 0.15 0.50
6 2 11 11 2 2 2 1 4 1 1 2 1 39 0.85 0.10 0 0.55
7 10 3 3 2 2 2 1 1 3 1 2 1 41 0.85 0.10 0.15 0
8 2 11 11 2 2 2 1 4 2 1 2 1 39 0.30 0.65 0 0.55
9 10 3 3 2 2 2 2 1 2 1 2 1 41 0.35 0.75 0 0
10 2 11 11 2 2 2 1 3 4 1 2 1 41 0.30 0.10 0.55 0.55
11 10 3 3 2 2 2 3 1 3 1 2 1 41 0.35 0.10 0.60 0
12 2 3 3 2 2 10 1 1 3 1 2 4 41 0.35 0.10 0.15 0.50
13 2 3 3 10 10 2 1 1 3 3 4 1 39 0.35 0 0.65 0.50
14 10 3 3 2 2 2 4 1 3 1 2 1 41 0.35 0.10 0.15 0.50

UbiCC Journal – Volume 4 No. 3 562


Special Issue on ICIT 2009 Conference - Applied Computing

Then, the metascheduler should choose the are presented in Table 3 by the schedules 1-4, 5-12,
schedules 1, 2, 4, 5, 12, and 13 as possible variants 13-17, 18-25, and 26-33, respectively. The Pareto-
of resource distribution. However, the concrete optimal strategy does not include the schedules 2, 5,
schedule should be formulated as a resource request 12, 14, 16, 17, 22, and 30.
and implemented by the system of batch processing Let us consider the generation of a strategy for
subject to the state of all four processors and possible the job represented structurally by the graph G3 in
runtimes of tasks P4, P5, and P6 (see Table 2). Fig. 1 and by summing of the values of the
Suppose that we need to generate a Pareto- parameters given in Table 1 for tasks P2, P3 and P4,
optimal strategy for the job graph G2 (see Fig. 1) in P5.
the whole range of the duration Ti of each task Pi, As a result of the resource distribution for the
while the step of change is taken to be no less than model G3, the tasks P1, P23, P45, and P6 turn out
the lower boundary of the range for the most to be assigned to one and the same processor of the
performance processor. first type. Consequently, the costs of data exchanges
The Pareto relation is generated by the vector of D12, D36, and D78 can be excluded. Because there
criteria CF, LL1, … , LL4. The remaining initial can be no conflicts in this case between processing
conditions are the same as in the previous example. tasks (see Fig. 1), the scheduling obtained before the
The strategies that are conditionally optimal with exclusion of exchange procedures can be revised.
respect to the criteria CF, LL1, LL2, LL3, and LL4

Table 3: The strategy of the type S2.


Sche- Duration Allocation Criteria
dule
T1 T2 T3 T4 T5 T6 A1 A2 A3 A4 A5 A6 CF LL1 LL2 LL3 LL4
1 2 3 3 4 4 3 1 1 3 2 4 1 39 0.40 0.20 0.15 0.20
2 4 3 3 2 2 3 2 1 3 1 2 1 41 0.40 0.30 0.15 0
3 4 3 3 3 3 2 2 1 3 1 3 1 40 0.40 0.20 0.30 0
4 5 3 3 2 2 2 2 1 3 1 2 1 43 0.35 0.35 0.15 0
5 2 3 3 2 2 5 1 1 3 1 2 1 43 0.60 0.10 0.15 0
6 2 3 3 4 4 3 1 1 3 1 4 1 39 0.60 0 0.15 0.20
7 2 3 3 5 5 2 1 1 3 1 4 1 41 0.60 0 0.15 0.25
8 2 6 6 2 2 2 1 2 1 1 2 1 42 0.60 0.30 0 0
9 4 3 3 2 2 3 1 1 3 1 2 1 41 0.60 0.10 0.15 0
10 4 3 3 3 3 2 1 1 3 1 3 1 40 0.60 0 0.30 0
11 4 4 4 2 2 2 1 1 4 1 2 1 41 0.60 0.10 0 0.20
12 5 3 3 2 2 2 1 1 3 1 2 1 43 0.60 0.10 0.15 0
13 2 6 6 2 2 2 1 2 4 1 2 1 43 0.30 0.40 0 0.30
14 4 3 3 2 2 3 2 1 2 1 2 1 41 0.40 0.45 0 0
15 4 3 3 3 3 2 2 1 2 1 2 1 40 0.40 0.50 0 0
16 4 4 4 2 2 2 2 1 2 1 2 1 41 0.40 0.50 0 0
17 5 3 3 2 2 2 2 1 2 1 2 1 43 0.35 0.50 0 0
18 2 3 3 2 2 5 1 1 3 1 2 2 43 0.35 0.35 0.15 0
19 2 3 3 4 4 3 1 1 3 2 3 1 39 0.40 0.20 0.35 0
20 2 3 3 5 5 2 1 1 3 2 3 1 40 0.35 0.25 0.40 0
21 2 6 6 2 2 2 1 2 3 1 2 1 42 0.30 0.40 0.30 0
22 4 3 3 2 2 3 2 1 3 1 2 1 41 0.40 0.30 0.15 0
23 4 3 3 3 3 2 2 1 3 1 3 1 40 0.40 0.20 0.30 0
24 4 4 4 2 2 2 2 1 3 1 2 1 41 0.40 0.30 0.20 0
25 5 3 3 2 2 2 2 1 3 1 2 1 43 0.35 0.35 0.15 0
26 2 3 3 2 2 5 1 1 3 1 2 2 43 0.35 0.35 0.15 0
27 2 3 3 4 4 3 1 1 3 2 4 1 39 0.40 0.20 0.15 0.20
28 2 3 3 5 5 2 1 1 3 2 4 1 40 0.35 0.25 0.15 0.25
29 2 6 6 2 2 2 1 2 4 1 2 1 42 0.30 0.40 0 0.30
30 4 3 3 2 2 3 2 1 3 1 2 1 41 0.40 0.30 0.15 0
31 4 3 3 3 3 2 2 1 3 1 3 1 40 0.40 0.20 0.30 0
32 4 4 4 2 2 2 2 1 4 1 2 1 41 0.40 0.30 0 0.20
33 5 3 3 2 2 2 2 1 3 1 2 1 43 0.35 0.35 0.15 0

UbiCC Journal – Volume 4 No. 3 563


Special Issue on ICIT 2009 Conference - Applied Computing

Table 4: The strategy of the type S3.


Sche- Duration Allocation Criteria
dule
T1 T23 T45 T6 A1 A23 A45 A6 CF LL1 LL2 LL3 LL4
1 2 8 6 4 1 1 1 1 25 1 0 0 0
2 4 8 3 5 1 1 1 1 24 1 0 0 0
3 6 4 6 4 1 1 1 1 24 1 0 0 0
4 8 4 3 5 1 1 1 1 27 1 0 0 0
5 10 4 3 3 1 1 1 1 29 1 0 0 0
6 11 4 3 2 1 1 1 1 32 1 0 0 0

The results of distribution of processors are schedule it runs the developed mechanisms that
presented in Table 4 (A23, A45 are allocations, optimize the whole job-flow (two jobs in this
and T23, T45 are run times for tasks P23 and example). In that case the metascheduler will still
P45). Schedules 1-6 in Table 4 correspond to the try to find an optimal schedule for each single job
strategy that is conditionally minimal with respect as described above and, at the same time, it will try
to CF with LL1 = 1. Consequently, there is no sense to find the most optimal job assignment so that the
in generating conditionally maximal schedules with average load of CPUs will be maximized on a job-
respect to criteria LL1, …, LL4. flow scale.

2.4 Coordinated Scheduling with the Critical


Works Method
The critical works method was developed for
application-level scheduling [19, 20]. However, it
can be further refined to build multifactor and
multicriteria strategies for job-flow distribution in
virtual organizations. This method is based on
dynamic programming and therefore uses some
integral characteristics, for example total resource
usage cost for the tasks that compose the job.
However the method of critical works can be
referred to the priority scheduling class. There is no
conflict between these two facts, because the
method is dedicated for task co-allocation of
compound jobs.
Let us consider a simple example. Fig. 3
represents two jobs with walltimes WT1 = 110
and WT2 = 140 that are submitted to the
distributed environment with 8 CPUs. If the jobs
are submitted one-by-one the metascheduler
(Section 3) will also schedule them one-by-one and
will guarantee that every job will be scheduled
within the defined time interval and in most
efficient way in terms of a selected cost function Figure 3: Sample jobs.
and maximize average load balance of CPUs on a
single job scale (Fig. 4). Job-flow execution will be Fig. 5 shows, that both jobs are executed within
finished at WT3 = 250. This is an example of WT4 = WT2 = 140, every data dependency is taken
application-level scheduling and no integral job- into account (e.g. for the second job: task P2 is
flow characteristics are optimized in this case. executed only after tasks P0, P4, and P1 are
To combine application-level scheduling and ready), the final schedule is chosen from the
job-flow scheduling and to fully exploit the generated strategy with the lowest cost function.
advantages of the approach proposed, one can Priority scheduling based on queues is not an
submit both jobs simultaneously or store them in efficient way of multiprocessor jobs co-allocating,
buffer and execute the scheduling for all jobs in the in our opinion. Besides, there are several well-
buffer after a certain amount of time (buffer time). known side effects of this approach in the cluster
If the metascheduler gets more than one job to systems such as LL, NQE, LSF, PBS and others.

UbiCC Journal – Volume 4 No. 3 564


Special Issue on ICIT 2009 Conference - Applied Computing

For example, traditional First-Come-First-Serve One to mention is Maui cluster scheduler, where
(FCFS) strategy leads to idle standing of the backfilling algorithm is implemented. Remote Grid
resources. Another strategy, which involves job resource reservation mechanism is also supported in
ranking according to the specific properties, such as GARA, Ursala and Silver projects [16]. Here, only
computational complexity, for example Least- one variant of the final schedule is built and it can
Work-First (LWF), leads to a severe resource become irrelevant because of changes in the local
fragmentation and often makes it impossible to job-queue, transporting delays etc. The strategy is
execute some jobs due to the absence of free some kind of preparation of possible activities in
resources. In distributed environments these effects distributed computing based on supporting
can lead to unpredictable job execution time and schedules (see Fig. 2, Tables 2, 3 and 4) and
thereby to unsatisfactory quality of service. reactions to the events connected with resource
assignment and advance reservations [15, 16]. The
more factors considered as formalized criteria are
taken into account in strategy generation, the more
complete is the strategy in the sense of coverage of
possible events [18, 19]. The choice of the
supporting schedule [20] depends on the utilization
state of processor nodes, data storage and relocation
policies specific to the environment, structure of the
jobs themselves and user estimations of completion
time and resource requirements.
It is important to mention that users can submit
jobs without information about the task execution
order as required by existing schedulers like Maui
cluster scheduler were only queues are supported.
Implemented mechanisms of our approach support
a complex structure for the job, which is
represented as a directed graph, so user should only
provide data dependencies between tasks (i.e. the
structure of the job). The metascheduler will
generate the schedules to satisfy their needs by
Figure 4: Consequential scheduling.
providing optimal plans for jobs (application-level
scheduling) and the needs for the resource owners
In order to avoid it many projects have
by optimizing the defined characteristics of the job-
components that make schedules, which are
flow for the distributed system (job-flow
supported by preliminary resource reservation
scheduling).
mechanisms [15, 16].
3 METASCHEDULING FRAMEWORK

In order to implement the effective scheduling


and allocation to heterogeneous resources, it is very
important to group user jobs into flows according to
the strategy type selected and to coordinate job-
flow and application-level scheduling. A
hierarchical structure (Fig. 6) composed of a job-
flow metascheduler and subsidiary job managers,
which are cooperating with local batch-job
management systems, is a core part of a scheduling
framework proposed in this paper. It is assumed that
the specific supporting schedule is realized and the
actual allocation of resources is performed by the
system of batch processing of jobs. This schedule is
implemented on the basis of a user resource request
with a requirement to the types and characteristics
of resources (memory and processors) and to the
system software as well as generated, for example,
Figure 5: Job-flow scheduling. by the script of the job entry instruction qsub.
Therefore, the formation and support of scheduling

UbiCC Journal – Volume 4 No. 3 565


Special Issue on ICIT 2009 Conference - Applied Computing

strategies should be conducted by the and resource owner’s needs as well as virtual
metascheduler, an intermediary link between the job organization policy of resource assignment should
flow and the system of batch processing. be taken into account. The scheduling strategy is
formed on a basis of formalized efficiency criteria,
Job-flows which sufficiently allow reflecting economical
i j k principles [14] of resource allocation by using
relevant cost functions and solving the load balance
Metascheduler
problem for heterogeneous processor nodes. The
strategy is built by using methods of dynamic
Job manager Job manager
programming [20] in a way that allows optimizing
for strategy Sk for strategies Si, Sj scheduling and resource allocation for a set of tasks,
comprising the compound job. In contrast to
previous works, we consider the scheduling strategy
Computer nodes
Job manager
for strategy Si Computer nodes as a set of admissible supporting schedules (see
Fig. 2, Tables 2 and 3). The choice of the specific
variant depends on the load level of the resource
Computer node domains dynamics and is formed as a resource query, which
is sent to a local batch-job processing system.
Figure 6: Components of metascheduling One of the important features of our approach is
framework. resource state forecasting for timely updates of the
strategies. It allows implementing mechanisms of
The advantages of hierarchically organized adaptive job-flow reallocation between processor
resources managers are obvious, e.g., the nodes and domains, and also means that there is no
hierarchical job-queue-control model is used in the more fixed task assignment on a particular
GrADS metascheduler [13] and X-Com system [2]. processor node. While one part of the job can be
Hierarchy of intermediate servers allows decreasing sent for execution, the other tasks, comprising the
idle time for the processor nodes, which can be job, can migrate to the other processor nodes
inflicted by transport delays or by unavailability of according to the updated co-allocation strategy. The
the managing server while it is dealing with the similar schedule correction procedure is also
other processor nodes. Tree-view manager structure supported in the GrADS project [13], where
in the network environment of distributed multistage job control procedure is implemented:
computing allows avoiding deadlocks when making initial schedule, its correction during the job
accessing resources. Another important aspect of execution, metascheduling for a set of applications.
computing in heterogeneous environments is that Downside of this approach is the fact, that it is
processor nodes with the similar architecture, based on the creation of a single schedule, so the
contents, administrating policy are grouped together metascheduler stops working when no additional
under the job manager control. resources are available and job-queue is then set to
Users submit jobs to the metascheduler (see Fig. waiting mode. The possibility of strategy updates
6) which distributes job-flows between processor allows user, being integrated into economical
node domains according to the selected scheduling conditions of virtual organization, to affect job start
and resource co-allocation strategy Si, Sj or Sk. It time by changing resource usage costs. In fact it
does not mean, that these flows cannot “intersect” means that the job-flow dispatching strategy is
each other on nodes. The special reallocation modified according to new priorities and this
mechanism is provided. It is executed on the higher- provides competitive functioning and dynamic job-
level manager or on the metascheduler-level. Job flow balance in virtual organization with
managers are supporting and updating strategies inseparable resources.
based on cooperation with local managers and
simulation approach for job execution on processor 4 SIMULATIONS STUDIES AND
nodes. Innovation of our approach consists in RESULTS
mechanisms of dynamic job-flow environment
reallocation based on scheduling strategies. The 4.1 Simulation System
nature of distributed computational environments We have implemented an original simulation
itself demands the development of multicriteria and environment (Fig. 7) of the metascheduling
multifactor strategies [17, 18] of coordinated framework (see Fig. 6) to evaluate efficiency
scheduling and resource allocation. indices of different scheduling and co-allocation
The dynamic configuration of the environment, strategies. In contrast to well-known Grid
large number of resource reallocation events, user’s simulation systems such as ChicSim [12] or
OptorSim [23], our simulator MetaSim generates

UbiCC Journal – Volume 4 No. 3 566


Special Issue on ICIT 2009 Conference - Applied Computing

multicriteria strategies as a number of supporting more computational expenses than MS1 especially
schedules for metascheduler reactions to the events for simulation studies of integrated job-flow and
connected with resource assignment and advance application-level scheduling.
reservations. Therefore, in some experiments with integrated
Strategies for more than 12000 jobs with a fixed scheduling we compared strategies MS1, S2, and
completion time were studied. Every task of a job S3.
had randomized completion time estimations,
computation volumes, data transfer times and 4.3 Application-Level Scheduling Study
volumes. These parameters for various tasks had We have conducted the statistical research of the
difference which was equal to 2, ..., 3. Processor critical works method for application-level
nodes were selected in accordance to their relative scheduling with above-mentioned types of strategies
performance. For the first group of “fast” nodes the S1, S2, S3. The main goal of the research was to
relative performance was equal to 0.66, …, 1, for estimate a forecast possibility for making
the second and the third groups 0.33, …, 0.66 and application-level schedules with the critical works
0.33 (“slow” nodes) respectively. A number of method without taking into account independent job
nodes was conformed to a job structure, i.e. a task flows. For 12000 randomly generated jobs there
parallelism degree, and was varied from 20 to 30. were 38% admissible solutions for S1 strategy,
37% for S2, and 33% for S3 (Fig. 8). This result is
4.2 Types of Strategies obvious: application-level schedules implemented
We have studied the strategies of the following by the critical works method were constructed for
types: available resources non-assigned to other
S1 – with fine-grain computations and independent jobs.
active data replication policy; Along with it there is a conflict distribution for
S2 – with fine-grain computations and a the processor nodes that have different performance
remote data access; (“fast” are 2-3 times faster, than “slow” ones): 32%
S3 – with coarse-grain computations and for “fast” ones, 68% for “slow” ones in S1, 56%
static data storage; and 44% in S2, 74% and 26% for S3 (Fig. 9). This
MS1 – with fine-grain computations, may be explained as follows. The higher is the task
active data replication policy, and the best- and state of distribution in the environment with active
worst execution time estimations (a modification of data transfer policy, the lower is the probability of
the strategy S1). collision between tasks on a specific resource.
The strategy MS1 is less complete than the In order to implement the effective scheduling
strategy S1 or S2 in the sense of coverage of events and resource allocation policy in the virtual
in distributed environment (see Tables 2 and 3). organization we should coordinate application and
However the important point is the generation of a job-flow levels of the scheduling.
strategy by efficient and economic computational
procedures of the metascheduler. The type S1 has

Figure 7: Simulation environment of hierarchical scheduling framework based on strategies.

UbiCC Journal – Volume 4 No. 3 567


Special Issue on ICIT 2009 Conference - Applied Computing

S1 S1

S2
S2

S3
S3
Figure 8: Percentage of admissible application-level Figure 9: Percentage of collisions for “fast”
schedules. processor nodes in application-level scheduling.

4.4 Job-Flow and Application-Level Scheduling Average node load level, %


Study 80
For each simulation experiment such factors as
job completion “cost”, task execution time, 60
scheduling forecast errors (start time estimation),
strategy live-to-time (time interval of acceptable 40
schedules in a dynamic environment), and average
load level for strategies S1, MS1, S2, and S3 were
20
studied.
Figure 10 shows load level statistics of variable
performance processor nodes which allows 0
discovering the pattern of the specific resource usage S1 S2 S3
when using strategies S1, S2, and S3 with Relative processor nodes performance
coordinated job-flow and application-levels 0.66-1 0.33-0.66 0.33
scheduling.
The strategy S2 performs the best in the term of Figure 10: Processor node load level in strategies
load balancing for different groups of processor S1, S2, and S3.
nodes, while the strategy S1 tries to occupy “slow”
nodes, and the strategy S3 - the processors with the Factor quality analysis of S2, S3 strategies for
highest performance (see Fig. 10). the whole range of execution time estimations for the

UbiCC Journal – Volume 4 No. 3 568


Special Issue on ICIT 2009 Conference - Applied Computing

selected processor nodes as well as modification scenarios, e.g., in our experiments we use FCFS
MS1, when best- and worst-case execution time management policy in local batch-job management
estimations were taken, is shown in Figures 11 and systems. Afore-cited research results of strategy
12. characteristics were obtained by simulation of global
Relative job Relative task job-flow in a virtual organization. Inseparability
completion cost execution time condition for the resources requires additional
1 1 advanced research and simulation approach of local
job passing and local processor nodes load level
forecasting methods development. Different job-
0.5 0.5
queue management models and scheduling
algorithms (FCFS modifications, LWF, backfilling,
gang scheduling, etc.) can be used here. Along with it
0 0
MS1 S2 S3
local administering rules can be implemented.
One of the most important aspects here is that
Сost Execution time advance reservations have impact on the quality of
Figure 11: Job completion cost and task execution service. Some of the researches (particularly the one
time in strategies MS1, S2, and S3. in Argonne National Laboratory) show, that
preliminary reservation nearly always increases
Lowest-cost strategies are the “slowest” ones like queue waiting time. Backfilling decreases this time.
S3 (see Fig. 11); they are most persistent in the term With the use of FCFS strategy waiting time is shorter
of time-to-live as well (see Fig. 12). than with the use of LWF. On the other hand,
estimation error for starting time forecast is bigger
Relative Start time deviation/ with FCFS than with LWF. Backfilling that is
time-to-live job run time
implemented in Maui cluster scheduler includes
1 1 advanced resource reservation mechanism and
guarantees resource allocation. It leads to the
0.5 0.5
difference increase between the desired reservation
time and actual job starting time when the local
request flow is growing. Some of the quality aspects
0 0
and job-flow load balance problem are associated
MS1 S2 S3 with dynamic priority changes, when virtual
organization user changes execution cost for a
Time-to-live Deviation specific resource.
Figure 12: Time-to-live and start deviation time in All of these problems require further research.
strategies MS1, S2, and S3.
ACKNOWLEDGEMENT. This work was
The strategies of the type S3 try to monopolize supported by the Russian Foundation for Basic
processor resources with the highest performance and Research (grant no. 09-01-00095) and by the State
to minimize data exchanges. Withal, less persistent Analytical Program “The higher school scientific
are the “fastest”, most expensive and most accurate potential development” (project no. 2.1.2/6718).
strategies like S2. Less accurate strategies like MS1
(see Fig. 12) provide longer task completion time, 6 REFERENCES
than more accurate ones like S2 (Fig. 11), which
include more possible events, associated with [1] I. Foster, C. Kesselman, and S. Tuecke: The
processor node load level dynamics. Anatomy of the Grid: Enabling Scalable
Virtual Organizations, Int. J. of High
5 CONCLUSIONS AND FUTURE WORK Performance Computing Applications, Vol. 15,
No. 3, pp. 200 – 222 (2001)
The related works in scheduling problems are [2] V.V. Voevodin: The Solution of Large
devoted to either job scheduling problems or Problems in Distributed Computational Media,
application-level scheduling. The gist of the Automation and Remote Control, Vol. 68, No.
approach described is that the resultant dispatching 5, pp. 32 – 45 (2007)
strategies are based on the integration of job-flows [3] D. Thain, T. Tannenbaum, and M. Livny:
and application-level techniques. It allows increasing Distributed Computing in Practice: the Condor
the quality of service for the jobs and distributed Experience, Concurrency and Computation:
environment resource usage efficiency. Practice and Experience, Vol. 17, No. 2-4, pp.
Our results are promising, but we have bear in 323 - 356 (2004)
mind that they are based on simplified computation

UbiCC Journal – Volume 4 No. 3 569


Special Issue on ICIT 2009 Conference - Applied Computing

[4] A. Roy and M. Livny: Condor and Preemptive future trends, Kluwer Academic Publishers, pp.
Resume Scheduling, In: J. Nabrzyski, J.M. 73 – 98 (2003)
Schopf, and J.Weglarz (eds.): Grid resource [14] R. Buyya, D. Abramson, J. Giddy et al.:
management. State of the art and future trends, Economic Models for Resource Management
Kluwer Academic Publishers, pp. 135 – 144 and Scheduling in Grid Computing, J. of
(2003) Concurrency and Computation: Practice and
[5] V.V. Krzhizhanovskaya and V. Korkhov: Experience, Vol. 14, No. 5, pp. 1507 – 1542
Dynamic Load Balancing of Black-Box (2002)
Applications with a Resource Selection [15] K. Aida and H. Casanova: Scheduling Mixed-
Mechanism on Heterogeneous Resources of parallel Applications with Advance
Grid, In: 9th International Conference on Reservations, In: 17th IEEE International
Parallel Computing Technologies, Springer, Symposium on High-Performance Distributed
Heidelberg, LNCS, Vol. 4671, pp. 245 – 260 Computing, IEEE Press, New York, pp. 65 –
(2007) 74 (2008)
[6] F. Berman: High-performance Schedulers, In: [16] D.B. Jackson: GRID Scheduling with
I. Foster and C. Kesselman (eds.): The Grid: Maui/Silver, In: J. Nabrzyski, J.M. Schopf, and
Blueprint for a New Computing Infrastructure, J.Weglarz (eds.): Grid resource management.
Morgan Kaufmann, San Francisco, pp. 279 – State of the art and future trends, Kluwer
309 (1999) Academic Publishers, pp. 161 – 170 (2003)
[7] Y. Yang, K. Raadt, and H. Casanova: [17] K. Kurowski, J. Nabrzyski, A. Oleksiak, and J.
Multiround Algorithms for Scheduling Weglarz: Multicriteria Aspects of Grid
Divisible Loads, IEEE Transactions on Parallel Resource Management, In: J. Nabrzyski, J.M.
and Distributed Systems, Vol. 16, No. 8, pp. Schopf, and J.Weglarz (eds.): Grid resource
1092 – 1102 (2005) management. State of the art and future trends,
[8] A. Natrajan, M.A. Humphrey, and A.S. Kluwer Academic Publishers, pp. 271 – 293
Grimshaw: Grid Resource Management in (2003)
Legion,” In: J. Nabrzyski, J.M. Schopf, and [18] V. Toporkov: Multicriteria Scheduling
J.Weglarz (eds.): Grid resource management. Strategies in Scalable Computing Systems, In:
State of the art and future trends, Kluwer 9th International Conference on Parallel
Academic Publishers, pp.145 – 160 (2003) Computing Technologies, Springer,
[9] J. Beiriger, W. Johnson, H. Bivens et al.: Heidelberg, LNCS, Vol. 4671, pp. 313 – 317
Constructing the ASCI Grid, In: 9th IEEE (2007)
Symposium on High Performance Distributed [19] V.V. Toporkov and A.S. Tselishchev: Safety
Computing, IEEE Press, New York, pp. 193 – Strategies of Scheduling and Resource Co-
200 (2000) allocation in Distributed Computing, In: 3rd
[10] J. Frey, I. Foster, M. Livny et al.: Condor-G: a International Conference on Dependability of
Computation Management Agent for Multi- Computer Systems, IEEE CS Press, pp. 152 –
institutional Grids, In: 10th International 159 (2008)
Symposium on High-Performance Distributed [20] V.V. Toporkov: Supporting Schedules of
Computing, IEEE Press, New York, pp. 55 – Resource Co-Allocation for Distributed
66 (2001) Computing in Scalable Systems, Programming
[11] D. Abramson, J. Giddy, and L. Kotler: High and Computer Software, Vol. 34, No. 3, pp.
Performance Parametric Modeling with 160 – 172 (2008)
Nimrod/G: Killer Application for the Global [21] M. Tang, B.S. Lee, X. Tang, et al.: The Impact
Grid?, In: International Parallel and Distributed of Data Replication on Job Scheduling
Processing Symposium, IEEE Press, New Performance in the Data Grid, Future
York, pp. 520 – 528 (2000) Generation Computing Systems, Vol. 22, No.
[12] K. Ranganathan and I. Foster: Decoupling 3, pp. 254 – 268 (2006)
Computation and Data Scheduling in [22] N.N. Dang, S.B. Lim, and C.K. Yeo:
Distributed Data-intensive Applications, In: Combination of Replication and Scheduling in
11th IEEE International Symposium on High Data Grids, Int. J. of Computer Science and
Performance Distributed Computing, IEEE Network Security, Vol. 7, No. 3, pp. 304 – 308
Press, New York, pp. 376 – 381 (2002) (2007)
[13] H. Dail, O. Sievert, F. Berman et al.: [23] W.H. Bell, D. G. Cameron, L. Capozza et al.:
Scheduling in the Grid Application OptorSim – A Grid Simulator for Studying
Development Software project, In: J. Dynamic Data Replication Strategies, Int. J. of
Nabrzyski, J.M. Schopf, and J.Weglarz (eds.): High Performance Computing Applications,
Grid resource management. State of the art and Vol. 17, No. 4, pp. 403 – 416 (2003)

UbiCC Journal – Volume 4 No. 3 570


Special Issue on ICIT 2009 Conference - Applied Computing

Least and greatest fixed points of


a while semantics function
Fairouz Tchier
Mathematics department,
King Saud University
P.O.Box 22452
Riyadh 11495, Saudi Arabia
ftchier@hotmail.com
May 1, 2009

Abstract 1 Relation Algebras


Both homogeneous and heterogeneous relation alge-
bras are employed in computer science. In this pa-
The meaning of a program is given by specifying the per, we use heterogeneous relation algebras whose
function (from input to output) that corresponds to definition is taken from [8, 27, 28].
the program. The denotational semantic definition,
thus maps syntactical things into functions. A re- (1) Definition. A relation algebra A is a structure
lational semantics is a mapping of programs to re- (B, ∨, ∧, −, ◦,^ ) over a non-empty set B of elements,
lations. We consider that the input-output seman- called relations. The unary operations −,^ are total
tics of a program is given by a relation on its set of whereas the binary operations ∨, ∧, ◦ are partial. We
states. In a nondeterministic context, this relation is denote by B∨R the set of those elements Q ∈ B for
calculated by considering the worst behavior of the which the union R ∨ Q is defined and we require that
program (demonic relational semantics). In this pa- R ∈ B∨R for every R ∈ B. If Q ∈ B∨R , we say that
per, we concentrate on while loops. We will present Q has the same type as R. The following conditions
some interesting results about the fixed points of the are satisfied.
while semantics function; f (X) = Q ∨ P 2 X where
P < ∧ Q< = Ø, by taking P := t 2 B and Q := t∼ , (a) (B∨R , ∨, ∧, −) is a Boolean algebra, with zero
one gets the demonic semantics we have assigned to element 0R and universal element 1R . The
while loops in previous papers. We will show that elements of B∨R are ordered by inclusion, de-
the least angelic fixed point is equal to the greatest noted by ≤.
demonic fixed point of the semantics function.
(b) If the products P ◦ R and Q ◦ R are defined,
so is P ◦ Q^ . If the products P ◦ Q and P ◦ R
are defined, so is Q^ ◦ R. If Q ◦ R exists, so
does Q ◦ P for every P ∈ B∨R .
Keywords: Angelic fixed points, demonic
fixed points, demonic functions, while (c) Composition is associative: P ◦ (Q ◦ R) =
loops, relational demonic semantics. (P ◦ Q) ◦ R.

UbiCC Journal – Volume 4 No. 3 571


Special Issue on ICIT 2009 Conference - Applied Computing

(d) There are elements R id and idR associated a vector [28] iff x = x ◦ 1. The second way is via
to every relation R ∈ B. R id behaves as a monotypes [2]: a relation a is a monotype iff a ≤ id.
right identity and idR as a left identity for The set of monotypes {a | a ∈ B∨R }, for a given R,
B∨R . is a complete Boolean lattice. We denote by a∼ the
monotype complement of a.
(e) The Schröder rule P ◦Q ≤ R ⇔ P ^ ◦−R ≤ The domain and codomain of a relation R can be
−Q ⇔ −R ◦ Q^ ≤ −P holds whenever one characterized by the vectors R ◦ 1 and R^ ◦ 1, re-
of the three expressions is defined. spectively [15, 28]. They can also be characterized
(f) 1 ◦ R ◦ 1 = 1 iff R 6= 0 (Tarski rule). by the corresponding monotypes. In this paper, we
take the last approach. In what follows we formally
If R^ ∈ B∨R , then R is said to be homogeneous. If define these operators and give some of their prop-
all R ∈ A have the same type, the operations are all erties.
total and A itself is said to be homogeneous.
(3) Definition. The domain and codomain opera-
For simplicity, the universal, zero, and identity ele- tors of a relation R, denoted respectively by R< and
ments are all denoted by 1, 0, id, respectively. An- R> , are the monotypes defined by the equations
other operation that occurs in this article is the re-
(a) R< = id ∧ R ◦ 1,
flexive transitive closure R∗ . It satisfies the well-
known laws (b) R> = id ∧ 1 ◦ R.
_
R∗ = Ri and R∗ = id ∨ R ◦ R∗ = id ∨ R∗ ◦ R, These operators can also be characterized by Galois
i≥0 connections(see [2, 2]). For each relation R and each
monotype a,
where R0 = id and Ri+1 = R ◦ Ri . From Definition
1, the usual rules of the calculus of relations can be R< ≤ a ⇔ R ≤ a ◦ 1,
derived (see, e.g., [8, 10, 28]). R> ≤ a ⇔ R ≤ 1 ◦ a.
The notion of Galois connections is very important
in what follows, there are many definitions of Galois The domain and codomain operators are linked by
connections [?]. We choose the following the equation R> = R^ < , as is easily checked.
one [2]. (4) Definition. Let R be a relation and a be a
(2) Definition. Let (S, ≤S ) and (S 0 , ≤S 0 ) be two monotype. The monotype right residual and mono-
preordered sets. A pair (f, g) of functions, where f : type left residual of a by R (called factors in [5]) are
S → S 0 and g : S 0 → S, forms a Galois connections defined respectively by
iff the following formula holds for all x ∈ S and y ∈ (a) a/•R := ((1 ◦ a)/R)> ,
S0.
f (x) ≤S 0 y ⇔ x ≤S g(y). (b) R\•a := (R\(a 2 1))< .
The function f is called the lower adjoint and g
An alternative characterization of residuals can
the upper adjoint.
also be given by means of a Galois connection as
follows [1]:
2 Monotypes and Related Op- b ≤ a/•R ⇔ (b 2 R)> ≤ a,
b ≤ R\•a ⇔ (R ◦ b)< ≤ a.
erators
We have to use exhaustively the complement of
In the calculus of relations, there are two ways for the domain of a relation R, i.e the monotype a such
viewing sets as relations; each of them has its own that a = R< ∼ . To avoid the notation R< ∼ , we adopt
advantages. The first is via vectors: a relation x is the Notation

UbiCC Journal – Volume 4 No. 3 572


Special Issue on ICIT 2009 Conference - Applied Computing

R≺ := R< ∼ . of R^ . Then, I(R) is a monotype. In a concrete


Because we assume our relation algebra to be com- setting, I(R) is the set of monotypes which are not
plete, least and greatest fixed points of monotonic the origins of infinite paths (by R):
functions exist. We cite [12] as a general reference A relation R is progressively finite iff for a mono-
on fixed points. type a, a ≤ (R ◦ a)< ⇒ a = 0 equivalently
Let f be a monotonic function. The fol- ν(a : a ≤ id : (R ◦ a)< ) = 0 equivalently µ(a : a ≤
lowing properties
V of fixed points Vare used below: id : a/•R) = id.
(a) µf = W {X|f (X) = X} = W {X|f (X) ≤ X}, The next theorem involves the function wa (X) :=
(b) νf = {X|f (X) = X} = {X|X ≤ f (X)}, Q ∨ P ◦ X, which is closely related to the description
(c) µf ≤ νf, of iterations. The theorem highlights the importance
(d) f (Y ) ≤ Y ⇒ µf ≤ Y, of progressive finiteness in the simplification of fixed
(e) Y ≤ f (Y ) ⇒ Y ≤ νf. point-related properties.
In what follows, we describe notions that are useful
for the description of the set of initial states of a (6) Theorem. Let f (X) := Q ∨ P ◦ X be a func-
program for which termination is guaranteed. These tion. If P is progressively finite, the function f has a
notions are progressive finiteness and the initial part unique fixed point which means that ν(f ) = µ(f ) =
of a relation. P ∗ ◦ Q [1]:
A relation R is progressively finite in terms of
As the demonic calculus will serve as an algebraic
points iff there are no infinite chains s0 , ..., si such
apparatus for defining the denotational semantics
that si Rsi+1 ∀i, i ≥ 0. I.e there is no points set y
of the nondeterministic programs, we will define in
which are the starting points of some path of infinite
what follows these operators.
length. For every point set y, y ≤ R ◦ y ⇒ y = 0.
The least set of points which are the starting points
of paths of finite length i.e from which we can pro- 3 Demonic refinement order-
ceed only finitely many steps is called initial part
of R denoted by I(R). This topic is of interest in ing
many areas of computer science, mathematics and is
We now define the refinement ordering (demonic in-
related to recursion and induction principle.
clusion) we will be using in the sequel. This ordering
(5) Definition. induces a complete join semilattice, called a demonic
semilattice. The associated operations are demonic
(a) The initial part of a relation R, denoted join (t), demonic meet (u) and demonic composition
I(R), is V
given by V ( 2 ). We give the definitions and needed properties
I(R) = {a | a ≤ id : a/•R = a} = {a | of these operations, and illustrate them with simple
a ≤ id : a/•R ≤ a} = µ(a : a ≤ id : a/•R), examples. For more details on relational demonic
where a is a monotype. semantics and demonic operators, see [5, 8, 6, 7, 14].

(b) A relation R is said to be progressively finite (7) Definition. We say that a relation Q refines a
[28] iff I(R) = id. relation R [23], denoted by Q v R, iff R< ◦ Q ≤
R and R< ≤ Q< .
The description of I(R) by the formulation a/•R = a
shows that I(R) exists, since (a | a ≤ id : a/•R) is (8) Proposition. Let Q and R be relations, then
monotonic in the first argument and because the set
of monotypes is a complete lattice, it follows from the (a) The greatest lower (wrt v) of Q and R is,
fixed point theorem of Knaster and Tarski that this Q t R = Q< ◦ R< ◦ (Q ∨ R),
function has a least fixed point. Progressive finite- If Q< = R< then we have t and ∨ coincide
ness of a relation R is the same as well-foundedness i.e Q t R = Q ∨ R.

UbiCC Journal – Volume 4 No. 3 573


Special Issue on ICIT 2009 Conference - Applied Computing

(b) If Q and R satisfy the condition Q< ∧ R< = (a) S(R) = I(P ) ◦ [(P ∨ Q)< /•P ∗ ] ◦ P ∗ ◦ Q., with
(Q ∧ R)< , their least upper bound is Q u R = the restriction
Q ∧ R ∨ Q≺ ◦ R ∨ R≺ ◦ Q, otherwise, the least
upper bound does not exist. If Q< ∧ R< = 0 (b) P < ∧ Q< = 0
then we have u and ∧ coincide i.e Q u R =
Our goal is to show that the operational semantics
Q ∧ R.
a is equal to the denotational one which is given as
For the proofs see [9, 14]. the greatest fixed point of the semantic function Q ∨
P 2 X in the demonic semilattice. In other words,
(9) Definition. The demonic composition of rela- we have to prove the next equation:
tions Q and R [5] is Q 2 R = (R< /•Q) ◦ Q ◦ R. F
(a) S(R) = {X|X v Q ∨ P 2 X};
In what follows we present some properties of 2 .
(10) Theorem. by taking P := t 2 B and Q := t∼ , one gets the
demonic semantics we have assigned to while loops
(a) (P 2 Q) 2 R = P 2 (Q 2 R), in previous papers [14, 35]. Other similar definitions
of while loops can be found in [19, 25, 29].
(b) R total ⇒ Q 2 R = Q ◦ R,
Let us introduce the following abbreviations:
(c) Q function ⇒ Q 2 R = Q ◦ R.
(12) Abbreviation. Let P , Q and X be relations
See [5, 6, 7, 14, 35]. subject to the restriction P < ∧ Q< = 0 (b) and x
Monotypes have very simple and convenient prop- a monotype. The Abbreviations wd , wa , w< , a and l
erties. Some of them are presented in the following are defined as follows:
proposition. wd (X) := Q ∨ P 2 X,
a := (P ∨ Q)< /•P ∗ ,
(11) Proposition. Let a and b be monotypes. We wa (X) := Q ∨ P ◦ X,
have l := I(P ).
(a) a = a^ = a2 , w< (x) := Q< ∨ (P 2 x)< = Q ∨ (P 2 x)<

(b) a 2 b = a ∧ b = b 2 a, (Mnemonics: the subscripts a and d stand for angelic


and demonic, respectively; the subscript < refers to
(c) a ∨ a∼ = id and a ∧ a∼ = 0, the fact that w< is obtained from wd by composi-
tion with <; the monotype a stands for abnormal,
(d) a ≤ b ⇔ b∼ ≤ a∼ ,
since it represents states from which abnormal ter-
(e) a∼ 2 b∼ = (a ∨ b)∼ , mination is not possible; finally, l stands for loop,
since it represents states from which no infinite loop
(f ) (a ∧ b)∼ = (a 2 b)∼ = a∼ ∨ b∼ , is possible.)
(g) a 2 b∼ ∨ b = a ∨ b, In what follows we will be concerned about the
fixed point of wa , w< and wd .
(h) a ≤ b ⇔ a 2 1 ≤ b 2 1.
(13) Theorem. Every fixed point Y of wa (Abbre-
In previous papers [14, 13, 31, 35], we found the viation 12) verifies P ∗ ◦ Q ≤ Y ≤ P ∗ ◦ Q ∨ l∼ 2 1,
semantics of the while loop given by the following and the bounds are tight (i.e. the extremal values are
P

  fixed points).
graph: - e - s -
  The next lemma investigates the relationship be-
Q
tween fixed points of w< and those of wd (cf. Abbre-
viation 12).

UbiCC Journal – Volume 4 No. 3 574


Special Issue on ICIT 2009 Conference - Applied Computing

(14) Lemma. Let h(X) := (P ∨ Q)≺ ∨ (P ◦ X)< and 4 Application


h1 (x) := (P ∨ Q)≺ 2 1 ∨ P ◦ x.
In [6, 7], Berghammer and Schmidt propose abstract
(a) Y = wd (Y ) ⇒ w< (Y < ) = Y < , relation algebra as a practical means for the specifi-
(b) w< (Y < ) = Y < ⇒ h(Y ≺ ) = Y ≺ , cation of data types and programs. Often, in these
specifications, a relation is characterized as a fixed
(c) h(Y ≺ ) = Y ≺ ⇒ h1 (Y ≺ 2 1) = Y ≺ 2 1, point of some function. Can demonic operators be
used in the definition of such a function? Let us now
(15) Lemma. Let Y be a fixed point of wd and b be show with a simple example that the concepts pre-
a fixed point of w< (Abbreviation 12). The relation sented in this paper give useful insights for answering
b 2 Y is a fixed point of wd . this question.
(16) Lemma. If Y and Y 0 are two fixed points of In [6, 7], it is shown that the natural numbers can
wd (Abbreviation 12) such that Y < = Y 0< and Y < ◦P be characterized by the relations z and S (zero and
is progressively finite, then Y = Y 0 . successeur ) the laws

The next theorem characterizes the domain of the (a) Ø 6= z = zL ∧ zz ^ ⊆ I (z is a point),


greatest fixed point, wrt v, of function wd . This SS ^ = I ∧ S^S ⊆
domain is the set of points for which normal ter- I (S is a one to one application.),
mination is guaranteed (no possibility of abnormal Sz = Ø (z has T a predecessor),
termination or infinite loop). L = {x|z ∪ S ^ x =
x} (generation principle).
(17) Theorem. Let W be the greatest fixed point,
wrt to v, of wd (Abbreviation 12). We have W < = For the rest of this section, assume that we are
a 2 l. given a relation algebra satisfying these laws. In this
algebra, because of the last axiom, the inequation
The following theorem is a generalization to a non-
deterministic context of the while statement verifi- (a) z ∪ S ^ X ⊆ X
cation rule of Mills [24]. It shows that the greatest
fixed point W of wd is uniquely characterized by con- obviously has a unique solution for X, namely, X =
ditions (a) and (b), that is, by the fact that W is a L. Because the functiong(X) := z ∪ S ^ X is ∪-
fixed point of wd and by the fact that no infinite loop continuous, this solution can be expressed as
is possible when the execution is started in a state
(a) L = n≥0 g n (Ø) = n≥0 S ^n z,
S S
that belongs to the domain of W . Note that we also
have W < ≤ a (see Theorem 17), but this condition
where g 0 (Ø) = Ø, g n+1 (Ø) = g(g n (Ø)), S ^0 = I
is implicitly enforced by condition (a). Half of this
and S ^n+1 = S ^ S ^n . However, it is shown in [6, 7]
theorem (the ⇐ direction) is also proved by Sekerin-
that z t S ^ 2 X ⊆ X, obtained by replacing the
ski (the main iteration theorem [29]) in a predicative
join and composition operators in a by their demonic
programming set-up.
counterparts, has infinitely many solutions. Indeed,
(18) Theorem. A relation W is the greatest fixed from Sz = Ø and the Schröder rule, it follows that
point, wrt v, of function wd (Abbreviation 12), iff
the following two conditions hold: (a) z ∩ S ^ L = Ø,

(a) W = wd (W ), so that, by definition of demonic join (8(a))


(b) W < ≤ l. and demonic composition (9), z t S ^ 2 X = (z ∪
S ^ 2 X) ∩ z ∩ (S ^ 2 X)L ⊆ z ∩ S ^ L = Ø. Hence,
In what follows we give some applications of our any relation R is a solution to z t S ^ 2 X ⊆ X.
results. Looking at previous papers [14, 32, 33, 34, 31], one

UbiCC Journal – Volume 4 No. 3 575


Special Issue on ICIT 2009 Conference - Applied Computing

immediately sees why it is impossible to reach L by how the universal relationL arises as the greatest
joining anything to z (which is a point and hence is lower bound n≥0 S ^n 2 z of this set of points. Note
an immediate predecessor of Ø), since this can only that, whereas there is a unique solution to a, there
lead to z or to Ø. are infinitelyF
many solutionsSto 4 (equivalently, to a),
Let us now go ‘fully demonic’ and ask what is a for example n≥k S n (= n≥k S n ), for any k.
solution to ztS ^ 2 X v X. By the discussion above, For the upward approach, consider
this is equivalent to Ø v X, which has a unique
solution, X = Ø. This raises the question whether z ^ t X 2 S v X.
it is possible to find some fully demonic inequation Here also there are infinitely many solutions to this
similar to (a), whose solution is X = L. Because L is inequation; in particular, any vector v, including
in the middle of the demonic semilattice, there are in Ø and L, is a solution to 4. Because (BL , v) is
fact two possibilities: either approach L from above only a join semilattice, it is not at all obvious that
or from below. the least fixed point of h(X) := z ^ t X 2 S ex-
For the approach from above, consider the inequa- ists. It does, however, since the following deriva-
tion tion shows that n≥0 z ^ 2 S n (=
F F
h n ^
(z ),
n≥0
0 ^ ^
X v z u S ^ 2 X. where h (z ) = z ) is a fixed point of h and hence
is obviously the least solution of 4: Because z ^
Using Theorem 10(c), we have z u S ^ 2 X = and S are mappings, property 10(c) implies that
z u S ^ X, since S ^ is deterministic (axiom a(b)). z ^ 2 S n = z ^ S n , for any n ≥ 0. But z ^ S n is
From a, z ⊆ S ^ L; this implies z ⊆ S ^ XL and also a mapping (it is the inverse of the point S ^n z)
S ^ X ⊆ z, so that, by definition of u, and hence is total,Ffrom which, by Proposition 8(a)
^2 n ^ n
F
z u S ^ X = z ∩ S ^ X ∪ z ∩ S ^ XL ∪ z ∩ S ^ X = and equation a, z
S n≥0 ^n S = n≥0 z S =
^ n ^
S
z ∪ S ^ X. n≥0 z S = ( n≥0 S z)˘ = L = L. This
This means that 4 reduces to means that L is the least upper bound of the set
of mappings {z ^ 2 S n |n ≥ 0}. Again, a look at
(a) X v z ∪ S ^ X. [31] gives some intuition to understand this result,
after recalling that mappings are minimal elements
By definition of refinement (7), this implies that
in (BL , v) (though not all mappings have the form
z ∪ S ^ XL ⊆ XL; this is a variant of (a), thus
z ^ 2 S n ).
having XL = L as only solution. This means that
Thus, building L from below using the set of map-
any solution to 4 must be a total relation. But L
pings {z ^ 2 S n |n ≥ 0} is symmetric to building it
is total and in fact is the largest (by v) total rela-
from above using the set of points {S ^n 2 z|n ≥ 0}.
tion. It is also a solution toF4 (since by axiom a(d),
z ∪ S ^ L = L) so that L = {X|X v z u S ^ 2 X};
that is, L is the greatest fixed point in (BL , v) of 5 Conclusion
f (X) := z u S ^ 2 X. Now consider n≥0 S
^n 2
z,
^n
where S is a n-fold demonic composition defined We presented a theorem that can be also used to find
by S ^0 = I and S ^n+1 = S ^ 2 S ^n . By axiom the fixed points of functions of the form f (X) :=
a(b), S ^ is deterministic, so that, by 10(c) and asso- Q ∨ P 2 X (no restriction on the domains of P and
ciativity of demonic composition, conS n 2 z = S ^n z. Q). This theorem can be applied also to the program
Hence, verification and construction (as in the precedent ex-
It is easy to show that for any n ≥ 0, S ^n z is ample). Half of this theorem (the ⇐ direction) is
a point (it is the n-th successor of zero) and that also proved by Sekerinski (the main iteration theo-
m 6= n ⇒ S ^m z 6= S ^n z. Hence, in (BL , v), rem [29]) in a predicative programming set-up. Our
{S ^n z|n ≥ 0} (i.e. {S ^n 2 z|n ≥ 0}) is the set of theorem is more general because there is no restric-
immediate predecessors of Ø; looking at [31] shows tion on the domains of the relations P and Q.

UbiCC Journal – Volume 4 No. 3 576


Special Issue on ICIT 2009 Conference - Applied Computing

The approach to demonic input-output relation [6] Berghammer, R.: Relational Specification of
presented here is not the only possible one. In Data Types and Programs. Technical report
[19, 20, 21], the infinite looping has been treated by 9109, Fakultät für Informatik, Universität der
adding to the state space a fictitious state ⊥ to de- Bundeswehr München, Germany, Sept. 1991.
note nontermination. In [8, 18, 22, 26], the demonic
input-output relation is given as a pair (relation,set). [7] Berghammer, R. and Schmidt, G.: Relational
The relation describes the input-output behavior of Specifications. In C. Rauszer, editor, Algebraic
the program, whereas the set component represents Logic, 28 of Banach Center Publications. Polish
the domain of guaranteed termination. Academy of Sciences, 1993.
We note that the preponderant formalism em-
ployed until now for the description of demonic [8] Berghammer, R. and Zierer, H.: Relational Al-
input-output relation is the wp-calculus. For more gebraic Semantics of Deterministic and Nonde-
details see [3, 4, 17]. terministic Programs. Theoretical Comput. Sci.,
43, 123–147 (1986).

References [9] Boudriga, N., Elloumi, F. and Mili, A.: On


the Lattice of Specifications: Applications to a
[1] Backhouse, R. C., and Doombos, H.: Math- Specification Methodology. Formal Aspects of
ematical Induction Made Calculational. Com- Computing, 4, 544–571 (1992).
puting science note 94/16, Department of Math-
ematics and Computer Science, Eindhoven Uni- [10] Chin, L. H. and Tarski, A.: Distributive and
versity of Technology, The Netherlands, 1994. Modular Laws in the Arithmetic of Relation Al-
gebras. University of California Publications, 1,
[2] Backhouse, R. C., Hoogendijk, P., Voermans, 341–384 (1951).
E. and van der Woude, J.:. A Relational The-
ory of Datatypes. Research report, Department [11] Conway, J. H.: Regular Algebra and Finite Ma-
of Mathematics and Computer Science, Eind- chines. Chapman and Hall, London, 1971.
hoven University of Technology, The Nether-
lands, 1992. [12] Davey, B. A. and Priestley, H. A.: Introduction
to Lattices and Order. Cambridge Mathematical
[3] R. J. R. Back. : On the correctness of refinement Textbooks. Cambridge University Press, Cam-
in program development. Thesis, Department of bridge, 1990.
Computer Science, University of Helsinki, 1978.
[13] J. Desharnais, B. Möller, and F. Tchier. Kleene
[4] R. J. R. Back and J. von Wright.: Combining under a demonic star. 8th International Con-
angels, demons and miracles in program spec- ference on Algebraic Methodology And Software
ifications. Theoretical Computer Science,100, Technology (AMAST 2000), May 2000, Iowa
1992, 365–383. City, Iowa, USA, Lecture Notes in Computer
Science, Vol. 1816, pages 355–370, Springer-
[5] Backhouse, R. C. and van der Woude, J.: De- Verlag, 2000.
monic Operators and Monotype Factors. Math-
ematical Structures in Comput. Sci., 3(4), 417– [14] Desharnais, J., Belkhiter, N., Ben Mo-
433, Dec. (1993). Also: Computing Science Note hamed Sghaier, S., Tchier, F., Jaoua, A., Mili,
92/11, Department of Mathematics and Com- A. and Zaguia, N.: Embedding a Demonic Semi-
puter Science, Eindhoven University of Technol- lattice in a Relation Algebra. Theoretical Com-
ogy, The Netherlands, 1992. puter Science, 149(2):333–360, 1995.

UbiCC Journal – Volume 4 No. 3 577


Special Issue on ICIT 2009 Conference - Applied Computing

[15] Desharnais, J., Jaoua, A., Mili, F., Boudriga, [28] Schmidt, G. and Ströhlein, T.: Relations and
N. and Mili, A.: A Relational Division Oper- Graphs. EATCS Monographs in Computer Sci-
ator: The Conjugate Kernel. Theoretical Com- ence. Springer-Verlag, Berlin, 1993.
put. Sci., 114, 247–272 (1993).
[29] Sekerinski, E.: A Calculus for Predicative Pro-
[16] Dilworth, R. P.: Non-commutative Residuated gramming. In R. S. Bird, C. C. Morgan, and
Lattices. Trans. Amer. Math. Sci., 46, 426–444 J. C. P. Woodcock, editors, Second Interna-
(1939). tional Conference on the Mathematics of Pro-
gram Construction, volume 669 of Lecture Notes
[17] E. W. Dijkstra. : A Discipline of Programming.
in Comput. Sci. Springer-Verlag, 1993.
Prentice-Hall, Englewood Cliffs, N.J., 1976.
[18] H. Doornbos. : A relational model of programs [30] Tarski, A.: On the calculus of relations. J.
without the restriction to Egli-Milner monotone Symb. Log. 6, 3, 1941, 73–89.
constructs. IFIP Transactions, A-56:363–382. [31] F. Tchier.: Sémantiques relationnelles
North-Holland, 1994. démoniaques et vérification de boucles non
[19] C. A. R. Hoare and J. He. : The weakest déterministes. Theses of doctorat, Département
prespecification. Fundamenta Informaticae IX, de Mathématiques et de statistique, Université
1986, Part I: 51–84, 1986. Laval, Canada, 1996.

[20] C. A. R. Hoare and J. He. : The weakest [32] F. Tchier.: Demonic semantics by mono-
prespecification. Fundamenta Informaticae IX, types. International Arab conference on In-
1986, Part II: 217–252, 1986. formation Technology (Acit2002),University of
Qatar, Qatar, 16-19 December 2002.
[21] C. A. R. Hoare and al. : Laws of programming.
Communications of the ACM, 30:672–686, 1986. [33] F. Tchier.: Demonic relational semantics of
compound diagrams. In: Jules Desharnais,
[22] R. D. Maddux. : Relation-algebraic semantics. Marc Frappier and Wendy MacCaull, editors.
Theoretical Computer Science, 160:1–85, 1996. Relational Methods in computer Science: The
[23] Mili, A., Desharnais, J. and Mili, F.: Relational Québec seminar, pages 117-140, Methods Pub-
Heuristics for the Design of Deterministic Pro- lishers 2002.
grams. Acta Inf., 24(3), 239–276 (1987). [34] F. Tchier.: While loop d demonic relational
[24] Mills, H. D., Basili, V. R., Gannon, J. D. and semantics monotype/residual style. 2003 In-
Hamlet,R. G.: Principles of Computer Pro- ternational Conference on Software Engineer-
gramming. A Mathematical Approach. Allyn ing Research and Practice (SERP03), Las Ve-
and Bacon, Inc., 1987. gas, Nevada, USA, 23-26, June 2003.

[25] Nguyen, T. T.: A Relational Model of Demonic [35] F. Tchier.: Demonic Semantics: using mono-
Nondeterministic Programs. Int. J. Founda- types and residuals. IJMMS 2004:3 (2004) 135-
tions Comput. Sci., 2(2), 101–131 (1991). 160. (International Journal of Mathematics and
Mathematical Sciences)
[26] D. L. Parnas. A Generalized Control Structure
and its Formal Definition. Communications of [36] M. Walicki and S. Medal.: Algebraic approches
the ACM, 26:572–581, 1983 to nondeterminism: An overview. ACM compu-
tong Surveys,29(1), 1997, 30-81.
[27] Schmidt, G.: Programs as Partial Graphs I:
Flow Equivalence and Correctness. Theoretical [37] L.Xu, M. Takeichi and H. Iwasaki.: Rela-
Comput. Sci., 15, 1–25 (1981). tional semantics for locally nondeterministic

UbiCC Journal – Volume 4 No. 3 578


Special Issue on ICIT 2009 Conference - Applied Computing

programs. New Generation Computing 15, 1997,


339-362.

UbiCC Journal – Volume 4 No. 3 579


Special Issue on ICIT 2009 Conference - Applied Computing

CASE STUDIES IN THIN CLIENT ACCEPTANCE

Paul Doyle, Mark Deegan, David Markey, Rose Tinabo, Bossi Masamila, David Tracey
School of Computing, Dublin Institute of Technology, Ireland
WiSAR Lab, Letterkenny Institute of Technology
{paul.doyle, mark.deegan, david.markey}@dit.ie,{rose.tinabo, bossi.masamila}@student.dit.ie
david.tracey@lyit.ie

ABSTRACT
Thin Client technology boasts an impressive range of financial, technical and
administrative benefits. Combined with virtualisation technology, higher
bandwidth availability and cheaper high performance processors, many believe
that Thin Clients have come of age. But despite a growing body of literature
documenting successful Thin Client deployments there remains an undercurrent
of concern regarding user acceptance of this technology and a belief that greater
efforts are required to understand how to integrate Thin Clients into existing,
predominantly PC-based, deployments. It would be more accurate to state that
the challenge facing the acceptance of Thin Clients is a combination of
architectural design and integration strategy rather than a purely technical issue.
Careful selection of services to be offered over Thin Clients is essential to their
acceptance. Through an evolution of three case studies the user acceptance issues
were reviewed and resolved resulting in a 92% acceptance rate of the final Thin
Client deployment. No significant bias was evident in our comparison of user
attitudes towards desktop services delivered over PCs and Thin Clients.

Keywords: Thin Clients, Acceptance, Virtualisation, RDP, Terminal Services.

1 INTRODUCTION technology. Over a four year period, three Thin


Client case studies were run within the Dublin
It is generally accepted that in 1993 Tim Negris Institute of Technology with the explicit aim of
coined the phrase “Thin Client” in response to Larry determining the success factors in obtaining user
Ellison’s request to differentiate the server centric satisfaction. The following data criteria were used to
model of Oracle from the desktop centric model evaluate each case study in addition to referencing
prevalent at the time. Since then the technology has the Universal Theory of User Acceptance Testing
evolved from a concept to a reality with the (UTUAT) [1].
introduction of a variety of hardware devices,
network protocols and server centric virtualised 1) Login events on the Thin Clients.
environments. The Thin Client model offers users 2) Reservation of the Thin Client facility.
the ability to access centralised resources using full 3) The cost of maintaining the service.
graphical desktops from remotely located, low cost,
stateless devices. While there is sufficient literature 1.2 Paper Structure
in support of Thin Clients and their deployment, the In section 2 we review the historical background
strategies employed are not often well documented. and trends of Thin Client technology to provide an
To demonstrate the critical importance of how Thin understanding of what the technology entails.
Clients perform in relation to user acceptance we Section 3 discusses the case for Thin Clients within
present a series of case studies highlighting key existing literature including a review of deployments
points to be addressed in order to ensure a successful within industry and other educational institutes.
deployment. Section 4 provides details of the three case studies
discussing their design, evaluating the results, and
1.1 Research Aim providing critical analysis. Section 5 takes a critical
The aim of this research has been to identify a look at all of the data and sections 6 and 7 provide
successful strategy for Thin Client acceptance within conclusions and identify future work. This paper is
an educational institute. There is sufficient literature aimed at professionals within educational institutes
which discusses the benefits of Thin Client adoption, seeking ways to realize the benefits of Thin Client
and while this was referenced it was not central to computing while maintaining the support and
the aims of this research as the barrier to obtaining acceptance of users. It provides a balance between
these benefits was seen to be acceptance of the

UbiCC Journal – Volume 4 No. 3 585


Special Issue on ICIT 2009 Conference - Applied Computing

the hype of Thin Clients and the reality of their The challenge faced by Thin Client technology is
deployment. to deliver on these lower costs and mobility, while
continuing to provide a similarly rich GUI user
2 THIN CLIENT EVOLUTION experience to that provided by the desktop machine
(a challenge helped by improved bandwidth, but
The history of Thin Clients is marked by a latency is still often a limiting factor [4]) and the
number of overly optimistic predictions that it was flexibility with regard to applications they have on
about to become the dominant model of desktop their desktop. Typically, current Thin Client systems
computing. In spite of this there have been a number have an application on a server (generally Windows
of marked developments in this history along with or Linux) which encodes the data to be rendered into
those of desktop computing in general which are a remote display protocol. This encoded data is sent
worth reviewing to set the context for examining the over a network to a Thin Client application running
user acceptance of this technology. Thin Clients have on a PC or a dedicated Thin Client device to be
established a role in desktop computing although not decoded and displayed. The Thin Client will send
quite the dominant one initially predicted. These user input such as keystrokes to the application on
developments have usually been driven by increases the server. The key point is that the Thin Client does
in processing power (and reductions in the processor not run the code for the user's application, but only
costs) in line with Moore's law, but the the code required to support the remote display
improvements in bandwidth and storage capacity are protocol.
having an increasing effect on desktop computing While the term Thin Client was not used for
and on Thin Client computing [2] driving the move dumb terminals attached to mainframes in the 1970's,
towards more powerful lower cost desktops but also the mainframe model shared many of the attributes
the possibilities of server virtualisation and Thin of Thin Client computing. It was centralised, the
Client computing with the ability to run Thin Clients mainframe ran the software application and held the
over WANs. data (or was attached to the data storage) and the
The first wave of computing was one where terminal could be shared by users as it did not retain
centralised mainframe computers provided the personal data or applications, but displayed content
computing power as a shared resource which users on the screen as sent to it by the mainframe. From a
accessed using dumb terminals which provided basic desktop point of view, the 1980's were dominated by
text based input and output and then limited graphics the introduction and adoption of the Personal
as they became graphics terminals. These Computer.
mainframes were expensive to purchase and were Other users requiring higher performance and
administered by specialists in managed environments graphics used Unix Workstations from companies
and mostly used for specific tasks such as like Apollo and Sun Microsystems. The X Window
performing scientific calculations and running highly System [5] was used on many Workstations and X
specialised bespoke payroll systems. terminals were developed as a display and input
The next wave was that of personal computing, terminal and provided a lower cost alternative to a
whereby users administered their own systems which Unix Workstation, with the X terminal connecting to
provided a platform for their personal applications, a central machine running an X display manager. As
such as games, word-processor, mail and personal such, they shared some of the characteristics of a
data. Since then the personal computer has Thin Client system, although the X terminal ran an X
undergone a number of significant changes, but the Server making it more complicated than Thin Client
one of most interest was the nature of the interface devices.
provided to the user which has grown into a rich The 1990's saw the introduction of several remote
Graphical User Interface where the Personal display protocols, such as Citrix's ICA [6]
Computer became a gateway to the Internet with the Microsoft's RDP [7] and AT&T's VNC [8] for Unix
Web browser evolving into a platform for delivery of that took advantage of the increasing bandwidth
rich media content, such as audio and video. available on a LAN to provide a remote desktop to
This move from a mainframe centralised users.
computing model to a PC distributed one resulted in Terminal Services was introduced as part of
a number of cost issues related to administration. Windows NT4.0 in 1996 and it offered support for
This issue was of particular concern for corporate the Remote Desktop Protocol (RDP) allowing access
organizations, in relation to licensing, data security, to Windows applications running on the Server,
maintenance and system upgrades. For these cost giving users access to a desktop on the Server using
reasons and the potential for greater mobility for an RDP client on their PC. RDP is now offered on a
users, the use of Thin Clients is often put forward as range of Windows platforms [9]. Wyse and vendors
a way to reduce costs using the centralised model of such as Ncomputing launched terminals, which
the Thin Client architecture. This also offers lower didn't run the Windows operating system, but
purchase costs and reduces the consumption of accessed Windows applications on a Windows
energy [3]. Server using RDP, which is probably still the

UbiCC Journal – Volume 4 No. 3 586


Special Issue on ICIT 2009 Conference - Applied Computing

dominant role of dedicated hardware Thin Clients. bidding. The relationship is the inverse of the
Similarly VNC is available on many Linux and Unix mainframe era: the people get the air conditioning
distributions and is commonly used to provide now, and the nice floors, and the computers live out
remote access to a user's desktop. These remote in cyberspace and sit there waiting eagerly to do
display protocols face increasing demands for more something for us”. [16]
desktop functionality and richer media content, with
ongoing work required in how, where and when 3 THE CASE FOR THIN CLIENTS
display updates are encoded, compressed or cached
[10]. Newer remote display protocols such as THINC There are many stated benefits for Thin Clients
have been designed with the aim of improving these all of which are well documented [17][18]. While
capabilities [11]. there is no single definitive list, potential system
In 1999, Sun Microsystems took the Thin Client designers may have different aims when considering
model further with the SunRay, which was a simple Thin Clients, these benefits should be clearly
network appliance, using its own remote display understood prior to embarking on any deployment
protocol called ALP. Unlike some of the other Thin and are discussed below.
Clients which ran their own operating system,
SunRay emphasized its completely stateless nature 3.1 Reduced cost of software maintenance
[12]. This stateless nature meant that no session The administrative cost benefit of the Thin
information or data was held or even cached (not Client model, according to Jern [19] is based on the
even fonts) on the appliance itself and enabled its simple observation that there are fewer desktop
session mobility feature, whereby a smart card was images to manage. With the combination of
used to identify a user with a session so that with the virtualisation environments and Windows Terminal
smartcard the user could login from any SunRay Service (WTS) systems it would not be uncommon
connected to the session's server and receive the for twenty five or more desktop environments to be
desktop as it was previously. supported from a single installation and
Many of these existing players have since configuration. This reduces the number of upgrades
focused on improving their remote desktop protocols and customizations required for desktop images in
and support for multimedia or creating new hardware computer laboratories where the aim is to provide a
platforms. There have also been some newer arrivals consistent service from all systems. Kissler and Hoyt
like Pano Logic and Teradici who have developed [20] remind us that the “creative use of Thin Client
specific client hardware to create “zero” clients, with technology can decrease both management
supporting server virtualisation to render the remote complexity and IT staff time.” In particular they
display protocols. Also, there are a number of chose Thin Client technology to reduce the
managed virtual desktops hosted in a data centre now complexity of managing a large number of kiosks
being offered. and quick-access stations in their new thirty three
One of the drivers behind Thin Client million dollar library. They have also deployed Thin
Technology, particularly when combined with a Client devices in a range of other roles throughout
dedicated hardware device, is to reduce the cost of Valparaiso University in Indiana. Golick [21] on the
the client by reducing the processing requirement to other hand suggests that the potential benefits of a
that of simply rendering content, but a second driver Thin Client approach include the lower mean time to
(and arguably more important one) is to gain a level repair (MTTR) and lower distribution costs. It is
of universality by simplifying the variations in the interesting to note that he does suggest that the
client side environment. This has been met in a potential cost savings for hardware are a myth, but
number of new ways using Virtual Machine players that administration savings still make a compelling
and USB memory in Microsoft's research project case for using Thin Client technology.
“Desktop on a Keychain” (DOK) [13] and also the
Moka5 product [14], allowing the mobility (and 3.2 Enhanced Security
security) benefits attributed to Thin Clients. This can Speer and Angelucci [22] suggest that security
be enhanced with the use of network storage to cache concerns should be a major factor in the decision to
session information [15]. adopt Thin Client systems and this becomes more
It can be seen that Thin Clients have evolved apparent when referencing the Gartner Thin Client
along with other desktop computing approaches, classification model. The Thin Client approach
often driven by the same factors of increasing ensures that data is stored and controlled at the data-
processing power, storage capacity and bandwidth. centre hosting the Thin Client devices. It is easy to
However, newer trends that are emerging with regard argue that the user can retain the mobility of laptops
to virtualisation, internet and browser technologies, but with enhanced security and the data is not
together with local storage, present new challenges mobile, just the access point. The argument is even
and opportunities for Thin Client technology to win easier to make when we consider recent high-profile
user acceptance. As Weiser said in 1999 in this new cases of the theft of unencrypted laptops containing
era, “hundreds or thousands of computers do our sensitive medical or financial records. The freedom

UbiCC Journal – Volume 4 No. 3 587


Special Issue on ICIT 2009 Conference - Applied Computing

conferred on users of corporate desktop and laptop 1) 5.4 million kWh reduction,
PCs undermines the corporation’s obligations in 2) 2,800 tonnes of CO2 saved annually
relation to data privacy and security. Steps taken to 3) Servers reduced by a factor of 20
protect sensitive data on user devices are often too 4) IT budget cut by a fifth
little and too late. Strassmann [23] states that the
most frequent use of a personal computer is for Indeed there are many deployments focused on
accessing web applications and states that the Thin obtaining energy savings through the use of Thin
Client model demonstrates significantly lower Clients. In a case study where SunRay systems were
security risks for the corporation. Five security introduced into Sparkasse a public German Bank,
justifications for adopting the Thin Client model Bruno-Britz [25] reports that the savings in
were proposed. electricity costs alone were enormous. The
University of Oxford has deployed SunRay Thin
1) Zombie Prevention Client devices in their libraries citing the cooler and
2) Theft Dodging quieter operation as factors in their decision. These
3) File Management devices, having no local hard disk and no fan operate
4) Software Control at a lower temperature and more quietly than
5) Personal Use Limitations traditional PCs. This characteristic has
environmental implications from noise, cooling and
Strassmann concedes that Thin Clients are not power consumption perspectives.
necessarily best for every enterprise and every class
of user, but for enterprises with a large number of 3.5 Summary of Benefits
stationary “non-power” users, “Thin Clients may In summary, we can extract the benefits
present the best option in terms of security, cost observed within literature and case studies as
effectiveness and ease of management.” follows:

3.3 User Mobility 1) Increased security as data maintained centrally


User mobility can refer to the ability of a user to 2) Reduced cost of hardware deployment and
use any device, typically within the corporation’s management and faster MTTR
intranet, as a desktop where the user will see a 3) Reduced administration support costs
consistent view of the system, for example, SunRay 4) Environmental costs savings
hot-desking. While user profiles in Microsoft 5) Reduced cost of software maintenance
Windows support this, it is often only partially 6) Reduced cost of software distribution
implemented. Session mobility can be viewed as the 7) Zero cost of local software support
facility for users to temporarily suspend or 8) The ability to leverage existing desktop hardware
disconnect their desktop session and to have it re- and software
appear, at their request, on a different device at a 9) Interface portability and session mobility
later time. This facility removes the need for users to 10) Enhanced Capacity planning
log-out or to boot-up a desktop system each time 11) Centralised Usage Tracking and Capacity
they wish to log-in. Both of these potential features Planning
of Thin Client technologies help to break the sense of
personal ownership that users often feel for their 3.6 Thin Clients vs. Fat Clients
desktop or laptop computers. It is this sense of Thin Client technology has evolved in
personal ownership which makes the maintenance sophistication and capability since the middle of the
and replacement of corporate PCs a difficult task, 1990s, however the “thickness” (the amount of
and this feeling of ownership and control is often a software and administration required on the access
reason why users resist the adoption of a centrally device) of the client is a source of distinction for
controlled Thin Client to replace their desktop, many vendors [26][27]. Regardless of “thickness”,
whereas this is exactly why IT management may Thin Clients require less configuration and support
want to adopt it. when compared to Fat Clients (your typical PC). In
the early 1990s Gartner provided a client-server
3.4 Environmental Costs reference design shown in Figure 1. This design
In the article “An Inefficient Truth” Plan [24] provides clarity for the terms “thin” and “fat” clients
reveals a series of “truths” supported by a number of by viewing applications in terms of the degree of
case studies directed at the growing costs of data access, application and presentation logic
Information and Communication Technologies. One present on the server and client sides of the network.
such case study is of Reed Managed Services where The demand for network based services such as
4,500 PCs were replaced with Thin Clients, and a email, social networking and the World Wide Web
centralised blade server providing server based has driven bandwidth and connectivity requirements
virtualised desktops. Savings are reported as follows: to higher and higher levels of reliability and
performance [28]. As we progress to an “always on”

UbiCC Journal – Volume 4 No. 3 588


Special Issue on ICIT 2009 Conference - Applied Computing

network infrastructure the arguments focused against incomplete and flawed technology. In the case of
Thin Clients based on requiring an offline mode of Thin Clients, it should be accepted that there are
usage are less relevant. The move from Fat Client to tradeoffs to be made. One of the appealing aspects of
Thin Client is however often resisted as individuals the Fat client is its ability to be highly flexible which
find themselves uncomfortable with the lack of facilitates extensive customization. However not
choice provided when the transition is made, as every user will require that flexibility and
observed by Wong et al.[29]. customization. Thin Clients are not going to be a
silver bullet addressing all users needs all of the
time.
All three case studies were evaluated under the
following headings in order to allow a direct
comparison between each. These criteria were
selected to ensure that there was a balance between
the user acceptance of the technology and the
technical success of each deployment.

1) Login events on the Thin Clients


2) Reservation of the Thin Client facility
3) The cost of maintaining the service

Figure 1: Gartner Group Client/Server Reference Design

4 CASE STUDIES

No matter how well documented the benefits of


Thin Clients may be, there is still an issue of
acceptance to be addressed. While it may be
tempting to assume that the implementation of
technology is a technical issue and that simply by Figure 2: Case Study 1
building solutions a problem is effectively solved,
evidence would point to the contrary. As there can 4.1 DIT Case Study 1
often be a disparity between what is built and what is In 2005 the DIT introduced the SunRay Thin
required or needed. Too often requirements Client technology into the School of Computing. In a
gathering, specification definition and user similar approach to many other technology
consultation are forgotten in the rush to provide new deployments the strengths of the technology were
services which are believed to be essential. In reviewed and seen as the major selling points of the
essence the notion of “if we build it they will come” deployment. In the case of SunRay there was a cheap
is adopted, inevitably causing confusion and appliance available which would provide the service
frustration for both service provider and the user. For of graphical based Unix desktops. Centralised
example, during Sun Microsystems’ internal administration ensured that the support costs would
deployment of its own SunRay Thin Client solution be low and the replacement requirements for systems
many groups and functions sought exemptions from for the next five years would be negligible. In
the deployment as they believed that their essence the technological and administrative
requirements were sufficiently different to the advantages were the focus of this deployment. Few
“generic user” to warrant exclusion from the project. of the services offered within the existing PC
The same arguments still exist today and it is often infrastructure were included in the deployment. This
those with a more technical understanding of the deployment sought to offer new services to students
technology who are the agents of that technology’s and introduced Thin Clients for the first time to both
demise. By providing interesting and often creative students and staff.
edge cases which identify the limitations of a
technology, they can, by implication, tarnish it as an

UbiCC Journal – Volume 4 No. 3 589


Special Issue on ICIT 2009 Conference - Applied Computing

4.1.1 Design Given that the nature of the service did not
A single laboratory was identified for deploying significantly change over the course of the three
the SunRay systems and all PC in that lab were years that the system was in place with the exception
replaced with SunRay 150 devices. A private of semester activity in line with student presence in
network interconnect was built which ensured that all the institute, it is clear that there was low utilization
data sent from the clients traversed a private network of the service. The graph shows raw data plotted,
to the SunRay server. The initial design of this case where login events were less than 10 per day.
study is shown in Figure 2 and it allowed students
within this new Thin Client lab access to the latest
version of Solaris using a full screen graphical
environment as opposed to an SSH command-line 14
Unix shell which was the traditional method still 12
used from existing computing laboratories. A new
10
authentication system was introduced based on

Login Events per day


LDAP which required students to have a new 8
username and password combination which was 6
different to the credentials already in use within the
Active Directory domain used for the existing PC 4
network. The reason for this alternative 2
authentication process was due to the difficulty of
authenticating on a Unix system using Active 0
Directory. Once the server was running, the Thin Feb 05 Feb 06 Feb 07 Feb 08
Client laboratory was ready to provide graphical
based Unix login sessions at a considerable reduced
price when compared to an investment of Unix Figure 3: User Login Events
workstations for each desk. In total 25 Thin Client
devices were installed which were all connected to a Reservation of the Thin Client Facility:
single Solaris server. In summary the key Each laboratory may be reserved by staff for the
components within the design were as follows: delivery of tutorial sessions and exercises. The
hourly reservations for this laboratory were reduced
1) The service was on a private network as a result of the introduction of Thin Clients with
2) New authentication process was introduced only 1 to 2 hours being reserved per day. One of the
3) New storage mechanism was introduced primary reasons for the reduction in the use of this
4) Devices were all in the same location facility was the fact that it had now become special
5) Service provided was a CDE desktop on Solaris purpose and the bookings for the room were limited
6) Graphical desktops running on Linux servers also to the courses which could be taught within it.
accessible
The Cost of Maintaining the Service:
4.1.2 Results A detailed analysis of cost savings associated with
The login events are a measure of the general the introduction of Thin Clients within our institute
activity of the devices themselves and were and specifically the costs associated with this case
considered to be a reasonable benchmark for study was performed by Reynolds and Gleeson, [30].
comparison with existing laboratories within the In their study they presented evidence of savings in
institute. One interesting point is that the comparison relation to the cost of support, the cost of deployment
of facilities is not necessarily relevant when the and a basic analysis of the power consumption costs.
facilities provide different services. Due to the fact They review both the system and the software
that Unix instead of Windows was provided meant distribution steps associated with Thin Clients and
that, with the exception of those taking courses PC systems and present a point of quantifiable
involving Unix, the majority of students were comparison between the two. Key findings of this
unfamiliar with the technology and did not seek to analysis were as follows:
use the systems.
1) Time spent performing system upgrades and
Login events on the Thin Clients: hardware maintenance was reduced to virtually
The login events were extracted from the Solaris zero as no hardware or software upgrades were
server by parsing the output of the last command required.
which displays the login and logout information for 2) A single software image was maintained at the
users which it extracts from the /var/adm/wtrmpx central server location and changes were made
file. The number of login events per day was available instantly to all users.
calculated and plotted in the graph shown in Fig. 3. 3) No upgrade costs were incurred on the Thin
Immediately obvious was the low use of the system. Clients or server hardware. All systems have

UbiCC Journal – Volume 4 No. 3 590


Special Issue on ICIT 2009 Conference - Applied Computing

remained in place throughout both case studies. This is defined as the degree to which there is a
The devices in this lab are now 8 years old and perception of how others will view or judge them
are fulfilling the same role today as they did based on their use of the system. Clearly by
when first installed. isolating the devices and having it associated
4) The Thin Client lab is a low power consumption with specialized courses, there was no social
environment due to the inherent energy efficiency imperative to use the labs. Unix as a desktop was
of the Thin Client hardware over existing PCs. relatively uncommon in the School at the time of
This can provide up to 95% energy savings when the case study and there would have been a
compared to traditional PCs [24]. moderate to strong elitist view of those who were
technical enough to use the systems.
4.1.3 Analysis
d) Facilitating Conditions
There has been extensive research in the area of
This is defined as the degree to which an
user acceptance of technology, but perhaps the most
individual believes in the support for a system. At
relevant work in this area is the Unified Theory of
first glance this does not appear to be a
Acceptance and Use of Technology (UTAUT) [1]
significant factor considering that the services
which identifies four primary constructs or factors;
were created by the support team and there was
considerable vested interest in seeing it succeed.
a) Performance Expectancy
However additional questions asked by the
b) Effort Expectancy
UTAUT include the issue of compatibility with
c) Social Influence
systems primarily used by the individual.
d) Facilitating Conditions
Each of the UTAUT factors can be considered
While there are additional factors such as
significant for Case Study 1. Many of the issues
Gender, Age and Experience, within the student
raised hang on the fundamental issue that the new
populations these are for the most part reasonably
services offered on the Thin Client were different to
consistent and will be ignored. It should be stressed
existing services and for all practical purposes seen
that although the UTAUT was developed for an
as incompatible with the majority of systems
industry based environment it is easily adapted for
available to students elsewhere. The fact that the
our purposes. It was felt that this model serves as a
technology itself may have worked flawlessly, and
relevant reference point when discussing the
may have delivered reduced costs was irrelevant as
performance of the case studies.
the service remained under utilized. Given that the
Clearly Case Study 1 failed to gain acceptance
reason for this lack of acceptance was potentially
despite belief that it would in fact be highly
inherent in the implementation of services and not
successful at its inception. We review the case study
due to failings in the technology itself it was clear
under the four UTAUT headings to identify the
that a second case study was required which would
source of the user rejection of the Thin Clients.
address the issue of service.
a) Performance Expectancy
This factor is concerned with the degree to which
the technology will assist in enhancing a users
own performance. Clearly however the services
provided an advantage to those students who
wished to use Unix systems. Since the majority
of courses are based on the Windows operating
system it would be reasonable to assume that
there was no perceived advantage in using a
system which was not 100% compatible with the
productivity applications used as part of the
majority of courses.
b) Effort Expectancy
This factor is concerned with the degree of ease
associated with the use of the system. One of the
clear outcomes of Case Study 1 was that students
rejected the Unix systems as it was seen to be a
highly complex system, requiring additional
authentication beyond what was currently used in
traditional laboratories. Figure 4: Case Study 2
c) Social Influence

UbiCC Journal – Volume 4 No. 3 591


Special Issue on ICIT 2009 Conference - Applied Computing

4.2 Case Study 2 b) Course specific Windows Terminal Servers for


The second case study is a modification of the courses where there were specific software
basic implementation of the first case study with requirements not common to all students.
changes focused on increasing student acceptance of c) Individual Virtualised desktops for students in
the Thin Client facility. Removing the Unix centric specific modules where administration rights
nature of the existing service was central to the were required.
system redesign. It was decided that additional d) All services were made available from both the
services could be easily and cheaply offered to the Thin Client and PC labs as they were available
Thin Client environment providing users with the over the Remote Desktop Protocol RDP.
ability to access more compatible services from e) Provisioning of an easy access point to all
within the Thin Client environment. Figure 4 services from within the Thin Client environment
identifies the key components within the design. which was not available from PC systems.

4.2.1 Design 4.2.2 Results


The most important addition to the second case The data gathered for Case Study 2 was evaluated
study was the provision of additional services which under same three headings as per case study 1.
were similar to those available in PC labs. This was
to ensure that students could use this facility and 1) Login events on the Thin Clients
have an experience on a par with the PC labs. A new 2) Reservation of the Thin Client facility.
domain was created where Unix and Windows 3) The cost of maintaining the service.
shared a common authentication process. Due to
difficulties integrating Unix and the existing
25
Windows authentication process, the new Domain
was built on the LDAP system with SAMBA 20
providing the link between the new Windows
Login Events per Day
Terminal Servers and the LDAP system. While 15 Case Study 2
students could now use the same username and
password combination for Windows and Unix 10
systems this was not integrated into the existing Case Study 1
5
Windows authentication process. Students were still
required to have two sets of credentials, the first for 0
the existing PC labs, and the second for access to a
new domain containing a number of Windows 08 Feb 22 Feb 08 Mar 22 Mar 05 Apr
Terminal Servers and the original graphical Unix Figure 5: User Login Event Comparison
desktop. While the Thin Clients now provided
Windows and Unix graphical desktops, the new Login events on the Thin Clients:
Windows Domain was also accessible from existing Figure 5 shows a comparison of activity during
PC labs via RDP connections to the Terminal the same time period for the two case studies. To
Servers. This allowed classes to be scheduled either identify trends in the data a displacement forward
inside or outside of the Thin Client laboratory. In
moving average was performed on the data as shown
addition to providing Windows Terminal Services
in Eq. (1).
(WTS), student owned virtual machines were now
also available. Due to the fact that most services
(1)
were now available from all locations, the ease of
access to the services from within the Thin Client lab
was improved by providing users with a menu of
destinations upon login. This new login script It is clear that for the same time period there was
effectively provided a configurable redirection a significant increase in the use of the system as the
service to the WTS and Virtualisation destinations number of login events increased by a factor of 4.
using the rdesktop utility [31] which performed a full Once again the login events were extracted from the
screen RDP connection to specified destinations. An Solaris server by parsing the output of the last
interesting outcome of this destination chooser was command.
that any RDP based destination could be included
regardless of the authentication process used. This Reservation of the Thin Client Facility:
would however require a second authentication The changes to the Thin Client facility were
process with the connecting service. The new announced at the start of the second academic
services provided were as follows: semester as a PC upgrade and the number of room
bookings increased as shown in Figure 6 from 6
a) A general purpose Windows Terminal Server hours a week to 20 hours a week. This was due to
with mounted storage for all students and staff. the use of the room as a Windows based laboratory

UbiCC Journal – Volume 4 No. 3 592


Special Issue on ICIT 2009 Conference - Applied Computing

using the new WTS and virtualisation services. modules being taught using these new services
were still required go through a new login/access
8 process which was not well documented. For
Case Study 1
Hours per day

example within the Thin Client labs the new


6 Case Study 2 username/password combination was required to
access the choice of destinations from the
4
devices. This acted as a barrier to use even
2 though emails were sent to students and
information on how to access these accounts
0 were posted in the labs. Usernames were based
Mon Tue Wed Thurs Fri on existing student ID numbers.
c) Social Influence
Figure 6: Thin Client Room Reservations Little changed in this case study for those who
did not have a teaching requirement based on the
The Cost of Maintaining the Service: new services.
All of the benefits observed from the first case d) Facilitating Conditions
study were retained within this case study. The With the provision of WTS services and virtual
addition of terminal services reduced the reliance of machines which provided Windows
students on Fat Client installations. Students are now environments the issue of compatibility was
using virtual machines and terminal servers on a reduced. However two issues remained which
regular basis from all labs. were not addressed. Firstly while users could now
share a common data store between systems on
4.2.3 Analysis this new domain there was no pre-packaged
This second case study certainly saw an access to the data store on the existing PC
improvement over its earlier counterpart and students domain. While it was technically possible to
and staff could now access more familiar services combine both under a single view, this required
from the Thin Client lab. Given the dramatic increase user intervention and additional training which
relative to the earlier results it could be stated that was not provided. Secondly the sequence of steps
the introduction of the more familiar services required to access choices from the Thin Clients
increased the acceptance of the facility. Both case was a non-standard login process which now
studies demonstrated equally well that it is possible required a second login, the first of which was at
to obtain the total cost of ownership benefits using a a Unix graphical login screen. For many this
Thin Client model, but the services offered has a initial login step remained as a barrier to using
dramatic affect on user acceptance. It is useful to the system.
review the outcome in relation to the UTUAT.
The most striking result from this case study is
a) Performance Expectancy that while the second case study demonstrated
Given that new services such as personalised significant increase in acceptance and use, the PC
virtual machines were now available, staff and environments remained the system of choice for
students could identify a clear advantage to the students, as shown in Figure 7. In this graph we
system where administration rights could be show the typical use PC laboratory within the same
provided in a safe manner, allowing more faculty. Thin Client use remained less than one third
complex and previously unsupported activities of the use of the busiest computer laboratory. Thin
to take place. For example, the Advanced Clients are shown to be capable of providing services
Internet module for the MSc. students could now equally well to both Windows and Unix users by
build and administer full web servers which introducing the ability of students to access their own
could remain private to the student ensuring that private desktop from many locations, however this
no other student could access or modify a feature alone was not enough to entice users from the
project which was a work in progress. existing PC infrastructure. Clearly the introduction of
b) Effort Expectancy virtualisation to the infrastructure allowed new
Considerable improvements were made in this services to be developed and used from Thin and Fat
case study to allow users to access well known clients which could be seen as a potential for
environments from both the Thin Clients and PC migrating users to a Thin Client/Virtualisation
systems. Students who were taught modules model, which indeed is a future planned initiative.
using the new WTS or virtual environments were The results show a definite increase in the use of the
trained on how to access the systems, and once Thin Client facilities with data being gathered from
they used them they continued to do so the same period over both case studies to eliminate
throughout the year. Those who did not have any bias which might occur due to module schedule

UbiCC Journal – Volume 4 No. 3 593


Special Issue on ICIT 2009 Conference - Applied Computing

differences at different time periods during the year. locations. It was essential that the Thin Clients were
The timing and method used to announce the not to be identifiable by students if at all possible,
changes was critical to the increase in acceptance. and to co-locate them with existing PC systems. To
The announcement of the systems as a PC upgrade ensure that all devices behaved in a consistent
removed some of the barriers which existed for users manner to PCs they must boot and present the same
who did not feel comfortable with a Unix login screen as would be expected on a PC in the
environment but failed to attract a majority of the same location. To achieve this all Thin Client
students. devices with the exception of the SunRay systems
used a Preboot Execution Environment (PXE) [32]
80 boot process to connect to a Linux Terminal Server
70 Project server (LTSP). The server redirected the user
PC Lab 1
60 session to the correct WTS using rdesktop where the
user was presented with a Windows login screen
50 identical to those on adjacent PC systems.
Login Events per Day

40 The SunRay systems were run in Kiosk mode


30 which allowed the boot sequence to redirect the
20 Case Study 2 session to a WTS also via the rdesktop utility. The
10 WTS were installed on a VMWare ESX Server to
Case Study 1
allow rollback and recovery of the servers. This
0 however was not central to the design of the case
08 Feb 22 Feb 08 Mar 22 Mar 05 Apr study and only served as a convenience in sharing
hardware resources between multiple servers. The
Figure 7: Comparison with PC Computer Labs only concern was the potential performance of the
WTS under a virtualised model. Given that the
4.3 Case Study 3 applications were primarily productivity applications
such as word processing and browsing, and that the
The third case study was designed using the maximum number of users allowable on any WTS
experiences of the first two case studies and was was 25 (based on the number of devices which were
extended beyond the School of Computing. It was directly connected to the WTS) this was considered
aimed at demonstrating the capability of the Thin to be within the acceptable performance range of the
Client technology in two different demographic architecture. This assumption was tested prior to the
environments, the first was one of the Institute case study being made accessible to students with no
Libraries where PCs were used by students from specific issues raised as to warrant further
many different faculties and the second was within restructuring of the architecture
the Business faculty where computer system use was Seventy five Thin Clients were deployed in six
provided in support of modules taught within that locations. The following Thin Client devices were
faculty. This case study expressed the following used as shown in Figure 8 and Table 1.
aims at the outset
1) To demonstrate the use of Thin Client technology
within the student population and determine the
level of student acceptance of that technology.
2) To implement a number of alternative
technologies in order to provide a point of
comparison with respect to their overall
performance and acceptance.
3) To determine the capability of the existing
network infrastructure to support Thin Clients.

4.3.1 Design
Unlike the previous case studies the aim was to
insert Thin Clients into the existing environment as
invisibly as possible. This meant that existing
authentication processes were to be maintained.
There were two different authentication processes in
place which needed to be support, Novell Client for
the Business faculty and Active Directory for the
Library. In both cases a WTS system was built which
joined to the respective domains. Applications were
installed on the Thin Client in order to mirror those
Figure 8: Case Study 3
that were present on existing PCs in the chosen

UbiCC Journal – Volume 4 No. 3 594


Special Issue on ICIT 2009 Conference - Applied Computing

remotely from the primary labs within the Business


Table 1: Thin Clients deployed faculty and traditionally did not have high use. Lab 2
Device Boot Mode Quantity was a more central location and again as expected
Dell GX260 PXE Boot PC 15 this exhibited greater user activity. The systems
Dell FX 160 PXE Boot TC 25 remained in operation continually for the period of
HP T5730 PXE Boot TC 8 the case study which was over one month during
Fujitsu FUTRO S PXE Boot TC 2 which data was collected from the three WTS
SunRay 270 SunRay 25 systems.

4.3.2 Linux Terminal Server Project 4.3.4 User Survey


LTSP works by configuring PCs or suitable Thin Once the case study was running a desktop
Clients to use PXE-Boot to obtain the necessary satisfaction survey which employed the Likert scale
kernel and RDP client used as part of this project. [33] was conducted to obtain feedback from students
These are obtained from a TFTP server whose IP using the Thin Client systems. The design of the
address is provided as a DHCP parameter when the questionnaire was such that students were asked to
client PXE-Boots. As part of the DHCP dialogue, identify their desktop using a colour coded system
devices configured to PXE-Boot are given settings which was known only to the authors. Each of the
by the DHCP server. These include; TFTP Boot Thin Clients and a selection of PC systems (which
Server Host Name and Bootfile Name. were not PXE booted) where targeted for the survey
The necessary settings were configured on each to allow a comparative analysis between all Thin
of the DHCP servers serving the relevant locations Clients and existing PC systems to be performed.
within the DIT so as to point any PXE-Boot devices The survey did not reference Thin Clients in any of
to the relevant LTSP boot server and to specify the the questions but rather sought feedback on
kernel to be loaded by the PXE-Boot client. Using application use and overall satisfaction with the
these settings the PXE-Boot clients load a Linux performance of the system through a series of
kernel and then an RDP client which connects to one questions. There were 234 responses recorded for the
of the three WTS used as part of this case study. survey. The key questions in the survey were as
follows.

1) Please rate the overall performance of the


140
machine you are currently using
Login Events per Day

120 2) Please identify the primary reason you used this


100 computer
3) How would you rate your overall satisfaction
80 Library with this desktop?
Lab 2
60 4) Would you use this desktop computer again?
40
Lab 1 80%
20
User Satisfaction Ratings

0 75%
17 Apr 24 Apr 1 May 8 May 15 May 70%

65%
Figure 9: User Login Event Comparison
60%
4.3.3 Results PC-Fat SunRay PXE HP TC Dell TC
Use of the Thin Clients was recorded using login Client Boot PC
scripts on the Windows Terminal Servers which
recorded login and logout events. As expected the All Applications Browsers
use of the Library systems exceed the use of the
laboratories but both were in line with typical use Figure 10: User satisfaction rating of desktop performance
patterns expected for each location. What was
immediately obvious was that each location had a The issue of overall performance was broken
higher utilization than the previous two case studies down by the device used which was identified using
but comparable with the PC labs shown in Figure 9. the colour coded scheme described earlier. Figure 10
One of the difficulties with the comparison however below represents the average rating of satisfaction
is that the final case study was performed at a reported by users broken down by device and
different point in the teaching semester and use of primary application in use. Since over 50% of
the systems declined as students prepared for responses identified “Browsing” as the primary
examinations. Lab 1 was a “quiet lab” located reason for using the machine there are two

UbiCC Journal – Volume 4 No. 3 595


Special Issue on ICIT 2009 Conference - Applied Computing

satisfaction ratings provided as a point of


comparison. Figure 11 shows the combined rating of
Non-USB Storage USB Only

User Satisfaction Ratings


users responses to overall satisfaction with desktop,
desktop performance and application performance. 100%
90%
80%
80% 70%
User Satisfaction Ratings

60%
78%
50%
76% 40%
74% PC SunRay PXE HP Dell
72% Boot
70%
68% Figure 13: Storage Satisfaction Rating

66%
By making the Thin Clients as invisible as
PC SunRay PXE HP DELL possible and comparing satisfaction and user access
Boot to the existing PC systems it was clear that for the
majority of users there was no apparent change to the
Figure 11: Combined rating of desktop performance services provided. Integrating into the existing
authentication process was an essential feature of this
4.3.5 Analysis case study as was the presenting of a single
This final case study while shorter in length than
authentication process at the WTS login screen.
the other case studies demonstrated significant
Efforts were also made to ensure that the
progress in user acceptance. As part of the survey
applications installed on the WTS were configured to
users were asked if they would consider reusing the
look and feel the same as those on the standard PC.
system and as can be seen in Figure 12 there was
As with the previous case studies it is useful to
significant support for the systems.
review the case study in relation to the UTUAT.
The small number of responses representing
those who did not wish to reuse the system cited
a) Performance Expectancy
USB performance as the primary cause of their
With the exception of increasing the number of
dissatisfaction. This was identified early in the
desktops in the Library, the primary deployment
testing of the Thin Clients that all systems performed
mainly replaced existing systems, so users were
noticeably slower than the PC systems in this
not provided with any reminders that they were
respect. Questions regarding the primary storage
using a different system. In effect there was no
method used by students were added to the survey as
new decision or evaluation by the user to address
was a satisfaction rating. From the results in Figure
the questions which were relevant in the previous
13 it is clear that while the PC systems did perform
case studies.
better when users primarily used USB storage, the
b) Effort Expectancy
satisfaction in storage performance for all other
The reuse of the existing login/access procedure
options were comparable. The HP satisfaction rate
which was well known and part of the normal
had a low survey response rate and hence was not
process for students using existing PC systems
considered significant in our analysis given the small
again allowed for this factor to become mainly
number of data points.
irrelevant. Usernames, passwords, applications
NO, 8% and system behaviour were identical to those on
the PCs.
c) Social Influence
Without perceiving a difference in service, social
influence as a factor was also eliminated. Only
the SunRay systems had different keyboards and
YES,
screens, and as these screens were of higher
92%
resolution than existing PCs they were if
anything seen as a more popular system.
d) Facilitating Conditions
Unlike the previous case studies support for the
Figure 12: User Response "Would you use this system facility was more complex. Different levels of
again" expertise and engagement were required. Thin

UbiCC Journal – Volume 4 No. 3 596


Special Issue on ICIT 2009 Conference - Applied Computing

Clients were now part of a larger support case study. These three case studies provide data
structure where many individuals were not core centric analysis of user acceptance and identify the
members of the technical team who built the evolving designs of our deployments. To gain
systems. However given that only three support acceptance of Thin Clients within an educational
calls were raised during the case study there was institute our case studies identified these key factors.
little pressure on this factor either. The calls
raised were not in fact directly related to the Thin 1) Locate the Thin Clients among the existing PC
Client devices, but rather the network and the systems, do not separate them or isolate them.
virtual environments used to host the centralised 2) Ensure that the login process and credentials
servers. users are identical to the existing PC systems.
3) Ensure that the storage options are identical to the
5 CRITICAL ANALYSIS existing PC systems
4) Focus on providing exactly the same services that
The UTUAT provides a useful reference point in already exist as opposed to focusing on out new
understanding some of the factors affecting services.
acceptance of the Thin Clients. In the first case study
the primary barrier to acceptance was the By ensuring we ran a blind test on the user
incompatibility of the new system with the existing population where Thin Clients co-existed with PC
system. Students were not motivated to use the new systems, and where the services offered were
system as there were few advantages to doing so and indistinguishable by the user, we were able to show a
considerable effort in learning how to use it. The user satisfaction rating of 92%. No significant bias
second case study while more successful still failed was evident in our comparison of user attitudes of
to gain acceptance despite the expansion of services desktop services delivered over PCs and Thin
offered being comparable with existing Windows Clients.
services. The session mobility and access from
anywhere feature, while useful did not overcome the 7 FUTURE WORK
resistance of users to migrate to the Thin Clients.
Thin Clients still required separate credentials and Additional case studies are planned which will
the login process was still different to the PC focus on acceptance of Thin Clients within the
systems. The third and final case study was designed academic staff population and will evaluate the
to provide the same existing services as the PC only relevance of some of the proposed core technological
using a centralised server and Thin Client model. No advantages within that environment such as session
new services for the user were provided. The primary mobility, Desktop as a Service, and dynamic lab
aim was to have the systems indistinguishable from reconfiguration and remote access using WAN and
the existing installation of PCs, effectively running a not just LAN environments.
blind test for user acceptance. Once the users
accepted the new systems, further machines could be 8 REFERENCES
deployed quickly and cheaply. The total cost of
[1] V. Venkatesh, M.G. Morris, G.B. Davis, and F.D.
ownership and centralised support savings
Davis, “User acceptance of information
demonstrated in the first two case studies were just
technology: Toward a unified view,” Mis
as relevant in the third case study. Quarterly, 2003, pp. 425-478.
[2] J.D. Northcutt, “CYB Newslog - Toward Virtual
6 CONCLUSION Computing Environments.”
[3] D. Tynan, “Think thin,” InfoWorld, Jul. 2005.
There is considerable literature in support of Thin [4] S.J. Yang, J. Nieh, M. Selsky, and N. Tiwari,
Client technology, and while there may be debate “The Performance of Remote Display
regarding the finer points of its advantages the issue Mechanisms for Thin-Client Computing,” IN
has been and continues to be one of acceptance. PROCEEDINGS OF THE 2002 USENIX
Acceptance for Thin Clients as a technology is often ANNUAL TECHNICAL CONFERENCE,
confused with the non technical issues arising from 2002.
the deployment. The UTUAT helps distinguish [5] T. Richardson, F. Bennett, G. Mapp, and A.
between technical and non-technical issues and as Hopper, “Teleporting in an X window system
shown within our case studies, the way in which the environment,” IEEE Personal
technology was presented to the user had a higher Communications Magazine, vol. 1, 1994, pp.
impact on acceptance than had the technology itself. 6-13.
This point is highlighted by the fact that the Thin [6] Citrix Systems, “Citrix MetaFrame 1.8
Client devices which were not widely used in first Backgrounder,” Jun. 1998.
case study were integrated seamlessly into the third

UbiCC Journal – Volume 4 No. 3 597


Special Issue on ICIT 2009 Conference - Applied Computing

[7] Microsoft Corporation, “Remote Desktop [21] J. Golick, “Network computing in the new thin-
Protocol: Basic Connectivity and Graphics client age,” netWorker, vol. 3, 1999, pp. 30-
Remoting Specification,” Technical White 40.
Paper, Redmond, 2000. [22] S.C. Speer and D. Angelucci, “Extending the
[8] T. Richardson, Q. Stafford-Fraser, K. Wood, and Reach of the Thin Client.,” Computers in
A. Hopper, “Virtual network computing,” Libraries, vol. 21, 2001, pp. 46 - .
Internet Computing, IEEE, vol. 2, 1998, pp. [23] P.A. Strassmann, “5 SECURE REASONS FOR
33-38. THIN CLIENTS.,” Baseline, 2008, p. 27.
[9] Microsoft Corporation, “Microsoft Windows NT [24] G.A. Plan, “An inefficient truth,” PC World,
Server 4.0, Terminal Server Edition: An 2007.
Architectural Overview,” Jun. 1998. [25] M. Bruno-Britz, “Bank Sheds Pounds.,” Bank
[10] J. Nieh, S.J. Yang, and N. Novik, “A Systems & Technology, vol. 42, 2005, p. 39.
comparison of thin-client computing [26] “Sun Ray White Papers,”
architectures,” Network Computing http://www.sun.com/sunray/whitepapers.xml.
Laboratory, Columbia University, Technical [27] B.K. Schmidt, M.S. Lam, and J.D. Northcutt,
Report CUCS-022-00, 2000. “The interactive performance of SLIM: a
[11] R.A. Baratto, L.N. Kim, and J. Nieh, “Thinc: A stateless, thin-client architecture,” Charleston,
virtual display architecture for thin-client South Carolina, United States: ACM, 1999,
computing,” Proceedings of the twentieth pp. 32-47.
ACM symposium on Operating systems [28] S. Potter and J. Nieh, “Reducing downtime due
principles, ACM New York, NY, USA, 2005, to system maintenance and upgrades,” San
pp. 277-290. Diego, CA: USENIX Association, 2005, pp. 6-
[12] B.K. Schmidt, M.S. Lam, and J.D. Northcutt, 6.
“The interactive performance of SLIM: a [29] I. Wong-Bushby, R. Egan, and C. Isaacson, “A
stateless, thin-client architecture,” Proceedings Case Study in SOA and Re-architecture at
of the seventeenth ACM symposium on Company ABC,” 2006, p. 179b.
Operating systems principles, Charleston, [30] G. Reynolds and M. Gleeson, “Towards the
South Carolina, United States: ACM, 1999, Deployment of Flexible and Efficient Learning
pp. 32-47. Tools: The Thin Client,” The Proceedings of
[13] M. Annamalai, A. Birrell, D. Fetterly, and T. the 4th China-Europe International
Wobber, Implementing Portable Desktops: A Symposium on Software. China (Guanzhou).
New Option and Comparisons, Microsoft Sun Yat-Sen University. (2008).
Corporation, 2006. [31] “rdesktop: A Remote Desktop Protocol client.”
[14] “MokaFive, Virtual Desktops,” [32] B. Childers, “PXE: not just for server networks
http://www.mokafive.com/. anymore!,” Linux J., vol. 2009, 2009, p. 1.
[15] R. Chandra, N. Zeldovich, C. Sapuntzakis, and [33] R. Likert, “A Technique for the Measurement of
M.S. Lam, “The Collective: A cache-based Attitudes,” Archives of Psychology, vol. 140,
system management architecture,” 1932, pp. 1–55.
Proceedings of the 2nd USENIX Symposium
on Networked Systems Design and
Implementation (NSDI’05).
[16] M. Weiser, “How computers will be used
differently in the next twenty years,” Security
and Privacy, 1999. Proceedings of the 1999
IEEE Symposium on, 1999, pp. 234-235.
[17] M. Jern, “"Thin" vs. "fat" visualization clients,”
Proceedings of the working conference on
Advanced visual interfaces, L'Aquila, Italy:
ACM, 1998, pp. 270-273.
[18] S. Kissler and O. Hoyt, “Using thin client
technology to reduce complexity and cost,”
Proceedings of the 33rd annual ACM
SIGUCCS conference on User services, ACM
New York, NY, USA, 2005, pp. 138-140.
[19] M. Jern, “"Thin" vs. "fat" visualization clients,”
L'Aquila, Italy: ACM, 1998, pp. 270-273.
[20] S. Kissler and O. Hoyt, “Using thin client
technology to reduce complexity and cost,”
New York, NY, USA: ACM, 2005, pp. 138–
140.

UbiCC Journal – Volume 4 No. 3 598


Special Issue on ICIT 2009 Conference - Applied Computing

AN INTERACTIVE COMPOSITION OF
WORKFLOW APPLICATIONS BASED ON UML
ACTIVITY DIAGRAM

Yousra Bendaly Hlaoui, Leila Jemni Ben Ayed


Research Unit of Technologies of Information and Communication
Tunis, Tunisia
Yousra.bendalyhlaoui@esstt.rnu.tn
Leila.jemni@fsgt.rnu.tn

ABSTRACT
In today's distributed applications, semi automatic and semantic composition of
workflows from Grid services is becoming an important challenge. We focus in
this paper on how to model and compose interactively workflow applications from
Grid services without considering lower level description of the Grid environment.
To reach this objective, we propose a Model-Driven Approach for developing such
applications based on semantic and syntactic descriptions of services available on
the Grid and abstract description provided by UML activity diagram language as
well. As there are particular needs for modeling composed workflows interactively
from Grid services, we propose to extend the UML activity diagram notation.
These extensions deal with additional information allowing an interactive and semi
automatic composition of workflows. In addition this specific domain language
contains appropriate data to describe matched Grid services that are useful for the
execution of the obtained workflows.

Keywords: Grid services, Interactive, semantic, composition, Workflow


application, UML activity diagrams.

1 INTRODUCTION application and on the other reduces the complexity


of the composed applications. There are several
Today’s distributed applications [23] are architectural approaches for distributed computing
developed by integrating web or Grid services [13, applications [22] which make easy the development
14] in a workflow. Due to the very large number of process. However, these approaches need rigorous
available services and the existence of different development methods to promote the reuse of
possibilities for constructing workflow from components in future Grid development application
matching services, the problem of building such [16]. It has been proven from past experience that
applications is usually a non trivial task for a using structured engineering methods makes easy the
developer. This problem requires finding and development process of any computing system and
orchestrating appropriate Grid services in a reduces the complexity when building large Grid
workflow. Therefore, we propose an approach that application [22].
allows semi automatic and interactive composition of To reduce this complexity and allow the reuse of
workflow applications from Grid services. To Grid service applications, we adopt a model-driven
describe and model workflow applications we use approach [24]. Thus we introduce in this paper a new
UML [25] activity diagrams. Recently, several approach to build, interactively, workflow
solutions were proposed to compose applications applications by following OMG(s) principals of the
from Grid services such as works presented in [8, 17, MDA in the development process [2, 3, 4].
18]. However, the proposed solutions need In this approach [2, 3, 4], our focus is to
interaction with user and guidelines or rules in the compose and model workflows from existing Grid
design of the composed applications. Consequently, services that represent the main aspect in the
the resulting source code is neither re-usable nor it development of Grid services applications. The
promotes dynamic adaptation facilities as it should. workflow modeling identifies the control and data
However, for applications composed of Grid services, flows from one depicted Grid service's operation to
we need an abstract view not only of the offered the next to build and compose the whole application.
services but also of the resulting application [31]. To model and express the composed workflow of
This abstraction allows the reuse of the elaborated Grid services, we use as abstract language the

UbiCC Journal – Volume 4 No. 3 599


Special Issue on ICIT 2009 Conference - Applied Computing

activity diagrams of UML [25]. The provided model the number of available services is in increase with
forms the Platform Independent Model (PIM) of the the existence of several forms and manners to
proposed MDA approach. This model is more compose such services.
understandable for the user than an XML [35] based Based on domain ontology description, we lead
workflow description languages like BPEL4WS [15] the user through to the composition process. Also,
which represent the Platform Specific Model (PSM). we provide for this user a graphical interface based
This paper is organized as follows. Section 2 on a domain specific UML language for automatic
presents the related work. Section 3 introduces the grid service composition. This UML profile [5] is
different components of the composition system; based on stereotypes, tagged values and workflow
section 4 specifies our proposed UML profile, patterns [5] that we propose to ensure the automatic
composition patterns and different steps of the composition. In the field of Grid services
interactive composition process. Finally, section 5 composition the most related work is the work
concludes the paper and proposes areas for further presented by Gubala et al in [8, 17, 18]. In this work,
research. the authors have developed a tool for semi automatic
and assisted composition of scientific Grid
2 RELATED WORK application workflows. The tool uses domain specific
knowledge and employs several levels of workflow
Many works were carried out in the field of abstractness in order to provide a comprehensive
Grid and Web services composition, such as works representation of the workflow for the user and to
presented in [8, 17, 18, 19, 20, 28, 29, 30]. In [28] lead him in the process of possible solution
authors were interested in the semi automatic construction, dynamic refinement and execution. The
composition of web services and proposed a originality of our contribution is that firstly we save
validation approach based on the semantic the effort of the user from the dynamic refinement
descriptions of services and on a logic based and execution as we propose a Model Driven
language to describe and validate the resulting Approach which separates the specific model from
composite Web services. However, the resulting the independent model.
composed web service is not clear for user who is Secondly, we use UML activity diagrams to
not familiar with logic based languages. In our deliver the functionality in a more natural way for
contribution, we propose a solution not only to the human user. The use of UML activity diagrams
compose workflows from available Grid services, in the description of workflow application is argued
but also to provide graphical and comprehensive in several works such as works presented in [1, 10,
models of the resulting workflows. In the same 12, 27]. Thus, the advantage of UML activity
framework, authors in [29] proposed a composition diagrams is that they provide an effective visual
approach of Web services based on Symbolic notation and facilitate the analysis of workflows
Transition Systems (STS). They developed a sound composition.
and complete logical approach for identifying the In our approach, we propose an UML profile for
existence of available composition. They have composing systematically a workflow application
emphasized upon the abstract representation of the from Grid services [5].
composition request (the goal of the composition)
and the representation of the resulting composite 3 THE INTERACTIVE WORKFLOW
Web service. For the representation, authors have COMPOSITION SYSTEM
used UML state machine diagrams [25] which are
suitable only to describe a sequence of component The system allows an interactive and semantic
services without addressing the other forms of composition of workflows from Grid services. As
matching services in a workflow such as parallel shown in figure 1, the system is composed of three
branches or and-branches. On the other hand, UML components: a Grid Services workflows composer,
activity diagrams that we use in our modelling an ontological Grid Services registry and a
approach support all kind of workflow composition workflows execution system also we call it activity
patterns [10] such as parallelism, split and fork. The machine.
authors in [19, 20, 30] have proposed a Model
Driven Approach for composing manually Web 3.1 The Grid services workflow composer
services. They were based on UML activity diagrams This system is composed of three components:
to describe the composite Web service and on UML the composition tool, the transformation tool and the
class diagrams to describe each available Web verification tool.
Service. The user depicts the suitable Web service
and matches it in the workflow representing the 3.1.1 The composition tool
composite Web service using UML activity It provides a graphical interface in the form of
diagrams. This approach would have been better if UML activity diagrams editor allowing to the user an
the composition were automatically elaborated, since interactive, systematic and semantic workflow

UbiCC Journal – Volume 4 No. 3 600


Special Issue on ICIT 2009 Conference - Applied Computing

Figure 1: Different components of the workflow composition system

Composition [6]. This composition is based on the translates the activity diagram into a Hyper-Graph
composition process which will be detailed in (HG). This HG will be translated as well by the
section 4.3. In the Grid registry, services are transformation tool into a NuSMV format file
described in an ontological form with statements according to a relative semantic. The details of
regarding the service operation's inputs, outputs, these semantics may not be relevant to the topic for
pre-conditions and effects (the IOPE set) [26]. which the paper is submitted. However these details
Through these notions, the composition system is could be made available.
able to match different grid service’s operations
into a workflow following a reverse traversal 3.1.3 The verification tool
approach. Thus, and by associating the required Checking errors in design models like UML
data with the produced output, the composer activity diagrams is essential since correcting an
constructs a data flow between Grid service’s error while the system is alive and running is
operations using our workflow composition usually very costly [21]. Consequently, workflow
patterns and UML profile[5]. The composer may activity diagram models should be spotted and
also use a specific notion of effect that may bind corrected as early as possible [6].
two operations together with non-data dependency. Several techniques are used in the field of
If the Grid registry fails to find the right operation, behavioural design verification such as theorem
the composition process stops. Otherwise, the proving and model checking [11]. The latter is the
composition process will stop when all workflow most useful because it is fully automatic and gives
dependencies are resolved. feedback in case of detected errors. It verifies
The request is sent to the Ontological Grid whether some given finite state machine satisfies
Registry in the form of SPARQL query [34]. This some given property specified in temporal logic [9].
language provides a higher-level access to the For activity diagrams, symbolic model checking
ontology transcribed knowledge for the automatic has proven to be an efficient verification technique
discovery and semantic matching of services. [11]. Thus, our verification tool is based on
Therefore, once the workflow model is built, it NuSMV symbolic model checker [9] that supports
should be validated and verified to ensure its strong fairness property which is necessary to be
reliability before being executed and reused as sub- verified in a workflow model to obtain realistic
workflow. results. With the model checker, arbitrary
propositional requirements can be checked against
3.1.2 The transformation tool the input model. If a requirement fails to hold, an
To support the verification and the execution of error trace is returned by the model checker. The
workflow models described in UML activity transformation tool translates systematically the
diagrams (UML-AD), the transformation tool error trace into an activity diagram trace by high-

UbiCC Journal – Volume 4 No. 3 601


Special Issue on ICIT 2009 Conference - Applied Computing

lighting a corresponding path in the given activity


diagram. 3.3 The workflow execution system
The reliable workflow model is sent to the
3.2 The Grid registry workflow execution system [6] which produces
During the workflow composition process, the implementation code for handling control flow and
Grid registry provides the composer system with data flow. The activity diagram describing the
the description of services available at the moment workflow model is translated into a specific XML
and provides reasoning capabilities to enable proper file which will be the input of the execution system.
matchmaking of services inputs and outputs. A workflow execution system executes different
The Grid registry [6] is an ontological workflow activities specified in the workflow XML
distributed repository of Grid services and sub- document in the correct order and with their
workflows. This registry is responsible for storing required inputs and outputs data. The execution of
and managing documents which contains an activity corresponds to the invocation of a Grid
descriptions of syntax and semantics of services service’s operation. The workflow execution
and their operations expressed in an RDF file [33]. system monitors these activities using the tagged
The semantic Web is making available technologies values information expressed in the activities but
which support automate knowledge sharing. In does not perform them. An activity of the activity
particular there are several existing initiatives such diagram modelling the workflow represents a state
as OWL-S [26] which proves that ontologies have a of the workflow execution system in which the
key role in the automating service discovery and system waits for an invoked grid service operation
composition. That knowledge is based on semantic to complete its work. Hence, the defined semantics
descriptions of service classes published by the of activity diagrams for the verification describe the
service developers and provided in the Grid behaviour of the execution system. When the
environment [16]. Our Grid registry is based on an system enters a state relative to an invocation grid
ontological description of services and workflows. service node or activity ai, it invokes a piece of
The service ontology [7] provides concepts behaviour that is executed by the service or system
and properties that allow description and environment. While the latter is in ai (activity ai is
matchmaking of available services. A part of this active), it waits for the termination event of the
ontology is common to all services and it is based invoked piece of behaviour. When termination
on a standard semantic web service description event occurs, the system reacts by executing the
ontology OWL-S [26] which makes interoperability outgoing edge E: it leaves the E's sources and enters
with existing services. A part from the common the E's targets and the execution process continues
ontology, there is a domain specific part of the for the other activity nodes until the final node is
ontology. The domain service ontology [7] allows reached.
users to extend the common ontology schema in
order to provide a better specification of services as 4 UML BASED INTERACTIVE
well as their inputs and outputs. For these we define COMPOSITION OF WORKFLOWS
a data ontology [7] which provides concepts and FROM GRID SERVICES
properties for describing services input and outputs.
Ontology alignment [7] is a process for finding In order to match and compose different Grid
semantic relationships among the entities of service’s operations, we need to analyze constructs
ontologies. Its main activity is to find similar of workflow models at higher abstraction level.
concept in ontologies being aligned, in order to map Since UML [25] is the core of the MDA [24], we
them. The measures for similarity computation can use its activity diagram language to model
be divided into two general groups; namely lexical composed workflows. The composition system
measures and structural measures. Lexical measures provides to the user a graphical interface to
are based on surface similarity such as title or label compose its request using a UML profile specific
of entities. In contrary, structural measures try to for the domain of composing systematically
recognize similarities by considering structures of
ontology graphs. The most advanced similarity 4.1 UML Profile for composing workflows
algorithms use combination of multiple similarity In this section, we present our UML profile
measures to obtain more information about which is based on Domain Specific Language
concepts similarity. In our Grid registry, we adopt (DSL) for customizing UML activity diagrams for
an approach using a combination of lexical and the systematic composition of workflows from Grid
structural similarity [7]. services [5].
We use similarity measures for mapping domain In our DSL (See Figure 2), an activity of an
ontology as initial selection and then the selection UML activity diagram represents a Grid service's
will be refined with using structural similarity operation, while object flows represent the types of
method [7]. results which flow from one activity to another.

UbiCC Journal – Volume 4 No. 3 602


Special Issue on ICIT 2009 Conference - Applied Computing

Effects binding two operations are presented with essential in the systematic building of workflow
control flows [5]. applications from Grid services. The use of these
The name of an activity in the diagram patterns depends on the number of the depicted
represents the name of the Grid service's operation. Grid Service's operations and their inputs and
This name must be specified as a Grid service could outputs [5]. These operations are results of the
have more than one operation often called interface semantic research elaborated by the ontological
which are specified in its relative WSDL file [32]. Grid services registry. This research is invoked by a
There are two different types of activities: yet- request given by the composition system in order to
unresolved activities and established activities of complete an unresolved activity in the workflow.
the composed workflow. The former represent the The Grid service registry provides zero, one or
need for a Grid Service's operation to be inserted in more operations producing the intended output.
order to complete the workflow. However, the latter Operations are depicted to be inserted in the
represent abstract operations that are already workflow interactively with the user.
included into the workflow.
As there are two different activity types in a 4.2.1 Sequence Pattern
Grid service workflow model, an activity needs to When the Grid registry provides one Grid
be typed and specified. To fulfil this, we propose to service's operation that is able to produce the
use the DSL modelling element invoke to required result or the user selects one operation
stereotype an established activity which is used to from the provided operation set; the composition
invoke an external Grid service's operation and yet- system uses the sequence pattern to insert the
unresolved to stereotype activities which are not yet operation in the workflow. In this case and as is
resolved. Object nodes of an established activity are illustrated by the figure 3, a single abstract
data stereotyped. Unknown input and output for a operation or activity (e.g. GridService1Operation1)
yet-unresolved activity are unknown stereotyped. will be inserted in the workflow model described by
In our UML profile, an object node could be the UML-AD language. This operation may also
relatedto a final node as composed workflow of require some data for itself (e.g.
Grid application should always deliver a result. GridService1Operation1Input) and thus it may
introduce a new unresolved dependency (e.g. the
4.2 UML-AD composition patterns yet-unresolved stereotyped activity). So, we use a
We identify, in this section, how UML activity follow-up method to build a simple pipeline-type
diagrams support some of basic Grid service sequential workflow: a sequence pattern.
composition patterns [5]. These patterns are

Figure 2: Meta-model of Grid service workflow composition specific UML activity diagram language

UbiCC Journal – Volume 4 No. 3 603


Special Issue on ICIT 2009 Conference - Applied Computing

A Sequence pattern is composed with sequential to object node representing the required input which
activities which are related with control flow (non both of them flow to a merge construct.
data operations dependency) or object flow (data Semantically, several services instances are
operation dependency). invoked in parallel threads and the merge will only
wait for the first flow to finish. We distinguish, in
Figure 5, two different Grid service's operations,
GridService1Operation1 and GridService2Operation1
providing the same output data DataOutput.

Figure 3: The sequence pattern

4.2.2 And-branches pattern


The and-branches pattern is introduced when the Figure 5: Alternative branches pattern
introduced operation represented by an abstract
UML activity has more than one input. This pattern 4.2.4 Alternative services pattern
is based on the Synchronization pattern presented in When composing workflows from Grid
[9]. services, a specific matching based on semantic
This pattern starts with object nodes, comparison could provide two or more different Grid
representing alternative operation inputs, which flow services performing each of them the required
to a join node. The latter is linked to the abstract grid operation. In such case and when the user do not
service's operation. This operation introduces some choose one of the depicted Grid service’s operations,
unresolved dependencies in the workflow. the composition system uses the alternative services
Semantically, several services instances are pattern to involve the operations in the workflow
invoked in parallel threads and the join will wait for model.
all flows to finish. As illustrated in Figure 4, the In this pattern, the Grid service’s operation to
operation of the Grid service insert is modelled by a composed super-activity with
GridService1Operation1 needs two inputs data a specified input data object and specified output
GridService1Operation1Input1 and data object (Figure 6). The super-activity is
GridService1Operation1Input2. The relative pattern stereotyped as AlternativeServiceInstance to indicate
produces two parallel threads in the workflow. that its task may be accomplished by a set of
alternative service's instances. These alternative
service instances are described with sub-activities.
The sub-activities shall be grid service instances and
thus stereotyped as invoke. It was up to decision
mechanism of the workflow execution engine to
choose which service instance in such given
workflow node is to be invoked and executed. In
Figure 6, the data DataOutput is provided from
GridServiceOperation service operation which could
be GridService1Operation1 provider or
GridService2Operation2 provider.
Figure 4: And-branches pattern

4.2.3 Alternative branches pattern


When the Grid registry provides more than
one operation able to produce the required result, and
the user do not select one of them, the composition
requires a specific pattern: the alternative branches
pattern.
This pattern combines the Exclusive Choice
and Simple Merge patterns presented in [9]. In this
pattern, each alternative service's operation is linked
Figure 6: Alternative services pattern

UbiCC Journal – Volume 4 No. 3 604


Special Issue on ICIT 2009 Conference - Applied Computing

4.3 The composition process


Figure 7 illustrates the scenario of the Step 4: If the required operation is found then the
composition process of workflows from available system displays its characteristics to the user to
Grid services. This composition is based on the confirm the choice. The register may provide more
domain specific UML activity diagram language than one operation. In such case the user could
presented in section 4.1. In the following, we choose the operation to insert in the workflow model
comment the different process steps of the scenario from the given list. If it does not specify its
presented in the figure 7. operation, then the system inserts all the given
operations using one of the composition patterns
Step 1: The user builds its composition request by presented in section 4.2. Relatively to the number of
specifying what kind of outcome or result that it depicted operations and their inputs and outputs, the
expects from the workflow application execution. composer chooses the right composition pattern.

Step 2: The composition system analyses the desired Step 5: For each input of inserted operation, the
output and sends a SPARQL query to request the system defines one unresolved dependency as a
ontologies of the Grid registry describing the workflow activity which is not yet established. This
available Grid services. The composer requests the activity depends on some Grid service’s operation.
Grid registry for a Grid service’s operation having For each unresolved dependency the composer asks
the specified result as output. the user if it wants to continue the composition
process or not. If the response is positive the
Step 3: If the required operation is not found and all composer re-executes the process from the step 2 to
unknown results are resolved then the composition resolve the current unresolved dependency.
process stops.

Figure 7: Scenario of the interactive composition of Grid service workflows based on UML activity diagrams

UbiCC Journal – Volume 4 No. 3 605


Special Issue on ICIT 2009 Conference - Applied Computing

5 Illustration of the interactive workflow the composer system is able to match different
composition operations into a workflow following a reverse
traversal approach. Thus, and by associating the
In the following, we illustrate the composition required data with the produced output, the composer
process through the example of the domain of the constructs a data flow between operation using
city traffic pollution analysis. This application, as workflow patterns and our UML profile [5]. The
presented in [23], targets the computation of traffic composer may also use a specific notion of effect
air pollutant emission in an urban area. that may bind two operations together with non-data
Step 1: Figure 8 shows an example of initial dependency. In [10], five basic control patterns were
workflow that represents a composition request for defined to be supported by all workflow languages
the results of the pollutant emission due to the city and workflow products. These patterns are Sequence
traffic. The desired result, PollutionEmission, is pattern, Parallel split pattern, Synchronization
described by the rectangle representing the object pattern, Exclusive Choice and Simple Merge
node in the relative activity diagram. patterns. Figure 10 represents the example of city
traffic analysis Workflow after the full composition
activity. It involves several Grid service operations,
sequence branches, parallel split branches, simple
merge branches and a loop [5]. The loop is involved
in the workflow diagram as the application iterates in
order to analyze the possible traffic. The Figure
shows also how UML activity diagrams support the
five basic patterns in the composition specific
domain of Grid services workflows [5]. In the
Figure 8: Initial workflow as a composition request example, some of object node or input data, such as
VehiculeType and StartZonzId, are given by the user
Step 2: Figure 9 represents the workflow of the of the application; they do not have an operation
computation of traffic air pollution analysis after one provider. This illustrates the interaction between our
step of composition. The service’s operation, composition system and the user.
delivering the PollutionEmission result, is
AirPollutionEmissionCalculator. This operation is
the result of the composer query asked to the
ontological Grid registry. The operation requires two
inputs TrafficFlowFile and PathsLenght-File, thus
it infers two unresolved dependencies in the activity
diagram modelling the composed workflow.

Figure 9: An example of workflow after one step of


composition

Step 3: For every dependency that needs to be


resolved .i.e. a yet-unresolved activity, the composer
contacts the ontological registry in order to find
suitable service’s operations that may produce the
required result. The services are described in an
ontological form with statements regarding the
service operation’s inputs, outputs, preconditions and Figure 10: The workflow application after the full
effects (the IOPE set) [26]. Through these notions, composition

UbiCC Journal – Volume 4 No. 3 606


Special Issue on ICIT 2009 Conference - Applied Computing

6 CONCLUSION Interactive Composition of UML-AD for the


Modelling of Workflow Applications, In. Proc.
In this paper, we have presented an approach for Of the 4th International Conference on
composing interactively workflows from Grid Information Technology, ICIT'2009, Amman,
services [2, 3, 4, 6]. This composition is based on an Jordan (2009).
UML profile for customizing UML activity diagrams
to compose and model workflows [5] and on [7] Y. Bendaly Hlaoui, L. Jemni Ben Ayed:
composition patterns [5] as well. The interactive Ontological Description of Grid Services
composition process was illustrated through the Supporting Automatic Workflow Composition,
example of city traffic pollution analysis domain [23] In. Proc. Of the International Conference on
We have developed and implemented most of the Web and Information Technologies,
presented components of the composition system. ICWIT'2009, Kerkennah, Tunisia, ACM
Actually, we are working on the implementation SIGAPP.fr, IHE éditions, pp. 233-243 (2009).
of the workflow execution system that invokes and
executes the depicted Grid service instances and [8] M. Bubak, R. Guballa, M. Kapalka, M.
manages the control and data flows in a run time Malawski, K. Rycerz: Workflow Composer and
environment relatively to our proposed activity service registry for grid applications, Journal of
diagram semantic. Future Generation Computer Systems, Vol. 21,
pp. 79-86 (2005).

7 REFERENCES [9] A. Cimatti, E. Clarck, A. Tacchella: Nusmv


version 2: An opensource tool for symbolic
[1] R. Bastos, D. Dubugras, A. Ruiz: Extending model checking, In Proc. Of the International
UML Activity Diagram for Workflow modelling Conference on Computer-Aided Verification,
in Productions Systems, In. Proc. Of the 35th CAV'02, Lecture Notes in Computer Science,
Annual Hawaii International Conference on Springer Verlag (2002).
System Sciences, HICSS'02, IEEE Cs Press
(2002). [10] M. Dumas, and A. H. M. ter Hofsetde: UML
Activity Diagrams as a Workflow Speci_cation
[2] Y. Bendaly Hlaoui, L. Jemni Ben Ayed: Toward Language, In UML'2001 Conference, Toronto,
an UML-based composition of grid services Ontario, Canada, Lecture Notes in Computer
workflows, In Proc. Of the 2nd international Science (LNCS), Springer-Verlag, Heidelberg,
workshop on Agent-oriented software Germany (2001).
engineering challenges for Ubiquitous and
Pervasive Computing, AUPC’08, ACM Digital [11] R. Eshuis: Semantics and verification of UML
Library, pp. 21-28 (2008). Activity Diagrams for Workflows Modelling,
PhD thesis, University of Twente (2002).
[3] Y. Bendaly Hlaoui, L. Jemni Ben Ayed: An
extented UML activity Diagram for Composing [12] R. Eshuis and R. Wieringa: Comparing Petri net
Grid Services Workflows, In Proc. Of the IEEE and Activity diagram variants for workflow
international Conference on Risks and Security modelling: A Quest for Reactive Petri Nets,
of Internet and Systems, CriSIS’08, Tozeur, Petri Net technology for communication based
Tunisia, p. 207-212 (2008). Systems, LNCS, Springer Verlag (2003).

[4] Y. Bendaly Hlaoui, L. Jemni Ben Ayed: A [13] I. Foster, D.Berry, A.Djaoui, A.Grimshaw,
MDA approach for semi automatic grid services B.Horn, H.Kishimoto, F.Maciel, A.Savy,
workflows composition, In Proc. Of the IEEE F.Siebenlist, R.Subramaniam, J.Treadwell,
international conference on Industrial J.Von Reich: The Open Grid Services
Engineering and engineering Managment, Architecture, Version 1.0. (2004).
IEEM’08, p.1433-1437 (2008).
[14] I. Foster, C. Kesselman: Grid Services for
[5] Y. Bendaly Hlaoui, L. Jemni Ben Ayed: Patterns Distributed System Integration, Journal of IEEE
for Modeling and Composing Workflows from Computer, Vol. 35, No. 6, pp. 37-46 (2004).
Grid Services, In. Proc. Of the 11th International
Conference on Enterprise Information Systems, [15] T. Gardner: UML modelling of automated
ICEIS'2009, Milan, Italy, LNBIP, Springer- Business Processes with a Mapping to
Verlag, Vol. 24, pp. 615-626 (2009). BPEL4WS, In. Proc. Of the European
Conference on Object Oriented Programming
[6] Y. Bendaly Hlaoui, L. Jemni Ben Ayed: An (2003).

UbiCC Journal – Volume 4 No. 3 607


Special Issue on ICIT 2009 Conference - Applied Computing

[16] C. Goble, D. de Roure: The grid: an application


of the semantic web, ACM SIGMOD Record- [27] Pllana, T. Fahringer, J. Testori, S. Benkner, I.
Special section on semantic web and data Brandic: Towards an UML Based Graphical
management, Vol. 31, No.4, pp. 65-70 (2002). Representation of Grid Workflow Application,
In. Proc. Of the 2nd Eu-ropean Across Grids
[17] T. Gubala, D. Herezlak, M. Bubak, M. Conference, Nicosia, Cyprus, Springer-Verlag
Malawski: Semantic Composition of Scientific (2004).
Workflows Based on the Petri Nets Formalism,
In Proc. Of the Second IEEE International [28] J. Rao, P. Kungas, M. Matskin: Logic-based
Conference on e-Science and Grid Computing, web service composition: from service
e-Science'06 (2006). description to process model, In Proc. Of the
IEEE International Conference on Web Services,
[18] R. Guballa, A. Hoheisel, F. First: Highly ICWS 2004, San Diego, California, USA (2004).
Dynamic workflow Orchestration for scientific
Applications, CoreGRID Technical Report, [29] E. Sirin, J. Hendler, B. Parsia: Semi automatic
Number TR-0101 (2007). composition of web services using semantic
descriptions, In Proc. Of the ICEIS-2003
[19] R. Gronomo, I. Solheim: Towards Modelling Workshop on Web Services, Modeling,
Web Service Composition in UML, In The 2nd Architecture and Infrastructure, Angers, France
International Workshop on Web Services: (2003).
Modelling, Architecture and Infrastructure, Porto,
Portugal (2004). [30] D. Skogan, R. Gronomo, I. Solheim: Web
Service Composition in UML, In Proc. Of the
[20] R. Gronomo, MC. Jaeger: Model Driven 8th Intl Enterprise Distributed Object Computing
Semantic Web Service Composition, In Proc. Of Conference, EDOC'04 (2004).
the 12th Asia-Pacific Software Engineering
Conference, APSEC'05 (2005). [31] M. Smith, T. Friese, B. Freisleben: Model
Driven Development of Service-Oriented Grid
[21] M. Laclavik, E.Gatial, Z. Balogh, O. Habala, G. Applications, In. Proc. Of the IEEE Asia-Pacific
Nguyen, L.Hluchy: Experience Management Conference on Services Computing, APSCC'06
Based on Text Notes, In. Proc. Of e-Challenges (2006).
2005 Conf. (2005).
[32] Web Services Descriptio Language (WSDL)
[22] W. Li, C. Huang, Q. Chen, H. Bian: A Model- 1.1. W3C Note 15 March (2001).
Driven Aspect Framework for Grid Service
Development, In Proc. Of the IEEE International [33]W3C: Resource Description Framework (RDF)
Conference on Internet and Web Applications Model and Syntax Specification, report num.
and Services, ICIW’06, pp. 139-146 (2006). TR/1999/REC-rdf-syntax-19990222 (1999).

[23] M. Masetti, S. Bianchi, G. Viano: Application of [34]W3C: SPARQL Query Language for RDF,
K-Wf Grid technology to Coordinated Traffic report , 2008.
Management. http://grid02.softeco.it/site/project-
info.html [35] M. J. Young: XML Step by Step, Microsoft
Press, ISBN: 2-84082-812-X (2001).
[24] Model Driven Architecture (MDA). Document
nomber omrsc/2001-07-01 (2001)

[25] Object Management Group. UML 2.0


Superstructure Specification. July (2005).

[26] OWL-S: Semantic Markup for Web Services.


The OWL Services Coalition. OWL-S version
2.0.S.

UbiCC Journal – Volume 4 No. 3 608


Special Issue on ICIT 2009 Conference - Applied Computing

HOW TO MAP PERSPECTIVES

Gilbert Ahamer, Adrijana Car, Robert Marschallinger, Gudrun Wallentin, Fritz Zobl
Institute for Geographic Information Science at the Austrian Academy of Sciences
ÖAW/GIScience, Schillerstraße 30, A-5020 Salzburg, Austria
gilbert.ahamer@oeaw.ac.at, adrijana.car@oeaw.ac.at, robert.marschallinger@oeaw.ac.at,
gudrun.wallentin@oeaw.ac.at, fritz.zobl@oeaw.ac.at

ABSTRACT
“Perspectives” are seen as the basic element of realities. We propose different
methods to “map” time, sspace, economic levels and other perceptions of
reality. IT allows views on new worlds. These worlds arise by applying new
perspectives to known reality. IT helps to organise the complexity of the
resulting views.

Key Words: Geographic Information Science, mapping, time, space, perception.

0. LET’S START TO THINK 1. WRITING HELPS TO BECOME AWARE


0.1 Our world is the entirety of perceptions. (Our We ask: Is it possible to map = write
world is not the entirety of facts.) 1. the distribution of material facts and
elements in geometric space? (physics)
2. the distribution of factual events in global
time? (history)
3. the distribution of real-world objects
across the Earth? (geography)
4. the distribution of elements along material
properties? (chemistry)
5. the distribution of growth within sur-
rounding living conditions2? (biology)
Figure 0: The human being perceives the world. 6. the distribution of persons acting in
Hence, every individual lives in a different world relationships? (sociology)
(Fig. 0). 7. the distribution of individuals between
advantage and disadvantage? (economics)
0.2 The “indivisible unit”, the atom (ατομος1) of 8. the distribution of perspectives within
reality, is equal to one (human) perspective. Our feasible mindsets? (psychology)
world is made up of a multitude of perceptions, not 9. the distribution of living constructs along
of a multitude of realities and not of a multitude of selectable senses? (theology)
atoms (Fig. 1).
We see: awareness results from reflection (Fig. 2).
elements
living conditions
objects
personalities

events advantages

matter perspectives
Figure 1: The “primordial soup” of living, before
the advent of (social) organisms: uncoordinated sense
perspectives, uncoordinated world views. x
y
0.3 In order to share one’s own conception with z
space themes time = t
others, “writing” was invented. Similarly, complex
structures, such as landscapes, are “mapped”. To Figure 2: Fundamental dimensions, along which to
map means to write structures. coordinate individual world views when reflecting.
                                                                                                                         
1 2
what cannot be split any further (Greek) životné prostredie (Slovak): living environment

UbiCC Journal – Volume 4 No. 3 609


Special Issue on ICIT 2009 Conference - Applied Computing

2. TIME CAN BE
1. an attribute of space (a very simple
historic GISystem)
2. an independent entity (Einstein’s physics)
3. the source of space (cosmology).
In terms of GIS item 2.1 is expressed as “t is one of
the components of geo-data” i (Fig. 3).

Figure 3: The where-what-when components of


geo-data, also known as triad (Peuquet 2002: 203). Figure 5: Notions of path in a geo-space: (a)
Minard’s map of human losses during Napoleon’s
Time can be understood as 1812 campaign into Russia; and (b) its geo-
• establishing an ordinal scale for events visualisation in a time cube (Kraak, 2009).
• driving changes (= Δ) of realities
• something that unfortunately does not Further examples such as landslides in geology,
appear on paper. growth of plants, energy economics, economics
will be shown in chapter 7.
A proposed solution is to map changing realities
(Δ) instead of mapping time. For implementing the idea to project the t axis onto
Time is replaced by what it produces. This is the Δ axis we need to have clear insight how time
indicated in Fig. 4. quantitatively changes reality.
In other words: we need a model, which (explaining
Δ elements
Δ living conditions how processes occur) determines the representation
Δ objects of time (Fig. 6). Examples are sliding geology,
Δ personalities
ΔGDP/cap, plant growth.
Δ events Δ advantages
One cannot perceive time (never!), only its effects:
Δ matter (e.g. its path) Δ perspectiv. what was perceived in this time span (duration)4?
This is why the t axis is projected onto another axis
Δ sense denoting the effect of elapsed time; what this means
x pro- to the individual sciences is shown in Fig. 4.
y ject!
z
space themes time = t Very similarly, in physics nobody can feel force,
Figure 4: The projection of time (t) onto the effects only its effect (deformation, acceleration), and still
of time (the changes Δ) can apply to any science. forces have been undisputedly a key concept for
centuries.
This idea flips = projects the t axis onto one of the
vertical axes. Time means then: how maps are What is time? Just a substrate for procedures.
changed by the envisaged procedures. What is space? Just hooks into perceived reality.
Such procedures modify the variables along the
axes, be they of physical (gravity force) or of social We retain from this chapter 2 that we need a clear
nature (war). model of how elapsing time changes reality. Then
we can map time as suggested: by its effects.
A classical example is Minard’s map of Napoleon’s
1812 campaign into Russia3 (Fig. 5a, b).
                                                                                                                         
3 4
Patriotic War (in Russian): Отечественная война T. de Chardin’s (1950) concept of durée (French).

UbiCC Journal – Volume 4 No. 3 610


Special Issue on ICIT 2009 Conference - Applied Computing

3. HOW TO WRITE TIME? 5. HOW TO MAP SPACE AND TIME?


The big picture shows us various examples: The detailed picture: it is obvious that a choice
1. as a wheel (see the Indian flag): revolving must be made for one mode of representation and
zodiacs, rounds in stadiums, economic for one view of one scientific discipline:
cycles, Kondratieff’s waves 1. (x, y; t): cartography, GIS (Fig. 7)
2. as an arrow (see Cartesian coordinates): 2. (x, y, z; t): geology
directed processes, causal determinism, 3. (x, y, z; vx, vy, vz; t): landslides
d/dt, d²/dt² 4. (x, y, z; biospheric attributes; t): ecology,
3. as the engine for further improvement tree-line modelling
(evolutionary economics): decrease vs. 5. (countries; economic attributes; GDP/cap)
increase in global income gaps, autopoietic or (social attributes; structural shifts;
systems, self-organisation elapsing evolutionary time): economic and
4. as the generator of new structures social facts in the “Global Change Data
(institution building, political integration, Base”6 (Fig. 8)
progressive didactics): new global 6. perceiving rhythms and structures: (only)
collaborative institutions, peer-review, these are “worth recognising”: music,
culture of understanding, self-responsible architecture, fine arts.
learning, interculturality
5. as evolving construct (music). objects seen by geographers
From this chapter 3 we only keep in mind that the
concepts to understand and represent time are
fundamentally and culturally different.

harmonised
x world views!
y
z
space themes time = t
5
Figure 6: All data representations require models. Figure 7: Harmonising world views: GIS reunites
world views by relating everything to its location.
4. HOW TO WRITE SPACE?
The big picture shows us various examples: Different sciences may have considerably different
1. as a container of any fact and any process outlooks on reality (Fig. 8). A humble attitude of
(geography and GIS) recognising facts5 instead of believing in the
2. as result of human action (landscape theories one’s own discipline offers can empower
planning) people to survive even in the midst of other
3. as evolving construct (architecture). scientific specialties: Galileo’s (1632) spirit: give
priority to observation, not to theories!
Examples span space as This is the essential advantage of geography as a
• received and prefabricated versus science: geographers describe realities, just as they
• final product of one’s actions, namely: appear. Such a model-free concept of science has
1. spaces as the key notion for one’s own promoted the usefulness of GIS tools to people
science: everything that can be geo- independent of personal convictions, scientific
referenced means GIS models or theories.
2. space as the product of human activity objects seen by economists
3. expanding space into state space: the
entirety of possible situations is
represented by the space of all “state
vectors” which is suitable only if
procedures are smooth.
harmonised
world views!
The main thesis here is: the “effects of time” are x
structurally similar in many scientific disciplines, y
and they often imply “changes in structures” too. z
space themes time = t
Information Technology (IT) is already providing
Figure 8: Different but again internally harmonised
scientific tools to visualise such structures.
world views: explain facts from another angle.
                                                                                                                         
5 6
datum (Latin): what is given (unquestionable) This GCDB is described in Ahamer (2001)

UbiCC Journal – Volume 4 No. 3 611


Special Issue on ICIT 2009 Conference - Applied Computing

6. WHAT IT DOES, DID, AND COULD DO 7. EXAMPLES


6.1 IT helps to organise the multitude of views (= The authors are members of the “Time and Space”
perceptions) onto data that are generated by project at their institution named “Geographic
humans: Information Science”8, a part of which explores the
• IT constructs world views, such as: GIS, cognitive, social, and operational aspects of space
history, economics, geology, ecology etc. & time in GIScience.
• IT has already largely contributed to This includes models of both social and physical
demolishing traditional limitations of space space and consequences thereof for e.g. spatial
and time: analysis and spatial data infrastructures. We
o Space: tele(-phone, -fax, -vision), virtual investigate how space and time are considered in
globes (Longley et al., 2001) these application areas, and how well the existing
o Time: e-learning, asynchronous web-based models of space and time meet their specified needs
communication, online film storage (see e.g. Fig. 9). This investigation is expected to
(Andrienko & Andrienko 2006). identify gaps. Analysis of these gaps will result in
improved or new spatio-temporal concepts
6.2 This paper investigates non-classical modes of particularly in support of the above mentioned
geo-representation. application areas.
We would like to point out that there are two
already well-established fields that offer solutions 7.1 Sliding realities: geology
to mapping (space and time, Fig. 9) views: The notion of the path in geography (x, y, t) is
Scientific and information visualisation are extended by the z axis (see item 5.2) which
branches of computer graphics and user interface produces a map of “time”: Fig. 9 (Zobl, 2009).
design which focus on presenting data to users, by
means of interactive or animated digital images.
The goal of this field7 is usually to improve the
understanding of the data presented. If the data
presented refers to human and physical
environments, at geographic scales of
measurement, then we talk about Geovisualisation,
e.g. (MacEachren, Gahegan et al. 2004; Dykes,
Figure 9: Geology takes the (x, y, z; t) world view.
MacEachren et al. 2005, Dodge et al., 2008).
The “effect of time” is sliding (luckily in the same
spatial dimensions x, y, z): we take the red axis in
Fig. 10. Space itself is sufficiently characteristic for
denoting the effects of time.

Figure 9: Time series and 3 spatio-temporal data


types
(http://www.crwr.utexas.edu/gis/gishydro05/).

6.3 IT could develop tools that are then


interchangeable across scientific disciplines, e.g.
landslides that may structurally resemble
institutional and economic shifts (see 7.1).

IT could prompt scientists to also look at data


structures from other disciplines. Figure 10: These effects of time occur in space,
Whatever the disciplines may be, the issues are most helpfully. Source: Brunner et al. (2003).
structures and structural change!                                                             
8
The overarching aim of the GIScience Research
                                                             Unit is to integrate the “G” into Information
7
http://en.wikipedia.org/wiki/Scientific_Visualization Sciences (GIScience, 2009)

UbiCC Journal – Volume 4 No. 3 612


Special Issue on ICIT 2009 Conference - Applied Computing

7.2 Slices of realities: geology 7.5 Global deforestation


Despite the lucky coincidence that the effect of One key driver for global change is deforestation;
time (Δx, Δy, Δz) occurs in the same space (x, y, z) easy to map as change of land use category of a
we try to produce slides carrying more information given area (Fig. 13).
(item 5.3) and hence recur to the so-called attributes
mentioned in Fig. 9 such as grey shades or colours.
The speed of sliding (d/dt x, d/dt y, d/dt z) is
denoted both by horizontal offsets and whitish
colours in the spaghettis (Marschallinger, 2009) of
Fig. 11.

Figure 13: The (x, y, z; Δ biospheric attributes; t):


  view of the global deforestation process in mega-
tons carbon. Above: map of carbon flow, below:
Figure 11: The (x, y, z; vx, vy, vz; t) view of a time series of GCDB data per nation symbolically
landslide process (shades of grey mean speed v). geo-referenced by the location of their capitals.

This representation is analogous to Fig. 11. In both,


7.3 Slide shows the focus shifts from maps(t) Î maps(t, Δt).
How to map spatial realities that are not any longer Interest includes temporal dynamics:
isotropic displacement vectors of space itself? For t = colour (above); Δt = height+colour (below),
the example of changing tree lines in the Alps enriching the purely spatial interest.
(Wallentin, 2009) a slide show is used to present Even if to the aim is to enlarge the scope of the
the change of growth patterns made up of the information delivered from the static map (Fig. 13
multitude of individual agents (= trees = dots in above) to the “dynamic map” (Fig. 13 below),
Fig. 12). Moving spatial structures are depicted as a readers will remain unsatisfied because no insight
film of structures (item 5.4). into the dynamic properties of deforestation is
provided (Fig. 18).
Increasingly, the viewer’s focus turns further from
“facts” to “changes of facts”, to “relationships with
driving parameters9” and to (complex social and
political) “patterns10”.

7.6 Realities beyond slides


But what if the information belongs to the social or
economic realm (Fig. 14)? How to depict economic
levels, education or policies?
Figure 14:
Example for graphic
notation: one
Figure 12: The (x, y, z; biospheric attributes; t) (hypothesised) para-
meter per nation
view of the Alpine tree line (above) and its shift
(seen across the
induced by climate change as a slide show (below). Jordan = ). 
In such processes which involve independent                                                             
9
behaviour of autonomous agents (here: trees) it see the suggested scenarios for water demand,
becomes seemingly difficult to apply a water supply and water quality (Ahamer, 2008)
10
transformation of space itself, e.g. d/dt(x, y, z). Patterns: name of the journal of the American
Society for Cybernetics ASC

UbiCC Journal – Volume 4 No. 3 613


Special Issue on ICIT 2009 Conference - Applied Computing

7.7 Mapping social processes 8. TRANSFORMATION OF COORDINATES


Social processes in social organisms can be
8.1 All the above examples have shown that
described by the intensity of four different
communicational dimensions (Fig. 15) along time: • various “spaces” can be thought of
S = info, A = team, T = debate, B = integration. • it would be suitable to enlarge the notion
This type of writing (Fig. 16) resembles a score in of “time”.
musical notation11 and was invented for the web- 8.2 Suitably, a transformation of coordinates from
based negotiation game “Surfing Global Change” time to “functional time” may be thought of.
(SGC), its rules are published in (Ahamer, 2004).
The elementary particle of humanity’s progress – 8.3 In chapter 2, we suggested already to regard
consensus building – is trained by SGC time as the substrate for procedures. Consequently,
In this case, IT contributed to making different “times” can be applied to different
communication independent from space and time: a procedures. As an example, in theoretical physics,
web-platform enables asynchronous worldwide the notion of “Eigentime12” is common and means
interaction of participants. the system’s own time.
8.4 Similar to the fall line in the example of
landslides in chapter 7.1 (red in Fig. 10) the
direction of the functional time is the highest
gradient of the envisaged process. This (any!) time
axis is just a mental, cultural construction.
8.5 According to chapter 2 (Fig. 6) a clear under-
standing (mental model) is necessary to identify the
main “effect of time”. We see that such an
understanding can be culturally most diverse. Just
consider the example of economic change:
Figure 15: Four basic components of any social • optimists think that the global income gap
procedure: learning information (Soprano S), decreases with development
forming a team (Alto A), debating (Tenor T), and • pessimists believe that it increases,
integrating opposing views (Bass B). hampering global equity.
8.6 Therefore, any transformation of coordinates
bears in itself the imponderability of complex social
assumptions about future global development and
includes a hypothesis on the global future.
8.7 Still, a very suitable transformation is
t Î GDP/capita
(Fig. 17) both because of good data availability and
increased visibility of paths of development.
GDP/cap resembles evolutionary time.

time t GDP/cap

= real time: ≈ evolutionary time of


development:
complex graphic structure simpler graphical structure

Figure 16: A map of social processes in 4 dimen-


Figure 17: A suitable transformation of time uses
sions during a negotiation procedure in a university
the economic level, measured as GDP per capita.
course: participants show varying activity levels.
                                                                                                                         
11 12
partitura (Italian): score (in music) literally (German): the own time (of the system)

UbiCC Journal – Volume 4 No. 3 614


Special Issue on ICIT 2009 Conference - Applied Computing

8.8 The strategic interest of such a transformation is 9. A FUTURISTIC VISION


“pattern recognition”, namely to perceive more 9.1 Building on the vision of “Digital Earth” (Gore,
easily structures in data of development processes. 1998), the deliberations in this paper might
Examples for such “paths of development” are eventually lead to the vision of “Digital
shown in Fig. 18 for the example of fuel shares in Awareness”: the common perspective on realities
energy economics. valid for the global population, aided by
(geo)graphic means.
9.2 The primordial element of (human and societal)
evolution is consensus building. Without ongoing
creation of consensus global “evolutionary time” is
likely to fall back.

The futuristic vision is to map global awareness.

 
Figure 18: Structural shift of percentages of
various fuels in all nations’ energy demand 1961-
91. Data source: GCDB (Ahamer, 2001).

8.9 It is suggested here that implicitly during many


mapping endeavours such transformation occurs.
This is legitimate, but care must be taken to take
into account the (silently) underlying model of
human development.
8.10 Suitable transformation of coordinates can Figure 19: The global society perceives the world.
facilitate to see and communicate evolutionary
structures, as it enables common views of humans 9.3 Much like the georeferenced satellites which
and is therefore helpful for global consensus circulate around the world produce a “Google,
building. Virtual [or similar] Earth”, the individual spectators
8.11 Also the “effects of time” are projected into a in Fig. 19 circle around the facts – and they create a
common system of understanding which might give “common virtual perception”: an
hope to facilitate common thinking independently IIS = Interperspective Information System.
of pre-conceived ideologies.
This plan creates the “common reference system of the entirety seen by all
objects”. global citizens
8.12 This paper suggests enlarging the concept of
• “globally universal geo-referencing” (one
of the legacies of IT)
to
• “globally universal view-referencing” x
entirety of
world views!

• or “globally universal referencing of y


z
time = t
perspectives” 13. space of themes

Fig. 19 illustrates this step symbolically.


                                                            
13
The facts themselves may well be delivered by
endeavours such as Wikipedia but here it refers to Figure 20: Divergent perceptions circulate around
the perspective on facts! A huge voluntarily earthen realities. The entirety of world views cre-
generated database on people’s perceptions, views ates the IIS (Interperspective Information System).
and opinions would be needed.

UbiCC Journal – Volume 4 No. 3 615


Special Issue on ICIT 2009 Conference - Applied Computing

9.4 Do we just mean interdisciplinarity? No. Nor do 10. CONCLUSION


we simply refer to people looking into any Sciences are similar to “languages” spoken by
direction. Fig. 21 shows the difference to IIS. people, they differ globally. Understanding for
others’ languages is essential for global sustainable
peace.
Human perceptions are also strongly influenced by
underlying models, assumptions and preconceived
understandings.
Studying geo-referenced data sets (GIS) can help to
facilitate bridging interperceptional gaps.

For the transformation of world views – to make


them understandable – it is necessary to know
about
• the “effect of time”, namely the “path along the
continuum of time” which a variable is expected
to take
• the speakers’ underlying model of a complex
techno-socio-economic nature
• the resulting perception of other humans.

A future task and purpose of IT could be to


combine the multitude of (e.g. geo-referenced) data
and to rearrange it in an easily understandable
Figure 21: This is not IIS. manner for the viewpoints and perspectives of
another scientific discipline or just another human
9.5 The science of the third millennium will allow being. Such a system is called Interperspective
dealing with a multitude of world views and world Information System IIS.
perspectives (see Tab. 1) with an emphasis on
consensus building. Merging a multitude of perspectives to form a
When learning, the emphasis lies on social learning common view of the entire global population is the
and may also make use of game-based learning target of an IIS.
(such as the web-based negotiation game “Surfing Symbolically, a “Google Earth”-like tool would
Global Change”) which allows to experimentally eventually develop into a “Google World
experiment with world views without any risk Perspective”-like tool, or a “Virtual Earth”-like tool
involved. would become a “Virtual Perspective” tool
encompassing all (scientific, social, personal,
Table 1: The science of the third millennium
political, etc.) views in an easily and graphically
encompasses multiple perspectives
understandable manner.
element interaction perspective
In the above futuristic vision, IT can/should(!)
become a tool to facilitate consensus finding. It can
single Mechanics Logics Teaching
21st cent.

rearrange the same data for a new view.


19th cent.

20th cent.

ones
mani- Thermo- Systems Social Symbolically speaking: similar to Google Earth
fold dynamics analysis learning which allows one to view the same landscape from
gaming, IIS different angles, a future tool would help to
navigate the world concepts, the world views and
9.6 A suitable peaceful “common effort14” for a the world perspectives of the global population.
peaceful future of humankind would involve IT can reorganise extremely large data volumes (if
developing tools and visual aids in order to technological growth rates continue) and could
understand the opinions of other citizens of the eventually share these according to the viewpoint of
globe. the viewer.
The future is dialogue.
Such a step of generalisation would lead from
Or else there will be no future. “Geographic Information Science” to
                                                             “Interperspective Information Science”, implying
14
(jihad in Arabic) also means: common effort the change of angles of perception according to
of a society one’s own discipline.

UbiCC Journal – Volume 4 No. 3 616


Special Issue on ICIT 2009 Conference - Applied Computing

REFERENCES
Ahamer, G. (2001), A Structured Basket of Models
for Global Change. In: Environmental
Information Systems in Industry and Public
Administration (EnvIS). ed. by C. Rautenstrauch
and S. Patig, Idea Group Publishing, Hershey,
101-136, http://www. oeaw-
giscience.org/ProjectFactSheets/Project
FactSheet_GlobalChange.pdf.
Ahamer, G., Wahliss, W. (2008), Baseline
Scenarios for the Water Framework Directive.
Ljubljana, WFD Twinning Project in Slovenia,
http://www.oeaw-giscience.org/ProjectFact
Sheets/ProjectFactSheet_EU_SDI.pdf.
Andrienko, N., Andrienko G. (2006), Exploratory
Spatial Analysis, Springer
Brunner, F.K., Zobl, F., Gassner, G. (2003), On the
Capability of GPS for Landslide Monitoring.
Felsbau 2/2003, 51-54.
de Chardin, T. (1950), La condition humaine [Der
Mensch im Kosmos]. Beck, Stuttgart.
Dodge, M., McDerby, M., Turner, M. (eds.) (2008)
Geographic Visualisation, Wiley
Dykes, J., A. MacEachren, et al. (2005). Exploring
Geovisualization. Oxford, Elsevier.
Galileo, G. (1632), Dialogo sopra i due massimi
sistemi del mondo, tolemaico, e copernicano.
Fiorenza.
GIScience, (2008), Connecting Real and Virtual
Worlds. Poster at AGIT’08, http://www.oeaw-
giscience.org/index.php?option=com_content&ta
sk=blogcategory&id=43&Itemid=29.
Gore, A. (1998). Vision of Digital Earth,
http://www.isde5.org/al_gore_speech.htm.
Kraak (2009), Minard’s map.
www.itc.nl/personal/kraak/1812/3dnap.swf
Longley, P.A. et al. (2001) Geographic
Information. Science and Systems, Wiley
MacEachren, A. M., M. Gahegan, et al. (2004).
Geovisualization for Knowledge Construction
and Decision Support. IEEE Computer Graphics
& Applications 2004 (1/2): 13-17.
Marschallinger, R. (2009), Analysis and Integration
of Geo-Data. http://www.oeaw-giscience.org/.
Peuquet, D. J. (2002). Representations of Space and
Time. New York, The Guilford Press.
Wallentin, G. (2009), Ecology & GIS.
Spatiotemporal modelling of reforestation
processes. See http://www.oeaw-
giscience.org/images/stories/Downloads/pecha%
20kucha%20technoz%20day.pdf
Zobl, F. (2009), Mapping, Modelling and
Visualisation of georelevant processes.
http://www.oeaw-giscience.org/.
                                                            
i
GIScience goes way beyond this view of time and
space (considering time as function) because it al-
lows for much more complex queries and analyses.

UbiCC Journal – Volume 4 No. 3 617


UbiCC Journal
Conference Consulting

UbiCC Journal Conference


Consulting

UbiCC Journal exhibits many


conferences a year. The calendar of
conferences includes Trade Shows,
Book Fairs, Conferences and Exhibits
throughout the world.

From local meetings to international


conferences, UbiCC offers you access
to a conference community that will
help you meet your professional goals.
UbiCC sponsors multiple conferences
and meetings and is involved in the
technical program development of
events around the world.
.

simplifying IT consulting
Conference
Management
support

UbiCC Journal
Conference Consulting
4 Nichols Rd, Nesconset, NY
+1-347-4149239 ph
+1-212-901-6990 fax
www.ubicc.org
Technology at work for you
CONNECTING YOUR CONFERENCE TO THE TECHNOLOGY RESOURCES YOU NEED

Network with your peers and enhance your career. links to conference websites, and special issue with A B O U T US
Whether looking for your industry's hottest topics, the UbiCC for technical program, registration and The papers of original research

latest research breakthroughs and innovations, short additional information. and innovatory applications from all
parts of the world are welcome.
courses on changing technologies, or vendor exhibits Conferences offer a unique opportunity for developing
The papers for publication in
of new commercial products and services, UbiCC has a wide range of professional relationships. Expert UBICC are selected through
it covered. panels, presentations, key note speeches, exhibits rigorous peer reviews to ensure
originality, appropriateness,
Find the conferences that meet your needs with the and social events all provide an opportunity to interact
significance, and readability.
UbiCC Conference Search, where you can search by with the leaders and innovators in your field of interest.

date, location, or UbiCC sponsor. Search results

I T S U P P O RT S O L U T I ON S

• CONFERENCE MANAGEMENT SYSTEM


flexible solutions for C O N F E R E NC E C O NS UL T I N G P R OV I DE S

• JOURNAL MANAGEMENT SYSTEM your conference needs A TOTAL END TO END SOLUTION.
SCOPE
• PAPER REGISTRATION SYSTEM The Journal aims to provide an
environment in which ideas and
• CONFLICT RESOLUTOR P R E SE N T AN D P UB L ISH A P P LI C A TIO N M A NA GE M E N T research can be presented,

• ANTI PRAGULISM SOLUTION discussed and criticized so that best


Whether your paper is focused on research and UbiCC Journal conferece collaboration solution is a free practice can be assimilated into the
• UBICC XPLORE SEARCH conference management system that is flexible, easy to new curricula of schools, colleges
development, or engineering and implementation, UbiCC
use, and has many features to make it suitable for and universities. For further details
• PAPER TRACKER has the right forum for your work. For upcoming various conference models. It is currently probably the contact: info@ubicc.org
• UBICC PERSONALIZED TOOLBAR submission opportunities, check the Call for Paper most commonly used

deadlines. conference management

• Selected papers from UbiCC authorized conferences system.


Journal/Conference
handling solution has been
BE A LEADER designed to help program
chairs to cope with the
complexity of the
As an UbiCC conference organizer you can connect with
refereeing process. The E D I T O R IA L B O A R D
others in your profession, make new contacts, and
current version supports: The editorial board of Ubiquitous
contribute to your field of interest. computing and communication
Have you identified a need for a conference in an • Sophisticated and flexible management of the access Journal consists of leading specialists
of PC members and referees; of the field from the globe.
emerging technology, looking to help bridge the gap
• Online discussion of papers;
between research and development and engineering
• Monitoring emails;
and implementation? Become a conference organizer • Paper submission;

and make a difference. and many more …