Sunteți pe pagina 1din 134

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/311843670

On the Design of An Ontology Based Access Control Model : A Digital Library


Perspective

Thesis · August 2015


DOI: 10.13140/RG.2.2.33787.52001/1

CITATIONS READS

0 240

1 author:

Subhasis Dasgupta
University of California, San Diego
17 PUBLICATIONS   35 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Ontology Digital Library Security View project

All content following this page was uploaded by Subhasis Dasgupta on 23 December 2016.

The user has requested enhancement of the downloaded file.


On the Design of An Ontology Based
Access Control Model : A Digital
Library Perspective

Thesis submitted by : Subhasis Dasgupta

A thesis submitted in partial fulfilment of the requirements


for the degree of Doctor of Philosophy(Engineering)

in the

Department of Computer Scinece and Engineering,


Faculty Council of Engineering & Technology
JADAVPUR UNIVERSITY
Kolkata, India
2015
i

JADAVPUR UNIVERSITY
Kolkata - 700032, India

Index No: 91/12/E

1. Title of the thesis : On the Design of An Ontology Based Access


Control Model : A Digital Library Perspective

2. Name Designation & Institution of the Supervisors :

(a) Prof. Aditya Bagchi,


Professor, Electronics and Communications Sciences Unit,
Indian Statistical Institute,
203, B. T. Road
Kolkata 700108
India
(b) Prof. Chandan Mazumdar,
Professor, Department of Computer Science & Engineering
Jadavpur University,
188, Raja S. C. Mallick Road,
Kolkata, 700032
India

3. List of Publications:

(a) Subhasis Dasgupta. Aditya Bagchi, Conflicts between User and Concept hi-
erarchies for controlled access to a Polyhierarchic Ontology-A Digital Library
Perspective , IEEE Transaction on Dependable and Secure Computing (Com-
municated)
(b) Subhasis Dasgupta , Pinakpani Pal, Chandan Mazumdar, Aditya Bagchi ,
Resolving Authorization Conflicts by Ontology Views for Controlled Access
to a Digital Library, Journal of Knowledge Management, Vol. 19 Issue :
(2015) 1, pp.45 - 59
(c) Subhasis Dasgupta, Aditya Bagchi: Controlling Access to a Digital Library
Ontology - A Graph Transformation Approach. International Journal of Next-
Generation Computing (IJNGC) 5(1) (2014), pp 22 - 42
(d) Subhasis Dasgupta, Aditya Bagchi: A Graph-Based Formalism for Control-
ling Access to a Digital Library Ontology. Computer Information Systems and
Industrial Management - 11th IFIP TC 8 International Conference, CISIM
2012, Venice, Italy, September 26-28, Lecture Notes in Computer Science,
Volume 7564 PP 111-122, Springer
ii

(e) Subhasis Dasgupta; Aditya bagchi, Resolving conflicts between role-hierarchy


and concept-hierarchy in a Digital Library ontology, Emerging Applications
of Information Technology (EAIT), 2012 Third International Conference on
, vol., no., pp.443,446, Nov. 30 2012-Dec. 1 2012
(f) Subhasis Dasgupta, Aditya Bagchi: Controlled Access over Documents for
Concepts Having Multiple Parents in a Digital Library Ontology. Computer
Information Systems - Analysis and Technologies - 10th International Con-
ference, CISIM 2011, Kolkata, India, December 14-16, 2011. Proceedings:
Communications in Computer and Information Science (CCIS) Volume 245,
PP 277-285, Springer

4. List of Presentations in National / International:

• A Graph-Based Formalism for Controlling Access to a Digital Library Ontol-


ogy. Computer Information Systems and Industrial Management - 11th IFIP
TC 8 International Conference, CISIM 2012, Venice, Italy, September 26-28
• Resolving conflicts between role-hierarchy and concept-hierarchy in a Digital
Library ontology, Emerging Applications of Information Technology (EAIT),
Dec 2012, Indian Statistical Institute, Kolkata
• Controlled Access over Documents for Concepts Having Multiple Parents in
a Digital Library Ontology. Computer Information Systems - Analysis and
Technologies - 10th International Conference, CISIM 2011, Kolkata, India,
December 14-16, 2011
iii

Certificate from the Supervisors

This is to certify that the thesis entitled “On the Design of An Ontology Based
Access Control Model : A Digital Library Perspective” submitted by Shri Subha-
sis Dasgupta, who got his name registered on 21st november 2012 for the award
of Doctor of Philosophy(Engineering)degree of Jadavpur University, is absolutely
based upon his own work under the supervision of Prof. Aditya Bagchi and Prof.
Chandan Mazumdar and that neither his thesis nor any part of the thesis has been
submitted for any degree/diploma or any other academic award anywhere before.
This thesis fulfills all the requirements as per for its submission.

1............................................. 2.......................................................
(Aditya Bagchi) (Chandan Mazumdar)
“Simplicity is a great virtue but it requires hard work to achieve it and education to
appreciate it. And to make matters worse: complexity sells better.”

Edsger W. Dijkstra
Abstract
On the Design of An Ontology Based Access Control Model : A Digital
Library Perspective

A Digital Library (DL), like any other library management system, supports documents
related to different subject areas. These documents may be collected in one place or may
be distributed in many different repositories, seamlessly integrated to offer a composite
view to a user. The metadata structure provided for the bibliographic search makes
the source of a document transparent to any user of the library. In a library, related
materials are usually referred by a common index. The most common classification
systems like Dewey Decimal Classification(DDC) or LCC provide a strict classification
of artefacts. With the help of ontology, concepts of the digital library can be arranged
according to the knowledge relationship of them. Hence, the recent development of the
Digital Library Research recommends that Ontology can improve the indexing and cat-
aloging of the digital library. According to the digital library community, polyhierarchic
structure should be present in digital library ontology. However, existing access control
mechanisms for semantic web do not support polyhierarchic structure. This thesis de-
scribes the problem of polyhierarchic structure for Digital Library Ontology and designs
an access control model for such structure. The proposed model in the thesis ventures
to provide a view based solution for users/user-groups. The model uses existing graph
transformation mechanism to establish the theoretical part of ontology administration.
A view generation algorithm has been proposed for user-group specific secured view gen-
eration. The model also considers existence of both user-group/subject hierarchy and
concept/object hierarchy, to study possible conflicts between them. Analysing the var-
ious conflicts between these two hierarchies, possible solutions have also been proposed
for resolving them. In the implementation part, a testbed has been developed using
various open source tools. The relevant algorithms have been tested with a large data
set available from SNAP(Stanford Network Analysis Platform). The source code for the
implementation is also available at The GIT hub1 . The thesis also proposes a security
model for the most modern digital library where collaborative development is possible.
The Update model offers a provision and obligation based secure locking model for the
digital library.

1
OBAC https://github.com/dsubhasis/obac

v
Dedicated to my parents and wife
Acknowledgements
I must thank my supervisors Prof. Aditya Bagchi of Indian Statistical Institute and
Prof. Chandan Mazumdar of Jadavpur University, India for providing the opportunity
to study for a Ph.D. It was a great trust on their part for which I am extremely grateful:
the last four years have been very rewarding both intellectually and personally. Their
experience, technical ability, patience, generosity and attention to detail have made a
tremendous contribution to the quality and accuracy of my work.
I am grateful for the support and genuine, if somewhat apprehensive, interest of my
family and friends. I am grateful to my parents, my wife and all my family members. A
very special thanks to Mrs. Bagchi for her kindness and providing us delicious snacks
and teas during each critical argument session with Prof. Bagchi.
I am grateful to Dr. Pinakpani Pal for the interest and enthusiasm he has shown in
my work. I am also thankful to my colleague in the project. Particular thanks go to
Raghu, Roukna, Moumita. My attempts to explain my work to them can only have
improved the clarity of my arguments. I am also extremely grateful to Prof. Dipti
Prasad Mukherjee and Prof Bhabotosh Chanda for providing advices and supports. I
am also thankful to all the workers of Electronics and Communication Sciences Unit,
Indian Statistical Institute. I am also grateful to my friends at Indian Statistical Insti-
tute; Soumita, Sanchayan, Mrinmoy, Pulak, Bikash, Umer, Jija, Bapi and others for the
wonderful experiences.
I would like to thank Prof. Ernesto Damiani, University of Milan, Italy, for the time
he spent discussing his work with me despite his busy schedule, and for sharing the
benefit of his experience and expertise in access control. Also, I am grateful to Prof.
Sushil Jajodia, George Mason University, U.S.A, for his expert advice and comments on
Access Control issues. I would like to express my respect and thanks to the Unknown
Reviewers of my papers.
I am grateful to Prof. Bimal Roy, Ex-Director of Indian Statistical Institute and Prof.
Sanghamitra Bandyopadhyay, Director, Indian Statistical Institute for providing facili-
ties and supports.
I am sincerely thankful to Prof. Sivaji Bandyopadhyay, Dean, Faculty of Engineering,
Jadavpur University and the F.E.T. Office for the regular official procedures. I am also
grateful to Prof. Debesh Das, Head of the Department, Computer Science and Engi-
neering, Jadavpur University.
My special thanks to Prof. Nandini Mukherjee, Prof. Sanjay Saha, Dr. Bibhash Chan-
dra Dhara, Dr. Anirban Sengupta of Jadavpur University and Dr. Sushanta Karmakar
of IIT, Guwahati for helping on various research issues.

viii
Contents

Abstract v

Acknowledgements ix

Contents x

List of Figures xiv

List of Tables xvi

1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contribution of The Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Outline of The Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Digital Library Past and Present 7


2.1 Digital Library Research : A Brief History . . . . . . . . . . . . . . . . . 7
2.1.1 Evolution of Digital Library . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Research and Practice . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.3 Evolving the Web beyond simple linking . . . . . . . . . . . . . . . 10
2.1.4 Document Annotation Systems . . . . . . . . . . . . . . . . . . . . 11
2.2 Semantic Web and Digital Library . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 European Data Model and Linked Data For Digital Library . . . . 12
2.3 Modeling of Digital Library . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 5-S Model of digital Library . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 The DELOS Digital Library Reference Model . . . . . . . . . . . . 15
2.4 Contemporary Data Model of Digital Library . . . . . . . . . . . . . . . . 18
2.4.1 Data Model of 5-S Incorporating Complex Object of Digital Li-
brary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 DRM : Extension of DELOS Data Model . . . . . . . . . . . . . . 19
2.4.3 Digital Library Metadata . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Further Development In Digital Library . . . . . . . . . . . . . . . . . . . 22

3 Relevance of Polyhierarchic Structure in Digital Library Ontology 24


3.1 Polyhierarchic Structure In Ontology . . . . . . . . . . . . . . . . . . . . . 24

ix
Contents x

3.2 Polyhierarchic Structure for Digital Library Ontology . . . . . . . . . . . 25


3.2.1 Relevance of Document Classes . . . . . . . . . . . . . . . . . . . . 26
3.2.2 Semantic Relationships . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Access Control Issues for Structured Metadata and Semantic Digital


Library 34
4.1 Access Control Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Access Control for Modeling Web . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.1 Empirical Data Dissemination Technique . . . . . . . . . . . . . . 35
4.2.2 XML based Information Dissemination . . . . . . . . . . . . . . . 36
4.2.3 Authorisation for Access in XML . . . . . . . . . . . . . . . . . . 37
4.3 Access Control for Semantic Web . . . . . . . . . . . . . . . . . . . . . . 39
4.4 Access Control Model for Digital Library . . . . . . . . . . . . . . . . . . 40
4.4.1 Access Control for Polyhierarchic Structure . . . . . . . . . . . . . 42
4.5 Proposed Access Control Model . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5.1 Credential of Digital Library . . . . . . . . . . . . . . . . . . . . . 43
4.5.2 Authorisation and Policy Specification . . . . . . . . . . . . . . . . 44
4.5.3 Access to Document Class . . . . . . . . . . . . . . . . . . . . . . . 50

5 Graph Transformations and View Generations 53


5.1 Graph Transformation and Access Control . . . . . . . . . . . . . . . . . . 53
5.1.1 Type Graph and State Graph . . . . . . . . . . . . . . . . . . . . . 54
5.1.2 Administrative Operations on a Concept Hierarchy using Graph
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.3 Decidability of Authorisation and Relevance of View Creation . . . 61
5.1.4 Properties of Ontology Views . . . . . . . . . . . . . . . . . . . . . 69

6 Access Control Specification and Conflict Management 70


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.1.1 Object and Subject Hierarchy . . . . . . . . . . . . . . . . . . . . . 72
6.1.2 Completeness of The rules . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Conflict and Conflict Management . . . . . . . . . . . . . . . . . . . . . . 76
6.2.1 Conflict Resolution Against Update . . . . . . . . . . . . . . . . . 81

7 Architectural View and Implementation Details 84


7.1 Anatomy of The Digital Library Deployment . . . . . . . . . . . . . . . . 85
7.2 Execution Sequence and Process Flow . . . . . . . . . . . . . . . . . . . . 88
7.3 Design of The View Generation Systems . . . . . . . . . . . . . . . . . . 90
7.3.1 Implementation of the Algorithm . . . . . . . . . . . . . . . . . . . 91
7.3.2 Database Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.3.3 Security Engine Setup . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.4 Time Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.5 Data Set & Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 98
7.5.1 Perforamce Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8 Update Policy for Digital Library 102


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.1.1 Addition and Update Issues . . . . . . . . . . . . . . . . . . . . . . 103
Contents xi

8.1.2 Authorisation Conflicts in Update Operations . . . . . . . . . . . . 103


8.1.3 Provisional authorisation Module (PAM) . . . . . . . . . . . . . . 105
8.1.3.1 Provision and Obligation Set . . . . . . . . . . . . . . . 105
8.2 Conflicts and Resolutions using Locking Protocols . . . . . . . . . . . . . 107
8.2.1 Partial Ordering and Priority of Locks . . . . . . . . . . . . . . . . 107
8.2.2 Conflicts and Management . . . . . . . . . . . . . . . . . . . . . . 108
8.2.3 PO Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.3 Architectural Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

9 Conclusion 114

Bibliography 116
List of Figures

2.1 5-S Model DL Generation process . . . . . . . . . . . . . . . . . . . . . . 15


2.2 DELOS Digital Library Reference Model . . . . . . . . . . . . . . . . . . . 16

3.1 Example ontology structure with Poly-Hierarchy . . . . . . . . . . . . . . 26


3.2 An augmented ontology structure with the concept Database . . . . . . . 27
3.3 Document Classes Under “Database” . . . . . . . . . . . . . . . . . . . . . 28
3.4 Document Class Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1 Implicit and Explicit authorisation . . . . . . . . . . . . . . . . . . . . . . 47

5.1 Graph Morphism Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53


5.2 The Type Graph of Concept Hierarchy . . . . . . . . . . . . . . . . . . . . 55
5.3 The State Graph of Concept Hierarchy . . . . . . . . . . . . . . . . . . . . 55
5.4 Add user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5 Remove User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.6 Add Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.7 Alter Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.8 Assign Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.9 Revoke Permission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.10 Sub-Graph Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.11 Node Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.12 Creation of View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1 User-group Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71


6.2 Object and Subject Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.1 Deployment Diagram of Digital Library . . . . . . . . . . . . . . . . . . . 85


7.2 Sequence Diagram of Digital Library Systems . . . . . . . . . . . . . . . 89
7.3 Class Diagram of The View Object : Data Dissemination . . . . . . . . . 92
7.4 Class Diagram of The View Object Security Enforcement . . . . . . . . . 92
7.5 Complete Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.6 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.7 Virtuoso Admin Console . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.8 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.9 Execution time with incresing view size . . . . . . . . . . . . . . . . . . . 100
7.10 Performance Monitoring Data from Zabbix . . . . . . . . . . . . . . . . . 100
7.11 Web Based Load Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.1 Provisional Access Manager . . . . . . . . . . . . . . . . . . . . . . . . . . 112

xii
List of Tables

3.1 Document Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.1 Conflicting User-Group Table . . . . . . . . . . . . . . . . . . . . . . . . . 82


6.2 Conflict Free User-Group Table . . . . . . . . . . . . . . . . . . . . . . . . 82

8.1 The compatibility matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

xiii
Chapter 1

Introduction

1.1 Introduction

Information security model for a multi-user system deployed over network has been fo-
cused as an important consideration for many years. Recently, National Institute for
Standards and Technology (NIST) has suggested that, the goals of the information sys-
tem security include confidentiality, integrity, availability, accountability and assurance.
Also, the main components of a security model are authentication, audit, and access
control [42, 81]. This thesis is mainly focused on an access control model for a polyhier-
archic ontology. In general, access control provides mechanism to enforce constraints
on organisation for access on its resources. As a result, constraints are important and
powerful mechanism for enforcing security. For example in an organisation, constraint
can specify higher level policies to put restrictions on the access of its various elements.
From access control perspective, an organisation is a collection of interrelated subjects
(elements trying to access) and objects (elements to be accessed). An authorisation
provides mapping between subject and object and associated constraints, which are
often defined as access rights. In this thesis, an Access Control Model for Semantic
Web application with specific attention to Digital Library ontology metadata have been
proposed. The model considers a digital library ontology, as a collection of related con-
cepts connected through semantic links and arranged as a directed acyclic graph(DAG).
Moreover, inheritance property considered in this research work ensures that among
concepts, information flows from lower concepts to higher concepts along the hierarchy
where as any authorisation/restriction specified on a concept is inherited in the opposite
direction. In other words, a restriction imposed on higher concepts are inherited by
lower concepts along the hierarchy. Since the research effort is mainly concerned with
digital library ontology, from access control point of view it has been considered as an

1
Chapter 1. Chapter Title Here 2

open system. So by default, all concepts can be accessed by all users. If a user is not
permitted to access any concept, exclusive negative authorisation has to be imposed.
Likewise, the thesis has proposed and designed a new Digital Library architecture that
supports a polyhierarchic ontology structure where a child concept representing an inter-
disciplinary subject area can have multiple parent concepts. Since, the proposed Digital
Library Architecture considers polyhierarchy, the underlying hierarchical structure be-
comes a Directed Acyclic Graph instead of a tree. Presence of multiple parent concepts
for a child concept provides a document classification facility according to the interest of
each parent concept. Thus a user accessing through a path involving any of the parent
concepts would have a smaller search space within its relevant document classes only.
The research effort then proposes an access control mechanism for controlled access to
different concepts by different users depending on the authorisations available to each
such user. Authorisations to a user depends on the credential of the concerned user. So,
users having similar credentials are placed in a user-group and authorisations assigned to
a user-group is inherited by all its members. The proposed model thus provides a better
knowledge representation and faster searching possibility of documents for modern Dig-
ital Libraries with controlled access to the system. It has further been shown that the
proposed access control model may give rise to undecidability problem. A user-group
specific view generation mechanism has been developed to solve the problem. After
proposing the access control model for a single user-group, the model is extended to
include multiple user-groups where the user-groups are also placed in a hierarchy. It is
very common for UNIX or Group based systems. This consideration increases the com-
plexity of the access control system since access to a concept now involves an interplay of
two hierarchies: concept and user-group. Solution of any access control related conflict
arising out of this system also considers issues related to separation of duty. During
the design part, the work has considered Static SOD and resolved all the associated
conflicts.

1.2 Contribution of The Thesis

The entire thesis stands on four major ideas. The ideas are as follows:

• Modelling Digital Library Metadata as a Polyhierarchic Ontology : The


presence of poly-hierarchic structure is very common in semantic data modelling.
Several research reports have noticed polyhierarchic structure in their applications.
Like, Spampinato and Zangara[142] have attempted to develop a Web information
system for digital humanities work in Classical Antiquity and to publish data
Chapter 1. Chapter Title Here 3

according to Linked Open Data principles. They have used “Bibliotheca Iuris An-
tiqui” (BIA) systems, which is a vertical framework in the Ancient Roman Law
domain to design the applications. In the modelling section, they have found that
the BIA Thesaurus is polyhierarchical. In another interesting work, Bodenreider
et al. [25] have shown that polyhierchic structure is also necessary to represent
semantic modelling of biomedical data. A Digital Library (DL), like any other li-
brary management system, supports documents related to different subject areas.
These documents may be collected in one place or may be distributed in many
different repositories, seamlessly integrated to offer a composite view to a user.
The metadata structure provided for the bibliographic search makes the source
of a document transparent to any user of the library. In a library, related docu-
ments are usually referred by a common index. The most common classification
systems are Dewey Decimal Classification(DDC) [131] and Library of Congress
Classification (LCC) [37]. The common classification systems classify all the doc-
uments in some different classes and ignore the underline knowledge relationship of
the document. Hence, if a document contains multi-disciplinary knowledge, then
the document should be available for all the concerned parent concepts, which
requires a multiple parent representation structure. The current model considers
the polyhierachic structure of digital library metadata.

• Access Control Model for Polyhierarchic Structure : Access control model


of semantic data has drawn the interest of many research groups. Starting from
the standalone web server to enterprise security research have innovated various
technique for fine-grained access control mechanism. The main contribution of
those works are related to tree structure data like XML [16, 43, 58] or OWL
/RDF [52, 85] data restricted as the tree structure. Modern data representation
technique like OWL, RDF and JSON, etc. are capable of representing the polyhier-
archic structure. Moreover, many applications like digital library, Online Social
Network, Biological network and many other application areas need polyhierachic
structure. However, the access control model for polyhierarchic structure is hith-
erto unexplored. The available technology is not capable of handling the security
requirements efficiently. This work considers access control requirements for the
polyhierarchic structure.

• User Group Based View Generation : In an organisation like University or re-


search institute, users’ are arranged into several groups like Student, Research
Scholar, Teaching Assistant, Faculty etc. Moreover, there may be different verti-
cals like department, various schools, academic office and examination cell, external
collaboration, etc. A user may be a member of one or more of such groups. For
example, a student may be a member of one department and also he is assigned to
Chapter 1. Chapter Title Here 4

a particular research group. Hence, these groups are interrelated and authorisa-
tions are inherited from a super- group to a sub-group. Interaction between group
hierarchy with ontology hierarchy and conflicts arising out of such interplay have
been studied in detail.

• Administrative Scope and Conflict Resolution : Administrative scope is


decided into two major division one is managing digital library ontology and other
is managing user/user-group. The model has adopted graph transformation ap-
proach for managing ontology and users. Management of Ontology includes add,
delete, alteration of concept etc. and the management of user includes add, delete
and permission management. Administrative functions also handle conflict of in-
terest and separation of duties.

Hence, the summary of the contribution is as follows :

• Polyhierachic Structure of Ontology Based Digital Library : It provides better


knowledge representation for present day digital libraries, since new interdisci-
plinary subject areas are getting introduced. Concepts representing interdisci-
plinary subject areas will have multiple parents and consequently, the library on-
tology introduces the new set of nodes representing document classes and thus
provides faster search mechanism.

• Access Control Model for Digital Library : A new access control model has been
introduced for the ontology structure where a user gets authorisations to access a
concept node only if its credential supports it.

• Group Based View : Group Based view generation algorithm has been developed
so that a user belonging to a group can access through the view designated for its
group. It avoids any possibility of undecidability in authorisation specification.

• Conflicts among User-Group and Concept Hierarchies : Analysis of conflicts among


user-group hierarchy and concept hierarchy have been modelled in this work and
possible solutions have been proposed.

• Implementation and Performance Monitoring : Implementation of the proposed


model with UML diagram and the link of GIT (Java) repository have been given.
Moreover, performance monitoring (Time, CPU, Memory) of the algorithm has
been done for relevant dataset collected and cleaned from standard source like
SNAP.

• Update Model : Collaborative learning environment is the feature of digital library.


Hence, a model of secured update for collaborative learning have been proposed.
Chapter 1. Chapter Title Here 5

1.3 Outline of The Thesis

In Chapter Two, the brief history of digital library evolution has been studied. It cov-
ers, from the first proposal of digital library published by N.S.F during 1994 - 98 to
modern ontological digital library proposed by various research groups. It also covers
all important aspects of the evolution like document annotation, semantic web-based
document repository, distributed content management metadata processing and so on.
Digital library is a multidisciplinary research area, hence first few years of the digital
library research were very much ad-hoc in nature. Later in a few groups on various re-
search laboratory from multiple disciplines have designed and published complete model
of digital library. Chapter Two covers some of those models.
In Chapter Three, the most important concept, the polyhierarchic structure and rele-
vance to digital library ontology has been discussed. The discussion includes structure
and the semantic relationship among different concepts. This chapter also introduces
digital library ontology with a formal model and example.
Chapter. Four is about the access control mechanism and existing technology. In this
chapter, existing fine-grain access control mechanism for XML and OWL have been dis-
cussed. This chapter also provides a formal representation of access control model.
Chapter Five contains various details of the view generation algorithm, and administra-
tive scope have been discussed. The administrative model uses graph transformation
to perform various operations like add, alter, delete, and so on. Details of those with
algorithm have been provided in this chapter.
In Chapter Six, the multiuser model with a group hierarchy has been introduced. In-
teraction of the group hierarchy with the ontology hierarchy is the primary idea of this
chapter. Conflict of interests and separation of duties have been discussed in this model.
Resolution of conflicts has also been provided in this chapter.
In Chapter Seven, the implementation model of the view generation algorithm has been
discussed. The model is based on Java and using UML the various component of the
model has been discussed. The algorithm has been tested using SNAP data and perfor-
mance monitoring details, like CPU, Memory and Time have been shown.
In Chapter eight, an extension of the model to adopt modern digital library’s collabora-
tive writing facility has been discussed. This chapter provides a provision and obligation
based locking protocol to ensure the security and integrity during the update of the dig-
ital library.
Chapter 2

Digital Library Past and Present

2.1 Digital Library Research : A Brief History

Perceptions of digital libraries vary and change over time. Different research groups have
conceptualised it in several ways. For librarians, it is another form of a physical library.
For computer scientists, it is a form of text-based information systems deployed over
the network or interactive multimedia based network system. However, for end users;
digital library is a web-based information system with collection of documents in the
form of texts and other multimedia elements. This vast possibility has led the digital
library to a multidisciplinary research domain which involves various research groups
from different disciplines. As a consequence, during the first 20 years of the lifespan of
digital library research, many ideas have evolved on ad-hoc basis. [34, 75, 76].

2.1.1 Evolution of Digital Library

In United States, digital library research initiatives have been accepted among most
significant projects, under the High-Performance Computing and Communications Ini-
tiative, USA, and acknowledged as “National Challenge Application Area.” During that
period 1994 to 1998, research in this area started involving three US Agencies, and a few
Universities [53], [54], [55]. After the successful completion of Phase I, it has been ex-
tended for Phase II with eight agencies. The digital library initiative I and II funded by
National Science Foundation, USA (NSF) and other agencies have integrated the tech-
nical, social, behavioural and economic research to develop a prototype of digital library
architecture. However, the project funded by NSF has two more significant outcomes
[27, 28]:

6
Chapter 2. Digital Library Past and Present 7

• Digital library is a set of electronic resources and associated technical capabilities


for creating, searching and using information. Precisely, it has an extension and
enhancement of information storage and retrieval systems that manipulate digital
data in any medium (such as text, images, sounds, static or dynamic images)
and it is also deployed over a distributed network. Nonetheless, the content of
digital library includes data, metadata that describes various aspects of the data
(e.g. representation, creator, owner, reproduction rights) and also metadata that
consists of links or relationships to other data or metadata, whether internal or
external to the digital library.

• Digital library is usually constructed, collected and organised, by (and for) a com-
munity of users. Its functional capabilities to support the information needs and
benefits of that community in which individuals and groups interact with each
other by utilising data, information and knowledge, resources and systems. In this
process, they are the extension, enhancement and integration of a variety of in-
formation institutions as physically places, where resources are selected, collected,
organised, preserved and accessed in support of a user community. These infor-
mation institutions include among other libraries, museums, archives and schools,
but digital libraries also extend and serve other community settings, including
classrooms, offices, laboratories, homes and public spaces.

Digital Library Research is of interest to many diverse research communities worldwide


[106][108][110][159][23][27][95]. Consequently, an international digital library program
was recently declared by the National Science Foundation. The United Kingdom has
put the initiatives for the Electronic Libraries Programme (eLib) 1 . Moreover, in recent
decades, many other nations like Italy, Germany, India, and so on, have taken initiatives
and granted funds [156] [144][48][27].

2.1.2 Research and Practice

Requirements of modern digital library have been documented by several authors [7,
47, 66, 121, 157]. Association of Research Libraries (ARL) and some other organisation
have framed out digital library requirements. Many organisations like The Association
for Computing Machinery (ACM), The Institute of Electrical and Electronics (IEEE),
Elsevier, Springer have published their repository and documents in the form of digital
library, and most of the leading universities have built their own online digital repository
[3–6]. For example, a system designed for this purpose went online during August 1991.
This system originally named e-print archive and later popularly known as arXiv was
1
http://ukoln.bath.ac.uk/elib/
Chapter 2. Digital Library Past and Present 8

built to produce scientific communications in more effective and economic way although,
restricted to Physicists community. The arXiv system opened the way to deal with the
social and economical issues related to the open access to outputs coming from publicly
funded research, that was later stated on Berlin Declaration [34]. In spite of technical
restrictions, it can be seen as a prototype of the modern institutional digital library, i.e.
systems providing functionalities for managing self-publications. [111].
Likewise, many research groups have made significant innovations in the digital library
domain to address different issues. Such early examples like Electronic Thesis and Dis-
sertation Repository (ETDR), “Archive of Cognitive Sciences Paper”, and “Research
in Economics (RePEc)” were launched during, 1996 - ’97. All this initiatives are the
founding stones of Open Archive Initiatives(ETDR) [56, 57], [51, 94, 100]. These pio-
neering ideas have also enriched the other areas like information storage and retrieval,
also the storage technologies. They also helped in defining the digital library i.e. meta-
data or data about the documents. Essentially, during that time digital library research
community has started to build the requirement of the new digital library design which
focused to provide the functionality of a traditional library i.e. acquisition, cataloging(or
indexing), search, and discovery of information, in the context of a distributed systems
[14]. Common elements of the digital library has been identified by ARL, which can
be studied for historical perspective and contains some very high level view of digital
library. The assumptions are :

• The digital library is not a single entity.

• The digital library requires technology to link the resources of many.

• The linkages between the many digital libraries and information services are trans-
parent to the end users.

• Universal access to digital libraries and information services is a goal.

• Digital library collections are not limited to document surrogates: they extend to
digital artefacts that cannot be represented or distributed in printed formats.

All these different definitions of digital library enhanced the scope of digital library in
different directions and encouraged different disciplines of research on digital library
development. As a matter of fact, dimension of digital library research has moved
beyond library sciences to include various other research areas like information science,
web technology, search, security and so on. Nonetheless, in the process of collecting and
arranging components, digital library research has emerged with several new challenges
which encompass information-related activities of multiple participating institutions.
Chapter 2. Digital Library Past and Present 9

2.1.3 Evolving the Web beyond simple linking

Development of hypermedia has structured information as a semantic network of nodes


connected by semantic links. It helps in representing a document as an interlinked
structure instead of a linear or sequential structure required for printing media [22]. It
has given rise to research efforts, on document portability formats, like XML, EPUB,
mobi, and etc.[141]. For example, “X-Ray” application of Amazon Kindle creates an
intelligent index of the content of e-books which increase readability and flexibility of
E-Books over normal document. Furthermore, recent developments on semantic web
and hypertext systems have driven the digital library research with a new dimension.
Web provides a rudimentary infrastructure for publishing digital library through vari-
ous types of semantically associated metadata, over static links[22, 60, 139]. Semantic
web has already demonstrated its capability to set up bi-directional links and connected
components. Additionally, latest computational capability supports invulnerable data
transfers and secure query on web devices using safe channels. Storage technology has
further improved to support easy and reliable storage facilities. As for instance, most
of the e-book readers provide a large storage for user and some of them also use cloud
storage for keeping documents and increase the availability on demand. Many organ-
isations have already adopted distributed and heterogeneous communications through
web services. For example, some universities in The U.S. have set up a project to share
Library data [145]. They have set up Peer to Peer channel and materialised view of each
active peer, which is a very contemporary example of resource sharing in the digital
library domain.

2.1.4 Document Annotation Systems

Document Annotation systems have evolved for semantic documents. It can be done
either automatically [46] or manually [83]. In both the cases, semantic annotation of
documents can help to qualify their contents, enable search and retrieval, and support
collaborations. Purpose of the annotation system is to annotate each segment in a doc-
ument with the most specific concepts. Readers then activate these annotations as a
guide to access required information [113]. Primarily, annotation was built for highlight-
ing and writing comments in the margins of the document during reading. Nonetheless,
Computer-based annotations can also be used for a variety of tasks [30]. Annotation can
be used for collaborative writing and many other supportive utilisation of documents
[12]. Furthermore, retrieval of contents, versioning documents, security parameters can
be done through document annotations. Many commercial products have adopted the
document annotation technology for different purposes. For example, Microsoft Office
Chapter 2. Digital Library Past and Present 10

and Lotus Notes provide various annotation based facilities starting from insert comment
to user’s modification details and others. Similarly, some organisations use document
annotation technology to secure their intellectual properties. For an instance, Common-
Space, one of the most popular Peer-Review Evaluation Process (PREP)systems, use
this facility for interaction between instructors and students. It’s a collaborative system
where a student can write a paper and the teacher can responded on the paper. The
students can revise the paper based on the comments received. The teacher then, can
view all versions of the paper with comments attached.

2.2 Semantic Web and Digital Library

In its current status, it is important to understand how Semantic Web deployment has
influenced information management.It has also contributed significantly in the areas
like resource discovery, metadata manipulation, and interoperability [91]. Semantic web
represents the digital library components in web standards and provides interfaces by
controlling the data sets with some customary Application Program Interface (API)
like Ontology Web Language (OWL) API or Apache Jena. Some major projects re-
lated to digital library ontology are BRICKS, SIMILE, Fedora, JeromeDL and Talia.
Both BRICKS [71] and Fedora [124] provide the basic architecture through which digital
library applications can be acquired and deployed. The Idea of BRICKS project is to de-
sign, develop, and hold user and service-oriented infrastructure to share knowledge in the
cultural heritage field. It also offers a lot of useful services on top of which each content
provider can prepare its own application and user interface. Likewise, FEDORA is an
interoperable, distributed repository service that can serve as a fundamental component
in an open digital library infrastructure. Furthermore, functionality of the FEDORA
is enhanced through SPARQL Protocol and RDF Query Language (SPARQL). SMILE
project focused on interoperability among digital assets, schemata, ontology, metadata.
Though, SMILE, is not a full-fledged digital library system, it provides a set of tools for
discovering and navigating digital resources on the Web, which is originally motivated
by DSpace in order to leverage and extend its facilities.
JeromeDL is more focused on developing social semantic digital library, which provides
resilient interpretability and flexibility for users [96]. JeromeDL uses Resource Descrip-
tion Framework (RDF)/OWL for storing metadata along with MarcOnt as a mediator,
which is managed by corresponding RDF repository (Sesame). In addition, it provides
a query interface RDF/OWL query service like SPARQL [97].
Chapter 2. Digital Library Past and Present 11

2.2.1 European Data Model and Linked Data For Digital Library

The European Data Model (EDM), provides a flexible a data model, that has been used
for digital library, archival storage, cultural heritage and other applications. EDM has
been introduced to replace conventional European Semantic Elements (ESE) and it pro-
vides a more sophisticated and useable data framework. EDM has been funded by The
European Commission under “eContentplus” program 2008, which is the continuation of
EDLnet thematic network that created European Digital Library (EDL) Foundation and
European prototype [9]. However, the design principles underlying the EDM are based
on the core functionality. Best practices of the Semantic Web and Linked Data efforts
are the most significant contributions of this project. EDM started in 2005 and by the
end of 2006, European Commission has designated the “European Digital Library” as
a flagship project within the strategy of i2010 [123]. However, the aim of this initiative
was to deliver integrated access to millions of digital objects from cultural collections,
distributed across a wide range of museums, archives and libraries in Europe.

2.3 Modeling of Digital Library

This subsection will review the contemporary digital library model. First and the most
significant model is the 5-S model. It considers all the utilitarian aspects of the digital
library, starting from digital object and metadata to grouping of users of similar interest
into virtual organisations. On the other hand,Model of Digital Library (DRM)(DELOS
reference model ) model is more focused on developing a query language for a core digital
library management system at the level of complexity required. Both model consider
ontological representation of digital library based on RDF/OWL structure.

2.3.1 5-S Model of digital Library

The 5-S model of digital library [64, 65] represents a formal model by defining high level
abstracts as digital objects, collections, repositories, services etc. 5-S model of digital
library can be represented as follows :

• Stream : Streams are sequences of elements of an arbitrary type. A Stream can


be a set of streams, as well.

• Structure : A structure specifies the way different parts of a library are arranged
or organised. In digital libraries, structures can represent hypertexts, taxonomies,
system connections, user relationships, containment, data-flows, and workflows, to
Chapter 2. Digital Library Past and Present 12

cite a few. Structs is a set of structures, which can be represented as (G, φ) where
G = (V, E) is a directed graph and φ : (V ∪ E) → L is a labelling function.

• Spaces :A space is any set of objects together with operations on those objects
that obey certain rules. Despite the generality of this definition, spaces are math-
ematical constructs. Operations and rules associated with such construct define
properties of a space. Spaces are distinguished by the operations on their elements.
Digital libraries can use many types of spaces for indexing, visualising, and other
services that they perform.

• Scenarios : A scenario is a story that describes possible ways to use a system to


accomplish some function that the user desires. Scenarios are useful as part of the
process of designing information systems. However, a more technical depictions
of scenarios are the set of events, that represents the change in computational
states in digital library. Scs = {sc1 , sc2 , ..., scd } is a set of scenarios where each
SCk =< e1k (p1k ), e2k (p2k ), ..., edk k(pdk k ) > is a sequence of events that also can
have a number of parameters pik .

• Society : A society is a set of entities and activities and the relationships between
them. (C, R) where C is a set of communities and R is a set of relationships among
communities. SM = {sm1 , sm2 , ..., smj }, and Ac = {ac1 , ac2 , ..., acr }are two such
communities where the former is a set of service managers responsible for running
DL services and the latter is a set of actors that use those services.

• Collections : Collection C = {do1 , do2 , ..., dok } is a set of digital objects. Each
digital object is a tuple of unique identifier, stream, structure and structured
stream.

• Catalogs : Catalogs are the set of metadata catalog, where each catalog is the
descriptive metadata specification.

• Repository: Repository is the set of catalog and collections. It is assumed that


there exists operations to manipulate them (e.g., get, store, delete).

• Service : Service Sek = {sc1k , .., scsk k } is described by a set of related scenarios.

Beside formal definition, to improve the acceptability and interoperability, the model
has attempted to use existing standard specification sublanguages for representing dig-
ital library(DL) concepts. It has used XML syntax, XML/RDF schema for identifying
the structure and metadata, MIME types to encode streams, RuleML and MathML for
representing event scenario and for evaluating them. The general process of automatic
creation of DLs and a particular application is shown in Figure. 2.1. The conceptual
Chapter 2. Digital Library Past and Present 13

Figure 2.1: 5-S Model DL Generation process

design of digital library is normally preceded by a 5S analysis. Declarative specifica-


tions in 5SL are then fed into a DL generator to produce tailored DLs, suitable for
specific platforms and requirements. DL generator compatible with MARIAN which is
a standard digital library interface.

2.3.2 The DELOS Digital Library Reference Model

DELOS is another model of digital library [31–33]. It provides a three tier structure for
Library and the Library Management System.

• Digital Library (DL) : A virtual organisation that comprehensively collects, han-


dles, and preserves digital content, and offers it to its user communities required
functionalities and specifications.

• Digital Library System (DLS) : A software system that is established on a defined


( perhaps distributed) architecture and it offers all functionalities required by a
particular Digital Library.

• Digital Library Management System (DLMS) : A generic software system that


provides the appropriate software infrastructure, both :

– to produce and administrate a Digital Library System incorporating the suite


of functionalities considered foundational for Digital Libraries and
– to integrate additional software for offering more refined, specialised, or ad-
vanced functionalities.

DLMS may provide a mechanism to make a platform for Digital Library Systems with
all its operational and technological prerequisites.
Chapter 2. Digital Library Past and Present 14

Figure 2.2: DELOS Digital Library Reference Model

• Extensible Digital Library System : A complete Digital Library System that is


fully operational with respect to a defined core suite of functionalities.

• Digital Library System Warehouse : A collection of software components, that


encapsulate the core suite of DL functionalities and a set of tools, that can be
utilised to merge these ingredients in different ways (in Lego -like
R fashion) to
offer tailored integration of functionalities.

• Digital Library System Generator : A highly parameterised software system that


encapsulates templates covering a wide range of functionalities, including a defined
core suite of DL functionality as well as any advanced functionality that has been
deemed appropriate to conform to the needs of certain specific application field.

• Content : The Content concept encompasses the data and information that the
Digital Library handles and makes available to its users. Concept encompasses the
diverse range of information objects, including such resources as objects, annota-
tions, and metadata. For example, metadata have a central role in the handling
and use of information objects, as they provide information critical to its syntac-
tical, semantic, and contextual interpretation.

• User :The User concept covers the various actors (whether human or machine)
entitled to interact with Digital Libraries.

• Functionality : The Functionality concept encapsulates the services that a Digital


Library offers to its different users including classes of users or individual users.

• Quality : The Quality concept represents the parameters that can be used to
characterise and evaluate the content and behaviour of a Digital Library.
Chapter 2. Digital Library Past and Present 15

• Policy : The Policy concept represents the set or sets of conditions, rules, terms and
regulations governing interaction between the Digital Library and users, whether
virtual or real.

• Architecture: The Architecture concept refers to the Digital Library System entity
and represents a mapping of the functionality and content offered by a Digital
Library onto hardware and software components.

Digital library model of DELOS has been acquired by several European Research groups
and also appreciated by several other agencies around the world. It aims at providing
common semantics that can be used unambiguously across and between different appli-
cation areas both to explain and organise existing digital library systems and to support
the evolution of research and developments in this area. The main contribution of DE-
LOS project was to establish conceptual design of a Digital Library.

2.4 Contemporary Data Model of Digital Library

2.4.1 Data Model of 5-S Incorporating Complex Object of Digital Li-


brary

Content of the digital library may vary for different kind of objects starting from a
simple text document to books and multimedia objects. However, 5-S model adopts all
kind of objects places them in one of the three classes :

• Atomistic : A single file in a preferred format (made up from a single or multiple


data/files). An electronic thesis and dissertation (ETD) in PDF format is an
example.

• Compound : Multiple content files in different formats representing a single digital


object. As an example, an ETD that has texts files and image files or other data
in non-textual format.

• Complex : A network of digital objects within a repository, having at least one


of these data as a digital object. An ETD which is composed by a software tool
developed during the research, along with respective pdf files.

Complex Object (CO) is a single entity that contains multiple objects. Multiple sig-
nificant researches efforts have been reported on the characteristics of CO. Krafft et al.
[93] and Cheung et al. [38] have developed model for Complex Object and encapsula-
tion process of various datasets and resources within a single unit, for publishing and
Chapter 2. Digital Library Past and Present 16

exchange. Hence, complex object is an assimilation of similar type objects that can be
grouped together for a particular purpose.
The formal definition of complex object can be represented through 5-S model [64] as
follows :
A digital object in the 5S framework comprises a tuple :
S
cdo = (h, SCDO = DO SM, S) where,
h ∈ H is a set of universally unique handles
DO = d1 , d2 , d3 , .........., dn where, di is a digital object or another Complex object ;
SM = sm1 , sm2 , sm3 , .........., smn is a set of stream
S is a structure that composes the CO (cdo into its parts in SCDO).

The types of streams in 5SL are restricted to some basic stream of the model, although
they can be extended for new type of stream in feature. Compound object may have
more than one types of stream/object compared to an atomistic object.According to the
nature object can be identified as SimpleObject or ComplexObject. Moreover, the CO
extension in 5SL can be used to specify DLs of COs using the 5S Graph tool [[92]]. The
following extensions were included in the new 5S Graph meta-model:

• The original component Document was renamed to SimpleObject.

• The new 5SL extension Object was added.

• The new 5SL extension for ComplexObject was added. The ComplexObject con-
cept was shortened to ComplexObj so the ComplexObj word can be viewed in its
entirety in the 5S Graph interface.

2.4.2 DRM : Extension of DELOS Data Model

DRM [117] is an extension of DELOS data model. In this sub-section, central idea of
this data model will be explained. The DRM aims at providing an ontology for digital
libraries, defining all relevant concepts in an informal, yet precise language. These ideas
grouped in the DRM under six main categories: content, user, functionality, quality,
policy, and architecture which have described in DELOS model [31, 32]. This model is
more focused on user’s perspective rather a generalised view like 5-S model. Specifically,
user’s activity of a digital library has been considered in the following ways:

• Create a new, complex object by re-using other existing objects as its content

• Provide representations of a created object.

• Describe an object of interest according to some vocabulary.


Chapter 2. Digital Library Past and Present 17

• Discover objects of interest based on content or description.

• View the representations, the content or the description of an object.

• Identify an object of interest, in the sense of assigning an identity to it.

Versioning of the object is another major issue in digital library. However, versioning
and editing of document has been left out of the scope of the DRM model. DRM has
defined digital library based on natural formalism, such as logic, and then shown the way
of implementing logic using the standard mechanism. Therefore, the model has been
focused on the ground truth of a digital library, often distinguished from an ordinary
information retrieval systems. However, an information retrieval model also provides of
importance to each object retrieved by a query. Consequently, the digital library model
presented by DRM does not go as far as specifying the degree of relevance of retrieved
objects with respect to a query, nonetheless only which objects qualify in response to a
query.

• Object(o) : The basic premise of this model starts from Digital Library Object,
which is a piece of information in digital form such as a PDF document, a JPEG
image, a text, a Uniform Resource Identifier (URI), and so on. As such, a digital
object can be processed by a computer, for instance; it can be stored in memory
and displayed on a screen [35, 116].

• Identifier(i) : Identifier identifies the object which is also an object. Hence, If


an object is identified by another object, then the first object called an identifier.
Identifiers are digital objects and applied to refer to other digital or non-digital
objects. A digital library defined on a subset of o that may or may not include
identifiers. When needed, an identifier can be taken from the set ID and used in
the digital library. The identifiers in a digital library may refer to digital objects
residing in the library or a different digital library. An identifier cannot refer to
an identifier.

• Representation : Representation of an object is important to perceive the object.


The representation in DRM model is indicated by view(o). Whenever a user inserts
an object o into the digital library, the function view is constructed incrementally
during the operation of the digital library. Henceforth, for each object o there
exists a representation function which is denoted by view(o). Unlike a concrete
object which has a unique representation, an identifier i may have any number
of representations or no representation at all. Each representation of i conveys a
different perception of the object o referred to by i. In addition, i may hold other
representations of o. Each such representation must be the view of some digital
Chapter 2. Digital Library Past and Present 18

object different from o. As, for example, i may content the references of the book
‘The Alchemist’ written by eminent author Paulo Coelho, which may be the URI
of the document or simple file systems references. Hence, representation of i must
include view(o) i.e. view of the object in e-book or pdf format which is suitable
for the book.

• Content : Content of an object o is the collection of atomic part of that object;


each of such object is called a part of o. For example, each chapter of the book
is a part of that book. Similarly, each paragraph is a part of the chapter, and
each sentence is a part of the paragraph. In real applications, content takes more
sophisticated forms offering, for instance, the distinction between re-usable versus
non-re-usable content, or allowing to order the parts of an object.

• Descriptions : Descriptions are the central attraction of digital library, and it’s
primarily metadata. Desc(d, i) is the typical representation of the description in
DRM model. Desc Express the fact that d is the identifier of the description
associated with the object referred to by i. It is important to notice that the
identifiers allow the digital library to hold descriptions of concrete objects that
are not in the digital library (i.e., residing in a different digital library). As well
as descriptions of non-digital objects. Description consists two components; 1. A
class in a description captures a salient feature that models itself as a separate
concept; typical concepts for describing digital objects may be image or video. 2.
A property-value pair in a description captures an attribute.

Using the above features, DRM model provides a query language [117] in first-order
logic. The model also use a first-order language for describing a digital library with three
fundamental features: representation, content, and description. Each of these features
expressed in the language via a set of predicates and the semantics, which determined
by the axioms of the theory. The digital library defined as one particular type of model
of the axioms and the query language is used for knowledge discovery. Using the above
model, the digital library can be mapped as a Semantic web-based application, often
termed as semantic digital library.

2.4.3 Digital Library Metadata

Digital library contains a managed collection of information in the form of various kind
of digital objects like document, e-books, multimedia and so on, with some associative
services like indexing, document delivery, authorisation control etc. Information in the
digital library are manifested in the form of digital objects and the information about
those objects as metadata. However, the distinction between data and metadata depends
Chapter 2. Digital Library Past and Present 19

upon the context, metadata commonly appears with information about the data in a
structured format. The most common type of metadata is descriptive type of metadata,
which occurs in catalog, indexes and include summary of the digital library objects. The
descriptive metadata specification inspired by development of modern metadata area
related to semantic web development, ontology data structure, and resource descriptive
framework (RDF) [115]. Each of the query of digital library can be arranged in the
from to triplet and submitted to the meta-data search engine. For example, if a query
is to search for all the Books written by author ‘Paulo Coelho’ then the query will
be Query(?book, hasAuthor, “P auloCoelho”). The search engine will find out all the
document’s where ’Paulo Coelho’ is linked by hasAuthor relationship.

2.5 Further Development In Digital Library

Various Research communities at their latest effort are trying to evolve digital library
as a more interactive system improving human interactions, collaborative writing, in-
telligent and multilingual recommendation system etc. Efficient search techniques and
recommendation systems are evolving with text analytics, NER indexing[84, 86], topic
modelling[150], etc. At the same time, some efforts have also been made for special as-
sistance to visually and hearing impaired people [39, 79, 149]. Document collection and
acquisition is another important area in present day Digital Library Research. Research
effort includes, accumulating documents from heterogeneous data model and sources,
cleaning, storing, indexing and finally query processing [154].
Collaborative writing for a digital library is a relatively new issue and gaining traction
among users [72, 104]. It is needless to mention that writing and publishing facility with
peer reviewing in the digital library will reduce the editing time compare to conventional
publishing procedure. Moreover, publishing through digital library will be helpful for
rapid research and development for the larger audience.
Currently many government agencies are building nationwide large knowledge repository
to facilitate knowledge service in an organized manner. Apart from the United States
and the European Union, other countries like India, China, South Africa, etc. are taking
initiatives to build national digital library facility. Hence, new challenges, research is-
sues, and development problems are evolving. Particularly in Indian Context, under the
umbrella of Indian Institute of Technology, Kharagpur a prototype of National Digital
Library [125] has already been developed. However, in Indian context researchers have
identified issues like storage heterogeneity, semantic search, and multilingual search etc.
[44]. First version of Indian National Digital Library is available online now 2 .

2
https://ndl.iitkgp.ac.in
Chapter 3

Relevance of Polyhierarchic
Structure in Digital Library
Ontology

3.1 Polyhierarchic Structure In Ontology

Example of polyhierarchic structure in building taxonomy has been well reported in var-
ious journals and other publications [29, 101, 119, 151]. Ontology is one of the efficient
ways to model a polyhierarchic structure [11]. Extensive use of polyhierarchic structure
in ontology representation can also be found in [15]. Polyhierarchic ontology is a rep-
resentation that permits an object to be a member of a class which is a direct subclass
of more than one superclasses. For example, the class “horse” may be represented as a
subclass of Equus, a zoological term, as well as a subclass of “racing animals,” “farm
animals,” and “four-legged animals.” The class “book” may be represented as a sub-
class of “works of literature,” as well as a subclass of “wood pulp materials” and “inked
products”[15]. Nevertheless, ontologies are not subject to the analytical limitations im-
posed by standard classification algorithms. In an ontology, a data object can be an
instance of many different kinds of classes; thus, a class does not define the essence of
the object as it does in a classification. In an ontology, the assignment of an object to
a class and the behaviour of the members of a class are determined by rules. An object
belongs to a class when it behaves like other members of the class according to a rule
set defined by the ontologist. Every class, subclass, and superclass is defined by rules,
and rules can be programmed by software. Moreover, the classifications were created
and implemented at a time when scientists did not have powerful computers that are

20
Chapter 3 . Polyhierarchic Ontology 21

capable of handling sophisticated and complex structure like polyhiearchy and to build
complex ontology.

3.2 Polyhierarchic Structure for Digital Library Ontology

In the earlier section, the relevance of the polyhierarchic structure in Ontology has been
discussed. Such polyhierarchic ontology structure has already been considered for other
applications [99, 143, 151]. Use of ontology as a metadata structure for a digital library
has already been discussed in [65]. However, it did not consider polyhierarchic structure.
Use of polyhieraerchic ontology for a digital library metadata is an important contri-
bution of the present thesis. Under a polyhierarchic ontology, a concept representing a
subject area may become the subclass of more than one subject areas. This considera-
tion becomes apparent for appropriate knowledge representation of an environment that
involves interdisciplinary subject areas. The digital library classifies the documents on
the basis of rigid hierarchical predefined structure. This method is followed by both the
standard document classification systems: Dewey Decimal Classification (DDC) [131]
and Library of Congress Classification (LCC) [37] and place any document in one of
their classes. However, for a digital library (DL) ontology, documents are placed in
classes, usually called concepts in the ontology domain, and those concepts are then
linked by appropriate semantic relationships. In case of a digital library (DL), a docu-
ment may include texts, audio, video or any other multimedia instances. However earlier
effort on DL ontology [65], has considered strict hierarchy of concepts. As a result, the
underlying structure becomes a T ree. In case of a polyhierarchic DL ontology, the un-
derlying structure becomes a Directed Acyclic Graph (DAG).
In order to explain the importance and requirement of a polyhierarchic ontology a run-
ning example through out the thesis has been used. So for example, in a DL ontology
a concept named Database may be defined as the sub-concept of three other concepts:
Computer Science and Engineering (CS), Geographic Information System (GIS) and
Biology/ Bio-Informatics (BIO). This consideration changes the underlying structure of
the ontology from a tree to a DAG, as shown in Figure 3.1. Here, the edges connect-
ing different concept nodes have been represented by two semantic relationships Is-A or
part-of. In other words, in an ontology, different node pairs may be linked by different
semantic relationships. However in this thesis, an edge between any two concept nodes
has been represented by only one semantic relationship, isSubClassOf. This semantic
relationship connects the hierarchy of concepts representing different subject areas. The
augmented representation is shown in Figure 3.2.
Now, the three parent concepts of Database may have distinct or even overlapping user
communities. As a result, any document under Database may be of interest to more
Chapter 3 . Polyhierarchic Ontology 22

Figure 3.1: Example ontology structure with Poly-Hierarchy

than one of the above user communities related to three parent concepts. Now, if a con-
cept has multiple parent concepts, members of different user communities corresponding
to different parent concepts may like to have access to different sets of documents even
when they are covered by the same child concept. Consequently, the documents covered
by a concept can be categorised into number of document classes. Figure 3.2 shows an
environment where documents covered by the concept Database may be contributed by
or of interest to any users of the parent concepts. As a matter of fact, a document under
the child concept Database can be a member of one or more than one of the parent
concepts.

3.2.1 Relevance of Document Classes

In a search engine like Google, relevant web-pages are retrieved against a keyword. Using
its own page ranking algorithm the web-pages are retrieved in their descending order of
importance. Now if any such ranking algorithm is applied to any of the concepts of a DL
ontology, the documents associated to that concept can also be retrieved with descending
order of importance. In addition, in a DL ontology, each user submits a credential at
the time of registration. This credential among other things would reveal the areas of
interest of a user. Importance and use of credentials will be discussed in detail later.
Looking back to the running example as shown in Figure 3.2, let a user Ui accessing the
concept Database have revealed in its credentials that his/her areas of interest are CS
and GIS. It may be possible that the ranking algorithm would place the documents
primarily of interest to BIO users on the top of the list and would unnecessarily delay
the search process of Ui . So it would be beneficial for the users if the documents under
Chapter 3 . Polyhierarchic Ontology 23

Figure 3.2: An augmented ontology structure with the concept Database

Figure 3.3: Document Classes Under “Database”

the concept Database be pre-classified according to the user communities defined by


its parents. It will be required only for such child concepts that have multiple parent
concepts. In case of single parent concept, a child concept will have all its documents
placed in only one document class. Now, documents under a child concept having n
parents can be classified into (2n − 1) categories. So, the documents under the Database
concept in Figure 3.2 can be classified into (23 − 1) or seven categories. Figure 3.3 shows
the Venn diagram for the documents corresponding to the concept Database having
three parent concepts, as mentioned earlier. The situation depicted in Figures 3.2 and
3.3 is very common in case of a DL and a document under the concept Database may
be of interest to the users of CS/GIS/BIO or any combinations of them. Representing
an interdisciplinary subject area as a child concept of all its possible parent concepts
not only provides a better knowledge representation but it also helps different user
communities to access their relevant documents faster. As mentioned in an earlier work
Chapter 3 . Polyhierarchic Ontology 24

Figure 3.4: Document Class Abstraction

Table 3.1: Document Class

Document Class Access


1 CS
2 GIS
3 BIO
4 CS , GIS
5 CS , BIO
6 BIO, GIS
7 CS, BIO, GIS

(Adam et al., 2002), the concerned research group proposed to have a credential verifier
that allows a user to create a user profile. This credential in turn can provide relevant
information about possible subject areas in which a user may be interested. For instance,
a user from GIS community while accessing the concept Database, as shown in Figure
3.2, would possibly be interested to access documents belonging to the document Classes
2, 4, 6 and 7, as shown in Figure 3.3. Likewise, different user communities would
access different document classes (partially overlapping). Consequently, a user from
GIS community will get the set of documents relevant for him/her and they will be
different from the set of documents considered to be relevant for a user coming from
computer science or bioinformatics community. This advantage will not be there if
the ontology hierarchy is considered to be a tree rather than a DAG. Moreover, in the
proposed system, the presence of three parents for the concept Database will give rise to
(23 − 1) = 7 document classes. The document classes relevant for each of the three user
communities representing the three parent concepts of Database are shown in Figure 3.4
and Table 3.1. Before discussing about the semantic relationships among the different
nodes of the DL ontology, related technological preliminaries are narrated below:

1. Ontology: Ontology is a directed graph O = (C, L) where nodes are concepts


(C) and edges(L) are relations. The fundamental objective of Ontology is to
form a semantic network based system to organise concepts in a directed graph
structure and to provide a mechanism to search a concept from such a structure
Chapter 3 . Polyhierarchic Ontology 25

through which a given schema element is referred. It also finds other related
elements/concepts in the ontology. So an ontology can be defined by a directed
graph, where each node is a concept. If O is an ontology represented as O = (C, L)
then C is a concept, and L is the link between two concepts representing their
semantic relationship.

• C is the set of concepts. A concept Cq ∈ C is defined as 4-tuple of the form:


{n(Cq ), P (Cq ), F p(Cq ), Q̂)} . . .
• n(Cq ) is the name of the concept.
• P (Cq ) is the set of property.
• F p(Cq ) is the set of featuring properties of Cq, with F p(Cq ) ⊂ P (Cq ).
• Q̂ is the domain references of the ontology Cq. or the reference to a cluster
of schema elements from which Cq is derived.

• L is a set of links. The following types of links are maintained for concepts
in the ontology:
• kind-of links between pairs of concepts to represent the subsumption relation-
ship.
• association to represent generic semantic relationships that can be established
between concepts (e.g., part-of, owns, causes, isSubClassOf and so on).
• An instance-of link defines the instance of the ontology e.g. for schema ele-
ment el ∈ Cl implies el is an instance of Cl .

Remark 3.1. Each ontology has a set of properties, classified into either object
property or data property. Data property describes about the data and Object
property deals with the concepts. All domain property of a concept c can be
S
represented as P (c) = DP (c) OP (c) [126], where DP (c) is the data property
and OP (c) is the concept property. The proposed DL ontology has two types
of links to represent semantic relationships among nodes: isSubClassOf and
hasContributedT o. These are object properties. In Figure 3.2, BioInf ormatics
(isSubClassOf ) Science, so the OP is (CBioInf ormatics .(isSubClassOf )) ∈ {Science}.

2. Concept: Each ontology has a set of semantically related concepts, C(o) =


{c1 , c2 , ...., cn }. Figure 3.2 is showing a Digital Library(DL) ontology, hence
C(DL) = {cDatabase , cBioInf ormatics , cComputerSCandEngg , ....., cDigitalLibrary }.

3. Concept Hierarchy: Ontology structure, as described in this research effort,


is a DAG, where a concept may have more than one parent concepts. Concepts
in an ontology O can be represented as a partially ordered set. Given two concepts
(Science, BioInf ormatics) ∈ (DigitalLibrary), if isSubClassOf (BioInf ormatics) =
Chapter 3 . Polyhierarchic Ontology 26

Science , then BioInformatics is more specialized than Science. It can also be de-
noted as CBioInf ormatics ≺ CScience .

4. Document Class: Depending on the number of first degree parents, documents


in a concept are classified into several classes . If a concept has n number of parents
then the concept should have (2n − 1) number of document classes. In Figure 3.2
concept Database has three parent concepts. So the documents covered by the
concept Database has 7 possible document classes.

5. Document Annotation: Gonçalves et. al. has published some work on ontolog-
ical representation of digital library [65]. Concept of an ontology can be identified
by it’s URI. In the present system, a document is identified by its concept URI
with a unique document class and document-id suffixed.

3.2.2 Semantic Relationships

The digital library (DL) ontology considered in this thesis supports two semantic re-
lationships: isSubClassOf (concept to concept relationship) and hasContributedT o
(document class to concept relationship):

Definition 3.2 (Property: isSubClassof). For any set of three concepts C1 , C2 and C3
and any document x, the relationship (isSubClassOf) holds the following properties:

• Reflexive: ∀x{x ∈ C1 ∧ C1 (isSubClassOf)C2 → x ∈ C2 }

• Transitive: ∀C1 ∀C2 ∀C3 (C1 (isSubClassOf)C2 ∧ C2 (isSubClassOf)C3 →


C1 (isSubClassOf)C3 )

• Non-Commutative: ∀C1 ∀C2 (C1 (isSubClassOf)C2 ) → ¬(C2 (isSubClassOf)C1 )

Example 3.1. isSubClassof can be considered as parent-child relationship and repre-


sents a partial ordered set. According to Figure 3.2, “Bio-Informatics” isSubClassof
“Science” i.e. “Bio-Informatics” is a subconcept of the superconcept “Science”. Though
isSubClassof is a non-commutative property, it offers a transitive relationship. So, if
B is isSubClassof A and C is isSubClassof B, then C is isSubClassof A as well.

Definition 3.3 (Property: hasContributedTo). For a concept Ci and a document


class di ; di hasContributedTo Ci means di is a collection of documents classified by the
concept Ci . The hasContributedTo relation is non-commutative.

Example 3.2. If di hasContributedTo Ci and Ci is isSubClassOf Cj , then number


of such parent concepts Cj will determine the number of document classes di that will be
Chapter 3 . Polyhierarchic Ontology 27

created under the child class Ci . So, according to the running example, concept Database
having three parent concepts CS, GIS and BIO will have (23 − 1) or 7 document classes
under it. So in case of a single parent concept, the child concept will also have only one
document class under it.

As mentioned earlier, document classes relevant for each of the three user communities
representing the three parent concepts of Database are shown in Figure 3.4 and Ta-
ble 3.1. However, Figure 3.4 shows the links in such a way as if the seven document
classes are connected directly to the parent classes. As a matter of fact, the document
classes are actually linked with the child class Database only with the semantic link
hasContributedTo. Figure 3.4 showed it differently to explain which parent classes are
involved with which document classes. The idea is clearly shown in the associated Table
3.1. So, generic definition of the semantic relationships for a DL ontology are:

• Inferable(⇒): Relationship Ci ⇒ Cj signifies that a concept Ci infers the ex-


istence of another concept Cj . In other words, Cj is inferable from Ci . Inferable
relationship is non-commutative, .i.e Ci ⇒ Cj 6= Cj ⇒ Ci , but transitive i.e. if
Ci ⇒ Cj and Cj ⇒ Ck , then Ci ⇒ Ck .
Since isSubClassof is basically an Inferable relationship, the properties enumer-
ated in definition 3.2, i.e. Reflexive, Transitive and Non-Commutative properties
are applicable to Inferable relationship in general.

• Partially Inferable(*): Relationship Ci * Cj signifies that a concept Ci can


partially infer another concept Cj . This relationship is also non-commutative. This
relationship is applicable to the concepts having multiple parents. As discussed
earlier, concept Database is partially inferable from the three parent concepts
BIO, GIS and CS. So if a user accesses the child concept Database through
the parent concept GIS, then Database will be partially inferred giving access to
documents under the document classes 2, 4, 6 and 7 only, as depicted in Figure
3.4 and associated Table 3.1.

• Non Inferable(a): Relationship Ci a Cj signifies that a concept Ci cannot infer


the existence of another concept Cj . So considering the concept hierarchy of the
ontology structure, non-inferable relationship Ci a Cj signifies that there is no
path from concept node Ci to concept node Cj . So, according to Figure 3.2,
concept Bio − Inf ormatics or BIO is non inferable from concept Engineering
(Engineering a BIO).

In the object hierarchy of the proposed model, an ontology contains concepts, a con-
cept contains document classes and a document class contains documents. Similar to
Chapter 3 . Polyhierarchic Ontology 28

the semantic relationship Inferable(⇒) connecting concepts of the ontology hierar-


chy, document, document-class and concept are related by the semantic relationship
Inclusion(γ). So a document is connected to a document-class and a document-class
is connected to a concept by the semantic relationship Inclusion. Semantic relation-
ship hasContributedTo, as defined earlier, is basically a Inclusion relationship. This
relationship also maintains the reflexive property and is Non-commutative. So for a
document di , a document-class dcj and a concept Ck ,

• Reflexive Property: di γdcj ∧ dcj γCk → di γCk

Similar to Inclusion relation for object hierarchy, membership of a user in a user-group


is captured by Assignment(α) relation.

• Non-commutative : Assignment(α) relation is non-commutative. ∀u1 ∀u2 {α(u1 , u2 ) →


¬α(u2 , u1 )}

• Transitive : Assignment(α) supports transitivity. ∀u1 ∀u2 ∀u3 {α(u1 , u2 )∧α(u2 , u3 ) →


α(u1 , u3 )}

Since assignment relation is non-commutative and supports transitivity, it not only


defines the membership of a user in a user-group but also membership of a user-group in
a super user-group and thus defines a hierarchy (i.e. user-group hierarchy). The present
chapter considers a single user-group. Effects of the presence of two different hierarchies:
object/concept hierarchy and subject/user-group hierarchy will be considered in a later
chapter.
Ontology hierarchy is built using the semantic relationships Inferable and Inclusion.
Later, it will be shown that in this thesis these relationships along with Assignment
relation are also used for authorisation inference in the proposed access control model.
The effects of poly-hierarchic structure for a DL ontology can be summarised as:

1. Better knowledge representation for interdisciplinary subject areas.

2. Classification of documents into (2n − 1) of document classes for a child concept


having n parent concepts and thus to provide faster access to the documents rele-
vant for the users of each of the parent classes.

Establishing the relevance of polyhierarchic ontology for representing the metadata


structure of a digital library, next chapter deals with the access control issues for the
same. Features of the proposed access control model have also been explained.

Authorisation
Chapter 4

Access Control Issues for


Structured Metadata and
Semantic Digital Library

4.1 Access Control Preliminaries

Early studies on access control [67, 102] have considered three entities: a set of objects
O, the resources to be protected, a set of subjects S, the entities that try to access the
members of O and a set of access rights A (read, write, delete, update etc.) defining
the mode of access. So, a tuple (s, o, a) = true signifies that the subject s, (s ∈ S)
can access the object o, (o ∈ O) with the access right a, (a ∈ A). The tuple itself
is an authorisation specification and part of security policy. Later the model was ex-
tended to include negative authorisation [17, 77] to express explicit denial of access to
an object by a subject. Consequently, the authorisation tuple was also extended to in-
clude a sign making it (s, o, a, sg), where sg ∈ {+, −}, giving rise to positive or negative
authorisation. Now, in a system or in an application environment involving many ob-
jects and subjects, it is difficult to maintain individual authorisations for each subject
with respect to each object for each type of access right. In addition, for each access
request, appropriate authorisation needs to be verified. Starting from Access Control
Matrix, many models and mechanism have been developed for imposing access control
and then to verify each access request against specified authorisations. An application
area supporting both positive and negative authorisations may even give rise to problem
of undecidability in authorisation verification as discussed in [70]. In other words, there
can be authorisation conflicts where for the same subject-object combination both pos-
itive and negative authorisations may be inferred for the same access right. In order to

29
Chapter 4 . Access Control Issues 30

avoid such conflicts, different choices of access control policies have been adopted [109].
For example in a closed policy, any access to be allowed must be specified and will be
of the form (s, o, a, +) = true. This is a positive authorisation. Similarly, for an open
policy all accesses are available by default. For stopping any access, explicit denial has
to be specified and will be of the form (s, o, a, −) = true. This is a negative authorisa-
tion. However in real life, it may be difficult to find an application environment that is
absolutely closed or absolutely open. So, if a new access control model is proposed or
if a known model is applied to a new application area, it is necessary to establish that
the environment is free from the problem of undecidability. Otherwise, even if any such
problem exists, there must be methods to identify them and then to resolve them or
avoid them.

4.2 Access Control for Modeling Web

4.2.1 Empirical Data Dissemination Technique

Problem of securing web data by providing fine-grained access control on the web has
attracted many research groups [107]. One of the most popular web servers, the Apache
server provides an access control list through a configuration file (access.conf) contain-
ing the list of users and corresponding hosts (IP addresses), i.e. (host,user) pairs and
specifies whether a request for connection to the server is allowed or forbidden. Users
are identified by user or user-group name and passwords are specified by UNIX-style
password file. This type of access control specification can legitimate a user’s access up
to file level. Depending upon the security requirement administrator can specify access
for a particular user, user-group, domain, etc. Apache is the most common security
control mechanism used in various web domains, but it has following limitations:

• As the granularity level of access control in the Apache server is limited to file
level, it is not possible to provide restriction on certain portions of a file. This
limitation may force protection requirements to affect data organisation at the
file-system level. For instance, a file, containing data with different access control
requirements, will have to be split into more than one files.

• Maintenance and management is very difficult for Apache security specification.


In this model security specifications are written in a configuration file. So against
any update or change in the security specification, new file needs to be reloaded.
It increases the cost of management and affects the overall maintenance cost of
the organisation.
Chapter 4 . Access Control Issues 31

• As the configuration file is written in simple english like language without encryp-
tion, it increases the overall vulnerability of the security system.

• Since the access control specifications follow simple logical expressions, it is difficult
to implement higher order logics.

A few efforts have already been taken to improve the Apache system. It has become
possible to implement further fine-grained access control on HTML documents using its
tag structure [133]. However, HTML does not have the provision to model semantic
relationships. Some other approaches like, EIT SHTTP [129] scheme explicitly rep-
resents authorisations within documents by using security-related HTML tags. Every
document may have associated security (meta)tags describing the authorisations on the
document. It seems to be the right direction towards the construction of a more pow-
erful access control mechanism, though HTML cannot model any semantic relationship
in the information structure.

4.2.2 XML based Information Dissemination

Improvements in markup language has removed many limitations of HTML and given
rise to XML. Main advantages are in the modelling of unstructured data and introduc-
tion of semantic relationships in web framework. Consequently, security issues in XML
environment have become an important research topic in access control area. Damiani
et al. [43] have developed an access control model, which basically analyses DTD from
structural perspective and enforces security policy up to document tag level. This work
is very close to the research effort of Bertino and Ferrari [16]. Another significant work
has been done by Gabillon and Bruno [58]. In spite of inherent similarities, these models
differ in some basic assumptions. Damiani et al. [43] have used separate hierarchy for
storing subject and object. The granularity of the model can be extended up to the
innermost element of the XML hierarchy. This model has considered both positive and
negative authorisations with positive as the default one. As a result, the model in its
structure has marked all elements with negative and positive sign. On the other hand,
Gabillon and Bruno [58] have represented object as Path tree, which provides more flexi-
bility over the previous model. However, based on XPath it is difficult to model complex
security policies. The model is restricted to read privilege only and assumes that a user
has no access to DTD. Model developed by Bertino and Ferrari [16] is quite similar to
the earlier model, but the granularity of this model can be extend up to attribute level.
Details of authorisation for access in XML environment are given in the next subsection.
Chapter 4 . Access Control Issues 32

4.2.3 Authorisation for Access in XML

In a server, a set Auth has been defined to hold all the authorisation specifications. Au-
thorisations have been specified for all the objects stored in that server. An authorisation
is specified as:

Definition 4.1 (Authorisation). An authorisation f ∈ Auth is defined as a five tuple


(subject, object, access right, sign, type) where subject ∈ Authorised subject set and
object ∈ (URI or XPath). Access right is either read or write, sign ∈ (+ve or -ve) and
type ∈ {LDH, RDH, L, R, LD, RD, LS, RS} specifies types of authorisation (Local DTD
Hard, Recursive DTD Hard, Local, Recursive, Local DTD, Recursive DTD, Local Soft,
and Recursive Soft, respectively)

Subject : Damiani et al. [43] has represented a subject on the basis of its identity and
on location wherefrom a request has been originated. Location can be expressed
with reference to either its numeric IP address (e.g., 150.100.30.8) or its symbol-
ic/domain name (e.g., cardiology.hospital.com). So, a subject requesting access is
characterised by a tuple (user-id, IP-address), where user-id is the identity with
which a user is connected to the server and IP-address is either numeric or sym-
bolic. Instead of a single subject, a group of subjects can be represented by wild
card character *. For example, an IP address 151.100.*.* denotes all the machines
belonging to network 151.100. Similarly there can also be user-group instead of
a single user. User-groups together with their membership relationships and IP
addresses with patterns will form partially ordered sets (hierarchies).

Object : A set Obj of uniform resource identifiers (URI) denotes the resources to be
protected. For XML documents, URIs can be extended with path expressions,
which are used to identify the elements and attributes within a document. In par-
ticular, W3C proposal uses XPath language for identification of internal compo-
nents of an XML document. Considerable advantages can be derived by adopting
a standard language. First, the syntax and semantics of the language are known
by potential users and well studied. Secondly, several tools are already available
that can be easily used to produce a functional system.

Definition 4.2 (Path Expression). A path expression on a document tree is a se-


quence of element names or predefined functions separated by delimiter /(slash) :
l1 /l2 /........./ln . Path expressions may terminate with an attribute name as the last
term of the sequence.

XML authorisation model controls access at all levels of granularity. The object gran-
ularity for which authorisations can be specified spans from the DTD (set of document
Chapter 4 . Access Control Issues 33

instances) to single elements/attributes within individual documents, where elements


and attributes can be referenced by means of path expressions. Authorisations can be
either positive, permitting access or negative, expressing denial. The reason for having
both positive and negative authorisations is to provide a simple and effective way to
specify authorisations applicable to sets of subjects/objects with support for exceptions
[78]. Authorisations can either be local or recursive. A local authorisation on an ele-
ment will be applicable only to that element and its attributes. On the other hand, a
recursive authorisation assigned to an element percolates down progressively through its
sub-elements along the tree structure. So an authorisation on DTD i.e, at the schema
level, will be applicable to all documents covered by it. However, a more specific autho-
risation may be assigned to any of the sub-elements different from the one assigned to
a parent node of the hierarchy. So, authorisations propagate until overridden by a con-
flicting authorisation on a more specific object. This facility is important for enterprise
level authorisation specification. Since large enterprises are often organised into multiple
domains, protection requirements may be specified both at the level of the enterprise,
stating general regulations that should hold and at the level of specific domains (part
of the enterprise) where, according to a local policy, additional constraints may need to
be specified or some constraints may need to be relaxed. Organisations specify autho-
risations with respect to DTDs; specific sites can specify authorisations with respect to
individual documents (instance-level authorisations) as well as with respect to DTDs.
The two types of DTD-level authorisations have complementary roles in increasing access
control flexibility. Organisation DTD-level authorisations stated by a central authority
can be effectively used to implement corporate wide access control policies on document
classes. Site DTD-level authorisations specified by departmental authorities allow for
department wide access control policies complementing the corporate ones. Moreover,
they alleviate administration chores by allowing concise specification of site-wide autho-
risations. For instance, suppose that a hospital is composed of different wards, each of
which is responsible for managing specific XML documents. This is a situation where
both hospital level authorisation and ward level special authorisation requirements need
to be implemented.

4.3 Access Control for Semantic Web

Semantic web provides some more abstraction over XML in terms of sharing meta-
information about objects available for machine processing using languages like the
Web Ontology Language (OWL). OWL describes meta-objects, their attributes and
relationships among objects that are typically exchanged between servers and clients.
Such meta-information is mostly expressed as graph, where nodes are concepts and edges
Chapter 4 . Access Control Issues 34

are semantic relationships and they are usually expressed using resource description
framework (RDF).
Access control to XML documents [16, 43, 58] is the foundation on which access control
for the Semantic Web has been built. However, when applied to the Semantic Web, the
existing access control models for XML have to be extended to take into consideration the
higher layers of the Semantic Web such as RDF and ontologies. Hence access control has
to be extended from document or DTD level to concept level of an ontology. For example,
it is natural that one would like to allow access on web data related to ‘sex’ only for people
aged 18 and above, or information on “chemical weapons’ is denied access for people
from certain countries. Instead of specifying authorisations over each related document
or DTD, semantic web proposes specifying access control over concepts like “sex’ and
“chemical weapons’, which automatically enforces it on all data instances. Qin and Atluri
[126] have proposed a model for access control mechanism of semantic web component,
where the concepts are defined in ontologies. They define ontologies as the semantic
schema for the web data, which are annotated by the concepts in ontologies and attached
as instances to them. Incidentally, different XML documents structurally under different
DTDs may still be related semantically to same concept in an ontology. So, access to
all these documents and DTDs can be regulated and enforced in a consistent manner
through one concept-level access control model. This model is a better alternative under
the infrastructure of the Semantic Web and complimentary to the element-level access
control for XML. Similar to authorisation propagation from DTD to documents to inner
tags in XML environment, in semantic web, authorisation propagates through concept
hierarchy along the semantic relationships connecting them. This work is comparable to
the research effort of Kaushik et al. [85]. They have proposed a policy-based disclosure
control framework for safe sharing of sensitive ontologies. In this model they prevent
disclosing sensitive portions of an ontology, selectively hide name of concepts and/or
relationships while disclosing the overall structure of the ontology, and replace them with
desensitised name thereby allowing the access controller to spread deceptive information.
Other significant works have been done by Kagal et al. [82] and Farkas et al. [52].

4.4 Access Control Model for Digital Library

Considerable research efforts have been made for controlling access to traditional cen-
tralised and distributed databases [18, 73, 78, 158] and for federated DBMS as well
[45, 80, 87]. These models are not suitable for controlling access to semantic digital
library. The requirements may be listed as:
Chapter 4 . Access Control Issues 35

• In digital library authorisation model should be based on user characteristics and


object content. On the other hand, traditional models are based on user and object
mapping.

• Digital library is more focused on content based retrieval. Hence, the security
model should be efficient enough for Content-based authorisation specification.
Such authorisation should also be able to control access to different portions of a
document.

• Digital library authorisation should address the modern digital library require-
ments, like supporting ontology or semantic web, subject and object hierarchy
and flexible structure like polyhierarchy.

First access control model in this area was proposed by Samarati et al. [134]. Here,
documents are organised as unstructured data pages connected through the links. Au-
thorisation may be specified either on an entire document or on selected portions of
it. Documents are identified by individual document-id. Though the documents are
interlinked, there is no grouping of documents or creation of hierarchy by subject ar-
eas/concepts. As a result, the model is not suitable for large digital library. Second
important effort was made by Adam et al. [8] who introduced users’ credential manage-
ment. However considering the history of research efforts, Winslett et al. [155] have first
identified the need to use users’ credentials in the access control model. The idea was
that in a distributed system users must provide information about themselves. Access
control mechanism would then decide about the type of documents where a user or a set
of users may/may not get access. Further, users credentials could also help in categoris-
ing users into different user-groups and might create an hierarchy of user-groups. Model
proposed by Winslett et al. [155] did not have any formal way of specifying credentials.
Later Adam et al. [8] improved the system by providing a specification language for
credential management. There are other significant research efforts as well. A broker
method was implemented in [62]. Here an access request is sent to appropriate authori-
sation system and access is provided depending on the credentials of users. The model
may even use simple user-name and password as credentials. Later the model has been
extended [118] further by incorporating certificates in user credentials. In a different lit-
erature, a hierarchical model has been proposed by Baru and Rajasekar [13]. The model
focused on organising digital library security administration components like, super user,
collection administrator and curator. There are some other significant approaches in dig-
ital library security model. Gladney [61] proposed the extension of mandatory access
control for a digital library. The model, however, does not consider dynamic changes of
users’ privileges. Document protection approach for the digital library using keychain
management called “Cryptolope” has been proposed by Kohl et al. [90]. A trust based
Chapter 4 . Access Control Issues 36

mechanism for accessing digital library has been suggested by Skogsrud et al. [140] where
users’ credentials include trust relationship. With a different approach, a trust model
for accessing digital library have also been developed by Ray and Chakraborty [128].

4.4.1 Access Control for Polyhierarchic Structure

Importance of using polyhierarchic ontology for digital library metadata has already
been highlighted in Chapter 3. However, polyhierarchic nature of digital library intro-
duces new security challenges. Earlier security models are strictly hierarchic where one
node has only one parent. It is true for XML documents and for most of the currently
available semantic web ontologies. As a result, the underlying structure is a tree whereas
it will be a Directed Acyclic Graph (DAG) for a polyhierarchic ontology. Though not
used for digital libraries, some work on the polyhierarchic structure can be found in
different domains [36, 152, 153]. Caseau [36] represented object oriented model using
polyhierarchic structure. However, dynamic addition and deletion of nodes have not
been considered in this model. Later, van Bommel and Beck [152] and van Bommel and
Wang [153] have used polyhierarchy to represent lattice structure. This model, however,
has considered dynamic updates. None of these models have discussed the security chal-
lenges for a polyhierarchic structure. Chapter 3 has discussed a polyhierarchic metadata
structure for a digital library where a concept having n parent nodes will have (2n − 1)
possible document classes under it. This thesis has primarily discussed the access control
problems for such a structure. Possible access control requirements for polyhierarchic
ontology for digital library metadata are:

1. Representation problem: Well known XML structure cannot represent polyhierar-


chy. However, OWL / RDF can be potentially used for modelling such a structure
but it would not model access control policies. If some other facility like XACML
is used for storing security policies, arrangements need to be made so that OWL
structure and XACML policy store can communicate.

2. Implicit and Explicit Policy Management: Most of the access control policies
should be implicitly specified with a few explicit policies. Using the explicit policies
and definite set of derivation rules, policies for all nodes of the ontology hierarchy
will have to be derived. This will reduce the size of policy store. At the same
time, it is necessary to show that access control policies for each concept/user
combination is decidable and the overall security model is safe.

3. Policy propagation: Since ontology structure is used for knowledge representation,


definite security policy propagation rules need to be specified for the semantic
relationships present among different nodes of the ontology.
Chapter 4 . Access Control Issues 37

4. Policy propagation and Conflict Analysis: Users can be categorised into user-
groups building user-group hierarchy. There also exists concept hierarchy of the
ontology. Now interaction of these two hierarchies for policy propagation may give
rise to different conflicts that need to be resolved as well.

4.5 Proposed Access Control Model

Initially the proposed model deals with a single user/user-group. Item No.1 above, is an
implementation related issue and will be covered in chapter 7. This chapter, however, will
explain how a new user is placed in a user-group and then how policy derivation is made
from a set of implicit and explicit policies for a single user/user-group. In other words,
Item No.2 and 3 will be discussed with respect to a single hierarchy, i.e. concept hierarchy
of the ontology. The user-group hierarchy and its conflicts with concept hierarchy, as
mentioned in Item No.4 above, will be discussed in a later chapter.
Before discussing about the authorisation and policy specification, next subsection deals
with credential specification of a user and its role in identifying its user-group. The
credential manager while registering a new user, determines the user-group(s) of the
concerned user from its credential. Same credential manager serves the purpose of
authentication while admitting an existing user trying to access the library.

4.5.1 Credential of Digital Library

Credential of a user provides the set of attributes, which defines user characteristics
apart from user’s identity and is used for authentication purpose. Detail discussion on
digital library credential mechanism has been made in [8, 128]. Credentials are assigned
during the user’s addition and updated either automatically or through administrative
intervention depending upon the user’s portfolio. In order to sign up, a user gives certain
In fact, a user’s credentials contain properties relevant to and hence mapped to one or
more user-group(s). Credential may be formally defined as:

Definition 4.3 (Credential Function). A credential function is a function which gener-


ates credential value from user’s attributes Cf → f (uid , name, age, designation, department,
year of admission, email, address, ....).

Definition 4.4 (Credential Type). Credential type is a pair Ct → {Cf , gid } which maps
credential and group id, where Cf is the credential value generated by credential function
and gid is the group id.

Example 4.1 (Credential Function). A credential function may be of the form: Cf (10001) →
f("10001", "Jack", "22", "Student", "Computer Science", "2015", "jack@abc.com",
Chapter 4 . Access Control Issues 38

"24 S N Bose Hall"). Here, Cf represents the credential function generated from
user’s input. User id ”10001” is the user’s unique id generated by systems. ”Jack“
is the name of the user of age ”22“. He is a ”Student“ of Computer Science of year
”2015“. Also, his email address and postal address are the next two consecutive fields.
Cf is the unique credential value generated by the system and sorted into LDAP for
standard authentications.

Example 4.2 (Credential Type). In Example. 4.1, system will decide the proper user-
group for the user. Let the group-id for the concerned student is ”1111034”. So, the
Credential Type will be Ct → (Cf (10001), ”1111034”).

Definition 4.5 (Credential). A credential will be a three tuple C → {uid , Ct }, where


uid is the unique user id and Ct ∈ {Credentials Type }. C → {uid , Ct } → {uid , Cf , gid } →
{uid , f (uid , name, age, designation, department, year of admission, email, address), gid },
where gid ∈ {Groups}

Example 4.3. Combining Example. 4.1 and Example. 4.2 the Credential will be C →
(10001, f("10001", "Jack", "22", "Student", "Computer Science", "2015",
"jack@abc.com", "24 S N Bose Hall"), "1111034")

4.5.2 Authorisation and Policy Specification

Different formal methodologies have already been proposed for modeling security re-
quirements [77] [26]. This thesis has adopted a modified version of the model proposed
earlier for enterprise security solution [137][136] [138]. Here the components for access
control policy specification are:

Definition 4.6 (Subject). A Subject may refer to an individual user or a user-group,


S = {si }, i ≥ 1

Definition 4.7 (Object). An Object can be a document, a document class covering a


set of documents, a concept covering one or more number of document classes or the
entire ontology, O{oj }, j ≥ 1

Definition 4.8 (Access rights). In this paper, the proposed access control model has
considered read and browse access rights only, A ∈ {read,browse}. A valid user can
browse through all the nodes by default but would need positive authorisation to read
any document under a concept, A{ak }, k ≥ 1

Definition 4.9 (Sign). Sign are of two types, +ve or −ve, positive for access permission
and negative for explicit denial. V ∈ {+, −} and V ∈ {vt }, t ≥ 1
Chapter 4 . Access Control Issues 39

So an authorisation in the system is defined by a four tuple (s, o, a, v) signifying that


subject s is authorized to access object o with access right a and sign v.

Definition 4.10 (Authorisation). Authorisation is a 4-tuple function; (s, o, a, v) which


when specified will be treated as true, where s is an instance of subject S, o is an
instance of object O, a is the access right and v is the sign. f : {s × o × a × v} →true
(by default).

The easiest way to implement authorisations will be to specify every authorisation ex-
plicitly as 4-tuple function f : (s, o, a, v) → true for each (user, concept) combination
and save in the repository or authorisation base. However, that will create a very large
authorisation base and authorisation verification process will also be slow. In order to
reduce the number of explicitly specified authorisations, the proposed model employs
some authorisation inference mechanism using the concept hierarchy. Consequently, a
few authorisations are explicitly specified and other authorisations are inferred from
them. So, the model considers two types of authorisation; explicit and implicit, also
defined as strong and weak authorisation respectively in [127][77] [26]. For an explicit
authorisation, authorisation is specified explicitly as a 4-tuple function and not inferred
otherwise. Referring to Figure. 4.1, explicit authorisation specifications at nodes 1 and
5 are of the form:

• f : (“jack”, “1”, “read”, “ + ”) → true

• f : (“jack”, “5”, “read”, “ − ”) → true

Now to reduce the number of explicitly specified authorisations, first the users are placed
into different user-groups as mentioned earlier. From access point of view, members of
one user-group are considered to have the same access requirements, i.e. same set of
access rights on same set of concepts. So, specifying authorisations for one user-group
will suffice for all the members of that user-group. With this being the first type of
inference mechanism for authorisation, second type allows a concept to inherit authori-
sations from its parent(s). In case of multiple parents, this type of inheritance may give
rise to authorisation conflicts. A child concept may inherit positive authorisation from
one parent node and a negative authorisation from another parent node for the same
access right and same user-group. Authorisation inference mechanism must resolve this
problem to avoid any authorisation conflict and consequent problem of undecidability
of authorisation at any node of the ontology. Authorisations and their combinations are
specified as access control policy. Rules for policy inference/inheritance are not only to
be properly defined but need to be established as sound and complete.
Chapter 4 . Access Control Issues 40

1
+
+ +
2 3

+ -
4 5 -
6

- 7 8 - -
9

explicit authorisation - implicit negative


implicit authorisation
+ explicit positive

+ implicit positive
explicit negative
-

Figure 4.1: Implicit and Explicit authorisation

Definition 4.11 (Authorisation Base). Authorisation base is the collection of explicit


authorisations using the authorisation function f : {s × o × a × v}

Definition 4.12 (Explicit Authorisation). An explicit authorisation is a 4-tuple function


f : (s, o, a, v), which is specified and stored in the authorisation base.

Definition 4.13 (Implicit Authorisation). An implicit authorisation is a dependent


function f¯ → f : (s, o, a, v) where f is an explicit authorisation and f¯ is derived using
policy rule set.

Considering the Figure. 4.1, nodes 1 and 5 are representing explicit authorisations, out
of which 1 is a positive authorisation and 5 is a negative authorisation. Implicit authori-
sations at nodes (2, 3, 4, 6, 7, 8, 9) have been derived from explicit authorisations 1 and
5. Authorisation base only stores explicit authorisations and implicit authorisations are
derived using a given rule set.
Before formally defining access control policy and the rule set for their derivation, infer-
ence mechanism for Authorisation using concept hierarchy needs to be explained.
In the object hierarchy of the proposed model, an ontology contains concepts, a con-
cept contains document classes and a document class contains documents. However,
access control has been implemented up to the document class level. In other words,
a user-group having access to a document class will get access to all the documents
covered by it. Chapter 3 has defined two semantic relationships among these objects:
isSubClassof which determines the set of concepts inferable from a particular concept
in the hierarchy, and hasContributedTo that determines which document-class is in-
cluded in which concept. General principles of inference mechanism for authorisation
specification are listed as:
Chapter 4 . Access Control Issues 41

1. All member users of a user-group will inherit the authorisations to different con-
cepts assigned to the user-group. Assignment relation assigns a user to a user-
group.

2. The access control model primarily considers only two access rights: read and
browse. Now browse is considered to be available to all user/user-group by default
for all concepts in the ontology hierarchy. This is required for running the traversal
algorithm through the ontology hierarchy without any authorisation problem. So
an authorisation for any (user-group, concept) combination either specifies/infers a
positive authorisation for read operation or a negative authorisation for the same.

3. Since the application area is a library, the access control model has considered an
open system. So, initially all concepts are accessible to all users. However during
registration as an authorized user, the credential manager determines the user-
group(s) where a user is to be placed. Depending on the properties of a user-group,
some of the concepts may be explicitly assigned with negative authorisations for
the concerned user-group and all users belonging to the user-group inherit these
authorisation related constraints. So explicit authorisations are usually negative
authorisations.
However, if for a user-group it becomes necessary to permit access to a concept
while both its parent and child concepts are having negative authorisations, an
explicit positive authorisation may have to be specified.

4. When a user-group is assigned with an authorisation, positive or negative, to a con-


cept Ci , all concepts inferable from Ci inherit the same authorisation. Inferable(⇒)
relationship, as described in the previous chapter, is used to determine these in-
herited concepts.

5. A positive read authorisation to a document-class is inferred only if its covering


concept connected by Inclusion relationship is having a positive read authori-
sation. Once a document-class is assigned with a positive read authorisation, all
the documents covered by the document-class can be read. However, if a concept
Ci has n number of parent concepts, Ci will have (2n − 1) document-classes under
it. Some of these document-classes will be explicitly assigned with negative read
authorisation when some of the parent concepts are having negative authorisations
even when Ci has a positive authorisation.

Access control policies and the authorisation inference rules can now be defined formally.
Any single authorisation may be considered as a policy primitive and further access
control policies may be formed by combining different primitives using the operators:
V W
And ( ), Or ( ), Exclusive-Or (⊕) and Implication (⇒). Exclusive-Or (⊕) allows the
Chapter 4 . Access Control Issues 42

implementation of Separation of Duty principles [63]; e.g. (s1 , o1 , a1 , +) ⊕ (s1 , o2 , a1 , +)


implies that user-group s1 has been granted the access right a1 on object o2 as well as
on object o1 but not to be applied simultaneously.

Definition 4.14 (Policy). A policy P defines authorisation from subject s to object o


for a ground truth, Policy is either,

• Single primitive or

• Combination of multiple primitives using logical operators.

Example 4.4 (Policy). A system having two objects ox and oy , and a user-group ui , may
have the following policies. β : {(ox , ui , ak , +) ⊕ (oy , ui , ak , +), (oy , ui , aj , +)}. Here, the
policy set β has two elements. First element implies that user-group ui has access right
ak on object ox and object oy , but not to be applied simultaneously whereas the second
element of β implies that user-group ui has access right aj on object oy . Elements of
policy set β are separated by comma.

Since the concept hierarchy supports transitivity property and hence can support in-
heritance of authorisation, if a user-group ui enjoys an authorisation set β explicitly
specified, then after inheriting authorisations along with the implicit authorisations, ui
will have a derived authorisation set βder which is a super-set of β. The derivation rules
are specified below:

• [Rule 1] Reflexivity Rule: All tuples of β are inherited by βder .

• [Rule 2] Inheritance Rule [Direct]: For concept hierarchy in the ontology, ((oy , ui , ak , sl ) ∈
β) ∧ γ(ox , oy )) ⇒ (ox , ui , ak , sl ) ∈ βder .

• [Rule 3] Inheritance Rule [Mutual Exclusivity]: ((ox , uj , ak , sl ) ⊕ (oy , uj , ak , sl ) ∈


β) ∧ α(ui , uj )) ⇒ ((ox , ui , ak , sl ) ⊕ (oy , ui , ak , sl )) ∈ βder .

• [Rule 4] Override Rule: ((ox , ui , ak , sl ) ∈ βder )∧((ox , ui , ak , ¬sl ) ∈ β) ⇒ (ox , ui , ak , ¬sl ) ∈


βder ∧ (ox , ui , ak , sl ) ∈
/ βder

• [Rule 5a] Inheritance rule for Multiple Parents: If for a user-group ui , the policy
set applicable to object oA is βA and same for object oB is βB and object oc is
a child of both oA and oB , then ((βA ∪ βB ) ∈ βCder ), where βC is the policy set
explicitly specified for the object oc and βCder is the total derived policy set for oc .

• [Rule 5b] Conflict Rule under Multiple inheritance: (((oA , ui , ak , si ) ∈ βA ) ∧


((oB , ui , ak , ¬si ) ∈ βB )) ⇒ (((oc , ui , ak , si ) ∨ (oc , ui , ak , ¬si )) ∈ βder ))
Chapter 4 . Access Control Issues 43

According to Rule-1, β covers the authorisations explicitly specified for each (concept,
user-group) combination. So for a user-group other authorisations are inherited through
the concept hierarchy in the ontology. It gives rise to βder .
Similarly for Rule-2, if an object ox in the ontology is included in another object oy , as
in case of a document-class and its corresponding concept, then any explicitly specified
authorisation that allows a user-group ui to access oy will also provide access to ox with
the same access right.
Rule-3 covers the situation when any Separation of Duty (Exclusive Or operation) is
specified for a user-group uj . Any user ui assigned to uj inherits the same Separation
of Duty property as explained earlier.
Rule-4 signifies that in case of any conflict of authorisation between the policy sets
β and βder , an explicit specification of authorisation (member of β) will override an
authorisation obtained by inheritance from implicit authorisation (member of βder ).
Rule-5a and Rule-5b consider the cases of having multiple parents in concept hierarchy of
the DL ontology, i.e. multiple inheritance situations. If an object/concept has multiple
parent concepts in the ontology hierarchy and any user-group ui has access to all the
parent concepts, then ui will inherit all the authorisations it has on all the parent
concepts while accessing the child object. If authorisations derived/inherited from parent
concepts are in conflict, OR combination of them will be applicable. So long at least one
of the parent concepts provide positive authorisation for a particular access right to ui ,
the members of ui will be able to access the child concept with the same access right.
However, access to the document class nodes will be controlled by the authorisations
inherited from the parent concepts.

4.5.3 Access to Document Class

While Rule-5b resolves conflicts among implicit authorisations at the subconcept level
derived by inheritance from multiple superconcepts, it is necessary to formulate rules
for selectively accessing the document classes. As discussed in the previous section and
shown in Figure. 3.3 and Figure. 3.4, a child concept having n parents will have (2n − 1)
document classes under it. Table. 3.1 has also shown the document classes which are
of interest to the users of each parent concept. The proposed access control model has
used this situation to extend a controlled access to the document classes. So, a user-
group ui having read permission for documents under CS only will have access right
to document classes 1, 4, 5 and 7 covered by the child concept DB (shown in Figure.
3.4). Documents under other document classes (2,3 and 6) will not be available to ui .
In other words, member of any user-group having access to CS only, will get access to
documents relevant to CS and not others. So the policies applicable to the parents of
Chapter 4 . Access Control Issues 44

concept DB are:
(CS, ui , read, +), (GIS, ui , read, −) and (BIO, ui , read, −)
Since for the library, the access control model has been considered to follow an open
system, all objects are supposed to have +ve authorisation by default. So the -ve
authorisations of user-group ui for objects GIS and BIO are to be explicitly specified.
Now according to Rule-5b, access policy for child concept DB by implicit authorisations
will be:

• ((DB, ui , read, +) ∨ (DB, ui , read, −) ∨ (DB, ui , read, −))

While the first +ve authorisation has been inherited from the parent concept CS, next
two -ve authorisations are inherited from GIS and BIO. Since according to Rule-5a
and Rule-5b, a child concept having multiple parent concepts, infers the union of the
authorisations inherited from all its parents, the above OR combination of the relevant
authorisations will provide a +ve authorisation to child concept DB. So in the proposed
access control model, in case of multiple parent concepts, a child concept will inherit a
+ve authorisation if at least one of the parents have a +ve authorisation.
Now according to item 5 of the general principles of inference mechanism mentioned
above, inheritance of authorisation by the Inclusion relationship between a concept Ci
and its associated document class dj will be ensured by the policy:

• [Rule 6] ((Ci , ui , read, +) ⇒ (dj , ui , read, +))

In the example described above, concept DB for having 3 parent concepts, has 7 doc-
ument classes dc1 to dc7 under it. Now since DB has a +ve authorisation for read
operation (as derived from Rule-5a and Rule-5b), normally according to Rule-6, user-
group ui will be able to access all the document classes under DB and thus will be able
to access all the documents as well. However DB has inherited +ve authorisation from
CS only, so according to Figure. 3.4 and Table. 3.1, user-group ui should get access to
document classes dc1 , dc4 , dc5 and dc7 only. So access to document classes dc2 , dc3 and
dc6 are to be blocked by assigning explicit negative authorisations to them as:

• ((dc2 , ui , read, −), (dc3 , ui , read, −), (dc6 , ui , read, −))

Soundness and completeness of the rule set has not been considered in this chapter.
Later chapter that considers both user-group and concept hierarchies along with their
conflicts in authorisation specifications, will discuss the matter in detail.
Next chapter provides mathematical formalism of graph transformation that helps in
explaining different mechanism of the proposed access control model applied on the
digital library ontology. authorisation
Chapter 5

Graph Transformations and View


Generations

5.1 Graph Transformation and Access Control

This section discusses the formal method of graph transformation, i.e. transformation
steps and rules. A formal introduction to graph based formalism can be found in [41]
and a RBAC implementation of the model has been developed by Koch et. al.[89]. In
this method, a graph represents a state of a system and a rule is a transformation {r :
L → R}, where both L and R are graphs i.e. the left-hand side transforms to the right-
hand side by the graph transformation rule. The basic idea of a graph transformation

Figure 5.1: Graph Morphism Basics

45
Chapter 5 . Graph Transformations and View Generation 46

[41] considers a production rule p : (L R), where L and R are called the left-hand
and right-hand side, respectively, as a finite schematic description of potentially infinite
set of direct derivations. So in the present context, L is the original graph and R is the
transformed graph after applying a relevant access control policy. The rule, {r : L → R}
consists of an injective partial mapping from left-hand side to right-hand side, among the
set of nodes rn and the set of links/relations re . Each mapping, should be compatible
with graph structure and the type of the node. Figure 5.1, shows the intermediate
states of transformation, where each graph production rule p : (L R) defines the
partial correspondence between the elements of left-hand side and the right-hand side
on the basis of the rule, determining which edge should be present and which edge should
be deleted. A match m : (L → R) for a production p is a graph homomorphism, mapping
nodes and edges of L to R in such a way that the graphical structure and the levels are
preserved. In Figure 5.1 production rule p1 : (L1 G1 ) is applied on the graph G1 . The
number written in the vertices and edges consider the partial correspondence between
p,m
L1 → G1 . Now, (G =⇒ H) denotes a direct derivation where p is applied to G leading
to directed graph H. The R is obtained by replacing the occurrence of L in G by H.

5.1.1 Type Graph and State Graph

Here, the ontological structure of a Digital Library and the access control policies im-
posed on it are represented by a type graph. In a type graph the edges are directed,
i.e. each edge runs from a source node to a target node. Each node represents a node
type and each edge also has an edge type. Figure 5.2, represents the type graph of the
proposed access control model for DL ontology. The type graph provides the node types
au, u, p, c and dc. Node type au represents the administrative user. A node type admin-
istrative user may or may not be an actual user of a digital library but will administer
or control all other node types. The model presented in this paper considers a single
administrator, i.e. a centralized ontology structure accessed by all users of the Digital
Library. A distributed environment with multiple administrators will be considered as a
future research effort. Nodes of type u represent users. Node types c and dc are the con-
cepts and document classes respectively. Node type p represents permissions that cover
all the access rights available along with any other administrative permission needed.
Here, the words permission and access right have been used synonymously. Edges from
node type au to node types u, p and c represent the administrative control of admin-
istrative user on other node types. Edge from c to dc represents the document classes
under a concept. A combination of edges that connect u to p and c to p represents the
type of authorisation given to a user for accessing a concept. These are all explicit
authorisations, inherited access rights will be discussed later. A self loop on
Chapter 5 . Graph Transformations and View Generation 47

au U P C dc

Figure 5.2: The Type Graph of Concept Hierarchy

Figure 5.3: The State Graph of Concept Hierarchy

node type c represents the concept hierarchy. A type graph is a pattern for a class of
graphs. A graph G will be a member of this class if each node and edge in G has a
corresponding node and edge type in the type graph. Each such member graph having
more than one instances of the different node types described above represents a system
state and thus called a state graph. Figure 8.1 represents one such state graph. Figure
8.1 shows different instants of the permitted set of nodes present in the system state
described. For example, u1 and u2 are two users. Since the proposed system developed
so far is considering only read permission, the state graph has only one p type node.
While user u1 is yet to get any authorisation to access any concept, user u2 has got the
permission (read in this case) to access the concept c3 . Edges running from one concept
to another represent isSubclassOf relationship. So, c2 isSubclassOf c1 , c3 isSubclassOf
c2 etc. Concept c8 has two parent concepts c3 and c7 . So it has (22 − 1), i.e. 3 document
classes dc1 , dc2 and dc3 under it. An edge running from a c type node to a dc type node
represents hasContributedTo relationship as explained earlier.
Chapter 5 . Graph Transformations and View Generation 48

5.1.2 Administrative Operations on a Concept Hierarchy using Graph


Transformations

This section will show how graph transformation rules can represent the basic admin-
istrative operations relevant to the proposed access control model. Since the proposed
system so far has considered a centralised digital library ontology under a single admin-
istrative domain, a single administrative user type node au has been shown in all the
graphs.

Algorithm 1 User Modification Algorithm


1: procedure UserEdit . Algorithm will add or remove user from the Users List
2: function AddUser(user)
3: if user ∈
/ U SER then S
4: U SER ← U SER user
5: P ermission = P ermission ∪ {U ser → ∅}
6: end if
7: end function
8: function RemoveUser(user)
9: if user ∈ U SER then
10: U SER ← U SER\user
11: P ermission = P ermission\{U ser → P ermission(user)}
12: end if
13: end function
14: end procedure

• Add User : The graph transformation rule for add user has an empty lef t − hand
side, while on the right − hand side, the administrator adds a new user. Figure
5.4 represents the scenario. A new user is created with no permission attached
(i.e. with an empty permission set associated).

• Remove User : Remove user is also another simple rule. Figure. 5.5 shows the
remove user operation. this operation removes a user by deleting a user type
node. That’s why, the lef t − hand side of the Figure shows a user type node while
the right−hand side is empty. Permission set attached to the removed user would
automatically be removed as well.

• Add Concept : Addition of a new concept to the ontological hierarchy is done by


add concept rule. Any new concept added to the concept hierarchy, will be con-
nected to its immediate parent(s) node(s) and immediate child(children) node(s)
by isSubClassOf links. So the rule should also introduce the required edges during
{L → R} transformation. In addition, each new concept added to the ontology
will have at least one document class (dc type node) created and connected to the
Chapter 5 . Graph Transformations and View Generation 49

Figure 5.4: Add user

Figure 5.5: Remove User

Algorithm 2 Ontology Concepts Administeration Algorithm


1: procedure ConceptEdit . Algorithm will add , alter or remove Concepts
2: function AddConcept(Concept, ListOf P arents, ListOf Children)
3: if Concpet ∈/ O then . Where O is the Ontology
4: Create(Concpet);
5: end if
6: for each Parent from ListOfParents do
7: Add(Concept, isSubClassOf, P arent)
8: end for
9: Classify(Concept, ListOf P arents) . hasContributed links will be created
10: for each Child from ListOfChildren do
11: Add(Child, isSubClassOf, Concept)
12: getListOfParent(Child) . Finds the complete list of parents
including new Concept
13: Classify(Child, P arentList);
14: end for
15: end function
16: function AlterConcept(Concept, ListOfParentConceptAdd, ListOfParent-
ConceptRemove, ListChildConceptAdd, ListChildConceptRemove)
17: for each parent from ListOf P arentConceptRemove do
18: remove(concept, isSubClassOf, parent );
19: end for
20: for each child from ListChildConceptRemove do
21: remove(child, isSubClassOf, concept );
22: ListOf P arentOf ChildN ode ← getListOfParent(child) . Find the
complete list of parent including new Concept
23: Classify(Child, ListOf P arentOf ChildN ode);
24: end for
25: AddConcept(Concept, ListOf P arentConceptAdd, ListChildConceptAdd)
26: end function
27: end procedure
Chapter 5 . Graph Transformations and View Generation 50

new concept by hasContributedTo relationship. However, if a new concept is added


as a parent of a child concept Ci already having one or more parent concept(s),
more than one new document classes will be created depending on the number of
parent concepts of Ci . Figure.5.6 shows the addition of a new concept C8 as the
parent of child concept C5 that already has three parent concepts. Left side of the
state graph shows that the administrator node au is proposing addition of a new
concept C8 , which will be the child concept of C1 and the parent concept of C5 .
So, from ontological perspective C5 isSubclassOf C8 and C8 isSubClassOf C1 .
The proposed concept and the required edges have been shown by dotted line on
the left-hand side. After transformation, right-hand side shows the corresponding
permanent edges. Before addition of the new concept C8 , concept C5 had three
parent concepts, C3 , C4 and C7 . So there were (23 − 1) = 7 document classes
under C5 , shown by the nodes dc1 to dc7 on the left-hand side of the Figure.5.6.
Now, after addition of concept C8 as the fourth parent concept of C5 , there will
be (24 − 1) = 15 document classes under C5 , creating additional document classes
dc8 to dc15 as shown on the right-hand side of Figure.5.6.

• Alter Concept : Since this research effort is dealing with library application, dele-
tion of concept (usually representing a subject area) has not been considered.
However, the concept hierarchy or the ontology structure may change. Alteration
of Concept actually means the restructuring of the concept hierarchy. It is always
possible that a concept/subject area in a Digital Library, as a result of its devel-
opment, encroaches into the area of another subject. For example, a particular
style of storage and retrieval of geographical entities using computers ultimately
gives rise to Geographical Information System (GIS). So, GIS becomes a branch
of both Geography and Information Technology. Hence in the DL ontology, alter-
nation needs to be made to show GIS concept as a child of both Geography as
well as Information Technology. In other words, an existing concept may change
its position, thereby changing its parent(s) and child(children) concepts and also
causing changes in the document classes under it. Figure.5.7 represents a state
graph on the application of alter concept rule. As shown on the left-hand side,
concept C9 which was originally a child of the concept C8 with no child concept
of its own, has now been placed as child of C1 and parent of C5 disconnecting
it from C8 . Proposal made on the left-hand side, has been made permanent on
the right-hand side by graph transformation. In addition, the concept C9 , being
added as the fourth parent of the concept C5 , gives rise to additional document
classes dc8 to dc15 under C5 because of the same reason as explained in the add
concept section. However, concept C9 retains its own document class dc20 even
after transformation.
Chapter 5 . Graph Transformations and View Generation 51

Figure 5.6: Add Concept

Figure 5.7: Alter Concept


Chapter 5 . Graph Transformations and View Generation 52

Figure 5.8: Assign Permission

Figure 5.9: Revoke Permission

• Assign Permission : This operation grants a permission (only read in this case)
to a user. Figure. 5.8, shows the assign permission operation. Lef t − hand
side proposes an authorisation to attach permission type node p to the user type
node u. Hence, the administrative node au proposes a link from node p to node
u, shown by a dotted line. If this permission assignment is allowable, then after
the graph transformation, edge connecting p and u will be permanent, as shown
on the right − hand side.

• Revoke Permission : This operation revokes a permission from an user. So an edge


from a p type node to an u type node as shown on the lef t − hand side of Figure.
5.9, is removed by the graph transformation rule, leaving only the user u on the
right − hand side.

5.1.3 Decidability of Authorisation and Relevance of View Creation

Problem of decidability for inferring authorisation has been an area of interest for many
researchers. A seminal work in this connection has been reported long back to estab-
lish the theoretical foundation of the decidability problem [70]. Many studies related
to this issue were made later and any new proposal for access control has to show that
undecidability of authorisation at any point of access has been avoided. An important
contribution in connection with Roll Based Access Control model has been made in [88].
Earlier, it has been shown that inheritance of implicit authorisation by a child concept
Chapter 5 . Graph Transformations and View Generation 53

Algorithm 3 Permission Modification Algorithm


1: procedure PermissionUpdate . Algorithm will assign and remove permission
2: function AssignPermission(user, Permission )
3: if P ermission ∈
/ P ERM ISSION (user) then
4: P ERM ISSION (user) = P ERM ISSION (user) ∪ {P ermission)
5: end if
6: end function
7:
8: function RemovePermission(user, Permission )
9: if P ermission ∈ P ERM ISSION (user) then
10: P ERM ISSION (user) = P ERM ISSION (user)\{P ermission)
11: end if
12: end function
13: end procedure

from multiple parent concepts may give rise to the problem of undecidability. Inheri-
tance of both positive and negative authorisations against same access right at a child
concept DB (Ref. Figure.3.3 and Figure.3.4), not only makes the authorisation at DB
to be undecidable, it makes authorisation at all nodes inferable from DB (the entire
sub-ontology under DB) to be undecidable as well. Rule-5a and Rule-5b in Chapter-4
have been inserted in the rule set for authorisation inference mechanism to avoid this
undecidability problem. Required proof for this purpose is given in a later chapter.
Logical OR combination of all the implicit authorisations inherited from multiple par-
ent concepts makes the authorisation for the child concept decidable and a positive
authorisation inherited from any of the parent concepts also makes the authorisation at
the child concept positive.
Another way of solving the problem of undecidability is the creation of user-group spe-
cific ontology views. Some researchers have already worked on the creation of ontology
views and processing queries on them [120] [135]. A user-group specific view will keep
only those concept nodes where the concerned user-group has positive authorisation (no
negative authorisation) either explicitly specified or obtained by inheritance. After the
registration of a new user and verification of the credentials submitted by him/her, ontol-
ogy management system places the user in relevant user-group(s) and he/she is attached
to related ontology view(s). Since the membership of a user in a user-group will remain
unchanged till his-her credentials are changed, the user-group specific view can be kept
in the local server of a user/user-group and access to digital library metadata can be
made locally instead of accessing the total ontology. Though implementation detail for
view generation and ontology management is given in Chapter 7, the View-Creation
Algorithm needs to be explained. However, before explaining the algorithm, some basic
operations done on the ontology for creation of such view need to be discussed. These
operations are:
Chapter 5 . Graph Transformations and View Generation 54

Figure 5.10: Sub-Graph Removal

1. Branch Removal: Whenever a user tries to access a node/concept, security pol-


icy server returns the corresponding authorisation of that user for that particular
node/concept depending on his/her user-group. Repeating the discussion made
earlier, it is important to note that the research effort in this thesis has considered
that a library would be accessed primarily in an open environment. In other words,
unless otherwise blocked, i.e. denied access, a user/user-group will have positive
authorisation, by default, to all concepts with their documents immediately after
a successful login. Depending on the user-group in which a user is placed, he/she
may get a negative authorisation to a node/concept by analysing his/her creden-
tials. Such negative authorisations will be created by assigning explicit denial
to some nodes/concepts and then by inheritance from parent nodes. However, if
a user/user-group has negative authorisation (implicit or explicit) to a concept
node, all nodes inferable from that node will also inherit that negative authori-
sation. The branch removal algorithm during user-group specific view creation
will remove those nodes of the ontology that have negative authorisation for the
concerned user-group. The associated links are also removed. Thus, the sub-graph
of the ontology structure that a user/user-group is not supposed to access for neg-
ative authorisation would not be present in the corresponding view. Figure.5.10
explains the situation. Left side of the figure shows the ontological structure be-
fore branch removal. In Figure.5.10, node-1 has positive authorisation. Hence all
nodes below the node-1 (i.e. inferable from node-1) will inherit the same positive
authorisation. Now, node-5 has an explicit negative authorisation. So once again,
nodes inferable from node-5, i.e. nodes 6, 7 and 8 will inherit the same negative
authorisation. Branch removal algorithm will remove these nodes and associated
links to generate user-specific view, as shown in the right-hand portion of the Fig-
ure.5.10. As mentioned earlier, +ve authorisation is available by default, hence
Chapter 5 . Graph Transformations and View Generation 55

specifying -ve authorisation should have been sufficient. However, in Figure.5.10,


+ve authorisation has been shown for the purpose of clarity only. It will be shown
later that even explicit +ve authorisation will be required in some other operation.

2. Concept Obfuscation: No doubt, branch removal algorithm retains only those


concepts/nodes in the user-group specific view for which the concerned user-group
has positive authorisation. As a result, the undecidability problem is avoided.
However, the same algorithm may leave the view as a collection of disconnected
sub-graphs. Node obfuscation is required to solve the problem. Figure.5.11 ex-
plains the situation. Left side of the figure shows the ontology structure before
node obfuscation. In Figure.5.11, node-1 has positive authorisation. Hence all
nodes below the node-1 (i.e. inferable from node-1) will inherit the same positive
authorisation. Now, node-5 has an explicit negative authorisation. So once again,
nodes inferable from node-5, i.e. nodes 6, 7 and 8 should inherit the same negative
authorisation. Now nodes 6 and 8 are again assigned with explicit positive autho-
risation. So branch removal algorithm would remove nodes 5 and 7 with associated
links and leave nodes 6 and 8 as isolated nodes away from the original ontology
structure. To avoid such a situation, node-5 is retained in the view to connect
nodes 6 and 8. However, since the concerned user has a negative authorisation
for concept represented by node-5, it is obfuscated. Right side of the Figure.5.11
shows the user-specific view after node obfuscation. When a user query traverses
the ontology structure, it identifies the existence of node-5 but doesn’t get its iden-
tity or cannot access any document covered by it. This problem may occur for
controlled access to any ontology structure and it is quite common for semantic
web applications [126][85]. The author of this thesis is not sure whether such situ-
ation would occur in a Digital Library application, never the less decided to extend
the facility. It may become useful if the model proposed in this thesis is used for
any other semantic web application. However, node obfuscation would occur only
if explicit +ve authorisation is allowed after assigning -ve authorisation to one or
more nodes causing such nodes to exist in between two positively authorised nodes.

3. Partial Access: This situation happens when a child node/concept has multiple
parents and the child node is partially inferable from them. Referring back to
Figure.3.2, concept DB has three parent concepts, CS, GIS and BIO. So, DB
is partially inferable from all three of them. Now if from the credentials of a
user, he/she is assigned to a user-group which can access only the CS concept
as the parent of DB concept, then the system would impose explicit negative
authorisations on concept GIS and BIO. Now the concerned user for its positive
authorisation to CS and inherited positive authorisation to DB, will get access
Chapter 5 . Graph Transformations and View Generation 56

Figure 5.11: Node Obfuscation

Figure 5.12: Creation of View

to document classes 1, 4, 5 and 7 only, as shown in Figure.3.4 and associated


Table 3.1. So, the branch removal algorithm will remove the GIS and BIO nodes
and their children nodes other then DB. Continuing the example, document class
nodes of DB concept not relevant for CS concept will also be removed, keeping
only the CS related document classes (i.e. document classes 1, 4, 5 and 7). As a
result of this branch removal, in the resultant view, the children of DB concept
will have positive authorisation inherited from DB. The concept DB, on the
other hand would have positive authorisation inherited from CS in spite of the -ve
authorisations inherited from other two parent nodes according to the Rule-6 of the
authorisation inference mechanism discussed in Chapter 4. Figure.5.12 shows an
example ontology where for a particular user-group, +ve and -ve authorisations
are clearly indicated in Figure.(a) on the left side. Figure.(b) on the right side
shows the created view along with the obfuscated nodes.

The View creation Algorithm is given below:


Chapter 5 . Graph Transformations and View Generation 57

Algorithm 4 View Creation algorithm


1: Input : Set of Ontology data O and User u
2: Output : User Specific view Ō
3: function createView(O, u);
4: Ō = {}; . Initialization of Empty Ontology
5: for i = 0 to n do
6: color[i] ← W HIT E; . color is the node coloring to differentiate visit. We
have use three variation of color i.e. WHITE , Gary and Black.
7: mCountN ode[i] ← 0; . mCountN ode is Hash list to store Number of visit
remaining for a multiple parent Concept.
8: mStatusN ode[i] ← 0; . mStatusN ode is a Hash List to store the
permission status flag.
9: end for
10: repeat
11: for each i
12: v ← Oi ;
13: OntologyAccess(v, u); . Initialization of Ontology Access Algorithm.
14: i + +;
15: until i  n;
16: end function
17: function OntologyAccess(v, u) . Recursive function of Pre-order Traversal
18: color[v] ← GRAY ;
19: mCount[v] ← GETPARENTS(v);
20: α ← GetXacmlService(v, u);
21: π ← getListOfChildren(v);
22: δ ←0;
23: passover ← 0
V
24: if (π.size() = ∅ color = Black) then
25: removeNode(v, Ō) . Remove node will remove the node from view.
26: else if then
27: Ō ← v;
28: end if
29: if α 6= true then
30: v ← nodeObfuscationService(v)
31: end if
32: for i = 0 to π.size() do
33: passover ← 0
34: µ ← π[i];
35: mCount[µ] ← GETPARENTS(µ);
36: if mCount[µ]  1 then
37: mCount[v] − −;
Chapter 5 . Graph Transformations and View Generation 58

V V
38: if α 6= true mStatusN ode[µ] 6= 1 mCount[µ] 6= 0 then
39: Remove Link of v ;
40: RemoveHasContributed(µ, Ō);
41: else if α 6= f alse then
42: passOver ← 1
V V
43: else if α 6= true mStatusN ode[µ] 6= 0 mCount[µ]  0 then
44: RemoveHasContributed(µ, Ō);
45: µ ← nodeObfuscationService(µ)
46: end if
W
47: else if mCount[µ]  1 passOver 6= 0 then
48: if α 6= true then
49: δ ← 1;
50: if getListOfChildren(µ) = ∅ then
51: color[µ] ← Black;
52: else if getListOfChildren(µ) 6= ∅ then
53: Ψ ← getListOfChildren(µ)
54: n = Ψ.size()
55: f lag ← 0;
56: repeat
57: p ← Ψ[i]
58: if color[p] = Gray then
59: f lag ← 1;
60: end if
61: i + +;
62: until (f lag! = 1 | n = 0)
63: if f lag = 0 then
64: color[µ] ← Black;
65: end if
66: OntologyAccess(µ, u)
67: end if
68: end if
69: end if
70: end for
71: end function

Salient features of the view creation algorithm are listed below:

Input: Digital Library Ontology O, credential and access rights for a user-group u.
Chapter 5 . Graph Transformations and View Generation 59

Output: User-group specific Ontology view.

1. A node will be added to user-group view once it is visited.

2. If a node is not a leaf node, algorithm will start traversing to it’s child. This
algorithm follows pre-order traversal rules.

3. If a leaf node has single parent and does not have +ve authorisation for the user-
group u, the node will be deleted from user-group view with all its document
class.

4. If a node has single parent, has negative authorisation for user-group u and is not
a leaf node, then the node will be obfuscated. Its document classes will not be
available to the concerned user-group.

5. If a node has single parent and has positive permission for user-group u, all its
document classes will be added to user-group view.

6. If a node Ci has multiple parents, then the node will be traversed completely to
reach its child nodes only after visiting Ci from all its parent nodes to find the
inherited authorisations from all its parents.

7. If a node Ci has multiple parents and if anyone of the parent has positive autho-
risation, node Ci will inherit positive authorisation which will also be propagated
to the child nodes of Ci . Other parent nodes of Ci having negative authorisation
will be deleted from the view.

8. If a node Ci has multiple parents and all parent nodes have negative authorisations,
then Ci will be obfuscated and traversed. Similarly the negative permission will
be propagated to its child nodes unless there is any explicit positive authorisation.

9. Starting from the first node with negative authorisation, all nodes inferable from
it will be obfuscated till a leaf node is encountered. Then the entire path with
negative authorisations starting from the first node with negative authorisation
to the leaf node will be deleted from the view. However, if any node in between
including the leaf has a explicit positive authorisation, the intermediate nodes with
negative authorisations will be obfuscated.

10. The algorithm will continue till all the nodes of the ontology are visited.

5.1.4 Properties of Ontology Views

User-group specific views satisfy some properties that provide the soundness and safety
of the view creation process.
Chapter 5 . Graph Transformations and View Generation 60

1. Creation of ontology views resolves authorisation conflicts. It has already been


explained with example.

2. Number of views generated by view creation algorithm is finite. If the ontology


has n concepts, number of views can maximum be 2n . However, the actual number
of views will be much less since each created view will depend on the connection
structure of the ontology hierarchy and all nodes are not reachable/inferable from
all other nodes.

3. Created view against a given set of authorisations is unique. This is a necessary


safety property for view creation. If same view can be created by more than one
set of Authorisations, security policies can be compromised.

Further explanations and necessary proofs are given in the next chapter.
Chapter 6

Access Control Specification and


Conflict Management

6.1 Introduction

Discussion so far has been restricted to the control of access to the ontology hierarchy of a
digital library for the members of a single user-group. However in a library management
system there should have the provision to define more than one user-group arising out
of different credentials of different users and their different requirements for accessing
the ontology structure. These user-groups can also form a hierarchy where one user-
group may become the sub-group of another larger user-group. Different users from
distinct user-groups will make queries on the digital library and thus will access the
DL ontology. As a result of interaction between two hierarchies of the DL ontology,
user-group hierarchy and concept hierarchy (both are DAGs), conflicts may arise in the
access control policy implemented for the library. The main motivation of this chapter is
to identify such conflicts and to offer solutions for them. It also shows that the proposed
access control model is free from any undecidability when the access control mechanism
may specify both positive and negative authorisations.
In real life a user may not always access an ontology metadata just as an individual
but he/she may be a member of a user-group or may be assigned to a role as in a role
based access control system. So from authorisation point of view a subject can be of
three types: a user, a user-group or a role. A role is defined as a named collection
of privileges/access rights that has to perform specific activities in a system. Users
are assigned to roles. Roles may maintain a hierarchy where a parent role will inherit
authorisations from its child roles connected to it.
A user-group, on the other hand, is a set of users placed in a group where all members

61
Chapter 6. A.C. Specification and Conflict Management 62

Public

Citizens CS-Dept Non-Citizens


Eng-Dept

CS-Faculty
Lucky Mike Sam
George

Jim Mary Jeremy

Figure 6.1: User-group Hierarchy

will have the same set of {object, access right} combinations. A user-group may be
defined as the sub-group of another user-group forming a hierarchy. However, here
a child user-group inherits authorisation from its parent user-group(s), opposite to the
direction of inheritance defined for a role hierarchy. Both role and user-group hierarchies
are acyclic in nature. Moreover, a particular user may be a member of more than one
user-group and similarly, a user may also be assigned to more than one role. As a result,
the underlying structure of both the hierarchies is a Directed Acyclic Graph (DAG).
Distinctions between user-group and role hierarchies have been discussed in [77]. In
the access control model proposed in this thesis, only access right considered is Read.
So a subject can either get a positive authorisation to read an object or a negative
authorisation for explicit denial. Browse has been considered as a default access right
available to all subjects for all concepts in the ontology. It ensures that access control
mechanism applied to the system will not hinder any ontology traversal algorithm. So
browse by default, doesn’t allow access to the document class level, read authorisation
will be required for it. Since only one access right is available, this paper has considered
user-group hierarchy only. Figure. 6.1 shows one such hierarchy. The example has been
borrowed from [77]. Figure. 6.1 has shown that some users (Jeremy, George and Lucky)
are members of more than one user-group. So, such users will inherit authorisations
from more than one user-group which may give rise to authorisation conflicts. Chapter
4 and 5 have discussed about the access control model against a single user-group. Next
sub-section, however, considers the presence of both subject and object hierarchies.
The present chapter will then continue to discuss about their possible conflicts and will
suggest the ways for resolution of such conflicts.
Chapter 6. A.C. Specification and Conflict Management 63

6.1.1 Object and Subject Hierarchy

In the object hierarchy of the proposed model, an ontology contains concepts, a concept
contains document classes and a document class contains documents. So the hierarchy
is captured by an Inclusion(γ) relation. For ease of representation in the algebraic
expression, in this chapter, both Inferable(⇒) and Inclusion(γ) relationships of
Chapter 3 have been merged into one.

• Non-commutative : Inclusion relation is non-commutative. ∀x∀y{γ(x, y) → ¬γ(y, x)}

• Transitive : Inclusion supports transitivity. ∀x∀y∀z{γ(x, y) ∧ γ(y, z) → γ(x, z)}

Since, inclusion relation is non-commutative and supports transitivity, different objects


related by inclusion define a hierarchy, i.e. concept/object hierarchy.
Apart from object hierarchy, there also exists user-group hierarchy as explained earlier.
Here, users are grouped into classes which are termed as user-groups. These user groups
are forming the user-group/subject hierarchy. Access-rights/permissions to access spe-
cific objects are assigned to user-groups. A user inherits the access permissions of the
user-group to which it has been assigned. A user may be assigned to multiple user-
groups as shown in Figure. 6.1. However, it may give rise to conflicting authorisations,
which will be explained later. Similar to Inclusion relation for object hierarchy, differ-
ent entities in the user-group hierarchy are connected by Assignment(α) relationship,
explained in Chapter 3.

• Non-commutative : Assignment relation is non-commutative. ∀u1 ∀u2 {α(u1 , u2 ) →


¬α(u2 , u1 )}

• Transitive : Assignment supports transitivity. ∀u1 ∀u2 ∀u3 {α(u1 , u2 ) ∧ α(u2 , u3 ) →


α(u1 , u3 )}

Since, assignment relation is non-commutative and supports transitivity, different user-


groups related by assignment can define a hierarchy (i.e. user-group hierarchy).
Above relationships, as explained Chapter 3, have already been used in the authorisa-
tion/policy specification and derivation rules discussed in Chapter 4. However in chapter
4 only concept hierarchy has been considered. These rules are to be extended for two
hierarchies partially conflicting. Now, both the user-group hierarchy and the concept
hierarchy of the ontology structure are polyhierarchic, i.e. a concept in the ontology hi-
erarchy can have multiple parents as shown in Figure. 3.2 and a user/user-group can also
have multiple parents as shown in Figure. 6.1. So, it is possible for a user/user-group
to have conflicting authorisation to a concept which needs to be resolved so that there
Chapter 6. A.C. Specification and Conflict Management 64

is no decidability problem in inferring authorisation. Moreover, both the hierarchies


support transitivity property and hence can support inheritance of authorisation. So, if
a user-group ui enjoys an authorisation set β explicitly specified, then after inheriting
authorisations it will have a derived authorisation set βder which is a super-set of β.
Chapter 4 has already enumerated the authorisation derivation rules considering only
the concept/object hierarchy. This chapter augments those rules to include both the
object and subject hierarchies. The augmented set of authorisation derivation rules are:

• [Rule 1] Reflexivity Rule: All tuples of β are inherited by βder .

• [Rule 2a] Inheritance Rule [Direct]: For user-group hierarchy the rule is ((ox , uj , ak , sl ) ∈
β) ∧ α(ui , uj )) ⇒ (ox , ui , ak , sl ) ∈ βder

• [Rule 2b] Inheritance Rule [Direct]: For concept hierarchy in the ontology the rule
is as follows ((oy , ui , ak , sl ) ∈ β) ∧ γ(ox , oy )) ⇒ (ox , ui , ak , sl ) ∈ βder .

• [Rule 3] Inheritance Rule [Mutual Exclusivity]: ((ox , uj , ak , sl ) ⊕ (oy , uj , ak , sl ) ∈


β) ∧ α(ui , uj )) ⇒ ((ox , ui , ak , sl ) ⊕ (oy , ui , ak , sl )) ∈ βder .

• [Rule 4] Override Rule: ((ox , ui , ak , sl ) ∈ βder )∧((ox , ui , ak , ¬sl ) ∈ β) ⇒ (ox , ui , ak , ¬sl ) ∈


βder ∧ (ox , ui , ak , sl ) ∈
/ βder

• [Rule 5a] Inheritance rule for Multiple Parents (Concept Hierarchy only): If for
a user-group ui , the policy set applicable to object oA is βA and same for object
oB is βB and object oc is a child of both oA and oB , then ((βA ∪ βB ) ∈ βCder ),
where βC is the policy set explicitly specified for the object oc and βCder is the
total derived policy set for oc with respect to user-group ui .

• [Rule 5b] Conflict Rule under Multiple inheritance (Concept Hierarchy only):
(((oA , ui , ak , si ) ∈ βA )∧((oB , ui , ak , ¬si ) ∈ βB )) ⇒ (((oc , ui , ak , si )∨(oc , ui , ak , ¬si )) ∈
βder ))

• [Rule 6] Inheritance rule for Multiple Parents (User-Group Hierarchy only): ((ox , uj , ak , sl ) ∈
βuj )∧((ox , ut , ak , ¬sl ) ∈ βut )∧α(ui , uj )∧α(ui , ut ) ⇒ ((ox , ui , ak , sl )⊕(ox , ui , ak , ¬sl )) ∈
βuider

According to Rule-1, β covers the authorisations explicitly specified for each (concept,
user-group) combination. So for a user-group other authorisations are inherited either
through the user-group hierarchy or through the concept hierarchy in the ontology. It
gives rise to βder .
Rule-2a and Rule-2b provide the inheritance rules as applicable to user-group and con-
cept hierarchy respectively. So according to Rule-2a, if a user group ui is assigned to
Chapter 6. A.C. Specification and Conflict Management 65

another user-group uj , i.e. ui is a subset of uj , then any authorisation explicitly specified


for superset uj will be inherited by its subset ui and will be included in its βder .
Similarly for Rule-2b, if an object/concept ox in the ontology is included in another ob-
ject oy , i.e. ox is a child concept of oy , then any authorisation that allows a user-group
ui to access oy will also provide access to ox for the same access right.
Rule-3 covers the situation when any Separation of Duty (Exclusive Or operation) is
specified for a user-group uj . Any user-group ui assigned to uj (i.e. ui is a subset of uj )
inherits the same Separation of Duty property as explained earlier.
Rule-4 signifies that in case of any conflict of authorisation between the policy sets β
and βder , an explicit specification of authorisation (member of β) will override an au-
thorisation obtained by inheritance (member of βder ).
Rule-5a and Rule-5b consider the cases of having multiple parents in concept hierarchy
of the DL ontology, i.e. multiple inheritance situations. If an object/concept ox has
multiple parent concepts in the ontology hierarchy, then any user-group ui while access-
ing the child object ox , will inherit all the authorisations (+ve or −ve) it has on all
the parent concepts while accessing the child object. If authorisations derived/inherited
from parent concepts are in conflict, OR combination of them will be applicable. It is
logically obvious, as the child concept will inherit the union of authorisations from its
parent concepts. So long at least one of the parent concepts provide positive authori-
sation for a particular access right to ui , the members of ui will be able to access the
child concept with the same access right. However, access to the document class nodes
will be controlled by the authorisations inherited from the parent concepts.
Rule-6 deals with multiple inheritance condition for user-group hierarchy. As shown in
Figure. 6.1 and associated explanation, a user-group may be the child/subset of more
than one super user-group. As a result, the child node on the user-group hierarchy will
inherit authorisations from more than one parent nodes and may give rise to conflict of
authorisations. As shown in the rule, a user-group ui is the sub-group of two other user-
groups uj and ut and inherits two conflicting authorisations. Considering the Separation
of Duty principle described in [63], a concept of session (between login and logout) for
accessing the digital library has been introduced. In a session, members of user-group
ui will inherit authorisation either from uj or from ut (Exclusive-Or operation) and not
both. At the login time the choice has to be made. This would automatically avoid
the conflicting situation arising out of the user-group hierarchy for presence of multiple
parent user-groups.
Incidentally, Access to Document Class as explained in sub-section 4.5.3 in Chapter 4
will remain unaltered even in presence of two separate hierarchies. Since conflicts be-
tween two hierarchies: user-group and concept, would not affect access to document
classes below once a (user-group, concept) combination has a positive authorisation.
Chapter 6. A.C. Specification and Conflict Management 66

6.1.2 Completeness of The rules

In order to justify the rules enumerated above, it is necessary to establish that all the
security requirements are met by the given set of rules. However the proposed system
has made some assumptions.

Completeness assumptions:

• Each concept in the ontology hierarchy is accessed by at least one user-group in


the user-group hierarchy.

• Each node pair (parent and child concept) in the ontology hierarchy is accessed
by at least one user-group in the user-group hierarchy.

The completeness assumptions ensure that each node and each edge of the ontology
hierarchy are accessed by some user-group.

Authorisation assumptions:

• Since the application domain is digital library, it is assumed that it normally


operates as an open system, i.e. normally all concepts are accessible unless any
explicit negative authorisation is specified.

• Entry point to the DL ontology is assumed to have a hypothetic root node that
accepts a user (through standard user-code/password verification). Then the user
is assigned to a user-group either by creating its credentials (in case of a new user)
or by verifying its credentials (in case of an existing user registered to the system)
already stored in the system.

• A user may belong to more than one user-group and for implementing Separation
of Duty, the concerned user has to declare its chosen user-group for his/her session
of accessing the library.

Earlier, in this thesis, the proposed access control model has highlighted the need of
creating ontology views. When such views are created for each user-group, it would
contain only such concepts where the concerned user-group has access. So, the conflict
arising out of multiple inheritance situation as highlighted by Rule-6 above will not be
encountered. Creation of user-group specific view will also ensure any conflict against
inheritance of authorisation as depicted in Rule-5b is avoided as well. Nevertheless, it is
Chapter 6. A.C. Specification and Conflict Management 67

u3 DL
u1

u2
GIS
Bio CS
u4
u5

DB

User Graph Hierarchy Concept Hierarchy

Figure 6.2: Object and Subject Hierarchy

necessary to show that the rule set defined are sound and complete with respect to the
access control requirements and authorisations at any concept node with respect to any
user-group is decidable.

6.2 Conflict and Conflict Management

So, the possible sources of conflict in the proposed access control system are:

• User-Group to User-Group Conflict: Two conflicting privileges should not be as-


signed to or inherited by the members of a particular user-group through user-
group hierarchy. Figure. 6.2 shows a user-group hierarchy and a concept hierarchy.
Now, if user-group u1 and u2 are assigned with authorisations < u1 , Bio, read, + >
and < u2 , Bio, read, − >, members of user-group u2 will simultaneously get nega-
tive and positive access to concept Bio, first by inheritance from u1 and then by
direct assignment, making the authorisation at u2 undecidable.

• Concept to Concept Conflict: Two conflicting privileges should not be assigned to


or inherited by the members of a particular user-group through concept hierarchy.
There can be two such situations:

– Conflict between explicit authorisation and implicit authorisation: Let the


user-group u1 be assigned with two different authorisations, < u1 , Bio, read, + >
and < u1 , DB, read, − >. Here, members of user-group u1 will have conflict-
ing authorisations for accessing the concept DB, since implicit authorisation
inherited from concept Bio will offer a positive authorisation for accessing
DB, while a direct/explicit assignment will provide a negative authorisation
for accessing concept DB.
Chapter 6. A.C. Specification and Conflict Management 68

– Conflict between two implicit authorisations: Let the user-group u1 be as-


signed with two different authorisations, < u1 , Bio, read, − > and < u1 , CS, read, + >.
Here, user-group u1 will inherit both positive and negative authorisations for
the common child concept DB from two different parent concepts Bio and
CS.

• Combined Conflict of User-Group and Concept: It is a combination of previous two


conflicts, i.e. conflict caused by both user-group and concept hierarchy. Let the
user-groups u1 and u2 are assigned with the authorisations, < u1 , Bio, read, + >
and < u2 , GIS, read, − >. Now for the concept DB, user-group u2 has conflicting
authorisations. authorisation < u1 , Bio, read, + >→< u2 , Bio, read, + > (inheri-
tance through user-group hierarchy) →< u2 , DB, read, + > (inheritance through
concept hierarchy). On the other hand, authorisation < u2 , GIS, read, − >→<
u2 , DB, read, − > (inheritance through concept hierarchy).

Lemma 6.1. Under the given set of rules, authorisation for any (user-group, concept)
combination is decidable.

Proof. For a user-group, undecidability at a particular concept node arises out of dif-
ferent types of conflicts discussed earlier. A conflicting condition is the one where for
a (user-group, concept) combination both positive and negative authorisations can be
inferred either by direct assignment (explicit authorisation) or by inheritance (implicit
authorisation). It has already been discussed that conflict may arise out of, User-Group
to User-Group Interaction, Concept to Concept Interaction and Combined Interaction of
User-Group and Concept. However, discussion above has revealed that the conflicts are
mainly of two types: Conflict between explicit and implicit authorisations and Conflict
between two or more implicit authorisationsauthorisation. Proof of this lemma may be
attempted by contradiction. So, if a state of undecidability is allowed to happen for a
(user-group, concept) combination, it is necessary to prove that the existing rule set can
resolve such conflict and can bring the system out of the state of undecidability.

• Conflict between Explicit and Implicit authorisation:


If at any concept node, direct assignment of authorisation to access by a user-group
(explicit authorisation) ui conflicts with the inherited authorisation (implicit au-
thorisation) either through concept or user-group hierarchy or combination of both,
condition of undecidability is resolved by the Rule-4 (Override rule) where the ex-
plicit authorisation is given preference over the inherited or implicit authorisation.

• Conflict between two or more Implicit authorisations:


If conflict arises out of two or more implicit authorisations inherited by a child
Chapter 6. A.C. Specification and Conflict Management 69

concept ci from its multiple parent concepts (through concept hierarchy) for a
user-group ui , condition of undecidability will be resolved by Rule-5b, where the
inherited authorisations will be combined by a logical OR operator. As a result,
any of the implicit positive authorisation will provide access to the concerned child
concept ci . Children of user-group ui and children of concept node ci will inherit
such positive authorisation.
Conflict of Implicit authorisations through user-group hierarchy will not be permit-
ted by Rule-6, since the Exclusive OR condition between implicit authorisations
ensures Separation of Duty condition. A user at login time (beginning of an access
session) will have to declare its user-group under which it will make its access.

So, all (user-group, concept) combinations under the given rule set are decidable.

Lemma 6.2. Policy or primitive combinations to access a particular document class is


unique.

Proof. It is a necessary safety property as explained in [88]. If a user-group can reach


i.e. access a concept node by more than one policy primitive combinations joined by
logical operators, the security of access can easily be compromised. So in the proposed
access control model, first it is necessary to show that access to a concept node by
a user-group is safe and then the safety property is also maintained for access to any
document class covered by a concept node. This proof can also be tried by contradiction.
So, first proposition, a concept node can be reached by more than one policy primitive
combination.

• If a concept node ci has only one parent, then ci will inherit only one set of
authorisations besides the explicit authorisations of its own. It is directly derived
from Lemma. 6.1 above that any conflict at the parent node will be resolved by
the rule set previously specified. So the inherited authorisations for a single parent
will always be unique. Extending the proposition, children of ci will also inherit
unique set of authorisations.

• If a concept node ci has multiple parent nodes, unless there is any conflicting
authorisation, a user-group ui will either be permitted to access ci or it will be
an explicit denial. In case of conflict again, it is shown in Lemma 6.1 that such
conflicts can be resolved.

• If a user inherits conflicting authorisations from more than one user-group where
it is a member, login time declaration will decide which user-group to be made
effective for the concerned user for the login session under consideration (Separation
of Duty).
Chapter 6. A.C. Specification and Conflict Management 70

So for a (user-group, concept), the policy primitive combinations permitting access is


unique. If a concept node has only one parent, it will have only one document class
under it and access to it will also be made by unique policy primitive combinations.
However it is still necessary to justify that for any concept node having multiple par-
ent nodes, access to a document class will also be done by a unique policy primitive
combinations. Referring to Figure. 3.2 and Table. 3.1, it is found that access to a
document class can be made via more than one parent concept nodes. Considering the
running example discussed in this paper, policy primitive combinations to access the doc-
ument class 7 (dc7 ) by a user-group ui will be {((CS, ui , read, ∗sg).(DB, ui , read, ∗sg)) ∨
((BIO, ui , read, ∗sg).(DB, ui , read, ∗sg))∨((GIS, ui , read, ∗sg).(DB, ui , read, ∗sg))}, where
∗sg → (sg ∨ ¬sg) and the “.”ù operator implies inheritance. Now if conflict arises for
inheritance through different paths because of multiple parents, Rule-4 and Rule 5b will
resolve them as discussed in Lemma 6.1.

Theorem 6.3. The proposed rule set is sound and complete.

Proof. Soundness of the rule set is justified when it ensures that the access to all con-
cept nodes and document classes are safe and decidable. Lemma. 6.1 and Lemma. 6.2
provides the required justification. The rule set can be considered complete if the total
policy derived for a (user-group, concept) combination is complete. In other words,
after the application of the given set of rules, no other policy/policy primitive can be
obtained. Different cases for membership of βder , as discussed earlier can be considered
for this purpose. Membership to derived set of policies βder for a (user-group, concept)
combination can either be by explicit authorisation or by inheritance (implicit authori-
sation).
For a (user-group, concept) combination

• All explicit authorisation in β should be member of βder . It is ensured by Rule-1


(Reflexivity Rule)

• Authorisations available to superconcept(s) will be inherited by subconcept(s).


Similarly, authorisations available to super user-group(s) will be inherited by sub
user-group(s). Rule-2a and Rule-2b ensure such inheritance

• Any conflict, either through user hierarchy or concept hierarchy, can be handled
by Lemma 6.1 or Lemma 6.2 to ensure decidability and safety for all (user-group,
concept) combination.

Since there is no other way to derive authorisation for a (user-group, concept) combina-
tion, the proposed rule set is complete.
Chapter 6. A.C. Specification and Conflict Management 71

6.2.1 Conflict Resolution Against Update

Discussion so far could ensure that for a given ontology hierarchy and user-group hi-
erarchy, all possible conflicts can be resolved. However, both concept and user-group
hierarchies may change and consequently may introduce new authorisation conflicts. So
it is necessary to propose a method so that the access control system remains conflict
free against any possible change. The possible changes are:

• Addition of new policy primitives.

• Addition of new authorisations to existing user-groups.

• Restructuring of Concept hierarchy

– Addition of concepts
– Addition of edges

• Restructuring of User-Group hierarchy

– Addition of user-groups
– Addition of edges

Definition 6.4 (Non Conflicting Updates). If any change in the system, by introducing
new policy primitives or new assignments of existing policy primitives or any structural
change in any of the hierarchies, can ensure that no new conflicting authorisations are
encountered in any of the two hierarchies, then the corresponding update in the system
is non conflicting.

According to the structure of two existing hierarchies, i.e. concept and user-group
hierarchy, some nodes may not conflict against any possible authorisation updates. For
example, non inferable node pairs will never be in conflict. However, besides user-group
to user-group conflict, all other conflicts even after updates can be taken care of by the
rule set already defined. Since between two conflicting user-groups Separation of Duty
need to be ensured, during update system should monitor whether such conflicts are
getting generated.

Definition 6.5. Two user groups ui and uj will be non-conflicting, if between them, no
conflicting explicit or implicit authorisations can be inferred.

Determination of non-conflicting set has been proposed by [122][130]. It constructs a


n × n conflict matrix, where n is the number of user-groups. Now, if there is any conflict
between i and j, the element Ci,j is set to 1 otherwise it is 0. In order to determine
Chapter 6. A.C. Specification and Conflict Management 72

Table 6.1: Conflicting User-Group Table

1 2 3 4 5 6 7 8 9 10
1 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 1 1 0
7 0 0 0 0 0 0 0 0 1 1
8 0 0 0 0 0 0 1 0 0 0
9 0 0 0 0 0 0 1 1 0 0
10 0 0 0 0 0 0 1 1 0 0

Table 6.2: Conflict Free User-Group Table

1 2 3 4 5 6 7 8 9 10
1 1 1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1 1 1
6 1 1 1 1 1 1 1 0 0 1
7 1 1 1 1 1 1 1 1 0 0
8 1 1 1 1 1 1 0 1 1 1
9 1 1 1 1 1 1 0 0 1 1
10 1 1 1 1 1 1 0 0 1 1

conflict free user-groups, a dual of conflict matrix is created. So here, an element with
value 1 shows a conflict free user-group pair. During update the system should ensure
that new authorisation assignment or structural change should preserve the Separation
of Duty, i.e. the conflict free user-groups of the dual matrix. Table. 6.1 shows the conflict
matrix for 10 user-groups and Table. 6.2 shows its dual, i.e. conflict free matrix.
Chapter 7

Architectural View and


Implementation Details

The implementation model has two major blocks. The first one is the standalone library,
that generates the secured views, and the other one is the web-based layered architecture
which provides the web service interface of the view management system. Inside the
standalone library block, view generation model is the main component. The view
management component initiates the query through the query processing unit. On the
other hand, it also provides the interfaces though which it can access policy store. Policy
store is the collection of security policy mapped to ground logic (like read, write, etc.).
Furthermore, all the semantic queries have been created using SPARQL specification
and fetches data from the Virtuoso database, which is the default ontology stores in this
model. The model has been considered as an open system and the policy store contains
a set of explicit rules. Current implementation has been done with the XACML [114]
based policy specification with SUN XACML engine. In the web interface, the model
relies on JAVA RESTful implementation with JSON 2.0. As a result, the developer
can choose a range of tools to generate the client interfaces. Consequently, a client can
connect from a range of devices like Android based tab, IOS based devices, windows
devices and so on.

7.1 Anatomy of The Digital Library Deployment

Overall functionality of a Digital Library system, includes various components and pro-
cesses. However, it’s necessary to understand the extensive deployment of the architec-
ture before going to component and process details. Figure.. 7.1, represents the overall
scenario and followed by the description of each component.

73
Chapter 7. Implementation Model 74

Digital Library Repository Metadata Storage

Metadata Server
8c XACML Storage
9
8a

8b
5
6
3

7
4

Security Engine Organisational


2 Hierarchy
View Server

10
Digital Library Gateway

Admin 1

Organisational Firewall

11 12

Users’ Group

Users’ Group

Figure 7.1: Deployment Diagram of Digital Library

• Organizational Firewall : Organizational firewall is the entry point of the digital


library setup. Users’ request first arrive at this point. Here the firewall has been
assumed to be a standard enterprise firewall, that protects the systems from various
kind of network attacks [160]. Present thesis dose not include firewall related
activity but assumes that the firewall has been properly configured to monitor
incoming and outgoing traffics.

• Digital Library Gateway : Digital library gateway contains two important


components. First is the credential server, which provides users’ credential to ac-
cess the digital library (DL). During the registration process, a user some manda-
tory information to the credential by filling up registration form. For example, it
may include name, registration number, department name, roll number etc. The
registration form contains information related to the identity of a particular user.
Credential manager analyses the information either automatically by some rule set
Chapter 7. Implementation Model 75

or through administrative intervention. After the verification, the access is granted


and the system assigns the user to a user-group and returns the user-group-id with
the credential-id to the user. Once the credential is verified the credential man-
ager allows the users’ client to establish the session and also allows to send the
user’s request to View Manager. Apart from the authentication, gateway server
also balance the load to the view server. As the web layer interface of the view
server has been implemented on the top of Java RESTful [68] web services with
Apache Tomcat version 7 [10]. Hence, mod jk apache based load balance API is
the natural choice. However, the Digital Library Gateway only allows the authen-
ticated request to the View Server after credential verification, and also it controls
the traffic load to the View server to avoid the bottleneck at the next level.

• View Server : View server is playing the central role in this model. The
View Server generates and manages users’ view. View server is connected with
Metadata Server and Security Engine through link 3 and 4 respectively. In the
process of generating views, the view server executes query at metadata server
using SPARQL specifications. Nonetheless, the View Server also sends queries to
the Security Engine through API calls. In response of that security engine sends
back the authorisation value. Using both the data set view server initiates the view
creation using View Generation Algorithm, which has already been described in
earlier chapter. View server is exposed to Digital Library Gateway through Java
Web services and deployed over Tomcat Servlet Container. However, number of
servers can be increased anticipating demands of load, by using tomcat cluster.

• Metadata Server : Metadata server contains consolidated collection of digital


library metadata. Figure.. 7.1 depicts the connections among metadata server
and various digital libraries through links 8a, 8b, and 8c. so, the metadata server
contains a consolidate view of all the participating digital libraries’ metadata. The
underlying structure of the ontology data resides on the metadata server is rep-
resented by OWL specifications. Virtuoso database in the metadata server stores
all the data in a typical graph format. It also provides the seamless integration
of the query processing engine with SPARQL specification. Virtuoso web engine
exposes the query API as services. Normally, a client sends the query though
secure connections from remote locations. In this model, JDBC has been used for
client-server communications. Each query has been answered in the triplet format.
Description of query and response will be given in the view generation algorithm.

• Security Engine : Security engine stores the security policy and ground rule.
Security policy is the mapping among subject, object and permissions. Security
engine can be accessed through XACML API. The view server sends a subject
Chapter 7. Implementation Model 76

name and object name with operation request(access right) to the security engine.
In response, security engine sends back the policy authorisation value. Security
engine is connected with Policy storage and Organisational hierarchy storage. The
Policy Storage, stores policies in XACML. In this model, to maintain the volume
of policy less, policy repository stores policies only. All the implicit policies are de-
rived during operations through policy derivation algorithms. Organisation struc-
ture provides the connections and the arrangements among various user-groups.
However, the ontology structure within the organisation is available at the or-
ganisational structure repository. Various components of the security policy and
security algorithm will be discussed in later section.

• Metadata Storage : Metadata storage is a network storage which stores the


digital library metadata in OWL format. The virtuoso server maintains this stor-
age. Its parameter for location and other details can be configured using virtuoso
configuration file.

• XACML Storage : XACML storage stores the policy file. XACML Policy De-
cision Point (PDP) resides at the Security Engine and XACML Storage is mounted
to the same server. Each time a authorisation request is initiated PDP loads the
policy files. Policy Enforcement Point (PEP) builds the authorisation request and
send it to the PDP.

• Organisational Hierarchy Storage : Organisation hierarchy typically stores


organisation users-group hierarchy. Current implementation model assumes that
according to the requirement number of user-groups have been created. Moreover,
according to access priority and importance, groups have been arranged into partial
ordered set. As an instance, user group “Computer Engineering” is a subgroup
of “ Engineering.” Hence, user assigned to “Computer Engineering”will inherit all
the privileges assigned to “Engineering” and also get the extra privileges assigned
to “Computer Engineering” only.

This section has provided brief descriptions of the important components present in the
model. In the next section, the execution sequence will be described through a sequence
diagram to understand over-all process flow. After the execution flow, each component
will be described with relevant examples.

7.2 Execution Sequence and Process Flow

Figure. 7.2 depicts the various processes of the digital library system. Web layer of
the model generates two types of requests. One is the registration request, and other is
Chapter 7. Implementation Model 77

the access request. Here, users are the end users remotely accessing the digital library
through secured channel(encrypted channel) using the standard web request (in our
case RESTful web services). First step of the Figure. 7.2 which is marked as “Web
Service Client” represents the clients’ applications. During the registration process, a
user sends a request to the credential manager with some mandatory information like
department, registration number, roll number, email and so on(Link 1 in Figure. 7.2).
Credential manager reviews that information either automatically by some rule set or
through administrative intervention. Once the access is granted, system assigns the user
to user-group and returns the user-group-id with the credential-id to the user(Link 2
in Figure. 7.2). The user-group-id ensures the user’s access through a particular user-
group or a set of non-conflicting user-groups. Credential id stores the references of the
user’s credential in the database (in this case Lightweight Directory Access Protocol
(LDAP)). Credential-id and user-group-id jointly define user’s authorisation set in the
systems. However, the user-group-id is used for simple authentication. In addition,
more secure arrangement can be achieved by replacing user-group-id and credential-id by
encrypted key files. In that situation, user will upload those key files through web services
while accessing the digital library. Once the access is approved, and the credential-id
is generated, the user can send the request from his client to the credential manager
residing at the digital library gateway. Link 3 at Figure. 7.2) shows the user’s request
with proper credential details. Initiation of a session request is invoked by the user by
passing its credentials. After the credential verification, credential manager allows the
user’s client to establish the session and also allows to send the user’s request to the
view server. Likewise, the view server, generates, maintains and manages views for every
single user-group of the system. In this process, if a user arrives with a fresh group, then
the View Server generates a new view for the particular group. View is the collections
of all permitted concepts for the particular user group and it generates directed acyclic
subgraph out of the original ontology concepts permitted for the concerned user-group
. However, XACML policy store maintain only explicit policy specifications to reduce
the size of the policy database. All implicit authorisations are derived by using the
authorisation derivation rules at the time of view generation. The view server accesses
explicit authorisation through the XACML API calls. The permission value can be
inherited from a concept to its immediate children. Information about the concept
hierarchy is available through the metadata server. Other than the concept hierarchy,
permission can propagate through organisational user-group hierarchy.Likewise, view
algorithm also access the organisation hierarchy from the graph stored in “Organisational
Hierarchy Storage”. Link 4 in the Figure. 7.2) represents user’s request for view after
authentication by the credential server. Link 6 of the same diagram shows the view
servers interaction with the digital library metadata storage. The view server sends
SPARQL query through JDBC driver and receives the answer in triple format. If the
Chapter 7. Implementation Model 78

Web Service Credential XACML Engine OWL Metadata


Client
Manager View Manager
repository

1 Registration Request

5. Authorisation Check
2 Credential id with group id

4. Approved query
3. Query for view

6. Metadata access using SPARQL query


8. user’s view

7. user’s view

Figure 7.2: Sequence Diagram of Digital Library Systems

user login for the first time, then user client will pull the entire view specified for his
group. So the first access will need more setup time depending upon the size of the view
for the particular user-group assigned to the user. From next access onwards, client
will just verify the version of its localised view against the current version of the view
available with the system and update changes, if any. Once the view has been transferred
to the client system, the user will run its query on the local view. In this system, query
requests use SPARQL query syntax and views are generated by OWL specifications. If
some error occurs at the client side view, client sends a request to rebuild its view. Client
subscriptions can be invalidated by the credential manager. In such a case, system will
automatically delete the client’s view. If necessary, view generation server may push
new updated view to the client’s system.

7.3 Design of The View Generation Systems

The view generation algorithm has been described in an earlier chapter. Various com-
ponent details are described in this section with class and sequence diagram of the
Chapter 7. Implementation Model 79

algorithm. However, the central programming paradigm is aimed to build a web-based


view generation service on the top of the servlet. Overall programming have been di-
vided into two distinct parts. The initial part is the development of the core algorithm,
and the second part is the web based deployment. The first part of this section will
illustrate the view algorithm, with a brief working description of the XACML server.
The later section will describe the virtuoso data storage and web services deployment.

7.3.1 Implementation of the Algorithm

Current approach of implementation and design relies on the Java based, object oriented
modelling. Likewise, the source code and the java package (obac-1.0.jar) of the algo-
rithm, detailed documented java-doc and Unified Modeling Language (UML) modelling
files are available in the code repository1 .
Figure. 7.3, represents the partial class diagram focused on data access from the graph
and the data dissemination. However, Figure. 7.4, represents the security enforcement.
However, Figure. 7.5, is the complete UML diagram of the view generation algorithm.
In the code base, the UserRootView class is the facade of the view generation algorithm.
The method getUserView() can be called from outside to get the view. The most impor-
tant parameters of the getUserView() method are ViewObject, credential of the user,
and the graph details. The ViewObject class, is on empty class passed with a method.
At the end of the execution, it is populated by the method to store all information which
required to generate the user’s view. Object instance with the ViewObject class is the
out put of this method. The ViewObject class also instantiate visitLog class which
stores the traversal log during the execution. The ontology traversal is done using a
non-recursive breadth first search(BFS) algorithm. During each pass of the algorithm it
stores several information like unvisited children, multiple parent and visited node with
positive permission and visited nodes with negative permission. The visitLog class is
used for storing those informations.
The UserRootView class also accesses ontology storage for accessing data, right now
the default storage is The Virtuoso. The VirtDataAccess is the connector class of the
virtuoso storage, which can be accessed through the StorageAccess interface. However,
VirtDataAccess uses JDBC driver to establish the connection and the UserRootView
uses Model interface to extract the reply of the query. List of the sample queries have
been given in Listing. 7.6. After extracting the concepts’ information from the query,
the UserRootView accesses the authorisation values of the concepts for a particular user-
group. XACML API returns 0, 1, 2, 3 in reply to any authorisation query, which rep-
resents allow, deny, error and unspecified respectively. As the XACML policy database
1
OBAC https://github.com/dsubhasis/obac
Chapter 7. Implementation Model 80

Figure 7.3: Class Diagram of The View Object : Data Dissemination

Figure 7.4: Class Diagram of The View Object Security Enforcement


Chapter 7. Implementation Model 81

stores only explicit authorisation values the authorisation query may try to access con-
cepts which are unspecified at the XACML policy server. However, an authorisation to
a concept not available at the policy store remain unauthorised for access by the concept
user-group necessarily. In this model, a concept may inherit authorisation from either
through concept hierarchy i.e. the parent concepts or through the user group hierarchy.
In Figure. 7.4, the UserRootView class from the method getUserView() create the in-
stance of class SecureAccessXACML. In side the SecureAccessXACML, getPermission()
is the method which initiates the authorisation check. The procedure needs to traverse
user-groups to generate implicit authorisation values. Similarly, TraverseOntology is
the procedure which traverse the concept hierarchy and stores the relevant authorisa-
tion information in the instance of OntologyObject class. The XACMLRequestGenerator
generates XACML request and sends that XACML request to the XACML server. The
Sun XACML API has been used here. The getPermission() method returns that re-
quest to the getUserView(). Reply to the authorisation request belongs to a set of
numerical values very similar to XACML reply. Nevertheless, it determines the authori-
sation status of the particular concepts from a group’s perspective. In this way by using
the BFS, the UserRootView generates a list of accessible children and other information
which is necessary to build the view.
Figure. 7.5, shows the composite view of the class diagram, where isi.ecsu.view.struct
package contains main object structure. However, the main implementation of the view
algorithm is available at isi.ecsu.view.struct.impl. Also, the security related al-
gorithm and functions are available at isi.ecsu.view.security package. Apart from
above three major packages,it contains some utility classes under isi.ecsu.util. The
dataset has been transformed as per the requirements of testing. Sequence diagram of the
functional workflow is shown in Figure 7.6. In the first step of the sequence diagram, the
ViewGenerate class provides the rView method of the ViewGenerationModule through
the View interface. This method called from web services. Clients’ view generation re-
quest arrives through this method. However, only approved query or call can initiate this
module after verified through the credential manager. Inside the userRootView class, the
method getUserView() is the code facade of the algorithm that initiate virtConnect()
procedure to create a connection string of the graph. The virtConnect() is a method
that resides at virtDataAccess() class. The other significant method in the same
class is the executeQuery() which is used for querying the Virtuoso database. The
getUserView() uses getPermission() call of SecureAccessXACML to use the security
functionality.

// Access All child concepts


PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?cls
Chapter 7. Implementation Model 82

Figure 7.5: Complete Class Diagram


Chapter 7. Implementation Model 83

Figure 7.6: Sequence Diagram


Chapter 7. Implementation Model 84

FROM <http://dltest.org>
WHERE { ?cls rdfs:subClassOf
<http://www.iwi-iuk.org/material/RDF/Schema/Class/scf#DL-Concept19> }

Listing 7.1: Sample SPARQL Request File

7.3.2 Database Setup

In this model, The Virtuoso database has been used extensively. However, there are
several Ontology servers are available, like Jena SDB, Virtuoso Server, D2R server and
Open Seasem Server. Lucid comparison of those servers are available at [132][24]. More-
over, The Berlin SPARQL Benchmark [1] benchmarks2 Virtuoso, BigOWLIM, BigData
and Jena TDB with datasets ranging from 10 million to 150 billion triples within the
Explore and Business Intelligence use cases and recommends virtuoso for OWL and RDF
storage. The performance of the virtuoso and it’s easy availability with open licensing
are the key reason to choose. The installation document available with the server is easy
to read and execute3 . The software also ready to download from various sources. Since,
UNIX is our execution environment, so the 64-bit source code was the natural choice4 .
In the test bed, the virtuoso server has been installed in Unix and connected with large
Data Storage. The Server can be configured and tuned by changing the virtuoso.ini file.
The virtuoso server also packed with a interactive admin console, which can be access
at 8890 port through HTTP protocol. Figure. 7.7 represents the virtuoso web console.
However, the virtusou server also supports a command line console and bulk loader.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>


SELECT ?cls
FROM <http://isirole.org>
WHERE { ?cls rdfs:subClassOf
<http://www.semanticweb.org/subhasis/ontologies/2014/6/isi-kolkata-ontology-15#Business>
}

7.3.3 Security Engine Setup

XACML is an XML-based language for access control that has been standardized in
OASIS. XACML describes both an access control policy language and a request/response
language. The policy language is used to express access control policies (who can do what
2
http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/
3
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/
4
http://sourceforge.net/projects/virtuoso/files/virtuoso/7.2.1/virtuoso-opensource-win-x64-
20150625.zip/download
Chapter 7. Implementation Model 85

Figure 7.7: Virtuoso Admin Console

when). The request/response language expresses queries about whether a particular


access should be allowed (requests) and describes answers to those queries (responses).
Security Engine of this model relies on Sun-XACML API [2]. However, to keep the
model simple primarily all the policy file has been stored in XML file. In a typical
XACML usage scenario, a subject (e.g. human user, workstation) wants to take some
action on a particular resource. The subject submits its query to the entity protecting
the resource (e.g. file system, web server). This entity is called a PEP . The PEP forms
a request (using the XACML request language) based on the attributes of the subject,
action, resource, and other relevant information. The PEP then sends this request to
a PDP, which examines the request, retrieves policies (written in the XACML policy
language) that are applicable to this request, and determines whether access should be
granted according to the XACML rules for evaluating policies. That answer (expressed
in the XACML response language) is returned to the PEP , which can then allow or
deny access to the requester.

7.4 Time Complexity Analysis

Steps of the view generation algorithm is shown in the picture 7.8 In the table, col-
umn A is showing the DAG traversal, column B is showing the polyhierarchy stack
and column C is showing the View. In column C a.∗ represents a concept node and
all document classes under a, similarly, e. ∗ (b) represents e concept and the document
classes contributed by parent b in e. View Generation Algorithm considers the ontology
hierarchy with concepts and document classes as an input Graph G = (V, E). Here V is
Chapter 7. Implementation Model 86

a b c d a.*
+ a

b c d e.b e =3 + a.* b.*

b c d -
c d e.b e.c e=2 +, + a.* b.* c.*

d e.b e.c e=1 +, +, - a.* b.* c.*


e

e.b e.c + a.* b.* c.* e.*(b)

e.c f a.* b.* c.* e.*(b,c)

f a.* b.* c.* e.*(b,c) f

A B C

Figure 7.8: Performance Analysis

the set of vertices or concepts and E is the set of edges or the semantic links among the
concepts. The model also considers a modified Breadth First Search (BFS) algorithm
for node graph traversal and view generation. Usually, the time complexity of BFS is
O(V + E) [40]. However, for the presence of polyhierarchy and the document classes
covered by the polyhierarchic nodes, the traversal time will be more. Considering the
maximum number of parent nodes present for a polyhierarchic node to be k, in worst
case each such node will need to traverse (2k − 1) document class traversal. So worst
case complexity for view generation will be O((V + E) ∗ (2k − 1)).
View generation algorithm also accesses XACML server to check for access authoriza-
tions. XACML policy server maintains a list of security policies as an ordered set.
Hence the access will have logarithmic complexity. So the worst case complexity will be
O(log(V )) for accessing policies.

7.5 Data Set & Performance Analysis

Deployment of the digital library service has gone through several levels of code quality
assurance testing before deployment. Likewise, to ensure the code quality PMD5 has
been used. PMD is a source code analyser [74] that finds common programming flaws
like unused variables, empty catch blocks, unnecessary object creation, and so forth.
5
http://sourceforge.net/projects/pmd/
Chapter 7. Implementation Model 87

Figure 7.9: Execution time with incresing view size

Another source code analyser Checkstyle6 [148] has been used for java syntax correctness.
The testing data has been collected from SNAP data collections [105]. SNAP contains
several large graphs. The DBLP computer science bibliography data has been extracted
transformed as per the requirements. The data original graph contains 317080 nodes
and 1049866 edges. However after the reduction and transformation the test data what
has been used contains around 50000 nodes in a directed acyclic graph. The codebase
is available at GIT Hub7 . The code has been deployed with tomcat version 7.0 in Intel
Xeon E5 server, with 64 GB RAM support, 7200-rpm SATA hard drive and MAC OS
X version 10.10.4. Data has been stored with The Virtuoso Server at LaCie storage
device which is 4 T.B. in size with 7200rpm speed. Dedicated memory allocated for
query analyser of The Virtuoso Server is 32 GB.

7.5.1 Perforamce Results

The performance result depicted in Figure. 7.9 and Figure. 7.10. Figure. 7.9 represents
mapping of view size and view generation time. With the increasing view size the time
taken by the algorithm is linearly increasing. But for some points it’s non linear and
6
http://checkstyle.sourceforge.net
7
https://github.com/dsubhasis/obac
Chapter 7. Implementation Model 88

Figure 7.10: Performance Monitoring Data from Zabbix

Figure 7.11: Web Based Load Test


Chapter 7. Implementation Model 89

decreasing because of the policy size. Large of number explicit policy required lesser time
to execute because it dose not require time of determining implicit rules. Hence, in data
point 1513, 24702, 2314, and 9173 have increasing trend in the graph. However, ratio of
implicit and explicit policy should be taken care of during the design and maintenance of
the systems. Unbalance of this ration may create serious impact of the systems. Figure.
7.10 represent the CPU utilisation and the available memory of the systems. It shows
the view generation process is heavily dependent of memory and cpu. In the graph a
low available memory means system is generation the view. Each of the generation it
also occupies significant amount of memories. A simple web based load testing also
been done. Figure 7.11 represents the load test result. Where through web service call
parallel requests have been generated. With the increasing load the time of execution is
also increasing here.
Chapter 8

Update Policy for Digital Library

8.1 Introduction

In a digital library, reading is the primary activity. Most of the library promotes “read”
as the only service for ordinary users. Recently beyond the traditional read functional-
ity, collaborative writing has been studied by various research groups [50, 112, 146, 147].
For example, Theng et al.[147]have discussed the designing aspects of the digital library
for children, and concluded that the dynamic environments that enable write and re-
view facility is more useful to encourage the audience for active involvement. Similarly,
another study done by EL-Deghaidy and Nouby[50], concludes collaborative learning is
more effective than traditional learning. In this chapter, the update of the digital library
has been introduced. Here, the object is the document and the subject is a user who
may either be an author or reader or volume editor or library administrator etc. Since,
the documents of the digital library reside on the various physical servers and expose
metadata as an ontology, the current approach will try to provide security related lock-
ing mechanism on the metadata to control the access. Moreover, all the rules of locking
are divided into two category one is provisions, and another is obligations. The users of
the digital library will access the document with the approval of provision and obligation
sets(PO sets).

8.1.1 Addition and Update Issues

The read-only access control model for digital library ontology has modified the earlier
models by extending the control of access from concept level to document class level.
However, the addition of new document or modification of an existing document needs
to control access even at the individual document level.

90
Chapter 8. Update Policy for Digital Library 91

Since a digital library ontology is used as the metadata for searching a document in a
digital library, an addition of new document or modification of an existing document has
to have the appropriate reflection in the ontology. The different types of requirements
are:

• Addition of new document: Any request for addition of a new document has two
phases. First, verification of credentials of the author(s) and secondly the classi-
fication of the new document for placing it under the appropriate concept in the
ontology hierarchy. The library administration would take help of suitable creden-
tial verifier and IR engine for this purpose, and both are assumed to be available
[49]. After the addition of the new document in the appropriate repository, ontol-
ogy as the metadata will be updated and the document identifier (DOI) will be
added to the appropriate document class and the corresponding concept.

• Modification of Existing Document: It has been assumed that an author or all


the authors of a document has/have the authorisation to write on an existing
document, i.e. to modify the concerned document. Whenever a new document is
added, the access policy set adds appropriate authorisations so that the author(s)
of the new document can access and update the document in future if necessary.
So an author can read, update and then save the modified document he/she has
authored earlier. Saving a modified document is equivalent to a commit operation
in a database transaction. The digital library ontology may perform structural
modifications like the addition of concept, alteration of the position of an existing
concept in the hierarchy, reclassification of documents, etc.

8.1.2 Authorisation Conflicts in Update Operations

Extended XACML framework in Figure. 8.1 provides a static security engine that stores
the appropriate positive authorisations for write/update related operations of the form
(s, o, a, +), where a ∈ {write, update, change}. Here, write signifies addition of a new
document, update is for modifying an existing document and change permits change
in the ontology structure. Now, these authorisations may give rise to different types of
conflicts that need to be resolved for successful update/write operations using the digital
library ontology. Possible types of conflicts are:

• Multiple Author Conflicts: If a document has multiple authors, each of them would
have the positive authorisation for updating the document. Conflict may occur if
more than one authors of the same document try to modify/update the document
simultaneously.
Chapter 8. Update Policy for Digital Library 92

• Editor and Author Conflict: In an edited volume, editor(s) may override certain
access rights of the author(s) of a document present in the edited volume. For
example, editor may specify a time limit for updating a document present in
the edited volume. Any author exceeding the time limit would not be able to
submit his/her modified document even when he/she has the authorisation to
write/update his/her document.

• Administrative Conflicts : Write authorisation of an author may be suspended


when administrator wishes to make any structural change in the ontology as dis-
cussed earlier.

However, when an author is actually submitting a modified document (similar to commit


operation) and the corresponding metadata present in the ontology is getting changed, no
other operation would be permitted till the commit operation is successfully completed.
These requirements clearly indicate that just a static security engine with the list of
authorisations is not sufficient to control update mechanism in a digital library ontology.

Apart from the static security engine, any update request may have to satisfy some
conditions, which are dynamic in nature, i.e. query specific. Thus each update request,
apart from having required authorisations, should pass through a system that would also
verify the dynamic conditions prevailing at the time of processing the update request.
These dynamic conditions/constraints are called provisions and obligations. Authori-
sation mechanism described for read-only system is extended to include a Provisional
Authorisation Module (PAM) to model these conditions.

8.1.3 Provisional authorisation Module (PAM)

Authorisations using provisions and obligations have been proposed by Bettni et.al. and
Kudo et.al, [19] [98] in two different papers. Paper by Bettni et.al, has focused on
the management of security policy, whereas the paper by Kudo et.al. has focused on
architectural issues. In the model proposed in this paper, the authors(Kudo and Hada)
have borrowed these concepts to resolve the conflicts using the PAM along with XACML
based security engine. XACML based security engine is necessary to verify a required
authorisation with simple yes/no response. Provisional authorisation module (PAM), the
main contribution of this paper, is responsible for handling dynamic conditions. PAM
and XACML together generate the final permission for writing/modifying a document.
Chapter 8. Update Policy for Digital Library 93

8.1.3.1 Provision and Obligation Set

Depending on the dynamic conditions behind a write request, these conditions are clas-
sified into pre-conditions and post-conditions [19] termed as provisions and obligations
respectively. In the proposed system, a global set containing all provisions and obliga-
tions has been defined and termed as PO set. As described by Bettni et. al, [19–21], the
PO set contains some variables and constants which are mapped to ground algebra where
each PO formula contains, PO atoms, disjunction set of PO formula or conjunction set
of PO formula.

Definition 8.1 (Provisions). Provisions are the set of conditions P with respect to an
access request req, where the system will allow to execute req if {∃P : P 6= ∅ : P −→ true
}.

Example 8.1. As mentioned earlier, in case of update operations, in an authorisation


(s, o, a, +), an access right a must belong to the set a ∈ {write, update, change}, where
write signifies addition of a new document, update is for modifying an existing document
and change permits change in the ontology structure. So in a digital library, an author
ai will be allowed to update a document dj having an authorisation (ai , dj , update, +)
iff the administrator is not executing a change operation related to the corresponding
document class dck where dj is placed. The change operation may involve associating
the document class dck to a different concept than where it was earlier. So the condition
in the provision(P) is P = {(change flag for document class dck ) 6= true }. System will
allow user to update the document if {P −→ true}.

Definition 8.2 (Obligations). Obligations are the set of conditions O, which should not
be violated during execution of the access request req. The system will allow to execute
req so long it does not violate the condition {∃O : O 6= ∅ : O −→ true}.

Example 8.2. If an editor sets a condition that the authors included in the concerned
edited volume will have to submit the updates of their respective documents within a
specific date, then the Obligation set will be O = {lastdate = ‘date : time‘}. In this case
the req, as mentioned earlier, will be invalidated after the specified date.

So in the proposed access control model for update operations, in addition to the com-
ponents Subject, Object, Access Right and Sign for the authorisation specification
as described in section 3, other required additional components are the Provision and
Obligation sets. So the extended authorisation function will be:

• Authorisation : Authorisation is a function f () → S × O × A × Sign × θ where


(θ ∈ P O)
Chapter 8. Update Policy for Digital Library 94

Definition 8.3 (Provision and Obligations Set). Given a set of variables V , set of
constants C and predicate symbols P, Provisions and Obligations set P O, is the set of
all mappings from rules to ground algebra.

The set of rules of the P O set can be defined in terms of BNF grammar.

P = r|V.V |V + V |V.C|C (8.1)

Example 8.3 (Example of Rules). If V ∈ {x, y, z} and C ∈ {a, b, c} some arbitrary


rules are:
P1 (x) → P2 (x, y)|P3 (z)|P1 (a)|λ
P2 (x, y) → P2 (x, a)|P3 (y)
P3 (y) → P3 (b)|λ
P2 (x, a) → P2 (c, b)|P3 (z)
P3 (z) → λ
Here, {P1 (a), P2 (c, b), P3 (b)} etc. are the members of provision and obligation set and λ
is the terminal condition.

Definition 8.4 (Access Request). Access Request can be defined as: φ = (act, auth),
where in φ, act is the action or operation requested and auth is the supporting autho-
risation. Hence, the access request function is φ() −→ act × auth.

Example 8.4 (Example access request). If a user ui wants to “write”(w) on a docu-


ment/object dj and his/her provision and obligation function is P O(). Then the autho-
risation is auth = (ui , dj , w, +, P O()). The access request will be φ() → (w, auth) ⇒
(w, {ui , dj , w, +, P O()}).

8.2 Conflicts and Resolutions using Locking Protocols

As discussed earlier, the presence of different conflicts for update operations in the
digital library ontology, gives rise to different provisions (preconditions) and obligations
(postconditions). Nonetheless, a deadlock-free serialisation of concurrent transactions
may mitigate such problem effectively. Incidentally, the first type of update operation,
i.e. Addition of New Document will not cause any conflict. However, a user trying
to access the corresponding Document Class should be alerted that a new document
addition is taking place. The concerned user may like to access later so that the query
can include that newly added document as well. Document addition should be done
after reviewed by the authority that the new document is meaningful for the digital
library. User should submit the document for peer review, after the review based on
Chapter 8. Update Policy for Digital Library 95

document content and keywords document should be classified and placed under the
right concept with it’s DOI. Further, from the next query user will get that document in
the result. This process is very simple and this requirement demands a lock equivalent
to intention to write, i.e. Intension Exclusive Lock (IX) for the Document Class and
as available in a Multiple Granularity Locking Protocol [59], [103], [69]. Unlike other
operations, addition of document is conflict free. However, in the section the conflict
and management will be discussed.

8.2.1 Partial Ordering and Priority of Locks

Update functionality of the digital library assumed a partial ordering among locks and
users. In this model, three kinds of user have been assumed. A user has the lowest
priority in the library when he/she has only read access to a document. An author can
write/update a document for which he/she is authorised. An editor of an edited volume
containing many documents, has hypervisory access on the edited volume for which he
is authorised. An administrator has the highest priority in the digital library. So the
partial order of the users are as follows : (reader ≺ author ≺ editor ≺ administrator)
Hence, in the proposed model, lock imposed by the higher priority user can override
the lock imposed by a lower priority user. For example, Lock(author, IX) lock can be
overridden by Unlock(editor), because editor has higher priority access than author as
(author ≺ editor). A compatibility matrix presenting lock priority has been given in
Table. 8.1.

8.2.2 Conflicts and Management

• Multiple Author Conflicts : The first type of conflict is the Multiple Author Con-
flict. If a document has more than one authors where all of them has access right
for the update, anyone updating the document should prevent others from access-
ing and updating the document simultaneously. Normally an Exclusive lock should
serve the purpose. However, in a digital library the update situation is different
from the update situation in a relational database. An author intending to update
a document written by him/her may not do it immediately after reading it. He/she
may get an Intension Exclusive Lock (IX) and may submit the update later. Any
other author of the same document (in case of multiple author document) though
having a permission to write would not get an IX lock since an IX lock is already
existing. An Exclusive Lock (X) will be given to a user for a document only when
he/she is actually submitting the updated document and the system is actually
storing such document in the library. The X lock from the concerned document
Chapter 8. Update Policy for Digital Library 96

will be removed at the end of the transaction storing the updated document. the
corresponding Document Class and the Concept will be locked with Intension Ex-
clusive Lock (IX). Any user having an access right to read a document will get
a Shared(S) lock and more than one user may get it simultaneously. Even if a
document has an IX lock it may be available for reading. However presence of
IX lock will indicate to the reader that he/she is getting an older version which
is being updated by any of its authors. Presence of X lock for a document will
definitely prevent a user from getting a S lock since the document is actually get-
ting updated then. Hence, three locks are required to resolve the multiple author’s
conflicts, those are :

– Exclusive Lock(X) : An exclusive lock on document means document is


locked for update. Exclusive lock on document class means all the docu-
ments under the class is implicitly locked.
– Intension Exclusive Lock (IX) : An Intension Exclusive Lock on document
means document will be updated in near feature .i.e document has been
retrieved for edit purpose. No other authors will be able to start the update
process on the document.
– Shared Lock (S) : A shared lock means that someone is reading and the
document so X lock can’t be imposed on that.

Algorithm 5 Update Document


1: function update(d , u)
2: f lag ← 0;
3: lockStatus ← getLock(d);
4: if (lockStatus ≤ (u, IX)) then
5: Lock(d, u, IX);
6: f lag ← 1
7: end if
8: return f lag
9: end function

Algorithm. 5 represents the update document algorithm. This document will


retrieve the lockStatus using getLock(d) function.If the lockStatus is empty
then document will impose the IX lock on the document. The lock table will
be updated through the Lock() functions. When the updated document is actu-
ally submitted to the appropriate repository, X lock is acquired, as shown in the
Commit document algorithm

• Editor and Author Conflict : Editor of volume has some hypervisor access to
the volume over authors. A special lock called Editor’s Lock (EDL) has been
provided for the editor of a book/volume having multiple documents. This lock is
different from the standard IX lock. An author in an edited volume may have an
IX lock for his/her document. However, when he/she is actually trying to update
Chapter 8. Update Policy for Digital Library 97

Algorithm 6 Commit Document


1: function commitDoc(d , u)
2: f lag ← 0;
3: lockDoc ← getLock(d);
4: docClass ← getDocClass(d);
5: docConcept[] ← getDocP arentClass(d);
6: if (lockDoc == IX then
7: U nLock(d, IX);
8: Lock(docClass, u, X);
9: Lock(docConcept, u, X);
10: Lock(d, u, X);
11: U nLock(docClass, u);
12: U nLock(docConcept, u);
13: U nLock(d, u);
14: f lag ← 1
15: end if
16: return f lag;
17: end function

by getting an X lock, the required X lock will be given only if the entire edited
volume doesn’t have an EDL lock. This would help in implementing the required
obligation as mentioned in an earlier example.

Algorithm 7 Volume Editor’s Lock


1: function editorLock(d[] , u)
2: f lag ← 0;
3: for each d do
4: lockStatus ← getLock(d);
5: if (lockStatus ≤ (u, EDL)) then
6: U nLock(d, u);
7: end if
8: f lag ← 0;
9: end for
10: return f lag;
11: end function

Algorithm 8 Volume Editor’s Unlock


1: function editorUnLock(d[] , u)
2: for each d do
3: lockDoc ← getLock(d);
4: if (lockDoc ≤ (u, EDL)) then
5: U nLock(d, u);
6: end if
7: end for
8: end function

• Administrative Conflicts : The last type of lock is the Administrative Lock


(ADM) available only to the system administrator. This lock is for restructuring
the ontology hierarchy. Addition/deletion of concept, modification of the position
of a concept/document class in the ontology structure will be done with the help of
this lock. This lock has two important properties not present in any of the other
locks. First an ADM lock on a concept ensures that all the concepts inferable
from that concept will also be locked with ADM lock. Secondly, If a concept is
locked with ADM lock, it will override any other lock present for that concept, the
Chapter 8. Update Policy for Digital Library 98

Table 8.1: The compatibility matrix

S IX X EDL ADM
S Y Y N Y N
IX Y N N N N
X N N N N N
EDL Y Y N N N
ADM N N N N N

document classes under it and for the entire sub-graph covered by the concerned
concept.

Algorithm 9 Administrative Lock


1: function adminLock(c, LockStore)
2: for each c do
3: class[] ← getClass(c)
4: for each class do
5: listDoc[] ← getDoc(class);
6: for each d from listDoc do
7: lockDoc ← getLock(d);
8: LockStore ← LockDoc
9: if (lockDoc! = ∅) then
10: U nLock(d);
11: end if
12: Lock(d, ADM )
13: end for
14: lockClass → getLock(class)
15: LockStore ← LockClass
16: if (lockClass! = ∅) then
17: U nLock(class);
18: end if
19: Lock(class, ADM );
20: end for
21: lockConcpet → getLock(c);
22: LockStore ← LockConcept;
23: if (lockConcept! = ∅) then
24: U nLock(c);
25: end if
26: Lock(c, ADM )
27: end for
28: end function

Thus the locking protocol for update operations of a digital library with provisions and
obligations is a variation of the well-known Multiple Granularity Locking protocol [59],
[103], [69] used for the first time for access control purpose. The types of locks proposed
here are: Shared lock (S), Exclusive Lock (X), Intension Exclusive Lock (IX), Editor’s
Lock (EDL) and Administrative Lock (ADM). The compatibility matrix (Table. 8.1)
shown in Table 8.1 provides the relationship among these locks.

8.2.3 PO Monitoring

Monitoring of provisions and obligations are necessary for proper functioning of the
controlled access mechanism to the digital library. Monitoring of obligations is even
Chapter 8. Update Policy for Digital Library 99

Adminstrative Module

Static Security Engine

Security Engine
XACML Storage

PO Rule Storage

Provisional Authrization Module (PAM)


User Module
Lock Table
id System Generated
Object name
Object Type
Object Id
Lock Type
Ontology Metadata Storage Owner

Figure 8.1: Provisional Access Manager

more important, because a user failing to meet certain obligation requirements should
not be able to enjoy certain privileges. For example, if an editor specifies certain deadline
for submitting new documents or for updating any existing document, each addition or
update request should be monitored to check whether the obligation has been met or
not. If the deadline is crossed, any user holding IX lock in any document is not allowed to
submit his/her updates anymore and EDL lock applied on the concerned edited volume
overrides the IX locks held on the documents within the edited volume. So the system
should store the PO conditions and verify them against each user access.

8.3 Architectural Details

The architectural detail of the system developed for controlled access to the digital
library is shown in Fig 8.1.

• Security Engine: Security engine returns security decision on the request of various
security related API calls. Security engine has been deployed under WEB 2.0
secure web service calls and is capable to communicate with various components
through secured encrypted channels. Security engine has two components Static
security engine and dynamic Provisional AuthorisationAuthorisation Module. On
the basis of the results given by both the components, security engine calculates
the final authorisation decision.

• Static Security Engine: Static security engine is the conventional security engine,
which provides authorisation value in response to a query. In this model, XACML
Chapter 8. Update Policy for Digital Library 100

2.0 has been considered as security standard. Security engine is connected with
XACML repository with secured channel and it sends back static answer of yes/no
against a query.

• PAM : PAM or Provisional Authorisation Manager is mainly to supports dynamic


part of the security. PAM takes care of provision and obligation verification and
addition/update issues.
Chapter 9

Conclusion

Present thesis has considered modelling a digital library environment controlling access
to different subject areas of the library depending on the credentials of different users.
The thesis has primarily considered a polyhierarchic ontology structure for representing
the digital library metadata. Considering the fact that an interdisciplinary subject area
may be placed as a concept under more than one parent concepts, the proposed model
has defined a new set of nodes called Document Classes. Depending on the credential
a user may get access to some of the parent concepts and thereby gaining access to
some of the document classes defined. Necessary access control model for this purpose
has been designed. Depending on the similarity in credentials, users are grouped into
user-groups and authorisations to access different concepts are assigned to these user-
groups. Member users of a user-group inherit the authorisations assigned to the group.
Depending on the authorisations, user-group based views have been generated. Once
the server is accessed for the login process, the required view is downloaded at the client
end and further search process on the metadata can be done at the client end without
accessing the server. Unless there is any change in the authorisation set for a user-group,
the view for a group remains same and no further download to client end is done. This
process substantially reduces the communication cost for the metadata search process.
Since both user-groups and concepts in the ontology structure give rise to two different
hierarchies, conflicts in authorisation arising out of the interplay of these two hierarchies
have been studied thoroughly. Considering the different possible conflicts, a set of rules
has been proposed to avoid/resolve the conflicts. It has also been shown that the rule
set is sound and complete and authorisation at each concept node for each user-group is
decidable. A test bench has been developed to test the proposed model and the results
have been included in the thesis.
At the end of the thesis an effort has been taken to design a collaborative developing
environment for a digital library. Conflicts among different activities of users other than

101
Chapter 9. Conclusion 102

reading have been resolved by using a variation of multiple granularity locking protocol.
This part of the work needs further study by converting this model to a role based access
control system. For example, besides being a reader, a user can also be an author of a
document in an edited volume as well as its editor. So for the same document a user
may have more than one role. Addition of role hierarchy would add further conflicts
and a modified access control model is necessary. This problem will be considered as a
future research effort.
Other future efforts to extend the research activities covered in the present thesis are :

• Searching in Multiple Ontologies : A user may have access to different digital


libraries under different administrative domains. A document not accessible in one
library or under one administrative domain may be available in another library
environment. So the search in such an environment needs a process to navigate
from one library ontology metadata and administrative domain to another ontology
structure.

• Extending to Mobile Environment: It would be interesting if the metadata


search process can be extended to mobile environment. In other words, user-
group specific views can be made available to mobile devices. Availability of such
APP will definitely be beneficial to mobile users accessing the digital library under
controlled access environment.
Bibliography

[1] Berlin sparql benchmark 2013.

[2] Sun’s xacml implementation.

[3] Cambridge digital library (http://pudl.princeton.edu/), Cambridge University.

[4] The cornell university library(http://cdl.library.cornell.edu/), Cornell.

[5] Oxford digital library (odl) (http://cdl.library.cornell.edu/), Oxford.

[6] Princeton university digital library (http://pudl.princeton.edu/), Princeton.

[7] M. S. Ackerman and R. T. Fielding. Collection maintenance in the digital library.


In DL, 1995.

[8] N. R. Adam, V. Atluri, I. C. Society, E. Bertino, S. Member, and E. Ferrari.


A content-based authorization model for digital libraries. IEEE Transactions on
Knowledge and Data Engineering, 14:296–315, 2002.

[9] N. Aloia, C. Concordia, and C. Meghini. Europeana v1.0, 2011. URL


http://puma.isti.cnr.it/linkdoc.php?icode=2011-A1-039&authority=
cnr.isti&collection=cnr.isti&langver=en.

[10] Apache. Apache tomcat, May 2007. URL http://tomcat.apache.org/.

[11] M. E. Aranguren. Automatic maintenance of multiple inheritance ontolo-


gies. http://ontogenesis.knowledgeblog.org/49, 2010. URL http://
ontogenesis.knowledgeblog.org/49.

[12] R. M. Baecker, D. Nastos, I. R. Posner, and K. L. Mawby. The user-centered


iterative design of collaborative writing software. In Proceedings of the INTERACT
’93 and CHI ’93 Conference on Human Factors in Computing Systems, CHI ’93,
pages 399–405, New York, NY, USA, 1993. ACM. ISBN 0-89791-575-5. doi:
10.1145/169059.169312. URL http://doi.acm.org/10.1145/169059.169312.

103
Bibliography 104

[13] C. Baru and A. Rajasekar. A hierarchical access control scheme for digital libraries.
In Proceedings of the third ACM conference on Digital libraries, pages 275–276.
ACM, 1998.

[14] N. J. Belkin. Understanding and supporting multiple information seeking be-


haviors in a single interface framework. In Proceedings of Eighth DELOS Work-
shop: User Interfaces in Digital Libraries. DELOS Working Group Report, number
99/W001, pages 11–18, 1998.

[15] J. Berman. Principles of Big Data: Preparing, Sharing, and Analyzing Complex
Information. Elsevier Science, 2013. ISBN 9780124047242. URL https://books.
google.co.in/books?id=gEho0DI8a2kC.

[16] E. Bertino and E. Ferrari. Secure and selective dissemination of xml docu-
ments. ACM Trans. Inf. Syst. Secur., 5:290–331, August 2002. ISSN 1094-9224.
doi: http://doi.acm.org/10.1145/545186.545190. URL http://doi.acm.org/10.
1145/545186.545190.

[17] E. Bertino, S. Jajodia, and P. Samarati. A non-timestamped authorization model


for data management systems. In Proceedings of the 3rd ACM Conference on
Computer and Communications Security, CCS ’96, pages 169–178, New York,
NY, USA, 1996. ACM. ISBN 0-89791-829-0. doi: 10.1145/238168.238211. URL
http://doi.acm.org/10.1145/238168.238211.

[18] E. Bertino, F. Buccafurri, E. Ferrari, and P. Rullo. A logic-based approach for en-
forcing access control [1] a preliminary version of this paper appears in the proceed-
ings of the 5th european symposium on research in computer security (esorics’98),
louvain-la-neuve, belgium, september 1998 under the title “an authorization model
and its formal semantics”. Journal of Computer Security, 8(2, 3):109–139, 2000.

[19] C. Bettini, S. Jajodia, X. S. Wang, and D. Wijesekera. Provisions and obli-


gations in policy management and security applications. In Proceedings of the
28th International Conference on Very Large Data Bases, VLDB ’02, pages 502–
513. VLDB Endowment, 2002. URL http://dl.acm.org/citation.cfm?id=
1287369.1287413.

[20] C. Bettini, S. Jajodia, X. S. Wang, and D. Wijesekera. Provisions and obligations


in policy rule management. J. Network Syst. Manage., 11(3):351–372, 2003.

[21] C. Bettini, S. Jajodia, X. S. Wang, and D. Wijesekera. Reasoning with advanced


policy rules and its application to access control. Int. J. on Digital Libraries, 4
(3):156–170, 2004.
Bibliography 105

[22] M. Bieber, F. Vitali, H. Ashman, V. Balasubramanian, and H. Oinas-


Kukkonen. Fourth generation hypermedia: Some missing links for the world
wide web. International Journal of Human-Computer Studies, 47(1):31–65,
1997. URL http://ijhcs.open.ac.uk/bieber/bieber.html,http://ijhcs.
open.ac.uk/bieber/bieber.pdf.

[23] A. P. Bishop and S. L. Star. Social informatics for digital library use and infras-
tructure. In M. E. Williams, editor, Annual Review of Information Science and
Technology, number 31, pages 301–401+. Information Today, 1996.

[24] C. Bizer and A. Schultz. Benchmarking the performance of storage systems that
expose sparql endpoints. In In Proceedings of the ISWC Workshop on Scalable
Semantic Web Knowledgebase, 2008.

[25] O. Bodenreider, T. C. Rindflesch, and A. Burgun. Unsupervised, corpus-based


method for extending a biomedical terminology. In Proceedings of the ACL-02
Workshop on Natural Language Processing in the Biomedical Domain - Volume 3,
BioMed ’02, pages 53–60, Stroudsburg, PA, USA, 2002. Association for Computa-
tional Linguistics. doi: 10.3115/1118149.1118157. URL http://dx.doi.org/10.
3115/1118149.1118157.

[26] P. Bonatti, S. De Capitani di Vimercati, and P. Samarati. An algebra for com-


posing access control policies. ACM Trans. Inf. Syst. Secur., 5(1):1–35, Feb. 2002.
ISSN 1094-9224. doi: 10.1145/504909.504910. URL http://doi.acm.org/10.
1145/504909.504910.

[27] C. L. Borgman. What are digital libraries, who is buidling them and why? In
T. Aparac, T. Saracevic, P. Ingwersen, and P. Vakkari, editors, CoLIS. Benja
Publishing, Lokve, Croatia, 1999.

[28] C. L. Borgman, M. J. Bates, M. V. Bates, E. N. Efthimiadis, A. J. Gilliland-


Swetland, Y. B. Kafai, G. H. Kafai, and A. B. Maddox. Social aspects of digital
libraries. final report to the national science foundation; computer, information
science, and engineering directorate; division of information, robotics, and intelli-
gent systems; information technology and organizations program, Sept. 28 2006.
URL http://works.bepress.com/borgman/183.

[29] A. Brewer, W. Ding, K. Hahn, and A. Komlodi. The role of intermediary services in
emerging digital libraries. In Proceedings of the first ACM international conference
on Digital libraries, pages 29–35. ACM, 1996.

[30] J. J. Cadiz, A. Gupta, and J. Grudin. Using web annotations for asynchronous
collaboration around documents. In Proceedings of the 2000 ACM Conference on
Bibliography 106

Computer Supported Cooperative Work, CSCW ’00, pages 309–318, New York,
NY, USA, 2000. ACM. ISBN 1-58113-222-0. doi: 10.1145/358916.359002. URL
http://doi.acm.org/10.1145/358916.359002.

[31] L. Candela, D. Castelli, Y. Ioannidis, G. Koutrika, P. Pagano, S. Ross,


H. Schek, and H. Schuldt. The digital library manifesto, 2006.
URL http://puma.isti.cnr.it/linkdoc.php?icode=2006-A1-15&authority=
cnr.isti&collection=cnr.isti&langver=en.

[32] L. Candela, D. Castelli, N. Ferro, G. Koutrika, C. Meghini, Y. Ioannidis,


P. Pagano, S. Ross, D. Soergel, M. Agosti, M. Dobreva, V. Katifori, and
H. Schuldt. The DELOS digital library reference model - foundations for
digital libraries, 2007. URL http://puma.isti.cnr.it/linkdoc.php?icode=
2007-A1-019&authority=cnr.isti&collection=cnr.isti&langver=en.

[33] L. Candela, D. Castelli, N. Ferro, G. Koutrika, C. Meghini, P. Pagano, S. Ross,


D. Soergel, M. Agosti, and M. Dobreva, editors. The DELOS Digital Library
Reference model. Foundations for digital Libraries (Version 0.98). ISTI-CNR at
Gruppo ALI, Pisa, 2008. URL http://eprints.port.ac.uk/4104/.

[34] L. Candela, D. Castelli, and P. Pagano. History, evolution, and impact of


digital libraries, 2011. URL http://puma.isti.cnr.it/linkdoc.php?icode=
2011-A1-001&authority=cnr.isti&collection=cnr.isti&langver=en.

[35] L. Candela, P. Manghi, and Y. Ioannidis. Fourth workshop on very large digital
libraries: on the marriage between very large digital libraries and very large data
archives. 40:61–64, 2011. URL http://puma.isti.cnr.it/linkdoc.php?icode=
2011-A0-069&authority=cnr.isti&collection=cnr.isti&langver=en.

[36] Y. Caseau. Efficient handling of multiple inheritance hierarchies. In OOPSLA,


pages 271–287, 1993.

[37] L. Chan. Library of Congress Classification as an Online Retrieval Tool: Potentials


and Limitations. Information Technology and Libraries, 5(3):181–92, 1986.

[38] K. Cheung, J. Hunter, A. Lashtabeg, and J. Drennan. SCOPE: A scientific


compound object publishing and editing system. IJDC, 3(2):4–18, 2008. doi:
10.2218/ijdc.v3i2.55. URL http://dx.doi.org/10.2218/ijdc.v3i2.55.

[39] L. B. Christensen and T. Stevns. Biblus–a digital library to support integration


of visually impaired in mainstream education. In International Conference on
Computers for Handicapped Persons, pages 36–42. Springer, 2012.
Bibliography 107

[40] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algo-


rithms, Second Edition. MIT Press, Sept. 2001.

[41] A. Corradini, U. Montanari, F. Rossi, H. Ehrig, R. Heckel, and M. Löwe. Algebraic


approaches to graph transformation. Part I: basic concepts and double pushout
approach, pages 163–245. World Scientific Publishing Co., Inc., River Edge, NJ,
USA, 1997. ISBN 98-102288-48. URL http://dl.acm.org/citation.cfm?id=
278918.278928.

[42] J. Crampton. Authorization and antichains. PhD thesis, Birkbeck College, 2002.

[43] E. Damiani, S. De Capitani di Vimercati, S. Paraboschi, and P. Samarati. A fine-


grained access control system for xml documents. ACM Trans. Inf. Syst. Secur.,
5:169–202, May 2002. ISSN 1094-9224. doi: http://doi.acm.org/10.1145/505586.
505590. URL http://doi.acm.org/10.1145/505586.505590.

[44] P. P. Das, P. K. Bhowmick, S. Sarkar, A. Gupta, S. Chakraborty, B. Sutrad-


har, S. Chattopadhyay, M. Ghosh, A. Basu, N. G. Chattopadhyay, and P. P.
Chakrabarti. National digital library: a platform for paradigm shift in education
and research in india. SCIENCE AND CULTURE, pages 4 – 11, JANUARY-
FEBRUARY, 2016.

[45] S. De Capitani di Vimercati and P. Samarati. Access control in federated systems.


In Proceedings of the 1996 workshop on New security paradigms, pages 87–99.
ACM, 1996.

[46] S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo,


S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. Semtag and seeker:
Bootstrapping the semantic web via automated semantic annotation. In Pro-
ceedings of the 12th International Conference on World Wide Web, WWW ’03,
pages 178–186, New York, NY, USA, 2003. ACM. ISBN 1-58113-680-3. doi:
10.1145/775152.775178. URL http://doi.acm.org/10.1145/775152.775178.

[47] K. M. Drabenstott. Analytical review of the library of the future. Technical report,
Council Library Resources, Washington, DC, 1994.

[48] H. Dreher, H. Krottmaier, and H. A. Maurer. What we expect from digital li-
braries. J. UCS, 10(9):1110–1122, 2004. URL http://www.jucs.org/jucs_10_
9/what_we_expect_from.

[49] M. du Preez. Digital library technologies: Complex objects, annotation, ontologies,


classification, extraction and security. Online Information Review, 38(6):833–834,
2014.
Bibliography 108

[50] H. EL-Deghaidy and A. Nouby. Effectiveness of a blended e-learning cooperative


approach in an egyptian teacher education programme. Comput. Educ., 51(3):
988–1006, Nov. 2008. ISSN 0360-1315. doi: 10.1016/j.compedu.2007.10.001. URL
http://dx.doi.org/10.1016/j.compedu.2007.10.001.

[51] N. I. ElSherbiny. Secure digital libraries, July 21 2011. URL http://scholar.


lib.vt.edu/theses/available/etd-06302011-161547/.

[52] C. Farkas, V. Gowadia, A. Jain, and D. Roy. From xml to rdf: Syntax, semantics,
security, and integrity (invited paper). In P. Dowland, S. Furnell, B. Thuraising-
ham, and X. Wang, editors, Security Management, Integrity, and Internal Control
in Information Systems, volume 193 of IFIP International Federation for Infor-
mation Processing, pages 41–55. Springer Boston, 2006. ISBN 978-0-387-29826-
9. URL http://dx.doi.org/10.1007/0-387-31167-X_3. 10.1007/0-387-31167-
X 3.

[53] E. A. Fox. Digital libraries. IEEE Computer, 26(11):79, Nov. 1993.

[54] E. A. Fox. Sourcebook on Digital Libraries: Report for the National Science Foun-
dation. Technical Report TR-93-35, Dept. of Computer Science, Virginia Tech,
Blacksburg, VA, Dec. 1993. URL http://fox.cs.vt.edu/pub/DigitalLibrary/.

[55] E. A. Fox. How to make intelligent digital libraries. Lecture Notes in Computer
Science, 869:27–??, 1994. ISSN 0302-9743.

[56] E. A. Fox, D. Hix, L. T. Nowell, D. J. Brueni, W. C. Wake, L. S. Heath, and


D. Rao. Users, user interfaces, and objects: Envision, a digital library. Jour-
nal of the American Society for Information Science, 44(8):480–491, 1993. ISSN
1097-4571. doi: 10.1002/(SICI)1097-4571(199309)44:8h480::AID-ASI7i3.0.CO;
2-B. URL http://dx.doi.org/10.1002/(SICI)1097-4571(199309)44:8<480::
AID-ASI7>3.0.CO;2-B.

[57] E. A. Fox, G. McMillan, and J. L. Eaton. The evolving genre of electronic theses
and dissertations. In HICSS, 1999. URL http://computer.org/proceedings/
hicss/0001/00012/00012004abs.htm.

[58] A. Gabillon and E. Bruno. Regulating access to xml documents. In Proceedings


of the Fifteenth Annual Working Conference on Database and Application Secu-
rity, Das’01, pages 299–314, Norwell, MA, USA, 2002. Kluwer Academic Publish-
ers. ISBN 1-4020-7041-1. URL http://dl.acm.org/citation.cfm?id=863742.
863764.

[59] H. Garcia-Molina, J. D. Ullman, and J. Widom. Database systems - the complete


book (2. ed.). Pearson Education, 2009. ISBN 978-0-13-187325-4.
Bibliography 109

[60] D. Giusti, M. R., G. L. Villarreal, A. Vosou, and J. P. Martı́nez. An ontology-


based context aware system for selective dissemination of information in a digital
library. 2, May 21 2010. URL http://arxiv.org/abs/1005.4008. Comment:
http://www.journalofcomputing.org.

[61] H. Gladney. Access control for large collections. ACM Transactions on Information
Systems (TOIS), 15(2):154–194, 1997.

[62] A. Glenn and D. Millman. Access management of web-based services-an incre-


mental approach to cross-organizational authentication an authorization. D-Lib,
1998.

[63] V. Gligor, S. L. Gavrila, and D. Ferraiolo. On the formal definition of separation-


of-duty policies and their composition. In In Proceedings of IEEE Symposium on
Research in Security and Privacy, pages 172–183, 1998.

[64] M. A. Gonçalves, E. A. Fox, L. T. Watson, and N. A. Kipp. Streams, structures,


spaces, scenarios, societies (5s): A formal model for digital libraries. ACM Trans.
Inf. Syst., 22:270–312, April 2004. ISSN 1046-8188. doi: http://doi.acm.org/10.
1145/984321.984325. URL http://doi.acm.org/10.1145/984321.984325.

[65] M. A. Gonçalves, E. A. Fox, and L. T. Watson. Towards a digital library the-


ory: a formal digital library ontology. Int. J. Digit. Libr., 8:91–114, April 2008.
ISSN 1432-5012. doi: 10.1007/s00799-008-0033-1. URL http://dl.acm.org/
citation.cfm?id=1388355.1388357.

[66] M. A. Gonçalves, B. L. Moreira, E. A. Fox, and L. T. Watson. “what is a good


digital library?” – a quality model for digital libraries. Information Processing
& Management, 43(5):1416 – 1437, 2007. ISSN 0306-4573. doi: http://dx.doi.
org/10.1016/j.ipm.2006.11.010. URL http://www.sciencedirect.com/science/
article/pii/S030645730600197X. Patent Processing.

[67] G. S. Graham and P. J. Denning. Protection: Principles and practice. In Pro-


ceedings of the May 16-18, 1972, Spring Joint Computer Conference, AFIPS ’72
(Spring), pages 417–429, New York, NY, USA, 1972. ACM. doi: 10.1145/1478873.
1478928. URL http://doi.acm.org/10.1145/1478873.1478928.

[68] M. Hadley and P. Sandoz. Jax-rs: The java api for restful web services. Java
Specification Request (JSR) 311, October 2007.

[69] S. Haldar and D. K. Subramanian. A dynamic granularity locking protocol for


tree-structured databases. In Applied Computing, 1991., [Proceedings of the 1991]
Symposium on, pages 372–380, 1991. doi: 10.1109/SOAC.1991.143905.
Bibliography 110

[70] M. A. Harrison, W. L. Ruzzo, and J. D. Ullman. On protection in operating


systems. SIGOPS Oper. Syst. Rev., 9(5):14–24, Nov. 1975. ISSN 0163-5980. doi:
10.1145/1067629.806517. URL http://doi.acm.org/10.1145/1067629.806517.

[71] B. Haslhofer and P. Kneževié. The bricks digital library infrastructure. In


S. Kruk and B. McDaniel, editors, Semantic Digital Libraries, pages 151–161.
Springer Berlin Heidelberg, 2009. ISBN 978-3-540-85433-3. doi: 10.1007/
978-3-540-85434-0 11. URL http://dx.doi.org/10.1007/978-3-540-85434-0_
11.

[72] P. Heslop, A. Preston, A. Kharrufa, M. Balaam, D. Leat, and P. Olivier. Evaluat-


ing digital tabletop collaborative writing in the classroom. In Human-Computer
Interaction, pages 531–548. Springer, 2015.

[73] W.-K. Huang and V. Atluri. Analyzing the safety of workflow authorization mod-
els. In Database Security XII, pages 43–57. Springer, 1999.

[74] InfoEther. Pmd, 2015. URL https://pmd.github.io.

[75] Y. Ioannidis. Digital libraries at a crossroads. International Journal on Digital


Libraries, 5(4):255–265, 2005. ISSN 1432-5012. doi: 10.1007/s00799-004-0098-4.
URL http://dx.doi.org/10.1007/s00799-004-0098-4.

[76] Y. Ioannidis, D. Maier, S. Abiteboul, P. Buneman, S. Davidson, E. Fox, A. Halevy,


C. Knoblock, F. Rabitti, H. Schek, and G. Weikum. Digital library information-
technology infrastructures. International Journal on Digital Libraries, 5(4):266–
274, 2005. ISSN 1432-5012. doi: 10.1007/s00799-004-0094-8. URL http://dx.
doi.org/10.1007/s00799-004-0094-8.

[77] S. Jajodia, P. Samarati, and V. S. Subrahmanian. A logical language for expressing


authorizations. In Proceedings of the 1997 IEEE Symposium on Security and
Privacy, SP ’97, pages 31–, Washington, DC, USA, 1997. IEEE Computer Society.
URL http://dl.acm.org/citation.cfm?id=882493.884380.

[78] S. Jajodia, P. Samarati, M. L. Sapino, and V. S. Subrahmanian. Flexible support


for multiple access control policies. ACM Trans. Database Syst., 26:214–260, June
2001. ISSN 0362-5915. doi: http://doi.acm.org/10.1145/383891.383894. URL
http://doi.acm.org/10.1145/383891.383894.

[79] N. B. Jariwala and B. Patel. Transliteration of digital gujarati text into printable
braille. In Communication Systems and Network Technologies (CSNT), 2015 Fifth
International Conference on, pages 572–577. IEEE, 2015.
Bibliography 111

[80] D. Jonscher and K. R. Dittrich. An approach for building secure database feder-
ations. In Proceedings of the 20th International Conference on Very Large Data
Bases, pages 24–35. Morgan Kaufmann Publishers Inc., 1994.

[81] J. Joshi, A. Ghafoor, W. G. Aref, and E. H. Spafford. Digital government security


infrastructure design challenges. Computer, 34(2):66–72, 2001.

[82] L. Kagal, T. Finin, and A. Joshi. A policy based approach to security for the
semantic web. In D. Fensel, K. Sycara, and J. Mylopoulos, editors, The Semantic
Web - ISWC 2003, volume 2870 of Lecture Notes in Computer Science, pages
402–418. Springer Berlin Heidelberg, 2003. ISBN 978-3-540-20362-9. doi: 10.1007/
978-3-540-39718-2 26. URL http://dx.doi.org/10.1007/978-3-540-39718-2_
26.

[83] J. Kahan and M.-R. Koivunen. Annotea: An open rdf infrastructure for shared
web annotations. In Proceedings of the 10th International Conference on World
Wide Web, WWW ’01, pages 623–632, New York, NY, USA, 2001. ACM. ISBN 1-
58113-348-0. doi: 10.1145/371920.372166. URL http://doi.acm.org/10.1145/
371920.372166.

[84] T. Kanan, X. Zhang, M. Magdy, and E. Fox. Big data text summarization for
events: A problem based learning course. In Proceedings of the 15th ACM/IEEE-
CS Joint Conference on Digital Libraries, pages 87–90. ACM, 2015.

[85] S. Kaushik, D. Wijesekera, and P. Ammann. Policy-based dissemination of partial


web-ontologies. In E. Damiani and H. Maruyama, editors, Proceedings of the
2nd ACM Workshop On Secure Web Services, SWS 2005, Fairfax, VA, USA,
November 11, 2005, pages 43–52. ACM, 2005. ISBN 1-59593-234-8. URL http:
//doi.acm.org/10.1145/1103022.1103030.

[86] E. H.-J. Kim, J. S. Oh, and M. Song. Exploring context-sensitive query reformula-
tion in a biomedical digital library. In International Conference on Asian Digital
Libraries, pages 94–106. Springer, 2015.

[87] W. Kim, N. Ballou, J. F. Garza, and D. Woelk. A distributed object-oriented


database system supporting shared and private databases. ACM Transactions on
Information Systems (TOIS), 9(1):31–51, 1991.

[88] M. Koch, L. V. Mancini, and F. Parisi-Presicce. Decidability of safety in graph-


based models for access control. In Proceedings of the 7th European Symposium on
Research in Computer Security, ESORICS ’02, pages 229–243, London, UK, UK,
2002. Springer-Verlag. ISBN 3-540-44345-2. URL http://dl.acm.org/citation.
cfm?id=646649.699492.
Bibliography 112

[89] M. Koch, L. Mancini, and F. Parisi-Presicce. Graph-based specification of ac-


cess control policies. Journal of Computer and System Sciences, 71(1):1 – 33,
2005. ISSN 0022-0000. doi: 10.1016/j.jcss.2004.11.002. URL http://www.
sciencedirect.com/science/article/pii/S002200000400145X.

[90] U. Kohl, J. Lotspiech, and M. A. Kaplan. Safeguarding digital library contents


and users. D-lib Magazine, 3(9), 1997.

[91] D. Koutsomitropoulos, G. Solomou, A. Alexopoulos, and T. Papatheodorou. Se-


mantic web enabled digital repositories. International Journal on Digital Libraries,
10(4):179–199, 2009. ISSN 1432-5012. doi: 10.1007/s00799-010-0059-z. URL
http://dx.doi.org/10.1007/s00799-010-0059-z.

[92] N. P. Kozievitch, J. Almeida, R. da Silva Torres, N. J. Leite, M. A. Gonçalves,


U. Murthy, and E. A. Fox. Towards a formal theory for complex objects and
content-based image retrieval. JIDM, 2(3):321–336, 2011. URL http://seer.
lcc.ufmg.br/index.php/jidm/article/view/142.

[93] D. B. Krafft, A. Birkland, and E. J. Cramer. Ncore: Architecture and imple-


mentation of a flexible, collaborative digital library. In Proceedings of the 8th
ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ’08, pages 313–322,
New York, NY, USA, 2008. ACM. ISBN 978-1-59593-998-2. doi: 10.1145/1378889.
1378943. URL http://doi.acm.org/10.1145/1378889.1378943.

[94] T. Krichel. The rePEc database about economics. 2001.

[95] M. Krishnamurthy. Digital library services at the indian statistical institute.


The Electronic Library, 23(2):200–203, 2005. URL http://dx.doi.org/10.1108/
02640470510592898.

[96] S. R. Kruk, S. Decker, and L. Zieborak. Jeromedl &#8211; adding semantic web
technologies to digital libraries. In Proceedings of the 16th International Conference
on Database and Expert Systems Applications, DEXA’05, pages 716–725, Berlin,
Heidelberg, 2005. Springer-Verlag. ISBN 3-540-28566-0, 978-3-540-28566-3. doi:
10.1007/11546924 70. URL http://dx.doi.org/10.1007/11546924_70.

[97] S. R. Kruk, M. Synak, and K. Zimmermann. Marcont: Integration ontology for


bibliographic description formats. In Proceedings of the 2005 International Confer-
ence on Dublin Core and Metadata Applications: Vocabularies in Practice, DCMI
’05, pages 31:1–31:5. Dublin Core Metadata Initiative, 2005. ISBN 8489315442,
9788489315440. URL http://dl.acm.org/citation.cfm?id=1383465.1383504.

[98] M. Kudo and S. Hada. Xml document security based on provisional authorization.
In Proceedings of the 7th ACM Conference on Computer and Communications
Bibliography 113

Security, CCS ’00, pages 87–96, New York, NY, USA, 2000. ACM. ISBN 1-
58113-203-4. doi: 10.1145/352600.352613. URL http://doi.acm.org/10.1145/
352600.352613.

[99] J. L. Kulikowski. The role of ontological models in pattern recognition. In Com-


puter Recognition Systems, pages 43–52. Springer, 2005.

[100] C. Lagoze and H. V. D. Sompel. The open archives initiative: building a low-barrier
interoperability framework. In In JCDL ’01: Proceedings of the 1st ACM/IEEE-
CS joint conference on Digital libraries, pages 54–62, 2001.

[101] P. Lambe. Organising knowledge: taxonomies, knowledge and organisational ef-


fectiveness. Elsevier, 2014. ISBN 1780632002.

[102] B. W. Lampson. Protection. SIGOPS Oper. Syst. Rev., 8(1):18–24, Jan. 1974.
ISSN 0163-5980. doi: 10.1145/775265.775268. URL http://doi.acm.org/10.
1145/775265.775268.

[103] S.-Y. Lee and R.-L. Liou. A multi-granularity locking model for concurrency
control in object-oriented database systems. Knowledge and Data Engineering,
IEEE Transactions on, 8(1):144–156, 1996. ISSN 1041-4347. doi: 10.1109/69.
485643.

[104] C. Leeder and C. Shah. Library research as collaborative information seeking.


Library & Information Science Research, 38(3):202–211, 2016.

[105] J. Leskovec. Stanford large network dataset collection, 2015. URL https://snap.
stanford.edu/data/.

[106] D. M. Levy and C. C. Marshall. Going digital: A look at assumptions underlying


digital libraries. Commun. ACM, 38(4):77–84, Apr. 1995. ISSN 0001-0782. doi:
10.1145/205323.205346. URL http://doi.acm.org/10.1145/205323.205346.

[107] X. Li and Y. Xue. A survey on server-side approaches to securing web applications.


ACM Comput. Surv., 46(4):54:1–54:29, Mar. 2014. ISSN 0360-0300. doi: 10.1145/
2541315. URL http://doi.acm.org/10.1145/2541315.

[108] R. E. Lucier. Building a digital library for the health sciences: Information space
complementing information place. Bulletin of the Medical Library Association, 83:
346–350, July 1995.

[109] T. F. Lunt and E. B. Fernandez. Database security. SIGMOD Rec., 19(4):90–97,


Dec. 1990. ISSN 0163-5808. doi: 10.1145/122058.122069. URL http://doi.acm.
org/10.1145/122058.122069.
Bibliography 114

[110] C. Lynch and H. Garcia-Molina. Interoperability, scaling, and the digital libraries
research agenda: A report on the may 18-19, 1995 IITA digital libraries work-
shop, May 18-19 1995. URL http://www-diglib.stanford.edu/diglib/pub/
reports/iita-dlw/main.html.

[111] C. A. Lynch. Institutional Repositories: Essential Infrastructure for Scholarship


in the Digital Age. ARL bimonothly report, (226), Feb. 2003. URL http://www.
arl.org/resources/pubs/br/br226/br226ir.shtml.

[112] S. Makri, A. Blandford, and A. L. Cox. This is what i’m doing and why: Reflections
on a think-aloud study of dl users’ information behaviour. In Proceedings of the
10th Annual Joint Conference on Digital Libraries, JCDL ’10, pages 349–352, New
York, NY, USA, 2010. ACM. ISBN 978-1-4503-0085-8. doi: 10.1145/1816123.
1816177. URL http://doi.acm.org/10.1145/1816123.1816177.

[113] C. C. Marshall. Annotation: From paper books to the digital library. In Pro-
ceedings of the Second ACM International Conference on Digital Libraries, DL
’97, pages 131–140, New York, NY, USA, 1997. ACM. ISBN 0-89791-868-1. doi:
10.1145/263690.263806. URL http://doi.acm.org/10.1145/263690.263806.

[114] P. Mazzoleni, E. Bertino, and B. Crispo. Xacml policy integration algorithms:


not to be confused with xacml policy combination algorithms! In D. F. Ferraiolo
and I. Ray, editors, SACMAT, pages 219–227. ACM, 2006. URL http://dblp.
uni-trier.de/db/conf/sacmat/sacmat2006.html#MazzoleniBC06.

[115] E. Meena, A. Kumar, and L. Romary. An extensible framework for efficient


document management using rdf and owl. In Proceeedings of the Workshop on
NLP and XML (NLPXML-2004): RDF/RDFS and OWL in Language Technology,
NLPXML ’04, pages 51–58, Stroudsburg, PA, USA, 2004. Association for Com-
putational Linguistics. URL http://dl.acm.org/citation.cfm?id=1621066.
1621074.

[116] C. Meghini and N. Spyratos. Rationale and some principles for a VLDL
data model, Sept. 2008. URL http://puma.isti.cnr.it/linkdoc.php?icode=
2008-A3-022&authority=cnr.isti&collection=cnr.isti&langver=en.

[117] C. Meghini, N. Spyratos, and J. Yang. A data model for digital libraries. Int.
J. on Digital Libraries, 11(1):41–56, 2010. URL http://dx.doi.org/10.1007/
s00799-011-0064-x.

[118] D. Millman. Cross-organizational access management: A digital library authenti-


cation and authorization architecture. 1999.
Bibliography 115

[119] J. Niu. Hierarchical relationships in the bibliographic universe. Cataloging &


Classification Quarterly, 51(5):473–490, 2013.

[120] N. F. Noy. Semantic integration: A survey of ontology-based approaches. SIGMOD


Rec., 33(4):65–70, Dec. 2004. ISSN 0163-5808. doi: 10.1145/1041410.1041421.
URL http://doi.acm.org/10.1145/1041410.1041421.

[121] P. J. Nürnberg, R. Furuta, J. J. Leggett, C. C. Marshall, and F. M. S. III. Digital


libraries: Issues and architectures. In DL, 1995.

[122] M. Nyanchama and S. Osborn. The role graph model and conflict of interest. ACM
Trans. Inf. Syst. Secur., 2:3–33, February 1999. ISSN 1094-9224. doi: http://doi.
acm.org/10.1145/300830.300832. URL http://doi.acm.org/10.1145/300830.
300832.

[123] H. L. E. G. on Digital Libraries. Final report : Digital libraries: Rec-


ommendations and challenges for the future, December 2009. URL
http://www.dlorg.eu/uploads/External%20Publications/HLG%20Final%
20Report%202009%20clean.pdf.

[124] S. Payette and C. Lagoze. Flexible and extensible digital object and reposi-
tory architecture (fedora). In Proceedings of the Second European Conference
on Research and Advanced Technology for Digital Libraries, ECDL ’98, pages
41–59, London, UK, UK, 1998. Springer-Verlag. ISBN 3-540-65101-2. URL
http://dl.acm.org/citation.cfm?id=646631.696688.

[125] M. G. Pillai and P. Aparna. National library and information services infrastruc-
ture for scholarly content (n-list): A study. Paradigm Shift in Libraries, page 25,
2015.

[126] L. Qin and V. Atluri. Semantics aware security policy specification for the se-
mantic web data. Int. J. Inf. Comput. Secur., 4:52–75, February 2010. ISSN
1744-1765. doi: http://dx.doi.org/10.1504/IJICS.2010.031859. URL http://dx.
doi.org/10.1504/IJICS.2010.031859.

[127] F. Rabitti, E. Bertino, W. Kim, and D. Woelk. A model of authorization for next-
generation database systems. ACM Trans. Database Syst., 16(1):88–131, Mar.
1991. ISSN 0362-5915. doi: 10.1145/103140.103144. URL http://doi.acm.org/
10.1145/103140.103144.

[128] I. Ray and S. Chakraborty. A framework for flexible access control in digital library
systems. In E. Damiani and P. Liu, editors, DBSec, volume 4127 of Lecture Notes
in Computer Science, pages 252–266. Springer, 2006. ISBN 3-540-36796-9. URL
http://dx.doi.org/10.1007/11805588_18.
Bibliography 116

[129] E. Rescorla and A. Schiffman. The secure hypertext transfer protocol, 1999.

[130] C. Ruan and V. Varadharajan. A graph theoretic approach to authorization dele-


gation and conflict resolution in decentralised systems. Distrib. Parallel Databases,
27(1):1–29, Feb. 2010. ISSN 0926-8782. doi: 10.1007/s10619-009-7044-9. URL
http://dx.doi.org/10.1007/s10619-009-7044-9.

[131] H. Saeed and A. S. Chaudhry. Using dewey decimal classification scheme (DDC)
for building taxonomies for knowledge organisation. Journal of Documentation,
58(5):578–583, 2002. ISSN 0022-0418. URL http://www.emeraldinsight.com/
journals.htm?articleid=864200&#38;show=abstract.

[132] S. Sakr and G. Al-Naymat. Relational processing of rdf queries: A survey. SIG-
MOD Rec., 38(4):23–28, June 2010. ISSN 0163-5808. doi: 10.1145/1815948.
1815953. URL http://doi.acm.org/10.1145/1815948.1815953.

[133] P. Samarati and S. D. C. d. Vimercati. Access control: Policies, models, and


mechanisms. In Revised Versions of Lectures Given During the IFIP WG 1.7
International School on Foundations of Security Analysis and Design on Foun-
dations of Security Analysis and Design: Tutorial Lectures, FOSAD ’00, pages
137–196, London, UK, UK, 2001. Springer-Verlag. ISBN 3-540-42896-8. URL
http://dl.acm.org/citation.cfm?id=646206.683112.

[134] P. Samarati, E. Bertino, and S. Jajodia. An authorization model for a distributed


hypertext system. Knowledge and Data Engineering, IEEE Transactions on, 8(4):
555–562, 1996.

[135] J. Seidenberg and A. Rector. Web ontology segmentation: analysis, classification


and use. In Proceedings of the 15th international conference on World Wide Web,
pages 13–22. ACM, 2006.

[136] A. Sengupta, C. Mazumdar, and A. Bagchi. A formal methodology for detection


of vulnerabilities in an enterprise information system. In CRiSIS 2009, Post-
Proceedings of the Fourth International Conference on Risks and Security of In-
ternet and Systems, Toulouse, France, October 19-22, 2009, pages 74–81, 2009.
doi: 10.1109/CRISIS.2009.5411976. URL http://dx.doi.org/10.1109/CRISIS.
2009.5411976.

[137] A. Sengupta, C. Mazumdar, and A. Bagchi. A formal methodology for detecting


managerial vulnerabilities and threats in an enterprise information system. J.
Network Syst. Manage., 19(3):319–342, 2011. doi: 10.1007/s10922-010-9180-y.
URL http://dx.doi.org/10.1007/s10922-010-9180-y.
Bibliography 117

[138] A. Sengupta, C. Mazumdar, and A. Bagchi. Specification and validation of enter-


prise information security policies. In CUBE International IT Conference & Ex-
hibition, CUBE ’12, Pune, India - September 03 - 06, 2012, pages 801–808, 2012.
doi: 10.1145/2381716.2381868. URL http://doi.acm.org/10.1145/2381716.
2381868.

[139] S. B. Shum, E. Motta, and J. Domingue. Scholonto: an ontology-based digital


library server for research documents and discourse. Int. J. on Digital Libraries,
3(3):237–248, 2000. URL http://dx.doi.org/10.1007/s007990000034.

[140] H. Skogsrud, B. Benatallah, and F. Casati. A trust negotiation system for digital
library web services. International Journal on Digital Libraries, 4(3):185–207,
2004.

[141] W.-S. Sohn, S.-K. Ko, Y.-C. Choy, K.-H. Lee, S.-H. Kim, and S.-B. Lim. De-
velopment of a standard format for ebooks. In Proceedings of the 2002 ACM
Symposium on Applied Computing, SAC ’02, pages 535–540, New York, NY,
USA, 2002. ACM. ISBN 1-58113-445-2. doi: 10.1145/508791.508894. URL
http://doi.acm.org/10.1145/508791.508894.

[142] D. Spampinato and I. Zangara. Classical antiquity and semantic content man-
agement on linked open data. In Proceedings of the 1st International Workshop
on Collaborative Annotations in Shared Environment: Metadata, Vocabularies and
Techniques in the Digital Humanities, DH-CASE ’13, pages 13:1–13:7, New York,
NY, USA, 2013. ACM. ISBN 978-1-4503-2199-0. doi: 10.1145/2517978.2517992.
URL http://doi.acm.org/10.1145/2517978.2517992.

[143] M. Spies. An ontology modelling perspective on business reporting. Information


Systems, 35(4):404–416, 2010.

[144] A. M. Tammaro. Going digital in italy. Humboldt University Berlin, Germany,


Jan. 19 2008. URL http://edoc.hu-berlin.de/conferences/bobcatsss2008/
tammaro-anna-maria-121/PDF/tammaro.pdf.

[145] I. Tatarinov, Z. Ives, J. Madhavan, A. Halevy, D. Suciu, N. Dalvi, X. L. Dong,


Y. Kadiyska, G. Miklau, and P. Mork. The piazza peer data management project.
SIGMOD Rec., 32(3):47–52, Sept. 2003. ISSN 0163-5808. doi: 10.1145/945721.
945732. URL http://doi.acm.org/10.1145/945721.945732.

[146] Y. L. Theng, N. Mohd-Nasir, H. Thimbleby, G. Buchanan, and M. Jones. De-


signing a children’s digital library with and for children. In Proceedings of the
Fifth ACM Conference on Digital Libraries, DL ’00, pages 266–267, New York,
Bibliography 118

NY, USA, 2000. ACM. ISBN 1-58113-231-X. doi: 10.1145/336597.336697. URL


http://doi.acm.org/10.1145/336597.336697.

[147] Y. L. Theng, N. Mohd-Nasir, G. Buchanan, B. Fields, H. Thimbleby, and N. Cas-


sidy. Dynamic digital libraries for children. In Proceedings of the 1st ACM/IEEE-
CS Joint Conference on Digital Libraries, JCDL ’01, pages 406–415, New York,
NY, USA, 2001. ACM. ISBN 1-58113-345-6. doi: 10.1145/379437.379738. URL
http://doi.acm.org/10.1145/379437.379738.

[148] P. Tools. Checkstyle, 2015. URL http://checkstyle.sourceforge.net.

[149] H. A. T. Tran. Challenges in the digital information era: Situation at the general
sciences library of hochiminh city. Library Management, 36(4/5):315–328, 2015.

[150] S. Tuarob, L. C. Pouchard, P. Mitra, and C. L. Giles. A generalized topic modeling


approach for automatic document annotation. International Journal on Digital
Libraries, 16(2):111–128, 2015.

[151] D. Tudhope, H. Alani, and C. Jones. Augmenting thesaurus relationships: possi-


bilities for retrieval. Journal of digital information, 1(8), 2006.

[152] M. F. van Bommel and T. J. Beck. Incremental encoding of multiple inheritance


hierarchies supporting lattice operations. Electron. Trans. Artif. Intell, 4(C):35–
49, 2000. URL http://www.ep.liu.se/ej/etai/2000/006/.

[153] M. F. van Bommel and P. Wang. Encoding multiple inheritance hierarchies for
lattice operations. Data Knowl. Eng, 50(2):175–194, 2004. URL http://dx.doi.
org/10.1016/j.datak.2003.12.001.

[154] M. Waugh, M. Donlin, and S. Braunstein. Next-generation collection manage-


ment: A case study of quality control and weeding e-books in an academic library.
Collection Management, 40(1):17–26, 2015.

[155] M. Winslett, N. Ching, V. E. Jones, and I. Slepchin. Using digital credentials on


the world wide web. Journal of Computer Security, 5(3):255–266, 1997.

[156] I. H. Witten. Visions of the digital library, Mar. 11 2002. URL


http://citeseer.ist.psu.edu/502866.html;http://www.cs.waikato.ac.
nz/~ihw/papers/01IHW-VisionsOfTheDL.pdf.

[157] H. I. Xie. Users’ evaluation of digital libraries (dls): Their uses, their criteria,
and their assessment. Inf. Process. Manage., 44(3):1346–1373, May 2008. ISSN
0306-4573. doi: 10.1016/j.ipm.2007.10.003. URL http://dx.doi.org/10.1016/
j.ipm.2007.10.003.
Bibliography 119

[158] X. Zhang, J. Park, F. Parisi-Presicce, and R. Sandhu. A logical specification for


usage control. In Proceedings of the ninth ACM symposium on Access control
models and technologies, pages 1–10. ACM, 2004.

[159] D. G. Zhao and A. Ramsden. The elinor electronic library. In Selected Papers
from the Digital Libraries, Research and Technology Advances, ADL ’95, pages
243–258, London, UK, UK, 1996. Springer-Verlag. ISBN 3-540-61410-9. URL
http://dl.acm.org/citation.cfm?id=647695.731359.

[160] R. L. Ziegler. Linux-Firewalls. Markt + Technik Verl., München, 2002. ISBN


3-8272-6257-7, 978-3-8272-6257-8.

View publication stats

S-ar putea să vă placă și