Documente Academic
Documente Profesional
Documente Cultură
01
Guidelines for
Digitization of
Manuscripts
final digitization.qxd 27/12/05 2:55 PM Page 1
2 3
Introduction This document addresses the standards for creating of organisation such as Library of Congress, USA,
archival quality digital still images of manuscripts. These National Library of Australia, National Library of New
guidelines are meant for any organisation planning to Zealand helped our efforts to develop the workable stan-
digitise these materials. The guidelines specify factor dards. National Informatics Centre has conducted proto-
affecting image quality, file formats, storage and access type digitization of manuscripts of Orissa and Chennai
standards for images. using these specifications. Guidelines are indicative and
Guidelines are prescribed to maintain consistency, evolving in nature. As the technology advances these need
high quality scan and meeting the global standards. It is updating regularly.
advisable to use these guidelines to create images of long-
term usage and reduce the need to rescan material. Digitization Digital technology opens up a totally new perspective.
Out of scope of these guidelines are: The World Wide Web holds millions of websites and the
• digitization standards for audio and video recording Internet is market place for research, teaching, expression,
• born digital material publication and communication of information. Libr-
• conservation processes for preparation of material for aries and Archives are society’s primary information
digitization providers with respect to cataloguing and processing
• scholarly work required deciphering manuscripts like management. Besides preserving and providing access to
folio numbers etc. ‘born digital material’ a great number of libraries nowa-
days have also turned to creating digital surrogates from
Background India has the largest collection of manuscripts in the their existing resources. Digitization means acquiring,
world. They are spread all over the country and also converting, storing and providing information in a
abroad in different libraries, academic institutions, muse- computer format that is standardized, organized and
ums, temples and monasteries and in private collections. available from demand from common system. With
The rich manuscript wealth of India today faces a threat specialized scanner’s manuscripts are converted into
of survival. The invaluable heritage of India in the form of compressed digital formats and stored systematically for
manuscripts has to be documented, preserved and made future reference.
accessible to us and to succeeding generations.
National Mission of Manuscripts (NMM) has the Target Audience These standards are aimed at decision makers, library
primary objective of using digital technology to preserve managers, and curatorial and technical staff members.
the manuscripts for the posterity. There are no digitiza-
tion standards thus far available those the mission in its Why Digitize? The reasons for implementing a digitization project, or
massive digitization initiative can adopt. NMM and more precisely for digital conversion of non-digital
National Informatics Centre – a premier Govt. of India IT source material, are varied. The decision to digitize may
organisation – has studied the best practices being adopt- be in order to:
ed in several digitization projects by world libraries, • To increase access: this is the most obvious and primary
museums etc. In depth study of the digitization processes reason, where there is thought to be a high demand
final digitization.qxd 27/12/05 2:55 PM Page 4
4 5
from users and the library or source has the desire to • Principal reasons for digitization
improve access to a specific collection. • Criteria for selection
• To improve services to an expanding user’s group by
providing enhanced access to the institution’s resources Principal reasons For enhanced access
with respect to education, long life learning. Mainly for research purposes.
for digitization
• To reduce the handling and use of fragile or heavily
used original material and create a “back up” copy for To facilitate new forms of access and use
endangered material. The main purpose in this case is to enable the use of
original manuscripts that cannot be consulted in its ori-
Components • Selection Policy ginal form other than by visiting its specific repository,
and for manuscripts that has been damaged and where
• Conversion technology is needed to reveal its content or shape.
6 7
tion may present too many risks of further damage being computer to store and retrieve them. From these files
caused by handling to allow it to be scanned without spe- the computer can produce analog representations for
cial care, or some basic conservation techniques. on-screen display or printing. Because files with high-
Similarly, if the material being considered as a candidate resolution images are very large it may be necessary to
for digitization lacks detailed cataloguing or descriptive reduce the file size (compression) to make them more
data, it is essential for future access to such material to manageable both for computer and the user.
create such data, and it will therefore need to be consid- When a source document has been scanned, all data is
ered whether the necessary costs of doing this can be converted to a particular file format for storage. There are
included in the overall budget of the digitization project. a number of widely used image formats on the market.
Some of them are meant both for storage and compres-
Technical Conversion sion. Image files also include technical information stored
A digital image is an “electronic photograph” mapped as in an area of the file called the image “header”.
Requirements
a set of picture elements (pixels) and arrange according The goal of any digitization programme should be to
and to a predefined ratio of columns and rows. The number capture and present in digital formats the significant
Implementation of pixel in a given array defines the resolution of the informational content contained in a single source docu-
image. Each pixel has a tonal value depending on the ment or in a collection of such documents. To capture the
level of light reflecting from the source document to a significant parts, the quality assessments of the digital
charged-coupled device (CCD) with light-sensitive images have to be based on a comparison between those
diodes. When exposed to light they create a proportional digital images and original source documents that are to
electric charge, which through an analog/digital conver- be converted, not on some vaguely defined concept of
sion generates a series of digital signals represented in what is good enough to serve the immediate needs.
binary code. The smallest unit of data stored in a com-
puter is called a bit (binary digit). The number of bits Input Specification
used to represent each pixel in an image determines the
number of colours or shades of grey that can be repre- • The input documents are manuscripts of generally
sented in a digital image. This is called bit-depth. A2 – A4 size.
Digital images are also known as raster images to sep-
arate them from other type of electronic files such as vec- • Manuscripts are available at various repositories located
tor files in which graphic information is encoded as at different parts of the country and need to be
mathematics formulas representing lines and curves. digitised at the Digitization Centre to be located at site.
Source documents are transformed to bit-mapped
images by scanner or digital camera. During image cap- • Manuscripts are primarily available on Paper (various
ture these documents are “read” or scanned at a prede- types), Palm leaves, Bhoj Patra, Bamboo leaves, Pamera
fined resolution and bit-depth. The resulting digital files, Leaves, Cloth, Clay Tables, Perchments, Tamra Patra,
containing the binary digits (bits) for each pixel, are then Wooden Covers, Ivory Covers/Sheets, Wooden Beads,
formatted and tagged in a way that makes it easy for a Scrolls, Dear Skin, Micro film etc.
final digitization.qxd 27/12/05 2:55 PM Page 8
8 9
• They are generally very old and brittle and need special • Long, horizontal format requires special handling
and sophisticated handling techniques. considerations.
• Some manuscripts are having illustrations/charts creat- • To maintain the sequence of loose leaf manuscripts
ed using ancient inks, vegetable dyes, metals such as local scholar are to be engaged to remove the thread,
silver, gold etc. They are very likely to get oxidised with enumerate the pages and record the missing folios and
the effect of bright light and heat. rethread the manuscripts after digitization.
10 11
12 13
14 15
are stored. So the organization is in such a way that just Specification 2. Clean Master (Cleaned loss less compressed image)
by looking at the name one could tell about the manu-
script digitized. 3. Access Image (Derivative lossy image)
16 17
18 19
20