Sunteți pe pagina 1din 7

Technical White Paper on Image Server

About Image Server


OmniDocs is a multi-tiered highly scalable Document Management solution.
At the heart of the OmniDocs Storage Management is the Image Server. This is responsible for storage and retrieval of
documents and for the entire document lifecycle management, moving documents from online to offline storage, data
caching, replication etc. It is implemented using server-side Java and is available on Windows, Linux and Unix platforms.
The Image Server is designed to support LAN, WAN and Internet environments, where Image storage is distributed
across multiple locations. The index information for the archived images is maintained in centralized database server (as
depicted in figure) that provides a single and centralized access to the images stored at multiple sites.

Image Server Organization


The Image Server consists of a Centralized Index Database (Image Server Database) and multiple Storage Management
Servers (SMS) located at multiple sites.

Storage Management Server (SMS)


Storage Management Server manages the actual storage and retrieval of Image Data. Multiple SMS can be deployed for a
single Image Server. SMS has the concept of labels, created for different storage media attached to the SMS. These labels
are logical references to absolute path on the media.
Image Volumes and Volume -Blocks
Image storage at a physical location is divided into logical storage units called Image Volumes. An Image Volume in turn
consists of multiple Image Volume Blocks. These Volume blocks are physical files corresponding to a group of image
files. This image data file is built by Image Server in correspondence to the Volume Block and provides the actual
physical storage.
An Image Volume can be replicated across multiple sites. There is no limit to the physical storage of an Image Volume.
Multiple Image volumes can be defined for an SMS.

The above figure displays the basic design of the Image Server. Here we have three SMS running At the Central SMS location we have four volumes
whose home site is Site1. vol1, vol2,vol3 are replicated at Site 2 and vol1,vol4 are replicated at Site 3.

Newgen Confidential
Site, Home Site, Preferred Site and Remote Site
A site is a physical location where Images are stored. Each Image Archive is logically divided into one or more Image
Volumes, which can be replicated across various Sites. This replication can be either automatic or manual.
• Home Site: Each Image Volume has a home site. This is where the documents of that image volume get added
by default.

• Preferred Site: A Preferred site is the site from where the user would wish to retrieve his documents. Multiple
preferred sites can be set with priority / round-robin etc.

• Replica Site: All sites other than the home-site/preferred site are referred to as replica sites.

Image Server Database


The Image Server Database consists of index information about sites, volumes, volume-blocks and the documents stored
within each volume-block. Index information about a document includes the Volume-Id, and the Document-Id. This is a
centralized database and the index information may pertain to documents available at local or remote sites. Typically, in
OmniDocs, there is one single Image Server Database for each Document Database.

Replication
This is a concept of replicating/copying documents from a home site to one or multiple replica site(s). The Storage
Management Server is responsible for replication. Image Volumes can be configured for immediate replication as well as
delayed replication.
A replica site is an alternate site where the document of the home-site is replicated. One Volume can have multiple replica
sites but have only one home site.

Caching
The SMS supports caching of both read as well as write requests. For Write cache, the volumes can be created on online
media, and later the volume-blocks moved to offline media. In case of Read cache, documents that exist on offline / near-
online media are cached to online media for future access, when they are retrieved.

Document Storage and Retrieval using Image Server


The Image Server is primarily responsible for effective storage and retrieval of document images. It supports distributed
storage and retrieval of documents.
Image Server consists of a server component as well as a client component. User applications such as OmniDocs
Windows Desktop or OmniDocs Web Desktop interface with the Image Server client for performing Document storage
and retrieval.

Document Storage
OmniDocs Document management system has a concept of documents and folders. Folders can be hierarchically
organized and documents reside inside a folder. Every folder has an Image volume mapped to it. When a document is
added to a folder it gets added to the volume associated with it. Each added document gets a unique Image Server Index
(ISIndex), which is a combination of the Document-Id and the Image Volume ID on which it is stored. This helps to
uniquely identify the document across multiple sites.
For Document storage, documents are always added to the home site, and depending upon the replication mechanism,
they get replicated to the replica sites immediately or in a scheduled manner.

Document Retrieval
Image Server client while requesting for a connection, specifies a preferred site, to which it wants to get connected.
For document retrieval, the client specifies the ISIndex of the document it wants to retrieve. Whenever a request for
document retrieval is made, the Image server checks whether the document resides on the client’s preferred site. If the
document resides on the preferred site then it is fetched from there else it is fetched from the Home site of the Volume. In
either case the document is accessed through a common Image Server Client Interface.
Newgen Confidential
Deploying Image Server

The actual image server deployment depends upon how the user decides to organize his documents in OmniDocs
especially in a scenario where the clients and server can span across different geographical locations. In a multi-location
scenario, the central server resides at one location, and users from various geographical locations access the central
OmniDocs Server.
In such cases, the administrator can either configure the central site as the home site, and replicate it at the local site, or he
can configure the local site as the home site, and replicate it at central sites. Typically, such a decision would depend upon
the connectivity between the sites, the folder structure, as well as the likely access to the documents from a particular
location.

Central Site configured as Home Site

Document Addition
In this case, all documents that get added from the local location, actually get added to the central server, and then
subsequently replicated to the local site. Replication between the local and central storage servers requires direct TCP/IP
connectivity.

Document Retrieval
At the time of retrieval, however, to ensure optimal bandwidth usage, in case the data is available locally, then the data is
fetched locally.
When the user requests the central server to view a document, the central server will check if the document has been
replicated to the preferred local site. If it has been replicated, then it will pass the appropriate status-code, and the user
will then connect to the local storage server to fetch the data. If, however, the document has not replicated to the local
preferred site, then the central server will also fetch the data from the central site itself and serve the user.

This configuration ensures that at the time of retrieval, the user is guaranteed the document. If the replication has
happened, then he will get the document from the local site, otherwise from the central site. Also, it ensures that remote
users can also always retrieve the document (from the central site).
Typically, this is the recommended configuration for multi-location systems

Local Site configured as Home Site

Document addition
If the local site is the home site of the Image volume then when a document is added it gets added to the local site, as the
document always gets added to the Home Site. A Replica site can be defined for this local site accordingly. User can
configure the site at the central server location as the replica site. The documents get replicated to the replica site
immediately or in a scheduled manner as defined by the user. Any number of replica sites can be defined for a Image
volume.
The advantages of defining replicas is that the document retrieval process is optimized as if a document resides on a
replica site and the replica site is the preferred site for a user, then the document can be efficiently fetched from the
preferred site rather then from the Home site.

Document Retrieval
When a document retrieval request is sent then first the preferred site is searched for the user. If the document lies on the
preferred site then the document is retrieved from the preferred site. If not, then the document is retrieved from the central
site.
The disadvantage with this option is that documents will not be available to remote sites, unless they have been replicated
to the central server.

Newgen Confidential
Deployment scenarios
This section describes a few multi-location deployment scenarios for OmniDocs, which make effective use of bandwidth.

Following table shows the communication protocols used between different components of OmniDocs.

From To Connectivity Protocol


DeskTop Client JTS TCP/IP
DeskTop Client SMS TCP/IP
Web Browser Web Server HTTP / HTTPS
JTS DB, IS DB TCP/IP (JDBC )
Web Server JTS TCP/IP
Web Server SMS TCP/IP

Scenario-1:
OmniDocs Web and OmniDocs Desktop with Central Server Site and Replicated Local Site

Web server Location A - OmniDocs Web Server


Web server Location B - SMS Web Server (for retrieval of document data)
JTS – OmniDocs Document server (Java Transaction Server)
DB – JTS database
IS DB – Image Server Database
SMS – Storage Management Server

• Location A acts as the Central server location where a Web Server, JTS and SMS for Site 1are deployed.
• Location B is a remote location where we have a SMS Web Server and SMS for Site 2 deployed.
• TCP/IP connectivity is required between locations A and B at the time of replication.
• Site 1 is configured as the Home site for the user of location A. For users of location B, Data is first added
to Site 1 and then replicated to Site 2 immediately or in a scheduled manner depending upon connectivity
available.
• For Location A Users, Desktop clients directly connect to the JTS and SMS running at the Central server
location, which is local to it. Web clients connect to the Web Server, which connects to the JTS and the
SMS for document addition and retrieval.
• Web clients at Location B connect to the Web server at location A. The document is always added to the
Central Site (Site 1). After this the document gets replicated from Site 1 to Site 2 with the help of the
Replication Scheduler. When the web client at Location B sends a document retrieval request, the Central

Newgen Confidential
location web server checks whether the document has been replicated or not. If it has not been replicated
then the document is fetched from the Central Site. If the document has been replicated to any of the
Preferred Site of the user then URL is redirected to the web server of the Preferred Site. In the current
scenario the URL is redirected to the SMS Web server at location B that in turn fetches the document image
from the SMS running at location B.
• Data retrieved from location B users is handled by SMS of Site 2, which is local, resulting in effective
utilization of the corporate bandwidth.

Newgen Confidential
Scenario-2:
OmniDocs Windows Desktop and OmniDocs Web with Central Server Site as Replicated Site
and Preferred Site as Home Site

Web server _OmniDocs Web Server


JTS – OmniDocs Document server (Java Transaction Server)
DB – JTS database
IS DB – Image Server Database
SMS – Storage Management Server

• Location A acts as the Central server location where a Web Server, JTS and SMS for Site 2 are deployed.
• Location B is a remote location where we have a Web Server and SMS for Site 1deployed. The Web server
at location B caters to the HTTP requests of the web clients at Location B. The Web Server connects to the
JTS running at the Central server Location through TCP/IP connectivity.
• Permanent TCP/IP connectivity is required between both the locations A and B.
• Site 1 is configured as the Preferred Site for the user of location B. For user of location B Data is first added
to Site 1 and then replicated to Site 2 immediately or in a scheduled manner.
• For Location A Users, Desktop clients directly connect to the JTS and SMS running at the Central server
location, which is local to it. Web clients connect to the Web Server, which connects to the JTS and the
SMS for document addition and retrieval.
• For Location B users, Desktop clients connect directly to the JTS at location A using TCP/IP. And SMS at
Location B. Web clients at Location B connect to the Web server at location B. The web server connects to
the JTS running at the Location A and the SMS running at Location B.
• Data added and retrieved from location B users is handled by SMS of Site 1, which is local, resulting in
effective utilization of the corporate bandwidth.

Newgen Confidential
Conclusion
In Scenario 1, the documents are always added to the central site. For retrieval, bandwidth optimization is done, through
the concept of preferred site, wherein, if the document is available locally, then the document transfer is done locally.
However, for this to happen, the document at the central server added from the local site needs to be replicated back at the
respective local site. This of course is a scheduled operation when bandwidth requirements are low.
In Scenario 2, the document is added to the preferred site and replicate at the central site when bandwidth requirements
are low. The advantage with this approach of replicated storage and local retrieval, is that disaster recovery, low
bandwidth requirement and fast retrieval.

Newgen Confidential

S-ar putea să vă placă și