Documente Academic
Documente Profesional
Documente Cultură
Air transportation has become the most popular and most convenient way for people to travel. Since the wide-spread using of airline, flight selection and flight ticket booking have become very important for the passengers, who need a convenient and fast information system for searching flight information and booking flight tickets. In this project, we plan to design and implement a distributed flight ticket booking system. Distributed database system is essential to flight ticket booking. First, since the airports are located at various cities in the USA, flights and tickets information should be more easily to store and manage in the distributed database system; Second, by focusing most of the query and update operations of flight and ticket information on local database, distributed database system can reduce the size of single database node without too much network traffic; finally, the replication of the flight and ticket information data in distributed database system can enhance the reliability and availability of the flight ticket booking system.
Each node stores a part of flight information, ticket information and passenger
information for the whole distributed database system. And there will be some replication of the data between these three airport databases to make the distributed database system more reliable. However, this kind of high level problem definition can not demonstrate clearly enough of the concrete problems, so here we display our considerations during approaching it. Distribution Transparency: This is a desirable feature of a distributed database system; our system should be totally transparency: users do not need to know the location of data, the fragment of relations and the replication of data, this kind of data distribution issues. Database Schema: Our design of the overall database schema should clearly represent the data structure of a flight ticket booking system, and also data can efficiently stored. Communication: Communication among the nodes is the foundation of a distributed database. We will build a communication layer to achieve it. Query Routing and Optimization: Here involves the issues that how to forward a non-local operation to its target destination and when multiple routes can be chosen, how to achieve lowest communication cost. Data Fragment and Replication: It will be essential to implement effective data fragment and replication policies to achieve load balancing among the nodes, increased parallelism and enhanced availability. Like we have said before, the data fragment and replication are totally transparent to the users. In our current work, we have designed and implemented the features regarding the first four points and some parts of the fifth point.
Scalability: Because of the P2P nature of the organization of agent, there is no constraint on the scalability of the agent layer. Thus, just by configuring the routing information of the new nodes, you can add arbitrary number of local database nodes to the system; or you can configure the routing information of the whole organization of agents to optimize the routing performance; Routing selection and optimization: through the pre-defined routing table in our system, we can choose the routes forwarding some non-local operations to its destination; based on the existed routing information, a new node could select the routing and easily join in the organization as 2 illustrates; Independence between GUI interfaces and Agent: our implementation of the GUI and the agent is fully independent. Different interfaces implementation approaches can be used as long as constructing messages in the format the agent requests; this makes incremental augment to the application's functionality, offering good extendibility.
ER mod
Airport
Name City Addr Tele No.
Passenger
SSN Name Tele No. Gender
Book
SSN Flight No. Date Type Checkin
Data Fragment
The partition strategy for the relation Flight: The relation Flight is horizontal partitioned. And tuples are distributed into the five airport database, each of which stores the tuples whose departure city is the same to the one its airport database deployed. The partition strategy for the relation Airport: The relation Airport is horizontal partitioned. Each airport database stores the tuples which contain its own airport's information. The partition strategy for the relation Passenger: The relation Passenger is horizontal partitioned. Each tuple will be stored on the database of a certain airport where he or she has ever booked some flights. The partition strategy for the relation Book: The relation Book is horizontal partitioned. Each tuple of the book information will be stored in the database of a certain airport which is the same to the departure city of the booked flight.
Output: all the information about flights with the certain flight number and departure date 2. Flight Information Retrieval -2 Input: from_city, to_city, date Output: all the information about flights with the certain departure city, destination city and departure date 3. Airport Information Retrieval-3 Input: airport's name Output: information about the certain airport 4. Booking Flights Input: result of the flight information retrieval Update: inserting the booking Information into the Booking relation, updating the number of available seats number about flight information and inserting the passengers' information into Passenger relation if it has not existed. 5. Booking Information Retrieval-1 Input: passenger's ssn number Output: all the booking record about the certain passenger. 6. Passenger Check-in Input: flight number, passenger's ssn number Update: updating the check-in status of the booking information of a passenger from not check-in to check-in.
System Implementation
Our system is constructed by two main modules, GUI and Agent. GUI is mainly responsible for interaction between users and our system. Agent is responsible for query routing and interaction between our system and database. The Figure 3 showed as follow gives us an overview of the architecture of our system. The goal of the GUI is to receive the requests of the users and send them to the lower agent after translate the requests into some SQL sockets. It must also receive the result of the requests and translate and then reveal them to the passengers with correct and clear meaning. As the key module working in the whole architecture of the distributed database system, Agent works in the scenarios below: The Agent receives a database query request; then the Agent decides the request is the operation on the local database or should forward the request to another Agent for remote operation on other databases;
at the end, the Agent receives the query results either from the local database or another Agent. We will discuss the functions of those two main modules in details in this section.
GUI
Input Input
Input
SQL Generator
SQL Statement
Destination Identify
SQL Socket: SQL+Type+Destination
Agent
Message Parser
N
Other Agents
Local ?
Y
Router
Local Database
A snapshot of GUI
Input2 Input1
Input3
Input4
Operation Type
Destination Identity
SQL Socket
GUI generates "Request" statement On the other hand, GUI receives the result of the requests from Agent and translates and then reveals them to the users. The Figure 6 gives us a direct view of this.
Request Status
Request Result
Operation Type
SQL Socket
The GUI will get the requests result from Agent in the form of SQL socket and send them to SQL Generator and Interpreter to parse out the information which is needed by the user. Then, these pieces of information will be showed on the display panel of the GUI to the user who initiates the request. You can refer to the section on message format to understand more about the specific method of interpretation between request or result and SQL socket.
GU I results
Agent
Database
Database
the position of agent in the whole system To truly fulfill the functions requirements of Agent, we propose such designs and technique.
5. Sockets programming techniques are employed to communicate with GUI and other Agent peers. Especially, the server sockets programming technique is utilized to implement the server function of the Agent; 6. Each Agent should have a routing table to decide the direction of every query Request. 7. Because the backend SQL SERVER 2000 as the database system, the JDBC-ODBC bridge is adopted to enable the interact of Agent with database System. 8. Multi threads techniques of JAVA has been used to handle concurrent requests.
The architecture of agent Here given the description of the server module and thread module based on pseudo JAVA description.
The Server module: public class agent extends Thread { public agent(int port) { listen_socket = new ServerSocket(port); this.start(); } public void run() { try { while(true) { Socket client_socket = listen_socket.accept(); event new_client = new event(client_socket); } } catch(IOException e) { fail(e, "Exception while listening for conncetions from client"); } } public static void main(String[] args) { } } The thread module (named as Event): class event extends Thread { public event(Socket coming_socket) { client = coming_socket; try
{ in= new DataInputStream(client.getInputStream()); out = new PrintStream(client.getOutputStream()); } this.start(); } public void run() { try { while(true) { request = in.readLine(); if (request is local operation) { results = operation in local database; encapsulate the results; send the encapsulated results to out; break; } el se { find the next hop; socket = new socket (next hop); forward (request, socket); encapsulated results = receive (socket, results); send the encapsulated results to out; break; } } close (client); } } } The function related with sockets and JDBC-ODBC has works well in the environment. The request can be forwarded correctly to the destination.
Currently the message format is temporally designed as below: 1. Encapsulated Request: %destination%operation_type%sql_statement% For example: %atlanta%select%select * from passenger% %seattle%insert%insert.% %seattle%update%update% %seattle%delete%delete..% 2. Encapsulated Reply: Select reply: %select%ok%*RECORD1**RECORD2**RECORD3**RECORDn*% RECORD: $column1$$column2$$column3$$...........$column n$ %select%error% Insert reply: %insert%ok%number of inserted items% %insert%error% Update reply: %update%ok%number of updated items% %update%error% Delete reply: %delete%ok%number of deleted items% %delete%error% The key point in the message format design is that how to transfer the ResultSets object in JAVA via sockets programming environment. We design such format and implement in our system. From identifying the information in the message, the application could handle most situations of the database operation, including success and failure.
6. System Test
Because of the limitation of the hardware, we deploy five systems in two virtual machines in the VMware station. It is equal to the situation that the these five nodes are deployed in five different machines in the network, because that we use IP&PORT to identify each agent in corresponding system and their communication is based on socket. The testing environment for our system is illustrated in Figure 9.
Figure 9 The testing environment Here are two scenarios to illustrate the working scenarios with different routing information: 9. In Figure 10, Atlanta could directly reach at the node Seattle. So a request from Atlanta could reach directly Seattle. 10. In Figure 11, in the case that the network connection between Atlanta and Seattle is not available, the Atlanta could use other nodes, New York and Cleveland as the router to forward the request to the Seattle destination.
Routing Table- newyork Routing Table- atlanta Dest Next Hop Seattle Newyork Atlanta Dest Seattle Newyork Chicago Next Hop Cleveland Request Reply
Cleveland
Figure 11 Scenario with an optimized routing table In the testing phase of the system, our system is easy to configure and run stably in the environment.
7. Future work
The current distributed flight ticket booking system is a prototype system. We adopt routing selection and optimization policies as well as basic data fragment and
replication. One of the issues we consider is that when the local nodes number increases, how to dynamically represent the routing related information, such as traffic information and how to optimize routing under this kind of complex scenarios. Another issue is the concurrency control against replicated data in the system. Some simple protocols such as primary copy can be applied to the system. At last, we want to counter system's vulnerability by introducing some failure nodes dealing strategies and communication lost dealing strategies.
Acknowledgement
Thanks Prof. Shamkant Navathe's insight suggestions about scoping the project, also his helpful comments to our proposal and mid-term report.