Documente Academic
Documente Profesional
Documente Cultură
A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data. The collection of data, usually referred to as the database, contains information relevant to an enterprise. The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient. Database systems are designed to manage large bodies of information. Management of data involves both defining structures for storage of information and providing mechanisms for the manipulation of information. In addition, the database system must ensure the safety of the information stored, despite system crashes or attempts at unauthorized access. If data are to be shared among several users, the system must avoid possible anomalous results. Because information is so important in most organizations, computer scientists have developed a large body of concepts and techniques for managing data.
View of Data
A database system is a collection of interrelated files and a set of programs that allow users to access and modify these files. A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained.
at the logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of this complexity. Database administrators, who must decide what information to keep in the database, use the logical level of abstraction.
n Logical level: describes data stored in database, and the relationships among the data. type instructor = record ID : string; name : string; dept_name : string; salary : integer; end; n View level: application programs hide details of data types. Views can also hide information (such as an employees salary) for security purposes.
n View level. The highest level of abstraction describes only part of the entire n database. Even though the logical level uses simpler structures, complexity n remains because of the variety of information stored in a large database. Many n users of the database system do not need all this information; instead, they n need to access only a part of the database. The view level of abstraction exists n to simplify their interaction with the system. The system may provide many n views for the same database.
Database systems have several schemas, partitioned according to the levels of abstraction. The physical schema describes the database design at the physical level, while the logical schema describes the database design at the logical level.Adatabase may also have several schemas at the view level, sometimes called subschemas, that describe different views of the database. Of these, the logical schema is by far the most important, in terms of its effect on application programs, since programmers construct applications by using the logical schema. The physical schema is hidden beneath the logical schema, and can usually be changed easily without affecting application programs. Application programs are said to exhibit physical data independence if they do not depend on the physical schema, and thus need not be rewritten if the physical schema changes.
n A collection of tools for describing l l l l Data Data relationships Data semantics Data constraints
n Relational model n Entity-Relationship data model (mainly for database design) n Object-based data models (Object-oriented and Objectrelational) n Semistructured data model (XML) n Other older models:
l l
Database Languages
A database system provides a data definition language to specify the database schema and a data manipulation language to express database queries and updates. In practice, the data definition and data manipulation languages are not two separate languages; instead they simply form parts of a single database language, such as the widely used SQL language.
The modification of information stored in the database A data-manipulation language (DML) is a language that enables users to access or manipulate data as organized by the appropriate data model. There are basically two types: Procedural DMLs require a user to specify what data are needed and how to get those data. Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data are needed without specifying how to get those data. Declarative DMLs are usually easier to learn and use than are procedural DMLs. However, since a user does not have to specify how to get the data, the database system has to figure out an efficient means of accessing data. The DML component of the SQL language is nonprocedural. A query is a statement requesting the retrieval of information. The portion of a DML that involves information retrieval is called a query language. Although technically incorrect, it is common practice to use the terms query language and datamanipulation language synonymously. This query in the SQL language finds the name of the customer whose customer-id is 192-83-7465: select customer.customer-name from customer where customer.customer-id = 192-83-7465 The query specifies that those rows from the table customer where the customer-id is 192-83-7465 must be retrieved, and the customer-name attribute of these rows must be displayed. If the query were run on the table in Figure 1.3, the name Johnson would be displayed. Queries may involve information from more than one table. For instance, the following query finds the balance of all accounts owned by the customer with customerid 192-83-7465. select account.balance from depositor, account where depositor.customer-id = 192-83-7465 and depositor.account-number = account.account-number
for each user. Relational database designthe design of the relational schemais the first step in building a database application.
Since we have studied only the E-R model so far, we shall use it to develop the conceptual schema. Stated in terms of the E-R model, the schema specifies all entity sets, relationship sets, attributes, and mapping constraints. The designer reviews the schema to confirm that all data requirements are indeed satisfied and are not in conflict with one another. A fully developed conceptual schema will also indicate the functional requirements of the enterprise. In a specification of functional requirements, users describe the kinds of operations (or transactions) that will be performed on the data. Example operations include modifying or updating data, searching for and retrieving specific data, and deleting data. At this stage of conceptual design, the designer can review the schema to ensure it meets functional requirements. The process of moving from an abstract data model to the implementation of the database proceeds in two final design phases. In the logical-design phase, the designer maps the high-level conceptual schema onto the implementation data model of the database system that will be used. The designer uses the resulting system specific database schema in the subsequent physical-design phase, in which the physical features of the database are specified. These features include the form of file organization and the internal storage structures;
The process of designing the general structure of the database: n Logical Design Deciding on the database schema. Database design requires that we find a good collection of relation schemas. l l Business decision What attributes should we record in the database? Computer Science decision What relation schemas should we have and how should the attributes be distributed among the various relation schemas?
n Physical Design Deciding on the physical layout of the database n Normalization Theory l Formalize what designs are bad, and test for them
n Entity Relationship Model l Models an enterprise as a collection of entities and relationships n Entity: a thing or object in the enterprise that is distinguishable from other objects n Described by a set of attributes n Relationship: an association among several entities l Represented diagrammatically by an entity-relationship diagram:
The entity-relationship (E-R) data model perceives the real world as consisting of basic objects, called entities, and relationships among these objects. It was developed to facilitate database design by allowing specification of an enterprise schema, which represents the overall logical structure of a database. The E-R data model is one of several semantic data models; the semantic aspect of the model lies in its representation of the meaning of the data. The E-R model is very useful in mapping the meanings and interactions of real-world enterprises onto a conceptual schema. Because of this usefulness, many database-design tools draw on concepts from the E-R model. The E-R data model employs three basic notions: entity sets, relationship sets, and Attributes
9. Write brief note on data storage & querying? y y Storage manager : It is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible to the following tasks: l Interaction with the file manager l Efficient storing, retrieving and updating of data Issues: l Storage access l File organization l Indexing and hashing
y 1. 2. 3.
Query Processing :
Alternative ways of evaluating a given query l Equivalent expressions. l Different algorithms for each operation. n Cost difference between a good and a bad way of evaluating a query can be enormous n Need to estimate the cost of operations. l Depends critically on statistical information about relations which the database must maintain. l Need to estimate statistics for intermediate results to compute cost of complex expressions. n n 10. Write brief note on database architecture?
The architecture of a database system is greatly influenced by the underlying computer system on which it runs, in particular by such aspects of computer architecture
as networking, parallelism, and distribution: Centralized database systems are those that run on a single computer system and do not interact with other computer systems. Such database systems span a range from y single-user database systems running on personal computers to high-performance database systems running on high-end server systems. The CPUs have local cache memories that store local copies of parts of the memory, to speed up access to data. Each device controller is in charge of a specific type of device (for example, a disk drive, an audio device, or a video display). The CPUs and the device controllers can execute concurrently, competing for memory access. Cache memory reduces the contention for memory access, since it reduces the number of times that the CPU needs to access the shared memory. Networking of computers allows some tasks to be executed on a server system, and some tasks to be executed on client systems. This division of work has led to clientserver database systems. Parallel processing within a computer system allows database-system activities to be speeded up, allowing faster response to transactions, as well asmore transactions per second. Queries can be processed in a way that exploits the parallelism offered by the underlying computer system. The need for parallel query processing has led to parallel database systems. Distributing data across sites or departments in an organization allows those data to reside where they are generated or most needed, but still to be accessible from other sites and from other departments. Keeping multiple copies of the database across different sites also allows large organizations to continue their database operations even when one site is affected by a natural disaster, such as flood, fire, or earthquake. Distributed database systems handle geographically or administratively distributed data spread across multiple database systems.