When looking for an individual record, it is convenient to
identify the record with a key based on the records content (e.g., the Ames record). Key is an expression derived from one or more of the fields within a record that can be used to locate that record. Canonical form is a standard form for a key that can be derived, by the application of well-defined rules. Primary key is a key that uniquely identifies each record and should be unchanging . Records can also be searched based on a secondary key. Those do not typically uniquely identify a record. Record Access: Sequential Search
Evaluating Performance of Sequential Search.
Sequential search is a method of searching a file by reading the file from the beginning and continuing until the desired record has been found. Sequential access means reading the file from the beginning and continuing until you have read in everything that you need. Record Access: Sequential Search
Improving Sequential Search Performance with
Record Blocking. Block is a collection of records stored as a physically contiguous unit on secondary storage. Record Access: Sequential Search
When Sequential Searching is Good
ASCII files in which you are searching for some pattern (wc and grep); Files with few record; Files that hardly ever need to be searched (tape files); and Files in which you want all records with a certain secondary key value, where large number of matches is expected. Unix Tools for Sequential Processing
Unix is an ASCII file with the new-line character as the
record delimiter and when possible, white space as the field delimiter. Sample Unix Tools cat wc (word count) grep (generalized regular expression) Record Access: Direct Access
Direct access is jumping to the exact location of a
record. How do we know where the beginning of the required record is? It may be in an Index We know the relative record number (RRN) Record Access: Direct Access
RRN is an index giving the position of a record relative
to the beginning of its file. Direct access to a fixed-length record is usually accomplished by using its relative record number (RRN), computing its byte offset and then seeking to the first byte of the record. RRN are not useful when working with variable length- records: the access is still sequential. However, it is useful with fixed-length record. More Record Structures
Choosing a Record Structure and Record Length
within a fixed-length record: Fixed-Length Fields in record Varying Field boundaries within the fixed-length record. Header Records are often used at the beginning of the file to hold some general info about a file to assist in future use of the file. File Access and File Organization
File organization depends on what use you want to
make of the file. Since using a file implies accessing it, file access and file organization are intimately linked. Example: though using fixed-length records makes direct access easier, if the documents have very variable lengths, fixed-length records is not a good solution: the application determines our choice of both access and organization. File Access and File Organization
File access method is used to locate information in a
file. Two way are sequential access and direct access. File organization method is the combination of conceptual and physical structures used to distinguish one record from another and one field from another. Beyond Record Structure
Abstract Data Models for File Access
Headers and Self-Describing File Metadata Color Raster Images Mixing Object Types in One File Representation-Independent File Access Extensibility Portability and Standardization
Portability is the characteristic of files that describes
how amenable they are to access on a variety of different machines.
Factors Affecting Portability
Differences among Operating Systems Differences among Languages Differences in Machine Architectures Portability and Standardization
Guidelines in Achieving Portability
Agree on a Standard Physical Record Format and Stay with it Agree on a Standard Binary Encoding for Data Elements Number and Text Conversion File Structure Conversion File System Differences Unix and Portability The End