Sunteți pe pagina 1din 4

SYBASE RFI ( RECOVERY FAULT ISOLATION )

Sybase ASE recovers all databases by rolling back or rolling forward


transactions to online the database to bring it to a consistent state when the
dataserver is restarted.
During normal ASE operation, all changes to a database are written first to
table syslogs and then to the data pages in the data caches. Eventually the
checkpoint process flushes the changes to disk. Log pages are written to disk
when the transactions commit. However, because all changed pages are
written to disk whenever a checkpoint occurs, changes could be written to
the log or data pages even when they are part of an incomplete or
uncommitted transaction. If the dataserver crashes after an uncommitted
transaction is written to the log but before the transaction completes, the
recovery upon startup reads the log and ensures that no uncommitted
changes are reflected in the database by rolling back the changes. Likewise,
online recovery ensures that any changes recorded in the log for committed
transactions that have not yet been flushed to disk are updated on the data
pages and written to disk by rolling forward the transactions.
In prior versions of ASE partial recovery of a database was not possible. If
recovery failed due to some corruption, there was no way to recover the
uncorrupt portion of the database and bring it online. The only option was to
either recover from backups or “suicide” the log.
The “recover from backups” approach has the drawback of not being to
recover up to the minute since transaction backups are typically taken every
5 to 15 minutes. The obvious drawback from “suiciding the log” is the
possibility of data physical and logical data corruption. The corruption may
not surface until a later time, and the relation to the earlier log suicide is not
always obvious.

Sybase ASE now implements Recovery Fault Isolation (RFI), a new online
recovery feature that provides for partial recovery of the database. RFI can
isolate corruption, encountered during recovery, to the corrupt pages. This
allows us to restore database to a consistent state by isolating and repairing
corruption on a page by page (and hence, on an object by object) basis
without having to go back to database backups or log suiciding. RFI can be
used only when non-system object corruption is encountered. If system
tables are corrupt, the entire database has to be restored from backups.
RFI allows the DBA to select the granularity of recovery for each database. A
DBA can
1) Mark the entire database suspect on any recovery failure ( default
behavior)
2) Set a threshold of the number of pages that would be allowed to be
offline. The DBA can determine if the database would be updateable or
just read only.
3) Also, the DBA can setup the database to be marked suspect on any
recovery failure so that the DBA can fix the corruption before the
database is opened for all users.

Page level granularity, allows the server to offline corrupt pages in a


transaction while onlining other pages. Since the entire database is not
recovered by replaying the log for rollforward/rollback data could be
inconsistent i.e. some transactions may be partially available due to offline
data. There is no way to determine which transactions involved offlined
pages except by manual examination.
It is possible to online corrupt pages. However, doing so without first
repairing the pages will result in logical data inconsistency. When restoring a
database by repairing offline pages or restore the affected object from a
backup. The DBA along with application team must determine the extent to
which logical consistency of the database has been compromised. It would
be wise to revert to restoring the database from backups and applying
transaction logs if the extent of corruption is undetermined. It is also
important to run dbcc tablealloc or dbcc indexalloc on any objects with
suspect pages.
How to proceed if Online recovery fails
The following options should be considered from most to the least desirable
in that order
• Restoring from Backups and applying transaction logs
• Partial recovery using RFI
• Suiciding the Log

Restoring from Backups and applying transaction logs


This was the only course of action in earlier versions of ASE if recovery
failed, the database could not be repaired, and suicide of the log was not
desirable and is still the preferred option. It is still the preferred option for
recovering the database after failure during online recovery if a) the entire
database is marked suspect due to thresholds being exceeded, or b) system
table(s) are corrupt. It is also the preferred option whenever physical and
logical consistency is critical.
Partial recovery using RFI
Implementing RFI gives us an opportunity to recover the database partially
before opting to suicide the log.
1. Isolated pages are known and can be examined. The DBA can decide
whether to repair the faults or restore from backups. If the isolated
pages belong to an index, the index can be dropped and rebuilt. For
data pages, data can be recovered possibly by other means such as
restoring the backup to a development environment and bcp-ing the
data. The data pages can also be left offline safely depending on usage
of these tables etc.
2. You can set thresholds to determine at what level page faults are
unacceptable, and at which the whole database should remain
unrecovered.
3. You can make the database available to users while conducting repairs.
The database can be configured to allow updates or to allow read-only
access.

RFI commands / steps


1. Check/alter granularity using sp_setsuspect_granularity:
sp_setsuspect_granularity [dbname [,{"database" | "page"} [,
"read_only"]]]

using read_only mode is encouraged.


If a query attempts to access an offline page, the server raises error
messages 12716 and 12717.
2. Set the threshold for escalating page level granularity to database
granularity using sp_setsuspect_threshold:
sp_setsuspect_threshold [dbname [,threshold ]]

Once the number of offlined pages reaches this threshold value,


recovery marks the entire database suspect.
3. You can print a list of pages or databases that are suspect after
recovery using sp_listsuspect_db and sp_listsuspect_page
sp_listsuspect_db
sp_listsuspect_page [dbname]

You can bring these suspect pages or database online using


sp_forceonline_db or sp_forceonline_page:
sp_forceonline_db dbname
{"sa_on" | "sa_off" | "all_users"}

sp_forceonline_page dbname, pagenumber


{"sa_on" | "sa_off" | "all_users"}

sa_on and sa_off toggle the database or page online and offline.
Suiciding the log
RFI eliminates the need to suicide the log

S-ar putea să vă placă și