Sunteți pe pagina 1din 9

IBSurgeon Emergency and data availability

tools
for InterBase and Firebird
databases

InterBase/Firebird DBA Tips&Tricks


by IBSurgeon, 2005-2009

IBSurgeon contacts:
www.ib-aid.com
support@ib-aid.com
InterBase/Firebird DBA Tips&Tricks

Contents
How to get statistics from InterBase/Firebird database in right way..................................3
Abstract.....................................................................................................................3
Right time, right place...............................................................................................3
If you does not experience periodical performance problems..................................3
If you experience periodical performance problems.................................................4
What to do with this statistics....................................................................................4
When DBA can't do nothing......................................................................................5
IBAnalyst: More Tips and Tricks.........................................................................................6
1. How to rebuild indices on PRIMARY, FOREIGN or UNIQUE constraints?..........6
2. I used all IBAnalyst recommendations but this won't help to speedup queries....6
4. Record versions for table that is must not be updated ........................................6
5. Blobs can cause table fragmentation....................................................................6
6. VerLen and RecLength relation............................................................................7
7. Why IBAnalyst name some indices as "bad" ?.....................................................7
8. What if "bad" index created by Foreign Key constraint?......................................8
9. Why in Data version percent row there are only 12 megabytes data, but I have
140 megabytes database ?.......................................................................................8
10. How to improve optimizer performance in case of frequent updates ................8

© IBSurgeon, 2005-2009 2
InterBase/Firebird DBA Tips&Tricks

How to get statistics from InterBase/Firebird database in right way

Abstract
This document is devoted to tips and tricks of gathering and analyzing statistics from
InterBase/Firebird databases with or without IBAnalyst.

Right time, right place


It sounds strange, but just taking statistics via gstat or Services API is not enough.
Statistics must be taken at the right moment to show how applications affect data and
transactions in database. Worst time to take statistics is
• Right after restore
• After backup (gbak –b db.gdb) without –g switch is made
• After manual sweep (gfix –sweep)
It is also correct that during work there can be moments where database is in correct state,
for example, when applications make less database load than usual (users at launch,
dinner or its by specific business process times).

How to catch when there is something wrong in database?

Yes, your applications can be designed so perfect that they will always work with
transactions and data correctly, not making sweep gaps, lot of active transactions, long
running snapshots and so on. Usually it does not happen. At least because some
developers test their applications running 2-3 simultaneous users at the same time, not
more. Thus, when they set up written applications for 15 and more simultaneous users,
database can behave unpredictably. Of course, multiuser mode can work Ok, because
most of multiuser conflicts can be tested with 2-3 concurrently running applications. But,
next, when more concurrent applications will run, garbage collection problems can came
(at least). And this can be caught if you take statistics at the correct moments.

If you does not experience periodical performance problems


This can happen when your applications are designed correctly, there is low database
load, or your hardware is modern and very powerful (enough to handle well current user
count and data).

The most valuable information is transactions load and version accumulation. This can be
seen only if you setup regular statistics saving.

InterBase does not have internal task scheduler, so you are free to use any external, like
standard Task Scheduler (Windows) or cron (Unix).
The best setup is to get hourly transaction statistics. This can be done by running

gstat –h db.gdb >db_stat_<time>.txt

where

© IBSurgeon, 2005-2009 3
InterBase/Firebird DBA Tips&Tricks

db.gdb is your database name,


db_stat_<time>.txt is text file where statistics will be saved,
<time> - current date and time when statistics was taken.

If you experience periodical performance problems


These problems usually caused by automatic sweep run. First you need to determine time
period between such a performance hits. Next, divide this interval minimally to 4 (8, 16 and
so on). Now information systems have lot of concurrent users, and most of performance
problems with not configured server and database happens 2 or 3 timers per day. For
example, if performance hits happens each 3 hours, you need to take

gstat –h db.gdb

statictics each 30-45 minutes, and

gstat –a –r db.gdb –user SYSDBA –pass masterkey

each 1-1.5 hour.


The best is when you take gstat –a –r statistics right before forthcoming performance hit.
It will show where real garbage is and how many obsolete record versions accumulated.

What to do with this statistics


If your application explicitly uses transactions and uses them well, i.e. you know what is
read read_committed and when to use it, your snapshot transactions lasts no longer than
needed, and transactions are being active minimal duration of time, you can tune sweep
interval or set it off, and then only care about how many updates application(s) makes and
what tables need to be less updated or cared about updates.

What does this mean, you can ask? We'll give example of some system, where
performance problems happened each morning for 20-30 minutes. That was very sufficient
for "morning" applications, and could not last longer.

Database admin was asked correct questions, and here is the picture:
Daily work was divided by sections – analytics works in the morning, than data is inserted
and edited by usual operators, and at the end of the day special procedures started
gathering data, that would be used for analytics next day (at least).

The last work on database at the end of day was lot of updates, and updates of those
tables which analytics used in the morning. So, there were a lot of garbage versions, which
started to be collected by application, running in the morning.
And, the answer to that problem was found simple – to run gfix –sweep at the end of the
day.

Sweep reads all tables in database and tries to collect all garbage versions for committed
and rolled back transactions. After sweeping database became clear nearly it comes after
restore.

© IBSurgeon, 2005-2009 4
InterBase/Firebird DBA Tips&Tricks

And, "morning problem" has gone.

So, you need to understand statistics with lot of other factors:


1. how many concurrent users (average) work during the day
2. how long is working day (8, 12, 16, 24 hours)
3. what kind of applications running at different day times, and how they affect data
being used by other applications, running at the same time or next. I.e. you must
understand business processes happening during the whole day and whole week.

When DBA can't do nothing


Sadly to say, these situations happen. And again, example:
Some system installed for ~15 users. Periodically performance is so bad, that DBA needs
to restart server. After server restart everything work fine for some time, then performance
gets bad again. Statistics showed that average daily transactions is about 75,000, and
there are active transactions running from the start of day to the moment when
performance getting down.
Unfortunately, applications were written with BDE and with no transactions using at all; i.e.
all transaction handling was automatic and used by BDE itself. This caused some
transactions to stay active for a long time, and garbage (record versions) accumulated until
DBA restarted server. After restart automatic sweep run, and garbage became collected
(eliminated).
All these was caused by applications, because they were tested only with 2-3 concurrent
users, and when they became ~15, applications started to make very high load.
Need to say that in that configuration 70% of users were only reading data, and other 30%
were inserting and updating some (!) data.
In this situation the only thing that can make performance better is to redesign applications
completely.

© IBSurgeon, 2005-2009 5
InterBase/Firebird DBA Tips&Tricks

IBAnalyst: More Tips and Tricks


Some questions that not answered in Recommendations or Help:

1. How to rebuild indices on PRIMARY, FOREIGN or UNIQUE constraints?


A: Yes, you can't use ALTER INDEX xxx INACTIVE/ACTIVE on constraint indices. If you
see deep or fragmented index on this constraint, you can use special trick (used by gbak
on restore):
RDB$INDICES has RDB$INDEX_INACTIVE flag that is null or 0 if index is active (after
CREATE INDEX or ALTER INDEX ACTIVE). 1 means that index is inactive (after ALTER
INDEX INACTIVE). But there is also value of 3 is used to indicate inactive indices on
constraints. So, you can set RDB$INDEX_INACTIVE=3 for that index, COMMIT, and than
return value to 0 and commit again - index will be rebuilt.

2. I used all IBAnalyst recommendations but this won't help to speedup queries.
A: This is separate issue, where IBAnalyst can't help. Here can be 2 causes of problem:
1. Indices have outdated statistics. You can refresh index statistics by command SET
STATISTICS INDEX xxx, or in IBExpert (menu Database, Recompute selectivity of all
indices), or using gidx tool (www.ibase.ru/download/gtools.zip).
2. Queries are hard themselves, or optimizer can't optimize query.
3. In some cases you will see "fragmented tables" right after restore.
Normally Interbase (without parameter -use_all_space) reserves about 25% space on data
pages for future inserts, updates or deletes (to place record versions). But, with any
database page size (1, 2, 4 or 8 k) you will see ~50% fragmentation for tables, that have
small record size (about ~12-20 bytes, for example, table with 2 integer fields have
average record size = 12 bytes).
This is OK, consider this as some magic server number (or behavior).
So, if you have such small record tables, you can
a) ignore "fragmented" warning for that tables
b) lower "fragmented %" to 45%, for example, in IBAnalyst Options dialog.

4. Record versions for table that is must not be updated


If you see record versions on table that must not be updated (for example, table with some
events log) - don't worry, these versions generated by delete.
So, you will know how many current records are in table, and how many records were
deleted.
This is true only if MaxVer = 1. If it is > 1, than this table is being updated by some
application. If you really sure that this table must not be updated ever, it's better to set
"before update" trigger with exception to find what application make updates.

5. Blobs can cause table fragmentation.


Engine stores blobs 3 different ways:
1. 1. if blob contents will fit at the data page (enough free space), it will be stored on that
data page near its record (or version).
2. 2. if blob contents will not fit at the data page, it will be stored on separate page

© IBSurgeon, 2005-2009 6
InterBase/Firebird DBA Tips&Tricks

3. 3. if in case 2 blob does not fit on one data page, pointer page is created to point at
appropriate blob pages.
Case 1 happens depending of stored blob size and database page size. For example, if
you had page size 4K and blobs with average size ~5K they are stored not at data pages,
but at additional blob pages.
But if you backup your database and restore it with 8K page size, blobs will fit at data
page, and they will be stored with records, causing high record fragmentation.
IBAnalyst marks these tables as Pale (Records column) and hint shows estimated records
for that table (based on data pages count) and real average fill value (%).
If your query reads any fields except blobs from that table, natural scan, join or aggregate
will run very slow.
Right now is the only one solution to avoid this - create additional table (linked 1-1 to
original table) and move all blob columns that have average size less than page size to it.
Firebird 2.0 will have option that can prevent storing blobs at data pages at all.
In that case don't try to backup/restore with greater page size! This will cause blobs that
couldn't fit at data pages with current page size, will be placed at data pages during
restore with bigger page size. So, your tables with blobs will be fragmented more than
earlier.
Also it is not recommended to restore with smaller page size, because it can decrease
performance for indices and non-blob tables.
Also you should not try to change blob fields to varchar fields - varchar fields are always
stored as a part of record, so record can have 2 or more fragments (be placed at 2 or more
data pages) if it doesn't fit at data page.
p.s. IBAnalyst can report these tables "by mistake", for example, table had blob fields with
data, but they were dropped from table structure. Unfortunately there is no configurable
option for that warning, because we calculate it exactly from data being reported by server
(statistics).

6. VerLen and RecLength relation


a) VerLen >= 90% of RecLength: versions you see in Version column are mostly record
deletes. The more records deleted, the less will be RecLength (up to 0 bytes). Also VerLen
can be greater than RecLen if you update your table with bigger string data, than was
stored in original records.
b) VerLen <=80% of RecLength: versions are mostly record updates.
We can't differ these cases more precisely because statistics shows average record and
version size for the whole table, while visible versions count for concurrent transactions
may vary.

7. Why IBAnalyst name some indices as "bad" ?


Indices having selectivity value lower than 0.01 marked as "bad" in IBAnalyst (See Index
view help). There are several causes to name particular index as bad::
1. Selectivity of that index lower than 0.01. Theoretically optimizer must not use that
index, but it does if no else indices exists (for where, order by or join clause, at least)
2. Such an index causes very slow garbage collection. This problem does not exist in
InterBase 7.1/7.5, and will be fixed in Firebird 2.0
3. This index make restore process very slow, and it is being created very slow
(create/alter index active). This is because record numbers chain is big for one index key.

© IBSurgeon, 2005-2009 7
InterBase/Firebird DBA Tips&Tricks

4. If this index is used in where clause, memory usage will depend on value being
searched (bitmask size). Since record chain can be big (lot of key duplicates), memory
consumption will be also big.
5. If that index is used in "order by", and lot of duplicates mostly in lower key values
(depending on index sort order), there will be lot of index page reads, that will slowdown
query.
That's because IBAnalyst can't ignore such indices existence.
Worst case for index is when it have Uniques column =1, i.e. all values for indexed column
are the same. These indices are listed in "Useless indices" at Summary page.
Of course, for your application such an index may be "good". For example, if records have
"archive" flag in some column, and your application search by index on that column only
for current, not archived data. Thus, its up to you, are we right naming that index "bad", or
not.

8. What if "bad" index created by Foreign Key constraint?


Well, previous paragraph shows that it's better to drop "bad" indices (if you do not use it to
search keys having less duplicates than other keys). But, if such an index is created by
foreign key you can drop it only by dropping foreign key. Dropping foreign key will disable
relation check constraint, which can be unacceptable.
You can replace FK by triggers, but with some restrictions. FK controls record relations
using index, and index "see" all keys for all records independently from transactions state.
But triggers works only in client's transaction context. So, replacing FK by triggers, you
must be sure that
• Records will not be deleted from master table, or being deleted in "snapshot table
reserving" mode
• Column, used by PK in master table will not be modified ever. You can restrict this
by before update trigger.
If you will maintain these conditions, you can drop particular Foreign Key. Of course, do
not create index manually on that column.

9. Why in Data version percent row there are only 12 megabytes data, but I have 140
megabytes database ?
1. IBAnalyst here shows "pure" data volume, without count of other database structures
(indices, metadata...) and page fragmentation.
2. After restore InterBase and Firebird leaves some free space (15-25%) at data pages
to make faster future updates/deletes.
3. There are specific server behavior when it leaves data pages fragmented by ~50%, if
that table record size is low, about 11-22 bytes.

10. How to improve optimizer performance in case of frequent updates


Index statistics is stored in RDB$INDICES.RDB$STATISTICS column, and is being
updated in 3 ways:
1. SET STATISTICS INDEX <index_name>
2. ALTER INDEX <index_name> ACTIVE, or CREATE INDEX <index_name> ...
3. restore process (all indices are being rebuilt as well as "ALTER INDEX <index_name>

© IBSurgeon, 2005-2009 8
InterBase/Firebird DBA Tips&Tricks

ACTIVE")
The optimizer uses this statistics information to prepare queries. Using statistics values
optimizer can decide that index is "good enough" or "not useful" for retrieving records.
If statistics was not updated during a long time optimizer may produce a bad plan because
existing statistics values does not correspond the actual state of affairs, because table
data can be significantly changed (for example, quantity of records was increased in 5-10
times, or vice versa, all records were deleted).
You can replace the bad automatic query plan that with explicit PLAN for particular query,
but this is not a good approach, because data can be significantly changed after the plan
was developed.
Alternate (and right) way is to refresh statistics periodically by applying SET STATISTICS
statement for all indices. You can schedule to run SQL script to refresh statistics using
ISQL or ready to use tool gidx (Windows only).
If you have some tables with periodically reloaded different records this approach will not
help. Let's consider the example:
• Table A is being loaded with data 4-5 times per day.
• After processing of loaded data all records in table A are being deleted
In this case we can see 2 correct statistics values for indices on table A - when it is loaded
with data, and when it is empty. So, statistics recomputed on loaded table will be useless
when table is empty, and vice versa.
To avoid this you need to recompute statistics for indices on table A only when table is
filled with data. The best is before queries on that table will be run.
Since version 1.91, IBAnalyst shows index statistics difference and allow you to recompute
it at any moment. First you need to look at table record information - is that usual average
record count or not. If yes, you may recompute index selectivity for sure. If not - maybe it
will be better do not touch index statistics, because it can cause optimizer to produce even
worse query plans.

© IBSurgeon, 2005-2009 9

S-ar putea să vă placă și