[DUG] Using Boolean (Char(1)) in Firebird

Mon Mar 31 12:30:46 NZDT 2014

[Reply]

Hi Steve and others,

Here's a selection of comments from my database of Interbase & Firebird community comments.

>From what I can ascertain, there is no hard and fast rule here, and you really need to test your data against different configurations - with indexes, and without - and see what works best.

cheers

Gary

====================================

http://www.firebirdfaq.org/faq166/

If you have a table with a lot of records and one of the fields stores values like 'Y' and 'N' or something similar (i.e. not a lot of different values) it does not help performance to create index on such field. Even worse, if you do create index and it does get used it will slow down the queries. Such index has low selectivity (number of records one index node identifies is huge - perfect selectivity is when one index node identifies one record). 
You might want to consider adding that field to some other (primary key for example) which is already indexed and often used in WHERE clause or JOINs. This is very effective with Firebird 1.x, while Firebird 2.x and higher are smarter and work around those duplicates. 

========================================================================

>Should I use Integer/Big Int & concatenate my Boolean fields, or should I
>convert them to char(1) values of "y"/"n" etc or smallint values of 0/1.

Use char(1).  The engine will pick a fast comparison since it knows its
comparing one byte to another.  If it makes you happy, make them charset
octets and store 0x1 and 0x0.

Regards,

Ann harrison

========================================================================

> Can someone please tell me if I should build an index on a field when > the the values for the field is always a boolean value (Yes or No).
> > I execute a lot of queries against this table based on whether the field > is 'No'. Everyday approximately 1000 new rows of data gets stored into > this table where the field will be 'No'. The table has millions of > records present where the value has been set to 'Yes', and they cannot > be removed.

Index on columns like this is incredibly dense, so it's often very
shallow (which is a good thing) and it's significantly smaller (also
good thing). This means that its processing overhead on selects is lower
than on average index. However, it has higher than average overhead for
maintenance, especially when keys are updated or deleted (problem is not
in update itself, but in subsequent garbage collection). In your case
when data are mostly permanent, the maintenance overhead could be
neglected. In your case, when there is a significant imbalance in key
distribution, you would get significant performance improvement for
queries that filter rows with less common key value, and you may suffer
performance hit for queries for complementary queries (at least until
data distribution statistics would be implemented in Firebird). However,
these indices have very poor general selectivity value, so Firebird
optimizer may decide to not use the index even if it would improve
performance, so you always need to check the query plan to verify
whether such index isn't useless.

To sum it up, there is no general rule that would recommend or not to
use such indices. The actual usefulness of such index depends on type of
queries you mostly use, and physical data storage characteristics that
affect optimizer decisions. The best approach to evaluate its usefulness
is to try it with your real data and queries (and don't forget to check
the execution plans).

best regards
Pavel Cisar
IBPhoenix

========================================================================

When you create a index, it add's time cost to updates and inserts. So if you don't need the index, droping then will improve inserts and updates.

To know more about index in Boolean you can take a look on the REPLYies of this message:
http://tech.groups.yahoo.com/group/firebird-support/message/106608

also you can google using this below (without the quotes)

"site:http://tech.groups.yahoo.com/group/firebird-support index on Boolean"

Theres a post of Ann H. that explain it wonderfully ^_^ (i just can't find it now o_o'' )

regards,

???

At 10:29 on 31/03/2014 you wrote 
>Hi Steve

>I read your original email again and note:
>
>My understanding is that this will never be indexed, even if you
>specifically add an index to it. So how do you handle it. There may be
>several boolean fields in a table definition.
>
> >
>As these tables can contain several hundred thousand records, this could
>potentially slow down any query to say total all records last 3 years where
>Active and Income - as the only index would then be on the date field, there
>is a possibility that this could potentially be a very slow query.
>
> >
>I've heard of others creating another table to create, say, non-Avtive
>record ID's, but this one table could have several booleans, therefore
>creating several new tables (combining then into a single table with the
>field name would cause the same problem).
>
> >
>I am trying to interpret for myself what your concern is. Indexing and using
>Booleans in tables, resulting in slow queries?
>
>You have a general concern without a specific example. You understanding
>about something "never indexed" is not clear to me. Maybe you have in mind
>that an optimiser will not use some indices because an index has a Boolean
>in it? >
> >
>One way to respond to your concern is to discuss index selectivity and
>optimisers. I won't do this as I don't know enough about the FB query
>optimiser. So I'll respond pragmatically. I am using Booleans in many tables
>and in many queries in software with biggest table 600,000 records and
>400MB+ DB file sizes. In FB 2.5.x. To the best of my knowledge I do not have
>slow query problems with Booleans.
>
> >
>If you have slow queries there are options to look at. We'd need to get
>specific. I would not let a concern for using Booleans worry me. >
>I hope I have understood you in a reasonable sense.
>
> >
>Kind regards,
>
>Russell
>
> >
> >
> >
>From: delphi-bounces at listserver.123.net.nz
>[mailto:delphi-bounces at listserver.123.net.nz] On Behalf Of Steve Peacocke
>Sent: Monday, 31 March 2014 9:15 a.m.
>To: NZ Borland Developers Group - Delphi List
>Subject: Re: [DUG] Using Boolean (Char(1)) in Firebird
>
> >
>Thanks Jolyon,
>
> >
>Silly me, I thought this was going to be simple.
>
> >
>Add Index, don't add index, it's a Boolean decision, I never expected NULL
>as an answer :)
>
>
>Steve Peacocke
>Mobile: +64 220 612-611
>
>Linkedin Professional Profile
><http://nz.linkedin.com/pub/steve-peacocke/1/a06/489> >
> >
>On Mon, Mar 31, 2014 at 8:56 AM, Jolyon Smith <jsmith at deltics.co.nz> wrote:
>
>I don't think you can adopt a general rule for all boolean type conditions
>in data.  In the two example fields you cite, for example, I can see that
>there is a potential difference in the nature of the booleans involved.
>
>ActiveRecord - looks like something that could change over time.  A record
>that was active may become inactive and I further speculate that there will
>over time be far more inactive records than active ones.
>
>AccountTransactionType - looks like something that is fixed.  The type of a
>transaction seems unlikely to change once that transaction has been
>recorded.  You might call this a "static" boolean, as opposed to the more
>"dynamic" nature of the previous example.
>
> >
>Of course, more specific domain knowledge may reveal these assumptions to be
>invalid, but you get the general idea.... the characteristics of a
>particular datum go beyond it's simple data type and those characteristics
>in turn determine the most appropriate implementation (which in turn will
>depend on whether the dominant context is OLTP or OLAP - i.e. efficiency of
>creating/modifying data vs efficiency of queries).
>
> >
> >
>In the case of "static" booleans for example, you might consider creating
>separate tables for records of different values in this field.  For
>convenience of querying all records you can of course project a view which
>unions the two (or more) tables involved, with a derived, virtual column
>containing the discriminating field value.  This also opens up the
>possibility that the most efficient indexes for rows of a certain type (i.e.
>now table) may well be different than those for the other.  i.e. the way you
>work with Income transactions might benefit from different indexes than
>Expense transactions.
>
> >
>On the other hand, the way you work with income and expense transactions may
>mean that you are better off having indexes operating over ALL transactions,
>regardless of Income/Expense type.
>
>See what I mean about "the best way" being dependent on far more than just
>the data type ?
>
> >
>And there's still more to it than that...
>
> >
>w.r.t index selectivity, I am not convinced that the 1 / # of distinct
>values metric is a particularly reliable measure.  It surely assumes an even
>distribution of distinct values across the data set ?
>
>
>i.e. if you have 100,000 records and they have a column where 50,000 rows
>have one value and 50,000 have another, then yes, the efficiency and thus
>the utility of any index on that value is going to be negligible (but then,
>no better than having no index isn't actually *worse*, is it ?  Although
>there will be some overhead introduced in maintaining the index, though I
>doubt this will itself be hugely significant).
>
>On the other hand, if only 1,000 of those 100,000 records have one value and
>the remaining 99,000 have another, AND if your application most often
>queries that table to select those in the smaller subset (the 1,000) then
>whilst an index may not be of any benefit for querying the 99,000, it surely
>will provide benefit for those queries that select the 1,000 (or from among
>them), a benefit which *might* be worth the overhead of maintaining that
>index even though it provides little/no benefit for the handful/minority of
>queries that work with the 99,000 records ?
>
>
>
>The bottom line is, there is no shortcut for properly understanding your
>data and the way your application(s) work(s) with that data for correctly
>tuning your database structure and metadata for optimal performance.
>
>
>:)
>
> >
> >
>On 30 March 2014 15:19, Steve Peacocke <steve at peacocke.net> wrote:
>
>Hi all,
>
> >
>I'm playing around with a Firebird database and wanted to know from you DB
>experts out there how you handle booleans in a table.
>
> >
>These could be as simple as
>
>  ActiveRecord (Y/N)
>
>  AccountTransactionType (I/E) - (Income or Expense)
>
> >
>That last I would normally think would be "Income (Y/N)" so that would be a
>boolean too.
>
> >
>My understanding is that this will never be indexed, even if you
>specifically add an index to it. So how do you handle it. There may be
>several boolean fields in a table definition.
>
> >
>As these tables c an contain several hundred thousand records, this could
>potentially slow down any query to say total all records last 3 years where
>Active and Income - as the only index would then be on the date field, there
>is a possibility that this could potentially be a very slow query.
>
> >
>I've heard of others creating another table to create, say, non-Avtive
>record ID's, but this one table could have several booleans, therefore
>creating several new tables (combining then into a single table with the
>field name would cause the same problem).
>
> >
>Any thoughts?
>
>
>
>Steve Peacocke
>Mobile: +64 220 612-611 <tel:%2B64%20220%20612-611> >
>Linkedin Professional Profile
><http://nz.linkedin.com/pub/steve-peacocke/1/a06/489> >
> >
>_______________________________________________
>NZ Borland Developers Group - Delphi mailing list
>Post: delphi at listserver.123.net.nz
>Admin: http://delphi.org.nz/mailman/listinfo/delphi
>Unsubscribe: send an email to delphi-request at listserver.123.net.nz with
>Subject: unsubscribe
>
> >
>
>_______________________________________________
>NZ Borland Developers Group - Delphi mailing list
>Post: delphi at listserver.123.net.nz
>Admin: http://delphi.org.nz/mailman/listinfo/delphi
>Unsubscribe: send an email to delphi-request at listserver.123.net.nz with
>Subject: unsubscribe
>
> >
>
>
>_______________________________________________
>NZ Borland Developers Group - Delphi mailing list
>Post: delphi at listserver.123.net.nz
>Admin: http://delphi.org.nz/mailman/listinfo/delphi
>Unsubscribe: send an email to delphi-request at listserver.123.net.nz with Subject: unsubscribe

Ref#: 41006

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listserver.123.net.nz/pipermail/delphi/attachments/20140331/d9ef122b/attachment-0001.html