[DUG] Using Boolean (Char(1)) in Firebird
russell
russell at belding.co.nz
Mon Mar 31 10:28:32 NZDT 2014
Hi Steve
I read your original email again and note:
My understanding is that this will never be indexed, even if you
specifically add an index to it. So how do you handle it. There may be
several boolean fields in a table definition.
As these tables can contain several hundred thousand records, this could
potentially slow down any query to say total all records last 3 years where
Active and Income - as the only index would then be on the date field, there
is a possibility that this could potentially be a very slow query.
I've heard of others creating another table to create, say, non-Avtive
record ID's, but this one table could have several booleans, therefore
creating several new tables (combining then into a single table with the
field name would cause the same problem).
I am trying to interpret for myself what your concern is. Indexing and using
Booleans in tables, resulting in slow queries?
You have a general concern without a specific example. You understanding
about something "never indexed" is not clear to me. Maybe you have in mind
that an optimiser will not use some indices because an index has a Boolean
in it?
One way to respond to your concern is to discuss index selectivity and
optimisers. I won't do this as I don't know enough about the FB query
optimiser. So I'll respond pragmatically. I am using Booleans in many tables
and in many queries in software with biggest table 600,000 records and
400MB+ DB file sizes. In FB 2.5.x. To the best of my knowledge I do not have
slow query problems with Booleans.
If you have slow queries there are options to look at. We'd need to get
specific. I would not let a concern for using Booleans worry me.
I hope I have understood you in a reasonable sense.
Kind regards,
Russell
From: delphi-bounces at listserver.123.net.nz
[mailto:delphi-bounces at listserver.123.net.nz] On Behalf Of Steve Peacocke
Sent: Monday, 31 March 2014 9:15 a.m.
To: NZ Borland Developers Group - Delphi List
Subject: Re: [DUG] Using Boolean (Char(1)) in Firebird
Thanks Jolyon,
Silly me, I thought this was going to be simple.
Add Index, don't add index, it's a Boolean decision, I never expected NULL
as an answer :)
Steve Peacocke
Mobile: +64 220 612-611
Linkedin Professional Profile
<http://nz.linkedin.com/pub/steve-peacocke/1/a06/489>
On Mon, Mar 31, 2014 at 8:56 AM, Jolyon Smith <jsmith at deltics.co.nz> wrote:
I don't think you can adopt a general rule for all boolean type conditions
in data. In the two example fields you cite, for example, I can see that
there is a potential difference in the nature of the booleans involved.
ActiveRecord - looks like something that could change over time. A record
that was active may become inactive and I further speculate that there will
over time be far more inactive records than active ones.
AccountTransactionType - looks like something that is fixed. The type of a
transaction seems unlikely to change once that transaction has been
recorded. You might call this a "static" boolean, as opposed to the more
"dynamic" nature of the previous example.
Of course, more specific domain knowledge may reveal these assumptions to be
invalid, but you get the general idea.... the characteristics of a
particular datum go beyond it's simple data type and those characteristics
in turn determine the most appropriate implementation (which in turn will
depend on whether the dominant context is OLTP or OLAP - i.e. efficiency of
creating/modifying data vs efficiency of queries).
In the case of "static" booleans for example, you might consider creating
separate tables for records of different values in this field. For
convenience of querying all records you can of course project a view which
unions the two (or more) tables involved, with a derived, virtual column
containing the discriminating field value. This also opens up the
possibility that the most efficient indexes for rows of a certain type (i.e.
now table) may well be different than those for the other. i.e. the way you
work with Income transactions might benefit from different indexes than
Expense transactions.
On the other hand, the way you work with income and expense transactions may
mean that you are better off having indexes operating over ALL transactions,
regardless of Income/Expense type.
See what I mean about "the best way" being dependent on far more than just
the data type ?
And there's still more to it than that...
w.r.t index selectivity, I am not convinced that the 1 / # of distinct
values metric is a particularly reliable measure. It surely assumes an even
distribution of distinct values across the data set ?
i.e. if you have 100,000 records and they have a column where 50,000 rows
have one value and 50,000 have another, then yes, the efficiency and thus
the utility of any index on that value is going to be negligible (but then,
no better than having no index isn't actually *worse*, is it ? Although
there will be some overhead introduced in maintaining the index, though I
doubt this will itself be hugely significant).
On the other hand, if only 1,000 of those 100,000 records have one value and
the remaining 99,000 have another, AND if your application most often
queries that table to select those in the smaller subset (the 1,000) then
whilst an index may not be of any benefit for querying the 99,000, it surely
will provide benefit for those queries that select the 1,000 (or from among
them), a benefit which *might* be worth the overhead of maintaining that
index even though it provides little/no benefit for the handful/minority of
queries that work with the 99,000 records ?
The bottom line is, there is no shortcut for properly understanding your
data and the way your application(s) work(s) with that data for correctly
tuning your database structure and metadata for optimal performance.
:)
On 30 March 2014 15:19, Steve Peacocke <steve at peacocke.net> wrote:
Hi all,
I'm playing around with a Firebird database and wanted to know from you DB
experts out there how you handle booleans in a table.
These could be as simple as
ActiveRecord (Y/N)
AccountTransactionType (I/E) - (Income or Expense)
That last I would normally think would be "Income (Y/N)" so that would be a
boolean too.
My understanding is that this will never be indexed, even if you
specifically add an index to it. So how do you handle it. There may be
several boolean fields in a table definition.
As these tables c an contain several hundred thousand records, this could
potentially slow down any query to say total all records last 3 years where
Active and Income - as the only index would then be on the date field, there
is a possibility that this could potentially be a very slow query.
I've heard of others creating another table to create, say, non-Avtive
record ID's, but this one table could have several booleans, therefore
creating several new tables (combining then into a single table with the
field name would cause the same problem).
Any thoughts?
Steve Peacocke
Mobile: +64 220 612-611 <tel:%2B64%20220%20612-611>
Linkedin Professional Profile
<http://nz.linkedin.com/pub/steve-peacocke/1/a06/489>
_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: delphi at listserver.123.net.nz
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to delphi-request at listserver.123.net.nz with
Subject: unsubscribe
_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: delphi at listserver.123.net.nz
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to delphi-request at listserver.123.net.nz with
Subject: unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listserver.123.net.nz/pipermail/delphi/attachments/20140331/8a5393ea/attachment.html
More information about the Delphi
mailing list