Thursday, February 6, 2014

Oracle

I have my share of irritations at Oracle's software. Oracle buys companies and software products, good or bad, rebrands them, then it makes them worse through the years, and still manages to sell that crap. In the end poor developers have to deal with enormous behemoths of software world, not really understanding what it is all about and how the hell this thing supposed to work. And even if developers understand all of that the crap from Oracle does not work as documented or expected anyway.

In my mind there was one exception from this: Oracle database. Probably because it is something the company started with.

Yes, Oracle database has its shares of strange features, non-standard implementations, etc. Yes, many argue that Oracle is the worst database out there. No, I am not going to join holly wars on what database is better.

Note that I wrote 'was one exception'. Recently my opinion is changed.

I am not a DBA. I do not have really deep knowledge of relational databases. I know enough to write SQLs if necessary. And if I can I try to off-load some work from my applications to the database. This keeps the processing close to the data being processed. As a result some of my SQLs end up being quite complicated but application code is much cleaner. Sometimes I run into a strange behavior that turns out to be a feature of the database product. It is not a big deal if it happens every now and them.

Unfortunately for the last week or two I had to write a lot of SQL statements. Some of them looked complicated, at least to me. I tested them against PostgreSQL and Oracle. I did not have any problem with PostgreSQL.

But Oracle... Well, it turned out my SQL was complicated to Oracle as well. Constructs that looked obvious just did not work with Oracle, resulting either in error messages, sometimes very strange, or in downright wrong results.

Did you know that Oracle treats an empty string as null? Now I do. If you have a not null column of say VARCHAR2 type you cannot insert an empty string into it. How clever it is?! Now what? Insert a string containing a single space??? Brr. Worst of all: there are people out there who defend Oracle on this issue.

Or how about not having a boolean type? You cannot have a column of boolean type in a table or in a query result. You have to go with 1/0 or 'Y'/'N', or whatever. Wow, real database shines through.

Another thing I ran into is limitations of updatable result sets. Sure, these limitations do not have anything to do with the database itself. They are part of a jdbc driver. But it is just playing with words. I do not care how great the database is if its jdbs driver is lousy.

Oracle documents support for updatable result sets. It is interesting to look at the difference between the versions: Oracle8i and Oracle9i. There are also more recent versions, for example, Oracle 11g, but they are quite similar to Oracle9i.

There is one interesting limitation that is mentioned in Oracle8i documentation and is not there in Oracle9i: "A query cannot use ORDER BY"

One might draw a conclusion that Oracle lifted this limitation long time ago. Ha-ha, gotcha! It is still there, even in a driver version 11.2.0.2.0.

But the funniest thing is not the documentation, but why the limitation is there in the first place. Turns out the driver parses SQL statements passed to it looking for quite a number of SQL fragments. One of such fragment is ORDER BY. And when an application uses methods like ResultSet.updateRow(), the driver gets the original statement, truncates it right before the ORDER BY and then appends the result after some UPDATE ... SET ... fragment it has generated. Now imagine what it does to a statement that has some analytic functions like ROW_NUMBER() OVER (ORDER BY ...). Bloody clowns!

Next Oracle "feature" hit me hard. I had quite a complicated INSERT from SELECT query that did some grouping reducing a 1M row table to a result set of about 70 rows. Worked without a hitch in PostgreSQL. But in Oracle I was getting one of those "unable to extend segment ... " errors. Looking at what and why I discovered that Oracle did not apply grouping at all: the result set contained 1M rows! WTF?!

The grouping was done "per date": the original data contained event timestamps and the query should have produced "so many events of a particular type per day". To convert a timestamp to a date I used cast(timestamp as DATE). "CAST converts one built-in datatype ... value into another built-in datatype ... value. " To hell it does. It probably just sets the type tag on a value without any conversion. So yes, if you run something like
select
    cast(systimestamp as date) d1,
    cast(systimestamp + interval '1' second as date) d2
  from dual;
you see the same two values in the output and think "yeap, cast() works". But if you run
select d1, d2 from (
    select
        cast(systimestamp as date) d1,
        cast(systimestamp + interval '1' second as date) d2
      from dual
) where d1 = d2;
you get no rows back! On the other hand this works as expected returning a single row:
select d1, d2 from (
    select
        trunc(systimestamp) d1,
        trunc(systimestamp + interval '1' second) d2
      from dual
) where d1 = d2;

Now my SQL produces the expected 70 rows both in PostgreSQL and Oracle. But I still wonder how many of such "small details" will hit me tomorrow?

Next one is a bit questionable: ORA-00904: XXX: invalid identifier. It is described quite in details for example here: "Ask Tom".

Why is it questionable? Well, Tom claims Oracle follows the standard here:
ANSI SQL has table references (correlation names) scoped to just one level deep.

This might very well be the case. The language used in all those standards is usually quite incomprehencible. It is really difficult to understand what the authors ment to say. And everyone who tries to follow standard usually understands things slightly differently. I found SQL92 text on the web. As I expected it is completely unclear if there is such a limitation in the standard. Actually I say stronger: if I did not know about Oracle's intrepretation I would not even think there is a limitation.

Now imagine Java had the same scoping rules:
void a() {
    int  c = 0;
    while (c < 10) {  // OK, visible
        ...
        if (c == 5) { // OK, visible
            int  k = c; // Oops, c is not visible here
            ...
        }
    }
}
But let's give Oralce the benefit of the doubt on this one.

Time for my favorite: error ORA-32036. Just go and read its description. And read again ... and again ... Wonderful piece of English prose, is it not? I guess if you cannot appreciate its beauty you cannot be a real DBA. Now goolge it. Turns out the error depends on how a jdbc connection is made. The error happens if the statement is executed in an XA connection. But everything works OK if it is a non-XA connection. And it is not only jdbc, it is the same in .NET world.

I could have added much more but it is getting late...

Only one thing bothers me: I can explain all the issues I mentioned above. Stupid decisions of not having boolean, empty string, bugs in parsing, [mis-]interpretation of the standard, etc. Yes, I can see how they could have happened. But that last one (ORA-32036)?! Damn, I am losing my imagination.