This blog will be about... Software Engineering... Object Oriented Analysis... Database Development, etc
I believe that imagination is stronger than knowledge -- myth is more potent than history -- dreams are more powerful than facts -- hope always triumphs over experience -- laughter is the cure for grief -- love is stronger than death.
Saturday, July 28, 2007
Why is validation so hard?
Monday, July 16, 2007
NoResultException is a really stupid idea!
But then, I met NoResultException... what a great way to screw a great idea!
Why not just return null??!!! getSingleResult should return 1 element, or null if it can not find stuff!
Saturday, July 14, 2007
Unit testing Relational Queries (SQL or HQL or JQL or LINQ...)
Most applications I have built have something to do with a database (I remember that while I was on college I used to think that was not exiting stuff, I used to dream about doing neural networks stuff, logic programming in Prolog, etc) but then I met WebObjects and its Object Relational Mapper (Enterprise Objects Framework) and I got really excited about object oriented data programming... but I always had a problem... to test my object oriented queries I had to "manually" translate them into SQL and test them against the database, and only after they give me what I thought were correct result I would write them using EOF Qualifiers...
Then I met Hibernate's HQL and I realized it is much more powerful than EOF Qualifiers, but I still had to translate it to SQL to test it, I know I can get the SQL that is generated from the HQL from the debug console, and paste it in my favorite SQL editor... but even then if I found a mistake, a lot of times it was easier to tweak it in SQL and the manually translate it HQL.
Currently, there are some extensions for Eclipse (Hibernate Tools) that make this more "direct" but, what if I don't like (or don't want, or can't) use Eclipse... it would be great if someone could, for example, make a plugin for SquirrelSQL, but until then... what options do I have?
Then I learned about unit testing... and the answer came to my mind immediately: I just had to write a unit test for each of my queries. That worked fine... in the beginning... until I started having queries that returned thousands (or millions) of objects, and it wasn't such a good idea to output them to the debug console... and I had another problem... how should I write the "asserts" of query?... and how can I do it so that it doesn't make my test so slow that it becomes unusable? (I can, of course, check the results just by viewing them, but my brain is not that good to say if those 10,000 row really match with the idea I had when I wrote that HQL)
So, I started to look "what do I do" to check if an SQL query is correct, lets say for example, that I write this:
select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3
(Translated to English: How many address the Employee with Id = 3 has?)
Now... how do I test that? well I could add an assert after getting the result in java (or c#) like this:
assert(count>0)
But the, what happens if someone deletes the row with Id = 3 from the table Employee? That means my test will fail... or what what if someone deletes all the addresses from employee? and what if I want to test that if there are no addresses for an employee, the answer should be zero...
That is a lot of work just to test if that simple query is right... and I think that work could be done automatically:
Take a look at the SQL, it could be decomposed into:
select count(*) from (select * from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3) as EmployeeAddresses
And then we could say, lets automatically check for the case when the resulting set is empty, and for the case when the result is not empty, to check for a case when the result is empty, we need and Employee with Addresses, so we generate:
Select * from Employee where exists(select * from Address where Address.EmployeeId = Employee.Id)
And take the Id first employee we get... and that should give us a non empty set if used in the original sql sentence that we are trying to test... after that, we automatically generate:
Select * from Employee where not exists(select * from Address where Address.EmployeeId = Employee.Id)
And take the Id first employee we get... and that should give us an empty set if used in the original sql sentence that we are trying to test...
I call this queries "inverses" of the original one, it like when one is testing a multiplication, to see if 2 x 3 = 6, just do: 6/3 = 2 and 6/2 = 3, if 2, and 3 match the operands of the multiplication, you multiplication is right. The same thing goes for SQL, one just has to find the way to "invert" it, if I could automate this inversion, the automatically generated queries would help me by telling me things that might not be immediately obvious to me when I look at the original query, and that would help me check if my original query is right.... it would me some kind of "invariants" that would help me to better understand my querying... or maybe I could even write the invariants first, and then create a query and see if it matches my invariants...
Mmmm.... maybe using a select there is another way to "invert" a query to test if it is right, using the actual inverse operation of selecting... that is "inserting", could I derive from:
select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3
Something like (In pseudocode):
Insert Employee;
Store Employee.Id
Run select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = EmployeeId
Assert("The answer should be zero")
Insert Address related to Employee
Run select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = EmployeeId
Assert("The answer should be one")
This has the advantage that I don't need a database with data already on it, but it has the disadvantage that takes lot of time to write an unit test like this in java, because to insert an employee, it might be necessary to:
- Avoid breaking validation rules no related to this particular test, for example, an Employee must be related to a Department, but if the Department table is empty, then I should create a Department or I will not be able to insert an employee.
- Avoid conflicts with validation rules directly related to this particular test, for example, what if I have an Hibernate interceptor that won't let me insert an address without 1 or more Addresses
I think it can be done... the question is..
What is the algorithm to generate the inserts needed to satisfy an SQL select statement?
Thursday, July 05, 2007
The perfect infrastructure (framework?) for data systems
The perfect infrastructure (framework?) for data systems:
- Has an object relational query language (something like JQL or LINQ)
- Has a database with versioning (like Subversion) so you can always consult the database as it was in a particular moment in time transparently
- Supports transactions... and distributed transactions. (like Spring)
- Has a framework to exchange graphs of objects with a remote client, objects can be manipulated freely on the client, filtered and queried without hitting the database without need, and are transparently loaded in to de client without having the n+1 problem. (like an hybrid between Hibernate, Apple's EOF & Carrierwave)
- Supports "client only transactions" and nested client only transactions (like the Editing Context in WebObject's JavaClient applications) so that it is possible to make rollbacks without hitting the database, and it is even possible to make partial rollbacks... and have savepoint functionality, without going all the way to the database (unless you want to do so)
- Client objects, server objects and database elements are kept in perfect sync automatically, but it is possible to add logic to a particular tier of the system without too much hassle.
- Has a validation framework, that make it really easy to write efficient validation code following DRY, and that validates data on the client, on the application server and on the database.
- Validation code, combined with the versioning capabilities of the infrastructure allows to save information partially, as easily as writing part of a paper, validating only as you completely the information, with multiple integrity levels
- It is possible to disconnect the client from the server, and it will be able to save your changes until the connection is established again
- The applications built with this perfect infrastructure auto update automatically.
- With a very simple configuration tweak, it is possible to download the application "sliced" in pages, or as a complete bundle. This capability is so well integrated that the final user can choose the installation method, and the programmer doesn't even care about this feature.
- The developer only needs to specify the requirements semi-formally in a language (like Amalgam) and he will receive a running application, that adapts dynamically to specification (unless he chooses to "freeze" a particular feature of the application, in which case, the default procedural code for that feature is automatically generated, and the developer can customize as he wishes ... or decide to un-freeze it.
- Can be coded in any language compatible with a virtual machine that runs anywhere, or can be compiled to an specific platform.
- Allows for easy report design... by the developer, or the user.
- It is opensource (or sharedsource), so that in the extremely unlikely case of needing another feature, or finding a bug, it can be easily fixed by the developer
- It is freely (as in beer) downloadable from the Internet. (or has a reasonable price)
- It is fully documented, with lots of examples, going from very simple examples for beginners, to really complex real world applications with best practices for experts
- Includes the source code with unit-test with 100% coverage of the code
- Supports design by contract coding (from the database up to the client side).
You know what is the funny (or sad) part of all this? I have met frameworks that do 1, 2 or even 3 or more of this features... but none that does them all... will I ever see such a thing? is even possible to build it?
Tuesday, July 03, 2007
RIAs: Faulting &Uniquing (or Merging?) (Granite, Ajax)
I am thinking this leads to a pattern like this:
- I need to work with persons, so I fetch a list of them from a remote service.
- I choose to work with the person with id "3";
- present the contacts of person "3". (here is the tricky stuff, all the contacts that I load have a reference to person "3", what do I do about that? do I re fetch it, creating a different object and breaking uniquing, or do I look for a way to prevent that "same but different object" in my application? )
But... until granite has that... what could we do as a first step? it would be nice if we could "merge" a recently obtained object with one a fetched before... something like ADO.NET DataSet... (I can not believe I am writing that I miss the DataSet)
I have been thinking... a fully "AJAX" traditional JavaScript based application would have the same problems if it had a complex enough UI... but I haven't heard of anything like it, it seems that most AJAX application developers build applications so simple that they don't even care about having to write and re-write client side data manipulation code... (or... maybe those applications don't even enough client side behavior to need it?).
I guess that until Granite has his own "Data Management" the way to handle data will be... to imitate the practices of traditional AJAX applications?
(Mmmm, now with Google Gears... will we see how JavaScript based frameworks for automatic handling of DTOs and ORM start to appear everywhere? Parhaps this will revive the interest in something like SDO?)
Requirements Analysis: Negative Space
A while ago, I was part of a team working on a crucial project. We were confident, relying heavily on our detailed plans and clear-cut requi...
-
In Java, first we had servlets , and each one of them answered to request made to a particular url ( or an url matching a particular express...
-
Today I was thinking about the way I build UIs ( List then Edit ) and the way most people I know build UIs ( Edit then List ) and the relati...
-
That seems like a very simple question: "When will feature "X" be finished?", specially if do not have a lot of experie...