This blog will be about... Software Engineering... Object Oriented Analysis... Database Development, etc
I believe that imagination is stronger than knowledge -- myth is more potent than history -- dreams are more powerful than facts -- hope always triumphs over experience -- laughter is the cure for grief -- love is stronger than death.
Friday, August 24, 2007
Code depreciation
If code is written for performance (compromising maintainability), the value of such performance optimization (and maintenance degradation) will depreciate, because the likehood of having either faster hardware or different developer maintaining the code increases as time passes.
On the other hand, maintaintable code, increases its value for similar reasons:
If code is written for maintainability (postponing performance enhancements), the value of that maintainability (and performace degradation) will increase its value, because the likehood of having either faster hardware or different developer maintaining the code increases as time passes.
I added this definitions to C2... I wonder how (or if they) will evolve.
Wednesday, August 22, 2007
Data Transfer Object Injection
DTO injection could happen where there is a remote object service that allows a client system to send and and object graph that is automatically converted by an object relational mapper in to SQL statements.
Instead of sending a valid object graph, the attacker can send a different object graph, representing alterations to the database that go well beyond his security level. For example, a remote object service receives an object graph that represent changes in the objects that represent new users, or new permissions granted for existing users of the system.
To prevent this problem it should be possible to specify at the object relational mapping level, which entities can be saved by the current user... many object relational mappers, or xml relational mappers automatically write the changes represented by the object graph to the database, without caring if the current application user has the privileges required to persist those objects... we can not rely on RDBMS security, because most remote object services use the same user for all the calls... and I think it that connecting with a different user for each remote object service would be bad for connection pooling (decreasing performance)
I wonder if anyone else thinks this is a common security problem... Mmmm... I will add this to C2... I wonder how (or if it) will evolve.
Saturday, August 11, 2007
REST DataService... it was so... obvious?
When I built my first systems using .NET 1.0 (back in the year 2002), I was excited with the idea of using "XML SOAP WebServices" to communicate my client side application with my remote side business logic... but, after I started developing it, I realized I had to do a lot of stuff that just seemed repetitive and hard to use, why did I have to create a WebMethod for each of the CRUD operations... and many for the "R" in CRUD for each possible "select" variation... and it was even more problematic because sometimes I just had to have a "dynamic querying UI" and couldn't find a good way to send the definition of a query in a good "XML" way...
Then I realized... why should I create a method for each variant? why not just have a single web method:
DataSet executeQuery(string Query)
And I started changing all my code, anything that I wanted to obtain from the server could be obtained that way... (but... I started wondering... is that the one and only true way to use data oriented web services? I remember reading somewhere that wasn't such a good idea.. that SOA wasn't invented for that.. after all, that was just a thin XML wrapper over my ADO.NET data provider...)
Fast forward a few years.... a lot of people start talking about a doctoral dissertation written by Roy Fielding... and the reach the following conclusion "SOAP is just too complex" ,"Having to create a different web method for each action makes the interface complex and not that good for inter operation", "one needs to know too much to understand a SOAP web service because methods are not standard", "WSDL is too complex", "SOAP is going against the resource naming philosophy in HTTP", etc, etc.... And REST is the answer to all our problems...
Well here I am taking a look to the experimental REST Framework "Astoria" Microsoft is creating, and... it looks painfully similar to DataSet executeQuery(string Query), but it has a difference... it is not using SQL... it is using a custom ad hoc querying mechanism... that... does the same things SQL does?.. Perhaps it will implement better some relational features that are badly designer in SQL but... what is the real advantage here? what is the real difference between this and a SOAP web services that receives SQL and returns an XML representing rows in a database?
Is there really a difference? or it just that we (as an industry) needed to invent SOAP webservices to realize all we needed was a thin XML wrapper around SQL?
Update: Astoria is now known as WCF DataServices
Saturday, July 28, 2007
Why is validation so hard?
Monday, July 16, 2007
NoResultException is a really stupid idea!
But then, I met NoResultException... what a great way to screw a great idea!
Why not just return null??!!! getSingleResult should return 1 element, or null if it can not find stuff!
Saturday, July 14, 2007
Unit testing Relational Queries (SQL or HQL or JQL or LINQ...)
Most applications I have built have something to do with a database (I remember that while I was on college I used to think that was not exiting stuff, I used to dream about doing neural networks stuff, logic programming in Prolog, etc) but then I met WebObjects and its Object Relational Mapper (Enterprise Objects Framework) and I got really excited about object oriented data programming... but I always had a problem... to test my object oriented queries I had to "manually" translate them into SQL and test them against the database, and only after they give me what I thought were correct result I would write them using EOF Qualifiers...
Then I met Hibernate's HQL and I realized it is much more powerful than EOF Qualifiers, but I still had to translate it to SQL to test it, I know I can get the SQL that is generated from the HQL from the debug console, and paste it in my favorite SQL editor... but even then if I found a mistake, a lot of times it was easier to tweak it in SQL and the manually translate it HQL.
Currently, there are some extensions for Eclipse (Hibernate Tools) that make this more "direct" but, what if I don't like (or don't want, or can't) use Eclipse... it would be great if someone could, for example, make a plugin for SquirrelSQL, but until then... what options do I have?
Then I learned about unit testing... and the answer came to my mind immediately: I just had to write a unit test for each of my queries. That worked fine... in the beginning... until I started having queries that returned thousands (or millions) of objects, and it wasn't such a good idea to output them to the debug console... and I had another problem... how should I write the "asserts" of query?... and how can I do it so that it doesn't make my test so slow that it becomes unusable? (I can, of course, check the results just by viewing them, but my brain is not that good to say if those 10,000 row really match with the idea I had when I wrote that HQL)
So, I started to look "what do I do" to check if an SQL query is correct, lets say for example, that I write this:
select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3
(Translated to English: How many address the Employee with Id = 3 has?)
Now... how do I test that? well I could add an assert after getting the result in java (or c#) like this:
assert(count>0)
But the, what happens if someone deletes the row with Id = 3 from the table Employee? That means my test will fail... or what what if someone deletes all the addresses from employee? and what if I want to test that if there are no addresses for an employee, the answer should be zero...
That is a lot of work just to test if that simple query is right... and I think that work could be done automatically:
Take a look at the SQL, it could be decomposed into:
select count(*) from (select * from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3) as EmployeeAddresses
And then we could say, lets automatically check for the case when the resulting set is empty, and for the case when the result is not empty, to check for a case when the result is empty, we need and Employee with Addresses, so we generate:
Select * from Employee where exists(select * from Address where Address.EmployeeId = Employee.Id)
And take the Id first employee we get... and that should give us a non empty set if used in the original sql sentence that we are trying to test... after that, we automatically generate:
Select * from Employee where not exists(select * from Address where Address.EmployeeId = Employee.Id)
And take the Id first employee we get... and that should give us an empty set if used in the original sql sentence that we are trying to test...
I call this queries "inverses" of the original one, it like when one is testing a multiplication, to see if 2 x 3 = 6, just do: 6/3 = 2 and 6/2 = 3, if 2, and 3 match the operands of the multiplication, you multiplication is right. The same thing goes for SQL, one just has to find the way to "invert" it, if I could automate this inversion, the automatically generated queries would help me by telling me things that might not be immediately obvious to me when I look at the original query, and that would help me check if my original query is right.... it would me some kind of "invariants" that would help me to better understand my querying... or maybe I could even write the invariants first, and then create a query and see if it matches my invariants...
Mmmm.... maybe using a select there is another way to "invert" a query to test if it is right, using the actual inverse operation of selecting... that is "inserting", could I derive from:
select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = 3
Something like (In pseudocode):
Insert Employee;
Store Employee.Id
Run select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = EmployeeId
Assert("The answer should be zero")
Insert Address related to Employee
Run select count(*) from Address,Employee where Address.Id= Employee.AddressId and Employee.Id = EmployeeId
Assert("The answer should be one")
This has the advantage that I don't need a database with data already on it, but it has the disadvantage that takes lot of time to write an unit test like this in java, because to insert an employee, it might be necessary to:
- Avoid breaking validation rules no related to this particular test, for example, an Employee must be related to a Department, but if the Department table is empty, then I should create a Department or I will not be able to insert an employee.
- Avoid conflicts with validation rules directly related to this particular test, for example, what if I have an Hibernate interceptor that won't let me insert an address without 1 or more Addresses
I think it can be done... the question is..
What is the algorithm to generate the inserts needed to satisfy an SQL select statement?
Thursday, July 05, 2007
The perfect infrastructure (framework?) for data systems
The perfect infrastructure (framework?) for data systems:
- Has an object relational query language (something like JQL or LINQ)
- Has a database with versioning (like Subversion) so you can always consult the database as it was in a particular moment in time transparently
- Supports transactions... and distributed transactions. (like Spring)
- Has a framework to exchange graphs of objects with a remote client, objects can be manipulated freely on the client, filtered and queried without hitting the database without need, and are transparently loaded in to de client without having the n+1 problem. (like an hybrid between Hibernate, Apple's EOF & Carrierwave)
- Supports "client only transactions" and nested client only transactions (like the Editing Context in WebObject's JavaClient applications) so that it is possible to make rollbacks without hitting the database, and it is even possible to make partial rollbacks... and have savepoint functionality, without going all the way to the database (unless you want to do so)
- Client objects, server objects and database elements are kept in perfect sync automatically, but it is possible to add logic to a particular tier of the system without too much hassle.
- Has a validation framework, that make it really easy to write efficient validation code following DRY, and that validates data on the client, on the application server and on the database.
- Validation code, combined with the versioning capabilities of the infrastructure allows to save information partially, as easily as writing part of a paper, validating only as you completely the information, with multiple integrity levels
- It is possible to disconnect the client from the server, and it will be able to save your changes until the connection is established again
- The applications built with this perfect infrastructure auto update automatically.
- With a very simple configuration tweak, it is possible to download the application "sliced" in pages, or as a complete bundle. This capability is so well integrated that the final user can choose the installation method, and the programmer doesn't even care about this feature.
- The developer only needs to specify the requirements semi-formally in a language (like Amalgam) and he will receive a running application, that adapts dynamically to specification (unless he chooses to "freeze" a particular feature of the application, in which case, the default procedural code for that feature is automatically generated, and the developer can customize as he wishes ... or decide to un-freeze it.
- Can be coded in any language compatible with a virtual machine that runs anywhere, or can be compiled to an specific platform.
- Allows for easy report design... by the developer, or the user.
- It is opensource (or sharedsource), so that in the extremely unlikely case of needing another feature, or finding a bug, it can be easily fixed by the developer
- It is freely (as in beer) downloadable from the Internet. (or has a reasonable price)
- It is fully documented, with lots of examples, going from very simple examples for beginners, to really complex real world applications with best practices for experts
- Includes the source code with unit-test with 100% coverage of the code
- Supports design by contract coding (from the database up to the client side).
You know what is the funny (or sad) part of all this? I have met frameworks that do 1, 2 or even 3 or more of this features... but none that does them all... will I ever see such a thing? is even possible to build it?
Requirements Analysis: Negative Space
A while ago, I was part of a team working on a crucial project. We were confident, relying heavily on our detailed plans and clear-cut requi...
-
In Java, first we had servlets , and each one of them answered to request made to a particular url ( or an url matching a particular express...
-
JPA Myth: EclipseLink and Hibernate are compatible. Not true: Hibernate does NOT follow the JPA SpecPlease take a look at the Hibernate JIRA issue EJB-441 , seen it? Now, please tell me if this is correct: According to Gavin King (autho...
-
A few days ago, I posted about how to fix a "," problem in an very good autocomplete control . As part of the solution I propose...