Saturday, June 13, 2009

Greenspuns Tenth Rule Of Programming and ADO.NET Data Services

This is a humorous observation once made by Philip Greenspun:

Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of CommonLisp.

Or the database variant (Greencodds Tenth Rule Of Programming):

Every sufficiently complex application/language/tool will either have to use a database or reinvent one the hard way

I sometimes I feel that:

Any sufficiently complicated procedural api contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of  query language.

Take for example, Web Services, they started a procedural thing where people created their own custom verbs for each service, then people realized that the semantics of HTTP constitutes a coordination language which is is sufficiently for any computational communication pattern, they realized that GET / PUT / POST / were enough for any program, but they again forgot about the query part, so after a  while they reinvented a query language  (like SQL) by creating create a query language based on URL query strings, as in ADO.NET Data Services)

Sometimes I feel all IT industry is like  a dog chasing its own tail… we invent Cobol to handle data procedurally only to invent SQL to handle data declaratively only to invent stored procedures and web services to handle it procedurally to then invent REST to handle it declaratively  I wonder what will be the name of the procedural silver bullet that will replace REST…

Sunday, June 07, 2009

JEE Development Deployment: In the order wrong

So, you start a new  we project in Eclipse 3.4.1 (or in Netbeans 6.5.1) and you run it, what happens is that basically the project gets copied in to the directory the application server uses to “keep” the applications it runs, then the application server starts, looks in to its directory for applications and starts your application, and you are able to see your project,if you are using Tomcat, or JBoss as lot people do, you will be able to see you web project on port 8080. Right? Wrong!

what really happens is that  then the application server starts and then the project gets copied in to the directory the application server uses to “keep” the applications it runs, then the IDE opens the web browser, that connects to the port 8080 using an url that includes  the context of the application, its “name” so to speak, and then, the application server then realizes there a new application with that name in on its “directory applications” and starts it.

Noticed the difference:

What one typically believes it happens:

  1. the project gets copied in to the directory the application server uses to “keep” the applications it runs
  2. the IDE starts the application server
  3. looks in to its directory for applications and starts your application…
  4. now you will be able to see you web project on port 8080

What really happens:

  1. the IDE starts the application server
  2. the project gets copied in to the directory the application server uses to “keep” the applications it runs
  3. then the IDE opens the web browser, that connects to the port 8080 using an url that includes  the context of the application
  4. the application server then realizes there a new application with that name in on its “directory applications” and starts it

Now, why is it so important to understand in what order this happens?

Well, because during development this processes is repeated again and again, until the application reach a point where it can be delivered to the customer the problem is that some times, one of the iterations of this process ends with an application so dysfunctional that it is able to crash the application server (and that will not stop until we get a real virtual machine for java).

But until we do get a better virtual machine, there is a more pressing problem, if we deploy an application that crashes the application server, the order in which the process is currently done makes it impossible to deploy the fixed version, unless we delete the previous version completely. Why do I say that: well lets analyze it, lets say we start with a fresh application server, where we have never installed our application:

  1. then the application server starts
  2. the project gets copied in to the directory the application server uses to “keep” the applications it runs
  3. then the IDE opens the web browser, that connects to the port 8080 using an url that includes  the context of the application
  4. the application server then realizes there a new application with that name in on its “directory applications” and starts it

This first version of our application is far from finished, but it does not have any bug capable of crashing the application server, so we add som more functionality, and we ask the IDE to run our application again, and what does the IDE do?:

  1. then the application server starts
  2. since the application server already knows it has our (failed) application installed, it does not need a visit from the browser to start it, in starts it immediately
  3. the project gets copied in to the directory the application server uses to “keep” the applications it runs, since the application is already there, only the changed files are copied
  4. if the changes are of a particular kind (alterations to web.xml for example) the applications server restarts the application so that it starts working with the new configuration
  5. then the IDE opens the web browser, that connects to the port 8080 using an url that includes  the context of the application

This second version of our application actually has a nasty bug, so after step 4, the application server hangs, and we are forced to kill it using the services provided by the operating system to deal with misbehaving processes, then we go back to our code, fix the problem, and we ask the IDE to run our application again, and what does the IDE do?:

  1. the IDE starts the application server
  2. since the application server already knows it has our (failed) application installed, it does not need a visit from the browser to start it, in starts it immediately
  3. The application server hangs.

And that is it, the project files are never copied in to the directory the application server uses to “keep” the applications it runs, the critical problem here is that the IDE seems to wait for the application server to start correctly before copying the new files (it should copy them any way, that of course would not prevent it from hanging, but, after we had killed the process, the next time we ran the application server, if we had fixed the bug, it would not hang any more… but even then the behavior would be anti-intuitive, why the process is not like this is a mystery to me:

  1. the project gets copied in to the directory the application server uses to “keep” the applications it runs, since the application is already there, only the changed files are copied
  2. the IDE starts the application server
  3. since the application server already knows it has our (failed) application installed, it does not need a visit from the browser to start it, in starts it immediately
  4. Since we have fixed the bug, everything runs fine

Do you know why it does not work like this? If you do, please… would you explain it to me?

Thursday, April 23, 2009

Client side caching: Typical omission in server side component models?

It seems like a simple problem, but it is not (to day, I have not been able to find a way to do this without complex Javascript coding):

  1. You have chained comboboxes: County and State.
  2. You select USA in the Country combobox, and its 50 States are loaded in the States Combo (roundtrip to the server to fetch them)
  3. You select Mexico in the Country combobox, and its 32 States are loaded in the States  Combo (roundtrip to the server to fetch them)
  4. Now you select USA in the Country combobox again... how do I tell  the server side component framework that I do not want it to go to the server for them, since it went for them the last time I selected USA, I want it to use that as a cache and do not go for them until I tell it to do so?

I provided this use case as just one illustration of a broader class of client side caching scenarios, support for this kind of scenario might not be need for all use cases, I might not be needed, for example,  for user self-registration... sadly, I do not build that kind of application where I work now, the kind of application I have to build is the kind where the UI is used repeatedly: Yes, I build those dreaded "enterprise level" applications used internally by a big organization.

This kind of optimization might seem silly for the typical web application where the user rarely uses the same form more than once, but for enterprise applications, this behavior can be the difference between an application that is perceived to be responsive and useful, and an application that is perceived to be cumbersome and useless.

This can not be done with “user code” in AribaWeb. It can not be done in JSF, and it can not be done in ASP.NET. But is extremely easy to do if you code in JavaScript and use ExtJS, or Cappucchino.

I wonder... Is this problem really impossible to solve in a declarative way using server side component models? is this really the insurmountable frontier for server based frameworks? or could someone create a trick that made this work?

Saturday, April 18, 2009

Inversion of re-render (subscription based re-rendering): Why it can not be this way?

Anyone that has used Richfaces knows that to rerender something, one needs to refer to it by id.

Now, this is (in my opinion) an useful approach but also a very limited one, specially if componentization and code reuse are important goals during development

Lets say I build a page (lets call it root page), that has a subview with a modalPanel, that includes a facelet component, that has a subview with modalPanel that includes another facelet component, and, when something is done here, I one another control, in the root page to be rerendered

Now, I could of course pass along the ids of the component I need to be re-rendered, but... what if I need another component to be re-rendered too? do I pass along its id too? and what if the other component is also inside (a different) subview with a modalPanel, that includes a different facelet component... then all this id "passing" gets really messy, and creates dependencies in components that could otherwise be decoupled... and to make things worse, using meaningful ids in JSF is not considered a good practice, because meaningful ids (specially if used in name container like the subviews) rapidly increase the size of the pages (because they concatenate to ids of all the contained controls), contributing to bandwidth waste . 

Now, I have a proposal, what if re-rendering were to work in an "inversed" way: instead of "A" saying that it will re-render "B", we say that "B" will be re-rendered when something (maybe "A") says so by broadcasting event "C".

This would mean that "A" no longer needs to know the name of "B" and "B" wouldn’t need to know the name of "A" either, it would only need to know that it should re-render itself it something, somewhere broadcasted the event "C"

Am I making sense here? Is this doable? Or am I not seeing a major limitation in JSF technology that prevents this from being built? (I am no expert in JSF, so I really can not say, but I do know that an event subscription based re-rendering engine would be a really nice tool to have)

Monday, March 16, 2009

EntityManager.persist: ¿What does/should it mean?

Lets say you are presented with the following JPA @Entity:

@Entity
public class Customer {

private Long id;
private String name;

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
public Long getId() {
return id;
}

public void setId(Long id) {
this.id = id;
}

@Column(nullable=false)
public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}
}



And then the following test



@Test
public void persistCustomerInTransaction() throws Exception {
factory = Persistence.createEntityManagerFactory(PERSISTENCE_UNIT_NAME);

EntityManager em = factory.createEntityManager();

em.getTransaction().begin();

Customer customer = new Customer();

em.persist(customer);em.getTransaction().commit();

em.close();

}



And the following question: Where does it crash?:




  1. At the line em.persist(customer); because the name is configured as nullable=false, and we are trying to persist the Customer instance with a null value in the name property


  2. em.getTransaction().commit(); because the name is configured as nullable=false, and we are trying to commit the transaction with a null value in the name property



Turns out… that the answer depends on the JPA provider you are using!!!! If using Hibernate, it crashes at em.persist, but if using EclipseLink, it crashes at em.getTransaction().commit.



Now, this might seem irrelevant, but it is in fact very important, because EclipseLink behavior means that constraint validations are deferred until the point where the transaction is committed, and that means that the developer has a lot more freedom to manipulate its persistent objects: they can be in any (possibly invalid) state while they are being manipulated by the business logic of the application, and they only have to be “right” the moment one needs to commit them to the database (not before), this is specially useful, for example, when building a wizard like UI (or a plain simple CRUD UI with support for full object graphs) with EclipseLink, I can persist my objects as soon as I want, and if want to, for example, run some validation logic at the end, all I have to do is ask the EclipseLink to give a list of all the objects that are going to be written in to the database, with Hibernate, the entityManager does not help me manage new object instances, I have to manage them myself.



What I find really surprising is that AFAIK there should be some kind of test to ensure that all JPA are compatible (something called the TCK?), and I think that this kind of discrepancy should be detected by those tests... shouldn't it?


I think this support for deferred validation, and an API to get will be inserted, will be updated or will be deleted, will be very important to really integrate JPA with something like JSR 330, so that validation can really integrate with the lifecycle of a persistent POJO.

Saturday, March 14, 2009

JPA/Hibernate (subjective?) weaknesses

I have had to work with JPA/Hibernate for a few years now... and I feel it has some weaknesses I really do not like (when compared with the first ORM I ever used, NexStep EOF), I am thinking about "switching to something else" but first i would like to be sure that the ORM I switch to does not have this weaknesses too.
List of (subjective? perhaps I only perceive them because I was exposed to EOF first?) weaknesses in JPA/Hibernate:

  • No way to manage an object that "will be persisted", in JPA/Hibernate if you call entityManager.persist(object) and the database does not support sequences (like MS-Sql) an insert will be triggered, and if any of the non nullable fields of the objects is null, it will crash. If your object has a compound primary key, things get worse, because the entityManager can not deal with it until the compound primary key is set, and if your compound primary key is formed by foreign keys pointing to objects that are new too, that means you will not be able to save stuff with with a simple single call to entityManager.persist to one of the objects, cascading will not help you (I really miss something "magic" like the single shot EditingContext.saveAllChanges() in EOF)
  • No easy way to know if "an object is dirty" (if it has changed/deleted since it was read from the database, there is just no API for that), and since you can not know what objects will be persisted, and what object have changed, and what objects will be deleted from database, that means you can not easily create an unified API for centralized polymorphic validation (that is no easy way to create validateForSave, or validateForDelete methods in your persistent entity classes)
  • No real equivalent for validateForXXX, JPA lifecycle callbacks are not match for validateForXXX because you can not query the database during the lifecycle callbacks, and if you throw an exception inside a lifecycle callback, the JPA/Hibernate entityManager enters an invalid state, and after that you can not continue to use your POJOs, you have to start over with a fresh entityManager... and without you modifications to the POJOs. Note that Hibernate new validation framework does not offer a real solution for this problem... and AFAIK JSR-000303 will not help with this either.
  • No support for some kind of temporary id: In JPA/Hibernate, the id for an object is "null" until you flush it to the database, so if you need to reference a particular object instance from the user interface... there is plain no way to do it, you have to "save it first" to get a primary key.
  • No support for Nested Contexts... ( I think they would be a perfect fit for conversational frameworks like Seam or Shale). One of the goals of the ObjectContext is to provide an isolated area where local object changes can be performed without affecting other similar areas or the underlying storage. Nested Context changes can be saved to the parent Context without saving them to the database. Such child context is often called "nested". Nested contexts are useful in many situations, such as nested UI dialogs, complicated workflows, etc.

Those are my main disagreements with the way I have to work with JPA/Hibernate... will switching to JPA/EclipseLink, JPA/OpenJPA or the still not fully JPA compliant Apache Cayenne help me with those? I will be writing about my findings on this in my following posts

Wednesday, February 25, 2009

AribaWeb: The framework of my dreams?

Could AribaWeb, finally be the framework of my dreams? that for which I have longed for ever since I stopped using WebObjects? The one to finally demonstrate those Ruby guys that what makes Ruby so great is Rails, and what made Java inferior (until before AribaWeb) was  its framework, but not the language itself?

So far, it seems that, while not perfect (no client side caching, some Hibernate inherited validation limitations), AribaWeb gets many basics concept right (like programmatic navigation, UI meta generation, runtime UI customization, MVC separation…)

I am simply amazed….

Now someone just needs to integrate AribaWeb with Capuccino, and total world domination would be in their hands.