Thursday, June 06, 2013

Architecture Rituals and Patterns: On the Importance of tying up the cat - Part 1

The Zen master and his disciples began their evening meditation. The cat that lived in the monastery made so much noise that it distracted the monks from their practice, so the master ordered that the cat be tied up during the entire evening practice. When the master died years later, the cat continued to be tied up during the meditation session. And when, eventually, the cat died, another cat was brought to the monastery and was tied up during the practice sessions. Centuries later, scholars descendants of the Zen master wrote treatises on the profound spiritual meaning of tying up a cat for meditation practice.

A few years ago, I attended an event organized by Sun to promote Java here in Mexico (I think it was called Java Dev Days). At that time, I entered a presentation, I think it was by Sang Shin (the creator of JavaPassion), in which he explained how simple it was to use JRuby with Ruby on Rails to create a "hello world" page. It was something like this video: http://www.youtube.com/watch?v=sas3KCKwyHI. It was the first time I saw RoR, so on that side, it seemed educational... however, most of the audience... was falling asleep!

That caught my attention... why were they all so bored, from what I had been able to observe the previous days, many of them were enterprise programmers, with the kind of clients that I have had, who give you very little information to work with, demand a lot, and want everything for yesterday..., so I started to listen to the comments that they exchanged quietly (or not so quietly) between yawns:

    Is that simple? It took like 15 minutes to make a hello world screen!
    How boring, I could do that with JSP in 1 minute
    Where does this guy get that this is simple? Did you see the mega directory structure that those commands created? That's not simple at all!
    You can tell this presenter spends his time giving courses, but he's never worked in real life, I'd like to see him with his RoR having to get the job done at 3 in the morning

At least the 2 rows sitting in front of me were not remotely impressed with RoR. And why would they be if in JSP you can make a hello world simply by putting something like this:

<HTML>
<HEAD>
<TITLE>Hello World</TITLE>
</HEAD>
<BODY>
<H1><%= "Hello World!" %> </H1></BODY>
</HTML>

Certainly, arguing that the 15 minutes of the multi-step video are simpler than this is absurd.

Well, some will say, (as part of me thought) the point of the example with RoR is not to show how simple it is to make a silly "Hello World" but to show the potential of RoR's MVC architecture to do much more complex things...

Let me see if I understood... are you telling me that I'm going to complicate my life by making something easy much more difficult and that's going to result in a simplification of the difficult? More complexity today == Less complexity tomorrow? That violates YAGNI. (Yagni means: You aren't going to need it)

And it's not that you have to avoid violating YAGNI just to follow tradition, as in the case of the cat in the monastery. Yagni has a reason for being: in systems development, change is the only constant, and if you get ahead and build something before you have a clear idea of whether you really need it, when you get to the point where you are really going to use it, it is very likely that what you really need will be very different from what you had predicted. This doesn't mean that your code shouldn't be flexible, but you have to be very careful, because you can end up with an over-engineered solution, ready for situations that will never occur. If you use too much energy today, fighting with the ghosts of things that will never come, when the real enemies finally arrive, you won't have the strength to face them.

Now, one of the big problems we have in learning programming is that when we are learning, we have to build examples like a Hello World in RoR to understand how it works. But as instructors, we have to be very careful in saying "look, with this example I'm going to show you how simple this is...". No. You don't make a hello world in RoR because it's simple. Let's not confuse our students. We don't go to the gym to sweat because it's less tiring than staying at home watching TV. We go because by doing so we will improve our skills, we go to prepare ourselves for future situations. So, in programming, not only Yagni rules, we also have another principle: OAOO. (OAOO means, once and only once).

OAOO is the root cause of the apparent complexity of RoR, and the archenemy of copy&paste. OAOO has other names, it is also known as the DRY principle (Do not repeat yourself). It's not good to have copies of the same code scattered throughout the project, because if there is a bug in one of them, and you didn't follow OAOO, you'll have to find and fix them all, instead of solving the problem in one place.

OAOO and YAGNI are like Yin and Yang. They balance each other. Without OAOO, YAGNI would slowly strangle us. With OAOO, you always have a system that you can improve (a phrase by Ron Jeffries).

OAOO alone would also kill us, leading us to an eternal refactoring of code searching for the holy grail of the generic code with the fewest possible lines without ever stopping...

We need both. We need balance.

How many times have you faced code like this in your work:

DAO:
    public List<Product> getProductList() {
        logger.info("Getting products!");
        List<Product> products = getSimpleJdbcTemplate().query(
                "select id, description, price from products",
                new ProductMapper());
        return products;
    }

Service:
    public List<Product> getListOfProducts() {
        logger.info("Getting products!");
        List<Product> products = getProductsDao().getProductList()
        return products;
    }

Controller:
    public List<Product> getProductsToList() {
        logger.info("Getting products!");
        List<Product> products = getProductsServices().getListOfProducts()
        return products;
    }

What value does this add? Why can't we simply do it like this (no Service layer, no DAOs layer):

Controller:
    public List<Product> getProductsToList() {
        logger.info("Getting products!");
        List<Product> products = getSimpleJdbcTemplate().query(
                "select id, description, price from products",
                new ProductMapper());
        return products;
    }

Why is it that if you don't tie up the cat before meditating... I mean, why is it that the principles of multi-layer architecture dictate that it should be done this way... really? Is that really the answer? NO!

If it doesn't satisfy an immediate need and doesn't help OAOO (obviously the redundancy in this doesn't help OAOO), we shouldn't do it. We are burning precious time and energy on "busy work" that nobody will thank us for.

Heresy! Many have told me when I tell them to stop having so many layers just because. Layers are NECESSARY. You can't remove them.

Of course, you can, and I'm not the only one who thinks so. JBoss Seam was designed precisely with the idea of eliminating so much unnecessary layering. And that was YEARS ago. Why hasn't it dawned on most people that making layers for the sake of making layers is not correct? Why do we still tie up the cat and buy a new cat every time we start a new project? Tradition.

Now, traditions are not all bad, (those cultures with traditions that don't help their survival simply become extinct). Where does the idea come from that putting so many layers helps in any way?

I will talk about that in my next post in this series...

(originally published in spanish on Javamexico)

Estimation: Performance Assumptions

A very common request (which often becomes a demand) made by clients is a performance guarantee, things like:

I want all screens/pages to have a maximum response time of 3 seconds.
I want all queries to have a maximum response time of 3 seconds.
I want the system to run on my server (when we're lucky, they tell us the server's specifications).

What to do in such a case? When I had just graduated (more than a decade ago) my response used to be: "I need more data", "With the information I have, I can't guarantee that performance", "I don't even know what the algorithm for the indirect sales commission calculation process consists of, how am I going to guarantee that the screen where I see the result responds in 3 seconds???"

It almost goes without saying that my attitude was the wrong attitude...

My questions seemed logical to me, to give a guarantee of a page's response time of 3 seconds was like being told "you're going to transport a box of indeterminate weight, across an unknown distance, using an unknown mechanism and you have to guarantee that you can do it in 3 seconds".

Of course, my response was: If I don't know the weight, the distance, or the mechanism, I can't guarantee the time....

And the inevitable response from my bosses: Don't tell me why not, tell me "what would have to be true to make it happen"... was irritating to my ears.

But they were right, if you're asked for a guaranteed transport time, and you don't know the weight, the distance, or the mechanism to use, what is expected is not that you say "I lack that data", what is expected is that you say under what conditions you could do it in the specified time (if later the client cannot provide them, then it's not your failure).

Of course, this doesn't mean you shouldn't ask. It's always best to ask. But paradoxically, you should ask, but not worry too much whether or not you get an answer. There will be clients who are interested in you building the right thing. Those clients will try to answer your question by giving you the parameters you require. There will be other clients for whom the "right" thing doesn't matter, the only thing that matters to them is that it's done "correctly" (that it's well done, even if it's not useful to them). This second type of clients will often be irritated by your questions, they might even answer "you're the expert, I don't know about this, you tell me what you need to do it in the time I require"... There will be cases where your client is lucky, and the right thing and the thing done correctly will coincide... but always try to do the right thing, but if your recommendations are ignored, don't get irritated, and switch modes. Don't think that your client is "stupid" or "bad" for demanding the "what needs to be true". Ultimately, they are right, you are supposed to be the expert: tell them "what needs to be true", and let them decide whether or not they have the budget.

Thus, all the missing parameters must be assumed. Let's take the most extreme case, they ask for a response time of 3 seconds, and they give you no other clues... and there are various requirements in the system (such as "display commissions for indirect sales"). You don't know how many sales they have, but you know it's a multinational corporation, and they are certainly not "a few."

What to do? Let's go back to the example of transporting a package:

Weight: Unknown

Distance: Unknown

Mechanism: Unknown

Transport time: 1 hour.

What would you do?

Well, you could, for example, say: In the basket of my bicycle, fits 1 liter of water. And in one hour, on my bicycle, I cover let's say 40 kilometers, then:

Weight: 1 kilo

Dimensions: maximum 15 cm square (the volume that fits in the basket of the bicycle)

Distance: maximum 40 km

Mechanism: Bicycle

Transport time: 1 hour.

Now, the big mistake why assumptions like these are considered the mother of all evils is because the one who assumes them keeps them until the day the client wants to send a package, and it turns out that half a ton of bricks needs to be sent 100km away in a maximum of 2 hours... and there's no way in the universe to do that in the basket of your bike...

And then, comes the typical sermon: "don't assume!" "ask!" (Which becomes particularly annoying because you clearly remember asking the client and they never gave you a clear answer)

Was the error in the assumption? NO.

The error was that you did not make your assumption PUBLIC. You didn't verbally discuss it with the client, you sent it by email and tried to get them to sign a document accepting it. Now, of course, some might think: "I would have to list all the possibilities" — what if the package is 100kg, what if it's 10 square meters, what if I have to deliver it in 15 minutes? That's the wrong approach. The correct approach is to give one example, but very clearly defined.

Of course, some will say, there will be something you overlook: what if the user asks to send 1 kilo of uranium? It meets all the criteria, but the radioactivity will kill you! There's no way to think of everything.

Indeed, there isn't. But that's no excuse not to try to define things as precisely as possible. If you strive to define your assumptions as best you can, they will gradually become more complete, and after a few years, experience will allow you to include enough factors so that the possibility of being asked to transport something that violates your assumptions and causes you harm is very remote. If at the beginning of your career as a developer, 9 out of 10 assumptions lead you into problematic situations, and 10 years later, only 5 out of 10 do, that is a success.

Returning to the scenario that interests us as developers, if you are asked: “I want all queries to have a maximum response time of 3 seconds”

Establish assumptions:

Maximum number of records per query.
Maximum data weight of the query in Megabytes.
Network speed (Mbits/s).
Hardware characteristics of all servers and client workstations involved.
Type of operation (if it's in SQL, maximum number of records, presence of indexes, maximum number of records in the tables, maximum number of joins, version of the pseudo-RDBMS, etc., etc., etc.).

In civil engineering, if an engineer wants to know the strength of a steel bar 10 cm thick and 10 meters long, they can look it up in a catalog...

Isn’t it time we built that kind of knowledge for ourselves?

As much as I've looked for tables with the above data, where someone has tested certain "typical" hardware/software configurations and basic algorithms”, something like:

If you perform a Join between 2 tables with such and such columns in Sql Server 2008R2, on an i5 server, with 8GB of RAM and a disk of certain characteristics, on a network of... and its combinations (after all, we are programmers, we create algorithms, let's make one that generates X combinations).

Will it be a benchmark that works absolutely? Of course not, but that's not the goal, the goal is to be able to offer a reasonable assumption, and give your client the confidence that if the conditions you set are met, you can guarantee a result.

Then, if something else happens along the way... well, it's not your fault, you couldn't control it, and the number of clients who will understand that you didn't meet the goal because they (or the circumstances) violated your assumptions is (at least in my experience) much greater than those who would understand if you tell them from the beginning "why not" instead of "as if."

Remember, assumptions are not bad. If you ask a civil engineer to build your house, you don't expect to answer their questions about the strength of materials, that's not your job. The job of the consultant (whether a software developer or a civil engineer) is to find the conditions under which "it can be done," and in software, fortunately... or unfortunately, almost everything can be done, if you have the right assumptions.

Originally published in Javamexico

The eye sees only what the mind is prepared to comprehend

Thursday, June 06, 2013

Architecture Rituals and Patterns: On the Importance of tying up the cat - Part 1

Estimation: Performance Assumptions

Requirements Analysis: Negative Space

Links

Report Abuse