Saturday, February 21, 2009

What Is Encapsulation? Part II: A Library That Does Nothing Is Nothing

OO design is built on several layers of encapsulation. You combine statements and encapsulate them in a method. You then combine methods and encapsulate them in a class. The next layer is a component, then a library, and finally the system itself.

Each layer should encapsulate more and more details. It should only expose the absolute essentials to the next layer. A library should therefore encapsulate all (or nearly all) of its details behind a well-defined, well-documented, simple, public API. A library that doesn’t encapsulate its details isn’t a library. At best it is simply shared code, at worst it is a disaster waiting to happen.

I have lately seen libraries that contain nothing but interfaces and classes with only data (essentially data structures). Usually these libraries come in pairs. One provides the public well defined API, and the other provides the implementation. This makes the two co-dependent on each other. They should both be thought of as one cohesive library.

There are good uses of this pattern. JDBC for example provides an interface library that is part of a specification. Multiple vendors provide their own implementations. Client code uses only the interfaces and the classes defined by the spec. This allows code that uses JDBC to stay independent of each vendor.

In general both the API and implementation should be a single cohesive library. This will allow the code to evolve together as a single unit, and be easier to change, deploy, and manage.

That isn’t to say libraries shouldn’t use interfaces. One could effectively argue for exposing all functionality via an interface. I think that is overkill. Still, using interfaces to expose most functionality is certainly a good idea, but put those interfaces in the same library as the implementation unless there is a specific technical reason why they can’t be deployed together. Even then both should be thought of as a single library forced apart by circumstances.

A while back I had an unpleasant experience with Charter Communications. They would schedule an appointment to install cable, but nobody would show up. This happened 5 times in a row before I gave up and moved to another provider.

The core of the problem is that the customer center is a completely separate department from the service center that did the installations. Some information was shared, but there were no real direct lines of communication. Everyone I talked to on the phone tried to be helpful, but they were limited in what they could do.

Here loose coupling worked against them. The service department implemented the requests gathered from the customers by the customer service department. There is a natural cohesion there, and decoupling them made it hard to provide adequate service to their customers.

In software terms, the customer service library contains the interface for making requests (such as an installation), and contains data used by both the customer and the service library. The service library implemented the interfaces, and consumed the data to do the actual work.

Suppose the service library needs to be modified to provide additional information or provide better interaction with the customer for some edge cases. It would be difficult to do since the customer service is in a different library with its own release cycle, evolution, and possibly entirely different development team. It’s likely one or both of two things will happen. First, users will be allowed to bypass the customer service library and use the service library directly. Second, the service library just won’t provide that behavior, or won’t do an adequate job. Neither option is appealing.

Loose coupling is a good trait to have, but only when it is combined with high cohesion. Separating a cohesive unit into two parts will at best force both parts to evolve together making the separation between them artificial. At worst it will allow the 2 to diverge making it impossible to add new features with out significant effort. It’s true with departments in a large company and its true with software libraries.


Blog Directory Listing

Saturday, February 7, 2009

What Is Encapsulation? Part I: Private Fields Does Not Encapsulation Make

Ever since hibernate came onto the scene a few years ago, I have been seeing more domain entities that contain nothing but getters and setters. This seems to go against the core principle of object oriented programming: data and behavior should be encapsulated into a cohesive object that exposes a well-defined interface.

Making fields private does not make the class encapsulated. If a domain entity contains nothing but data, then it is closer to a data structure then to an object. Such a design has a definite procedural under tone to it.

Procedural programming may fit your problem better then OO, but usually it is a mistake, and therefore an anti-pattern. It has even been given a name: Anemic Domain Model. See Martin Fowler’s explanation: http://www.martinfowler.com/bliki/AnemicDomainModel.html.

There are other benefits then the one Fowler mentions. First it is easier to share core business logic when it is encapsulated in a domain model. Take a company that sells custom T-shirts and hats. The core logic around an order is unlikely to change between the web application for customers and the in house order taking application. Encapsulating it behind an Order object makes it much easier to share between the applications.

This is in theory only. In practice the in house system is likely to be on a different platform then the web application, making it difficult to share the domain model. This is the core problem SOA is trying to solve.

The second problem is that business rules are often different between applications. You may need more strict validation on an order placed over the web by a customer then you would on an order placed by a trained staff member.

Patio Building As Software Design


I’m hiring a contractor to build a patio at my house. It’s going to have a concrete floor and two stonewalls. The contractor is handling the details of how the patio is built. I don’t need to know where to get the materials, how to get them on site, how to build the walls, or how to pour the concrete.

The contractor will delegate to the suppliers who will ship the materials, the workers to lay the stones for the walls, and the concrete guys to pour the floor. The contractor will need to coordinate between them as well. The workers will need the correct materials before they can start, and the concrete can’t be poured before the walls are built.

The contractor is encapsulating me from the details of the patio building process, and is coordinating the different domain experts in order to bring them together to successfully build my patio. He doesn’t need to know the details of how to make or ship the materials, how to build the walls, or how to pour the concrete. The experts in those respective steps are encapsulating him from the details.

If this was a software system, below might be a model for it:



I make a request (build a patio) to the contractor (the service layer). The contractor delegates to several experts (domain entities) such as supplier, workers, and concrete guys. The contractor coordinates the work as the experts perform their respective steps.

Since the contractor delegates’ the responsibility to create the materials, the supplier can make changes to their manufacturing process (by upgrading equipment for example) without the contractor needing to know. This allows the supplier to easily use the same or similar (subclassed if you will) manufacturing process to create materials for other contractors, which may or may not be building something completely different (a new multi-story building perhaps). The supplier can make any change necessary in order to better accommodate the needs of all of its customers.

Multi-Tier Architecutre


I’ve seen the anemic data model anti-pattern recently in multi-tier architectures. A client and remote server run in different contexts. It is difficult to share any behavior in that scenario. This is the reason why DTO (Data Transfer Objects) were invented. The Anemic Domain Model anti-pattern comes from applying the DTO pattern to hibernate persisted entities.

Usually the hibernate objects are themselves passed to the client. This makes putting logic in them difficult to do, since most of it can’t run on the client. Under these circumstances, keeping most of the logic in the DAO (Data Access Object) layer becomes the path of least resistance. The DAO is a subset of the service layer, and is exactly the kind of problem Fowler warns against. It forces a more procedural design on your architecture. Fowler and many other OO advocates strongly advise against multi-tier architectures for this very reason.

Don’t go overboard!


A perfectly encapsulated class would not allow any access to any data it contains, nor would it send the data to any other objects it collaborates with. This is neigh impossible in the real world, unless you have one object that does everything. I, however, wouldn’t recommend that kind of design.

OO is a theory. Applying that theory in the real world means compromising. It is important to understand the theory, the benefits it brings, and how those benefits are derived. When compromising on the principles of OO, it is important to know what you are giving up. Don’t stick to dogma of any particular paradigm if the consequences out weight the benefits.

Java is built to be an OO language. That doesn’t mean procedural concepts can’t or shouldn’t be applied. The consequences will be higher however. OO should be the first choice. The consequences of any deviation should be weighed carefully.

Conclusion


You may have noticed that i didn't give a solution to the anemic domain model anti-pattern in multi-tier environments. I'm more curious about what you think. Is there a good way of avoiding this problem? What is it?

Also leave your rating for this post: +1 Ted Neward (brilliant) or +1 Britney Spears (stupid)

Blog Directory Listing