November 2005

Monthly Archive

Test-Driven Development on Legacy Code

Posted by on 29 Nov 2005 | Tagged as: Java

After working on many projects in which I was not the original developer, configurator, tester, or whipping boy, I have come to some realizations regarding legacy system maintenance. Since this is merely a blog entry, I am not feeling inclined as of yet to map out legacy system maintenance techniques for a large set of systems out there in the wild. If you want to research this in gory detail then check out a book (that I have not read but was recommended to me by Paul Dupuy) called “Working Effectively with Legacy Code” by Michael C. Feathers.

When we begin a project that contains legacy code, tools, and deployment situations, a common path is followed to tackle the seemingly insurmountable project learning curve. This path consists of the following:

  • Build and Configure the Project – If you are lucky, there will be existing build scripts which actually work with little to no configuration needed. I have come across this situation once. In that project there was a build.xml and an example with an acceptable amount of documentation for use with Ant. Unfortunately, this has been the exception rather than the rule. Usually there are complicated build files, oddly distributed configuration files, undocumented build tools, and a lack of consistency due to multiple directions the projuct has took over it's lifetime.
  • Deploy the Project – There is the possibility that the resulting executable from your build and configure phase is something that you launch directly or runs inside a container without much difficulty. Again, this has not been my experience. If the project is of any complexity, there tends to be multiple modules which have their own runtime configurations, protocols, and security policies. If you are diligent and accepting, you may attempt to scan page upon page of documentation hunting for clues into the runtime configuration. You may even be fortunate enough to have a document called “Installation Guide”. After some investigation, I have found that most often these documents are out of sync with the current product. Without documentation generators, it is near impossible to keep your documentation up to date with your products. This is even more prevelant when your project has evolved over many years.
  • Implement Bug Fixes and/or Enhancements – The customer has now informed you that they need a quick modification to their production system. The fix is just a slight modification to one part of the user interface. Their expectations are running high and your jaw just hit the floor. We politely tell the customer that we will let them know how long it will take after some investigation. The fix seems quite easy. Just add that field to this page or allow for editing of this data after it has been created. You scan the code trying to find out where to make the modification. It's been a couple of days and you have made a few somewhat educated attempts at pinpointing the bug but it doesn't look like the current data access will allow for your small modification either because the field asked for can not be joined into the result set or modifying that field could cause corruption. And even if the modification appears to have worked, you may find out later through conversations with the customer that the data you read or modified is not the correct data. Maybe that was a historical event which was copied from the correct data or it is just a slightly different set of data which looked quite similar to the data you used.
  • Fix Bugs Introduced by Bug Fix or Enhancement – So now you have made your bug fix or enhancement and even put some unit tests around the components involved. A build is created and deployed into the QA environment and all of a sudden there are three new bugs. At this point I am usually thinking “What could have happened? I thought my unit tests would break if there was a problem created by my changes.”. We proceed to engage the code again and find out that another class or module expected the data to be in a specific state when accessing it. Over time, this part of the process makes most developers gun shy if they weren't already from previous experiences.
  • Rinse and Repeat – The previous two procedures are repeated over and over and the frustration tends to wear on teams. Even the modifications which appear to be simple lead into more problems. We attempt to make the situation better by refactoring and introducing better management tools and it may even get “better” over an extended period of time.
  • Deploy to Production – If your first attempt at deploying a legacy system update into production goes without a hitch it is almost too good to be true. I have actually had this happen before and was so surprised by it that I stayed around to figure out if I could find any problems deep in the bowels of the system. The reason I was so paranoid was that leading up to the actual deployment we had so many problems in the QA and staging environment that I was not prepared for success.

With the recent rise of TDD (Test-Driven Development) as a way to manage complexity and quality within systems, it only seemed natural to use it as a way to manage legacy system maintenance issues. If you have tried such measures in your own legacy system maintenance efforts you may already know that this is a daunting feat. There are many issues to overcome when writing test harnasses around legacy code such as:

  • Extremely Long Methods/Functions – Extremely long methods/functions can cause even the smallest of modifications to take a relative eternity. In order to combat this we must use tools such as those found in Martin Fowler's “Refactoring to Patterns”. One possible antidote for this, according to the book, is the “Compose Method” pattern. I believe that many of us, if not all, have used this even when it did not have a name. The basic premise is that if you break up an extermely long method into smaller chunks it will be easier to modify and maintain. This seems like a simple pattern but it can produce major improvements in overall maintainability.
  • Coupling Between Components of the System – Projects evolve over time with many different developers and teams implementing components using their own methods. All of these differing methods tend to complicate and corrupt the system component topology. In order to fix that urgent bug corners are cut and left for somebody else to deal with later. Also, inexperienced developers, and sometimes even experienced developers, may implement components that introduce coupling through subtle antipatterns or code smells.
  • Brittle Boundaries Between Layers - The promise of SOA (Service-Oriented Architecture), in my opinion, is to promote loose coupling between services such as EJB session beans, SpringFramework services, web services, and grid services. This may be done through small, concise boundary definitions using Java interfaces, dependency injection, WSDL, and other configuration methods such as those found in JNDI and Jini. In the past, I have seen boundary definitions which I believe completely conflict with this promise such as using primitives as return types from service methods. In this case, the client needed to know too much about the contents of the returned string. If the string was null or empty then they could procede successfully. If not, then the client should throw an exception with the returned string in the message. In a refactoring effort with this offending code base we created DTO (Data Transfer Objects) and business exceptions which allowed us to provide a boundary definition to send meta data regarding the entire service request and possibly all of the issues which needed to be resolved in order to be successful.
  • Out of Date Languages/Tools – One great thing about the software development industry today is that we are continually finding better ways to get our work done. Over time we introduce new ways to perform builds, interact with our SCM (Software Configuration Management), and integrate with a relational database. The problem is that we don't have all of these tools today which will make us even more productive in the future. Therefore when we deal with a legacy system we are many times faced with upgrading parts of the system which are not related to business value immediately but will help the business move faster in the future to maintain the system. These things may include build technologies such as Ant, project management tools such as Maven, and IDE (Integrated Development Environments) such as Eclipse or NetBeans.

Recently, I was informed about a set of tools which have changed my software development methods forever. Paul Dupuy, who also recommended the “Working Effectively with Legacy Code“ book, introduced me to the “acceptance test” category of tools. These tools are not neccessarily for the end of a project iteration, but are used to improve communication between the business customer and the team before implementing features or bug fixes. Here is listing of the tools:

  • Fit – Created by Ward Cunningham “for enhancing collaboration in software development”
  • Fitnesse – “The fully integrated standalone wiki, and acceptance testing framework”
  • Selenium – Created by ThoughtWorks to test web applications

These tools are essential for delivering features and understanding when the user story and/or feature is finished from the customer‘s perspective. We have found this to be an effective means for discussing details of a user story from the customer. By developing acceptance tests with the customer which portray the intended results, we are able to definitively show when the story is finished. Finished is a bit misleading, but if you think of user stories as a description based on your current intentions or understanding, then a user story may finish but a new user story could be introduced to replace or enhance the original story.

Now, what does this have to do with legacy system maintenance? Earlier in this blog entry I asserted a typical path for dealing with legacy systems. With the new understanding about acceptance test frameworks I believe that we can now reconstruct this path to be more efficient. Here is the updated path:

  • Build, Configure, and Deploy the Project – We are still going to be faced with the learning curve of getting familiar with the legacy system project. I would like to hear any ideas on getting around this one. I think that the best antidote so far for this is taken during the previous phases of the project before we even get it for legacy maintenance. Using tools such as those found in agile methodologies are good start for any prolonged project maintenance cycle.
  • Update Build Scripts and Introduce Continuous Integration – In my opinion, continuous integration is an essential component in developing high quality software. This means that the faster you get your project decorated with automated builds and tests the better quality output and success your project will have. The value of this initial effort towards high quality and efficiency should be promoted from the start of the legacy maintenance project. Taking a practical approach to continuous integration can make it more palatable to the business customer if they are not already familiar with it's benefits. You may decide intially to just use automated builds with basic email notification to the team when problems occur. This alone can be helpful in determining when and where a bug was introduced. Once this is in place you may be able to add in automated unit, integration, acceptance, and deployment tests.
  • Create Acceptance Tests with Customer for Bug Fix or Enhancement – As I asserted previously, surrounding legacy system components with unit tests where there were none before is a daunting task. I have found that most legacy system maintenance projects have a large learning curve for a newly introduced team to be productive. In many cases, we only have the running application and the users themselves as a resource for bug fixes and enhancements. The acceptance testing tools mentioned above are quite helpful in mapping out the current behavior and creating tests for expected behavior once the bug or enhancement is completed. Notice that in the previous sentence that I ended it with the word “completed”. What percentage of the time on your current project do you actually feel like you have completed a task before the customer even looks at the results. This leads into our next step…
  • Implement Bug Fix or Enhancement – Since they have already helped you identify what completed will mean, we can feel confident that the change we made is what the customer wanted. Just run the acceptance test suite on your bug fix or enhancement.
  • Deploy to Production using Scripts and Acceptance Tests – Over time, we may create deployment scripts and enhance the current procedures to prevent manual process variability from inhibiting successful deployment. These should be integrated into your continuous integration procedures and tested using acceptance tests which were created with the system administrator who is deploying your legacy system updates.

I am sure that I have not thought of all the permiatations in legacy system maintenance and am well aware of the added complexity that system integration issues add into the fray. I believe that acceptance testing frameworks can help in combating this complexity along with effective software development methodologies, diligent continuous integration procedures, and the creation of supporting utilities such as test fixtures and performance monitoring tools.

Please comment on this blog entry if you want to add, modify, or question anything contained within. I am eager to hear other experiences and ideas on how we can make our legacy system maintenance better.

SOA is for Kids

Posted by on 29 Nov 2005 | Tagged as: Distributed Computing

Not that I wish to be apologetic regarding the title of this entry, but I feel it needs some explanation. The title by no means is intended to suggest SOA (Service Oriented Architecture) is simple or without merit. The intent of this entry is to assert the importance of SOA as a means to improve enterprise architectures and their extended maintenance complexity. Given this assertion, I contend that SOA must be a critical part of computer science programs and corporate training strategies for software development. Also, SOA is extremely complimentary to agile methodologies in the delivery of high quality and valuable software assets in small calculated chunks.

The rise of layered architectures, such as those presented in J2EE blueprints and MVC (Model View Controller) based frameworks like Spring and Struts, has been integral to a design which contains a middle layer for business logic. This layer, also known as the “service layer”, provides valuable support and accelerates delivery of business products and services. A problem which has persisted with these architectures is their lack of complete service decoupling. Teams must combat coupled designs through slices of the layered architecture with pre-defined interface documentation, technical requirements documents, coding standards, and code reviews.

In my opinion, the frameworks mentioned above are not at fault. I believe that the problem of service coupling is due to our training techniques for developer application design over the past few years. A shift in software development design must be made such as those made when structured, function prototype, and object-oriented design were introduced. I should mention a friend of mine who has wrote a white paper, “A View-Oriented Approach to SOA” by Scott Came, which attempts to define an appropriate method for SOA design. I agree with many of the points made in his white paper and would like to describe effective programming models which facilitate good SOA design.

The following image depicts the high level systems within an SOA implementation:

SOA Overview

Each component within the SOA has a role in the effective design:

  • Service Registry – This is a location where clients may find services they are interested in using. Examples of a service registry are UDDI, Jini's Reggie, and Spring XML configuration descriptors.
  • Service Proxy – This is the interface defined for communicating with your service. This may be a compiled WSDL to Java or C#. The proxy may be as smart or dumb as your service requires. RMI uses downloadable serialized proxy objects to communicate with a remote service. For Spring you use the AOP proxy object created at runtime using an XML configuration file.
  • Deployment Configuration – A deployment configuration allows service to be used dynamically at runtime. In Jini you might configure your service to use the start.jar with net.jini.config.ConfigurationFile instance which defines the communication protocol to use between your service proxy and the service implementation. Another example is the J2EE application descriptors such as those used in EJB configuration and JNDI name binding.
  • Message – The data sent to your service for processing such as XML for SOAP or REST web services, Java objects for RMI, or JMS messages into a queue.
  • Service Grid / Bus – In many distributed systems, containers are used to manage objects across a grid or bus. J2EE defines an EJB container for managing services and entities. Rio is a grid fabric which is built on top of Jini which manages JSB (Jini Service Beans). ESB (Enterprise Service Bus) usually have service containers which have been standardized in the Java community through the JBI (Java Business Integration) specification.
  • Service Interface / Implementation – The service interface defines how clients may communicate messages with a service. This can be defined using WSDL, Java interfaces or POJO (plain old Java objects), or IDL in CORBA. The implementation of this service is the message handler.

In my experience, there is an initial barrier to entry for SOA development which seems obvious at first glance but derails many developers, including myself at one time or another. When creating a desktop or console application, most of the time we are working with in-memory models which can be passed around in graphs easily between components. SOA-based applications or systems revolve more around message passing between clients and services.

The difference in paradigm brings some different design philosophies such as the amount of data and the frequency we communicate between components or services of the system. We might not think twice about making a call for each item added to our order if it is inside our desktop application. If we are adding these items to our order over the network you may fall pray to “The Eight Fallacies of Distributed Computing”. A possible way to combat this is a local persistent cache and guaranteed delivery options such as those found in JMS.

Many times, SOA-based applications or systems are a step away from overly evolved OO designs. Since a service tends to contain finer grained slices of functionality, developers can focus more of their efforts on handling the task at hand. For example, adding an item to an order may only entail finding the order by a unique identifier and then adding the specific item passed into the service via a message. In a service grid or bus there may be any number of services which can perform this action. And all of those services may just be copies of each other and therefore handle the message in the same manner.

To be continued…

Role of the Bit Flipper

Posted by on 29 Nov 2005 | Tagged as: Java

What is the role of the “bit flipper” nowadays? We all have friends who seem to be the best programmer you have ever seen. You may even be that person who everybody says is a great programmer. It‘s the programmer who can come up with elegant algorithms to solve any issue. Squeezes out the most performance possible from their system functions. Actually attempts to win encryption breaking contests. With the recent updates in software development which have paved our way toward systems integration and further away from the actual bits, does a “bit flipper“ stand a chance?

I have a theory that “bit flipper” types have an enormous chance to become quite valuable to these recent software development practices. I think the key is in automated testing and especially test-driven development. I believe this may be the last frontier for manipulating bits and solving interesting algorithms. In the world of systems integration we tend to work on business level objectives and logic. In the testing of these systems there still is a need for creative thinking and extremely performant harnasses.

What do other people think about the role of the “bit flipper”?