Tech Talk

Wednesday, March 21, 2007

Rails vs Struts

I had done a study on the size and effort metrics and performance of an application developed on Rails vs that on Struts. Here are the observations:

Please note that the performance figures are not an exact measure and need to be considered with these points in perspective. There might be scope for improving the performance of the application, as there was no performance tuning performed for the application. In addition, following points also need to be taken into account.

Ruby VM: Ruby is right now an interpreted language. But a project (code name Rite) plans to make it byte compiled. This would improve the execution speed of Ruby based programs.
Framework method tuning: The Rails framework has been around only for a little more than a year. Development is on profile various components especially Active Record and Helper methods. The tuning effort would further improve the performance of applications using Rails.
Choice of web server: WebRick was used as the web server for this application. Apache and Lighttpd are the suggested web servers for better performance.

-Ashish.

Wednesday, February 28, 2007

Sun 'Tech'-aways

Last week saw techies from all around the country and outside flocking at the International Convention Center in Hyderabad. It was the Sun Tech Days symposium, a worldwide developer event organized by Sun Microsystems and touted as the mini Sun One conference! With 3000+ attendees, it almost seemed like all roads led to Novotel,Hitex. Although far from the heart of the city, the venue was aptly located very close to the Hi-Tech City thus gaining a lot of visibility from the various IT companies. Both the ICC and Sun representatives did a commendable job of managing the crowd, time, booths, technical sessions and the whole event as such. To top it all, there was good food and gifts galore! Event of this scale can of course not do without sponsors and the list included some of the big names from the industry like AMD, Oracle, VmWare, Accenture, SAS, and NIIT.

The technical sessions were divided into multiple tracks alligned with the technologies - J2SE, J2EE, J2ME and open solaris. While on the first two days the attendees were free to choose between any of the sessions going parallaly in these tracks, on the third day, they had to choose between one of - Netbeans, J2ME and Open solaris - tracks. The sessions on ‘Java scripting’ and ‘JRuby’ in the J2SE track were quite interesting and having worked on Ruby, were quite appealing. It was interesting to know that the Java platform was no longer restricted to the Java language alone but supported various scripting languages like JRuby, Javascript, Groovy and Jython. The session on ‘JMX and concurrency’ talked about the new concurrency API in JDK 5.0 and its rich features that address the issues with the old Thread support. The JMX API which was already a part of J2EE 1.4 has now been adopted into the core java platform.

Among the more interesting sessions (and among the ones that I attended!) in the J2EE track, was the one on J2EE 5 and Glassfish. Apart from the technical talk on the new EJB 3.0, persistence API, JAX-WS and JAXB, the speaker encouraged the attendees to be a part of the Glassfish community and contribute to the development of the reference implementation.

The booth on Java DB, an Apache Derby based project, introduced the java based database. It is best suited for small and medium scale applications and can be bundled along with the application! An ideal solution for sharing database backed applications. The session on SPOTs was very impressive. The Small Programmable Object Technology aims at using Java for programming devices having embedded chips… something that was in the domain of C all this while. The demo showing three robots trying to chase each other recieved a loud applause from the audience. VmWare gave a presentation on their virtualization technology, which apparently is receiving a lot of attention and popularity. The technology evolved trying to explore solutions to the under utilization of the hardware. The presentation introduced the idea of treating one physical system as multiple virtual machines and then focused on its impact on cost, n/w maintenance, and availability. Their VMotion technology that allows runtime movement of virtual machines (and therefore the appliations running on them) from one system to another without requiring the systems to be down indeed has a lot of potential.

On the third day, as part of the Open Solaris track, there was an interesting talk on Opengrok. Opengrok is a tool that indexes a given code allowing developers to browse and search the codebase. Although the intent behind its development was to enable quick and easy browsing of the proliferating open source code on the internet without having to set up a development environment, it can be used for any application codebase in our day to day projects. What more, it also has support for dropping the ‘opengrok’ed code into a web server and browsing the code over the network!

I must say that it was a good learning experience and a forum to meet other developers and exchange ideas with them. And did I mention about the discounts in technical books and journals! :)

The sessions can be downloaded from the Sun Tech Days page on the Sun website:
http://developers.sun.com/events/techdays/index.jsp

- Ashish

Tuesday, December 26, 2006

Closures in Java

I hit upon closures first (well at least I thought so until I knew what they were and realized that I had used them before) when coding in Ruby. Closures are blocks of code that can be passed as arguments to other methods. In this sense they are similar to function pointers in C. But closures are a tad more powerful in that they also extend the scope of variables in the lexical scope of where they are used. Refer http://www.eclipsezone.com/eclipse/forums/t86911.html for a precise definition.

I recently completed a project involving Ruby on Rails. Having worked with Java for the last three years, I could not help but compare it with Java. I was wondering if any of the language and framework features could be implemented in Java and the numerous frameworks based on it. Blocks and closures are impressive language constructs that allow customization and extension of other language constructs (like the looping constructs) and APIs.

Gafter, Gosling et al's proposal to add closures in Java seems to be a good bet. This would put an end to the addition of more new statements like 'for each' that was added in J2SE 5.0.

Am keeping an eye on the numerous forums and blogs abuzz with discussions on Java and Ruby. It's interesting how Ruby and more importantly the Rails framework is causing so much flutter.

Pointer:
http://www.javac.info/closures-v04.html

-Ashish.

Wednesday, June 28, 2006

Ruby on Rails

I got hold of this stone when I attended a presentation organized in the geeknight at ThoughtWorks. The presenters (ThoughtWorkers) were highly enthuisiastic about this project that they had just developmed using the Rails framework. I had only vaguely heard about this new opensource framework and had put it aside as another addition to the slurry of web frameworks that are already in use. The claim that it achieves ten times faster development appeared fantastic - a conclusion drawn from one-off occurence! It was only when I attended this presentation that I got to appreciate the beauty of this precious framework (read stone :)) and how it gets near zero turnaround time.

So what is Ruby and what is Rails? And what then is Ruby on Rails!

Ruby is a dynamically typed object oriented programming language inspired by other languages like Perl and Smalltalk. And Rails is an open source web framework developed in Ruby. Within a short period, it has gained in popularity for its reduced development time and higher productivity. It is known to be one of the most well thought-out frameworks in existence today.

Rails’ support for agile web development can be attributed to various features and its guiding principles - Do not Repeat Yourself (DRY) and Convention over Configuration. Database backed enterprise applications developed today have quite a few things in common. A UI component, a controller component a model component, and an ORM layer. Frameworks like Struts, Spring while providing these components, leave the configuration part to the developers. The developers have to do the menial task of building the configurations and the bunch of XMLs that are common in any application. This leads to a copy paste mentality and repeatative code across applications. It is precisely this that Rails tries to eliminate.

Ruby by its very design makes it easy to create domain-specific languages and metaprograms leaving the developers to just focus on the business logic. Also, Rails has its self defined convention for integrating the various components that form part of this full stack framework (ohh yes, its a full stack framework so no need to manage the idiosyncracies of different frameworks at different layers). Thus no more XML configuration files! Talking of the full stack support, Rails has various components that map to the M, V and C. Active Record maps to the model. The programmer is only required to subclass the ActiveRecord: : Base class; and the program by itself determines the table and column details! View is implemented by Embedded Ruby with syntax close to JSP. Controller is taken care of by the Action Pack classes.

What more but new technologies like Ajax and Web Services have been integrated with Rails making it convenient to implement applications requiring their use. Among its other features, good programming practices forming a part of this framework result into easily maintainable code.

All this might sound like I am painting a rosy picture but try it out now... get your hands dirty and you will see that they don't get all that dirty after all :-). Of course RoR can't be without its drawbacks; nevertheless its promising features were convincing enough for me to choose it as the topic of my final year MS dissertation!

Pointers-
http://www.rubyonrails.org/
http://wiki.rubyonrails.com/rails

-Ashish.

Thursday, January 19, 2006

Sudoku

Sudoku has become the buzz word around. Small or big, everyone seems to have been enchanted by this apparently simple number game. Like most of us, I stumbled upon this game a few days back in the Times, and have become its avid fan ever since. The only game that had driven me nuts like this before was the Rubik's cube! Basically of Japanese origin (or is it US!) the aim of the game is pretty straight forward. You have to arrange numbers from 1 - 9 in a 9 x 9 sqaure such that each number appears exactly once in every row, column and every 3 x 3 block. [Something similar to Euler's magic quare but not quite same]. It involves no luck, no math, but sheer logic, and there lies the beauty of this game.

Solving it requires patience, and with practice one can crack it in a few minutes. Well, to be frank it's not just the matter of practice, but about applying your mind, and learning with every game - a neat test of your logical skills. Try it out right away - http://www.sudoku.com/ - and you would know what I mean!

For all those programmers out there, this could be a challenging coding problem. Simple as it might seem, this in fact belongs to one of the tuffest class of problems. By that I don't mean that it cannot be solved, but just that it might not be solvable in polynomial time, so to speak. Those with a little background in algorithmics would have guessed that it has non-polynomial complexity. It is actually NP-complete and thus belongs to the league of Knapsack, graph coloring and 8-queens problem! A trivial approach to it would of course be the good old backtracking! Starting with the first blank sqaure, fill it up with a 1, move on to the next blank... continue in a similar fashion, until you break one of the constraints. As soon as you do, backtrack and fill up the last sqaure with the next possible number. Wait for a sufficiently long time and you will eventually have a solution for sure.

There are better approaches. Every blank sqaure can be associated with a list of possible values. This list will be finite and could be set to the universal set [1-9 here] to start with. Now scan the matrix, apply the contraints to each row, column and 3 x 3 block and keep reducing the set of possible values for each blank square until you are left with just one value. This approach would require many scans and updates to the lists before we have a solution. However it can be made faster if we carefully observe the pain points. Those of you who have solved a few sudoku's would be able to relate this approach to how a human mind would solve it. Note how you scan the matrix and you would soon realize that if you scan it for maximally occuring number, the count of subsequent scans would be less. In other words, if you store the frequency of occurence of numbers (currently on board), and scan the matrix for the number with max. frequency, not only would the updates be less, but so would be the required scans. I believe that applying backtracking for sufficiently small sizes of the lists would also be faster. This is so, as we would be saving on the time spent on scanning. Note also, that there's a significant time spent searching for blank squares. This could easily be eliminated by storing references to the neighboring blank squares!

It might interest you that this problem belongs to a class of problems called the Exact Cover problem. Knuth gives an elegant algorithm (which he calls the X algorithm) and an equally elegant implementation approach called Dancing Links (so called because of the way in which the doubly linked lists change links as the solution proceeds). For details, here's Knuth's paper on the Dancing Links solution - http://xxx.lanl.gov/PS_cache/cs/pdf/0011/0011047.pdf
It's implementation in java can be found in this Sudoku solver.

Keep Sudoking!
-Ashish.

Wednesday, January 18, 2006

Escape Analysis

The idea of implementing Escape Analysis in Mustang (J2SE 6) seems to be an impressive bet and finds a mention in this developerworks article -
http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html?ca=dgr-jw22JavaUrbanLegends
Escape analysis refers to the analysis done on a program to determine those dynamically allocated objects and their references that "escape" or are likely to escape the method or thread scope. This has many applications with potential implications on the program performance. The knowledge of objects being bound by the method scope opens up the possibility of allocating them in the method stack frame rather than on the heap as is typically done for any dynamically allocated objects. Dynamic memory allocation (& deallocation) comes at a price. It comes with the associated problems of fragmentation, data locality and overcoming these is costly both in terms of time and space. Talking within the Java perspective, the garbage collector takes care of reclaiming the dead object space, but even the most recent generational copying collectors can "stop-the-world" for significant amount of time. Stack allocation is not only free of these idiosyncrasies but also the deallocation happens implicitly on method return. Moreover, better data locality means that there would be less cache misses.

Knowledge of local objects also gives the compiler/optimizer a chance to get rid of the object altogether in some cases. The link above has a code snippet justifying the same.

If an object does not escape the thread, the overhead associated with its synchronization can be eliminated. This is significant because forgetting to synchronize objects being accessed by multiple threads can pose serious issues like race condition, spurious and unexpected results etc. Compilers and optimizers often reorder instructions for efficient execution. While this is acceptable in a typical sequential execution, this might cause unexpected results in a multithreaded multiprocessor environment. To add to this, the java memory model does little to ensure the data atomicity, visibility and ordering in such cases. And it is not to blame... with new environments coming up every other day, it is next to impossible to ensure correct execution. Needless to say, thread local objects are desirable and anything that can be done to identify them in a multi threaded program is worth the effort.

Okay, so escape analysis seems to be desirable. How do we go about implementing it? The short answer could be - Do it however you like! :)
Basically, what this would require is an understanding of the relationship between the various dynamically allocated objects and references... something similar to what a compiler would do in the dataflow analysis. We need to track the object right from it's inception to how it is accessed in a method to how it gets passed to other methods to whether it would be accessed by other threads until it is reclaimed by GC. There are ample articles to peruse varying in details on escape analysis in general, it's application in Java and various implementation strategies. Here's a pointer -
http://citeseer.ist.psu.edu/choi99escape.html

Whether or not escape analysis is implemented in Mustang is still being debated and you will find many forums dedicated to this topic. But then nothing stops us from delving into the code and contributing to the build!

Happy Coding :)
- Ashish.

Wednesday, November 16, 2005

Sticky Stale connections

My first encounter with the stale connections was when we first launched an interactive J2EE application on the Websphere Application Server 3.5.6 on a SUN Solaris box. The application frequently received StaleConnectionException (com.ibm.ejs.cm.portability.StaleConnectionException), an exception that became a potential showstopper for us. In spite of frantic googling and research and changes varying from implementation of retry logic in the application to changing the connection pool settings to changing the zparams values on the Db2 server, the sticky issue continued to haunt us. Frustrated with the results, we finally had to do away with the connection pooling support of WAS and obtain the connections directly from the Driver Manager instead. Now as we migrate the application from WAS 3.5.6 to WAS 5.1.1, we were again challenged by the same issue, but this time we finally seem to have got a solution. What follows is an account of the steps we took to combat this "sticky" exception.

Well before I forget, the package structure for the exception has changed since WAS 4, and it is now qualified as com.ibm.websphere.ce.cm.StaleConnectionException.
The first thing we did was to google on StaleConnectionException. Sure enough, there is enough documentation on the net on the possible causes of this exception. And any of these could cause the managed connection objects in the connection pool to become "stale". Any statement executed on this stale connection would cause the driver implementation to throw a SQLException with some SQL state and error code. The WAS implementation maps a set of these error codes to a sub class of the SQLException - StaleConnectionException - and throws this exception to the requesting application. What it also does is that it clears the connection pool off all the unusable connection objects from the datasource in question. So ideally, the application should receive a good connection object the next time it requests for one. This prompted us to implement a retry logic something like this:
do {
try {
get connection object;
execute statement;
...
break;
}catch(StaleConnectionException se) {
retryCount++;
}
}while(retryCount < MaxTtries);
Unfortunately, this did not help. Thinking that the pool implementation might be taking some time to refresh the connection pool, we tried putting delay between the tries... but to no avail.
Meanwhile, we were in touch with contacts from other projects. We got all the data access code reviewed to confirm that we were closing all the connection objects, and closing all the cursurs and statements explicitly. Thus there was no chance of spurious connection objects becoming stale.

We also tried changing the driver implementation from the type 2 db2java to type2/4 UDB driver but the exception continued to recur.

Sometime in the middle of all this, we also tinkered with a few connection pool parameters but without success. If only we had been a little more careful [:(] 'coz we ultimately found the solution in one of those parameters! Well, before that... by now we were almost certain that there was something going awry in the WAS connection pool implementation but influenced by the statement

Recovering from stale connections is a joint effort between the application
server run time and the application developer.

in the IBM infocenter for WAS 5.1 http://www-306.ibm.com/software/webservers/appserv/was/library/library51.html we implemented a solution to redirect the users to an error page with a friendly error message asking them to retry the operation. Just when we had given up the hope and were thinking that this was the best we could do, we stumbled upon a link that suggested that applications running behind a firewall might be affected by connection timeout setting at the firewall. This could inturn surface as StaleConnectionException for the application. They also suggested that the "unused connection timeout" - one of the connection pool parameters - be set to a value smaller than the firewall setting. It was then that we realized that the connections might be getting closed lower in the network stack, WAS runtime at the application layer being unaware of it. Please note that the pool implementation does not do any check on the validity of the managed connection object before returning one to the requesting application. We set the connection timeout value to 300 seconds and sure enough... we have not seen the exception post this change. Note that the unused connection timeout value should still be greater than the reap time (default 180 sec.) for it to make sense.

One more suggestion or rather a best practice to add to this "list" is to localize the data access as much as possible. While the DAO pattern helps to a certain extent, what it does not suggest is the localization of the code that uses the connection object. Have a single place where you open the database connection, make the backend call, and close the connection. (Avoid replicating the same in every data access method for instance).

Causes for the stale connections are many. Although we found our solution in the connection pool parameter setting, it's equally likely that some other application would find theirs in connection objects being left open in the code. What I am heading at is that if you are a victim of StaleConnectionException, you might want to try all the above options... and this might not even be an exhaustive list!