Challenge the convention: JEE software development estimation

Software estimation is one of the key as well as mystifying subject area in software engineering. There is an ongoing debate between two groups whether developing a software is an engineering discipline or a liberal science. The group that claims that software development is a science often put on the proof of it's inaccuracy in estimating a software development, consistently over the period of time.

There are several well known software estimation techniques or models like Line of Code (LOC), Function Point Analysis (FPA), COCOMO and some others. I've studied few of those estimation process but found overly complex to its value proposition or, to some extent, irrelevant to the projects that I was involved so far in my professional career. Most of my prior software development projects are based on J2EE and to develop Financial and CRM solutions. May be those estimation models are useful for many projects but at least I didn't find those very useful or cost effective in my projects. But I'm no way claiming those models or process as not useful.

In my projects I've been using some of the very straight forward and simple techniques to estimate the development effort (not the entire Software Project through) and those estimations were accepted by the project manager with very minor tweaks. The benefits that I get by using my simpler model are:

The ball park figure asked by the project manager (PM) can be provided with a 10-15% error margin
If PM asks for the base behind your estimation, you've something to defend your number
When your project manager pushes you to include additional feature(s) in the middle of development, you can bargain confidently that is backed by your detail estimation
People trust documents rather than your verbal explanation (but it may vary person to person and depends on the image in the team)

The driver behind my estimation model is experience and historical data. The factor I use in the model is mostly based on my experience and also supported by the historical data of the project. So it can't be mathematically or statistically proven but has justification. The factors I used in the model are:

Complexity
Familiarity
Comfort level
Implementation's spreading (number of places in the system the feature would've impact)
New feature
Modification of existing feature
Up gradation of existing feature
Change distribution
Implementation items (e.g. Business Logic class, Data Access class, Utility class, Configuration file, Database table, Resource file, User Interface class/files etc.)
Requirement stability (e.g. clarity)
Unit/Integration/Functional test case development
Buffer zone

Using the past experience in the project, I've given a value to each of the factors mentioned above and finally add-up the the numbers to get the time estimated to complete the feature implementation.

Here is an example of estimation calculation using the above estimation technique:

Initially it was little simpler as below:

C: Type of change (New=1/Upgrade=2/Modify=3)
D: Estimated complexity (Low=1/Medium-2/High-3)
E: Level of comfort (1 - Did similar before/2 - Didn't do but know/3 - First time doing)
F: Num of new view file
G: Num of view files to modify
H: Num of New Business methods
I: Num of Bussiness methods to modify
J: Num of new Dao methods
K: Num of Dao methods to modify
L: Num of new tables
M: Num of tables to modify
N: Change is for (e.g. how many products)

Estimation: C/1.5+F*3+G*1+H*3+I*1.5+J*2+K*1.5+L*3+M*1) *N * D/2

But later I found that the estimation that I was getting out of the formula was giving me the value with wide error margin and also didn't cover some fine aspect of estimation items. So I later refined it as below that worked for me almost with no issues for 6 releases (each release time spans 2-3 months of development and implementation) over 1 and half year

J: Type of Change (New=3/Upgrade=2/Modify=2/NoChange=0)
K: Estimated Complexity (Low=1/Medium=2/High=3)
L: Comfort Level (1 - Done similar before/2 - Have conceptual idea/3 - No idea/It can be fraction value)
M: Num of new view file
N: Num of view files to modify
O: Num of New Business methods
P: Num of Bussiness methods to modify
Q: Num of new Dao methods
R: Num of Dao methods to modify
S: Num of new utility methods
T: Num of utility methods to modify
U: Num of configuration files to modify
V: Num of new tables
W: Num of tables to modify
X: Num of places change would happen
Y: No. of Unit Test Cases
Z: Integration Testing? (1/0)
AB: Estimation without testing =(J/2*(M*3+N*1+O5*2+P*1.5+Q*1.5+R*0.75+S*0.5+T*0.25+U*0.1+V*2+W*1)*X*K/3*(L))+Y*1
AC: Estimation with testing (in hr) = =AB*1.5
AD: Estimation with testing (in day) [ceiling] =CEILING(AC/8,1)
AE: Estimation without testing (in day) = =AB/8
AF: Estimation without testing (in day) [ceiling] =CEILING(AE,1)

If a single point estimation needs to be communicated, I use the below famous formula to get a realistic estimation figure:

Estimated Day =(AC+2*(AC+AE)+AE)/6.0

It is certain that the above model would be different for other than Java based web application built for enterprises. In that case the value of factors and criteria would require to be tweaked or twisted

Resources

http://www.stellman-greene.com/aspm/images/ch03.pdf

Featured Post

The great debacle of healthcare.gov

Saturday, June 7, 2008

JEE software development estimation

1 comment: