Featured Post

The great debacle of healthcare.gov

This is the first time in history when the president of the United States of America, or probably for any head of state around the world,...

Monday, October 19, 2009

Computing confidence level of a software project or iteration

Usually once a Project or an Iteration or a Release finishes up, the corresponding team sits in a review meeting (there are so many names of this kind of meeting e.g. Lesson Learned Meeting, Retrospective Meeting etc.) and tries to identify the mistakes the team had made (we don't like to hear that we've done any mistakes as a natural human tendency, that's why people name it as "Lesson learned"), what are the remedy so that the same mistake doesn't occur in next cycle (but the lessons of history is "Nobody learns from history"). My objective of this blog is not to dig down on those issues but to answer a simple question asked at the beginning or end of the meeting (usually by the manager) and that is "How we've done in this last release?"

There are so many ways of answering this question. Like:
- "we've done a great job. The number of defect has gone down greatly ...
- "It was good. The developers had written comparatively more codes in this release (along with LOC, if available)"
- "The quality of business requirement was better; the developer didn't have much complain about requirement issues in this release".
- And this can goes on.

But shouldn't we have a straight forward way of evaluating the success or failure or comparing with other releases of projects which would be in unambiguous way and most of all, measurable. After all, we develop software systems which works in a predictive path (unless it's AI or robotics project) and can be measured by numbers.

I would rather expect the answer in below format instead of the answers mentioned earlier:
- "The project was great! The Project Confidence Level (PCL) was 4.5 in this release", or
- "The release went somewhat good. The PCL was 2.3, we've room to improve in the field ...", or
- "The PCL was 1.9. It didn't meet our expected goal. Lets talk about what we had done wrong ..." etc.
And through this unambiguous number we can even set a target for the next release that would be easy to communicate to the project team members.

So now, lets talk about what could be a Project Confidence Level (PCL). By PCL we'd be able to associate a number to a development release or a project using the available historical facts gathered for the project or similar projects. Ther are various kinds of facts in a software development project that can be used to compute the PCL. The number of facts would vary depending on the availability of information in a project. The higher number of facts you include in your formula, the more effective your PCL calculation would be. Below are the sample of some facts.

Average LOC/feature
Average LOC/developer in a Development day
Average LOC/Defect
Average LOC/Critical Defect
Average LOC/Medium Defect
Average LOC/Simple Defect
Average Defect/Feature
Average Critical Defect/Feature
Average Medium Defect/Feature
Average Simple Defect/Feature
Average Development Day/feature
Average project Day/Feature
Average Work days/release
Average LOC/release
Average LOC/Developer in a release
Average Feature/Release
Average Defect/Release

Each above mentioned facts would be given it's weight/impact on the development effort and the total would be added up to 100. Consider the each average facts has confidence level as 1. So the average confidence level of the project would be 1 when the following formula is used:
Average PCL = {summation of (each fact * weight)} divided by 100.

And after each release, the above facts would be counted for that release and through a mathematical formula the PCL can be determined. The PCL lower than 1 is unacceptable, > 1 is expected and the higher the PCL is the greater the achievement of the team.

Thursday, October 15, 2009

Oracle error: ORA-00600: internal error code, arguments: [15264], [], [], [], [], [], [], []

I was playing with the partitioned table in Oracle 10g. At one instance, I issued a drop table statement to drop a partitioned table but in the middle of the run I changed my mind and stopped the statement execution. Then I tried to re run the statement but got stuck with the oracle error:

ORA-00600: internal error code, arguments: [15264], [], [], [], [], [], [], []

After googling the error, I found some suggestion to purge the recycle bin and bounce the database server to get rid of the error.

SQL> purge recyclebin;
SQL> shutdown immediate;
SQL> startup

In my case, I tried to drop the table after couple of days without any database restart and successfully able to drop that table. So it seems that Oracle internally has managed to fix it in the course of time even without a restart. Anyway, the immediate action would be to restart as usually you won't have leverage to wait days to do the table drop.

Wednesday, October 14, 2009

Table Partitioning in Oracle

What is table partitioning?

Table partitioning in Oracle is a feature given by the database system to store and manage a table's record in physically partitioned spaces in the storage but transparent to all who are accessing the table.

There are different kind of strategies to partition a table:
- Range partition: partition the table's record in a range, for example, of number or Date
- List partition: this strategy is used when it is required to organized table's record using some values e.g. State, City, Country etc.
- Hash partition: a hash function is used to distribute the records across the partitions.

The partition can have sub partitions as well. In Oracle 10g, the valid sub partitions type within partition are:
- Range partition ->List sub partition
- Range partition -> Hash sub partition

The benefit of using partitions are:
- The access to table's record are limited to partition so search would be much faster
- Partitions can be independently backed up and restored without hampering the table's availability
- Partitions can contain the indexes within itself (Local index)
- Partitions data can be moved to a table with a single command (Exchange partition)

1. Create a partitioned table

CREATE TABLE PARTITIONED_TABLE
( ID INTEGER NOT NULL,
AS_OF_DATE DATE NOT NULL,
BALANCE NUMBER(25,7),
DEBIT NUMBER(20,5),
CREDIT NUMBER(20,5)
)
PARTITION BY RANGE (AS_OF_DATE)
(
PARTITION YEAR_JAN_2006 VALUES LESS THAN (TO_DATE('01-FEB-2006', 'DD-MON-YYYY')),
PARTITION YEAR_FEB_2006 VALUES LESS THAN (TO_DATE('01-MAR-2006', 'DD-MON-YYYY')),
PARTITION YEAR_MAR_2006 VALUES LESS THAN (TO_DATE('01-APR-2006', 'DD-MON-YYYY'))
)


2. Populate data into the partitioned table, if available

insert into partitioned_table (id, as_of_date, balance, credit, debit)
(select id, as_of_date, balance, credit, debit from regular_table);

insert into bsl_rdr.rdr_erec_partitioned (
id, as_of_date, balance, credit, debit)
values (1, '30-JAN-2006', 100.0, 20.0, 100);

Commit;


3. Analyze the table to gather statistics data

analyze table partitioned_table compute statistics

analyze table partitioned_table PARTITION(YEAR_JAN_2006) compute statistics;

4. Check the statistics of the partitioned table

select t.table_name, t.partition_name, t.num_rows, t.high_value from user_tab_partitions t where t.table_name = 'PARTITIONED_TABLE'

5. Exchange partition's data into a new table

ALTER TABLE PARTITIONED_TABLE
EXCHANGE PARTITION YEAR_JAN_2006 WITH TABLE

6. Split a partition into two

ALTER TABLE PARTITIONED_TABLE
SPLIT PARTITION YEAR_JAN_2006 AT (TO_DATE('01-Jan-2006', 'DD-MON-YYYY'))
INTO(
Partition YEAR_DEC_2005,
Partition YEAR_JAN_2006
);

Note: Be careful about the sequence when a partition needs to be splitted to actually create a new partition with the lowest range value, you might want to keep the original partition, for example, to hold, for example, 1 month records and the new partition would hold the records below that date. Keep the new partition name first in the INTO(....) part and then keep the original partition name to achieve that objective. In above example statement, I wanted to keep YEAR_JAN_2006 partition to hold the January, 2006 records and YEAR_DEC_2005 would contain records from December, 2005 and backwards.

7. Add a new partition

alter table PARTITIONED_TABLE
add partition YEAR_MAR_2010 VALUES LESS THAN (TO_DATE('01-APR-2010', 'DD-MON-YYYY'))

8. Renaming a partition

alter table PARTITIONED_TABLE rename partition YEAR_MAR_2006 to YEAR_FEB_2006

9. Drop a partition

alter table partitioned_table
drop partition YEAR_JAN_2006_1

Resources:
http://www.dbazine.com/oracle/or-articles/nanda6 - Methods to migrate from regular table to partitioned table

Friday, October 9, 2009

Data Dictionary: a thought

Data Dictionary (DD) is usually created by the development lead or the database analyst during or after the creation of the Data Model. The most popular format of a DD is using a spreadsheet or in rare case text document. Some people also uses the database's Data Dictionary infrastructure (Oracle has a Data Dictionary infrastructure where you can enter Definitions for Tables and Columns as comments in a system table. The second most important thing after the creation of the definition is the change log. Most of the time I've seen this has been ignored or given less importance in a project. But to me this is a single most important thing the Data Model designer should emphasize apart from the Data Dictionary definitions. Because even though we think of a initial Data Model as invincible for next application changes but, in practical life, it is not.

I've seen and used another spreadsheet to maintain the change logs which is asynchronous and not integrated with the Data Model designing activities. Now the question is how we can enforce the change logs with the Data Model activities. Here is a proposed solution:

We can have a Data Model for the Data Dictionary. Below is a pseudo script of the DDL

TABLE DD_TABLE (
TABLE_NAME -- may be prefixed with schema to uniquely identify
TABLE_DEFINITION
TABLE_STATUS -- Active, Inactive, Decommissioned, Future_Use etc.
CREATE_COMMENT
CREATED_BY
CREATED_DATE
LAST_UPDATED_BY
LAST_UPDATED_DATE

)

TABLE DD_TABLE_HISTORY (
TABLE_NAME (FK from DD_TABLE)
OLD_VALUE
NEW_VALUE -- redundant information for reporting purpose
CHANGE_COMMENT
CHANGED_BY
CHANGED_DATE

)


TABLE DD_COLUMN (
TABLE_NAME (FK of DATA_DICTIONARY_TABLE)
COLUMN_NAME
TABLE_STATUS -- Active, Inactive, Decommissioned, Future_Use etc.
CREATED_BY
CRATED_DATE
LAST_UPDATED_BY
LAST_UPDATED_DATE
LAST_UPDATE

)

TABLE DD_COLUMN_HISTORY (
TABLE_NAME (FK from DD_TABLE)
COLUMN_NAME (FK from DD_COLUMN)
OLD_VALUE
NEW_VALUE -- redundant information for reporting purpose
CHANGE_COMMENT
CHANGED_BY
CHANGED_DATE

)

Triggers can be used to enforce to have default log entry into the history tables unless otherwise entered by the Data Model designer. What do you think about it?