Short Circuit caused power Down=>SAN Down=>All databases servers using this storage Crashed=>Oracle 9i DB started fine=>but…OOPS! SAP Application not working properly
Once a User complained about some modules having issues :
There was a Database Alert Log recorded ORA-600 internal error with different first actual code argument. A search into Meta link showed that DB had problems in executing DMLs on some tables, as some of the indices on these tables had logical corruption. So it was thought, when DB crashed, DMLS on these tables were in progress,causing update of index entries. so BAD row mismatch occurred as key-row information was not consistent. So queries and DMLs which would involve these indexes were failing.
Restart of the database, automatic Crash Recovery of Database[without intervention of DBA] was successful, there was no corruption shown in logs nor in any datafile block. So when DB started, SAN was made up again. After restoring power supply, it was opened successfully. But it had a logical corruption hidden inside bad leaf blocks of indices. Automatic crash recovery was not able to fix index logical corruption. This is obvious behavior. Some won’t expect such issue and blame RDBMS some won't. I won't.
Solution:
1. Identify such indices which are in bad shape and fix problematic indexes
a) drop and recreate such indexes OR
b) rebuild them online so that it read key values/rowed from table (normal rebuild reads existing index entries so corruption does not go away)
2. Restore from backup and recover
The latter was not feasible as DB had size 1TB and prod database was on RAID 5. It would have taken more than 2 days to restore it!!!
Another case:
I seen some people killing their application had corrupt indexes. It means when they killed they had some running DMLs on tables.
Once a User complained about some modules having issues :
There was a Database Alert Log recorded ORA-600 internal error with different first actual code argument. A search into Meta link showed that DB had problems in executing DMLs on some tables, as some of the indices on these tables had logical corruption. So it was thought, when DB crashed, DMLS on these tables were in progress,causing update of index entries. so BAD row mismatch occurred as key-row information was not consistent. So queries and DMLs which would involve these indexes were failing.
Restart of the database, automatic Crash Recovery of Database[without intervention of DBA] was successful, there was no corruption shown in logs nor in any datafile block. So when DB started, SAN was made up again. After restoring power supply, it was opened successfully. But it had a logical corruption hidden inside bad leaf blocks of indices. Automatic crash recovery was not able to fix index logical corruption. This is obvious behavior. Some won’t expect such issue and blame RDBMS some won't. I won't.
Solution:
1. Identify such indices which are in bad shape and fix problematic indexes
a) drop and recreate such indexes OR
b) rebuild them online so that it read key values/rowed from table (normal rebuild reads existing index entries so corruption does not go away)
2. Restore from backup and recover
The latter was not feasible as DB had size 1TB and prod database was on RAID 5. It would have taken more than 2 days to restore it!!!
Another case:
I seen some people killing their application had corrupt indexes. It means when they killed they had some running DMLs on tables.