Originally posted on “Databases at CERN” blog
Most of you for sure know, that ability to restore data in case of failure is a primary skill for each DBA. You should always be able to restore and recover data you’re responsible for. This is an axiom. To be sure, that you’re able to do it, you should test it on regular basis. There is of course possibility to use some Oracle features, like
BACKUP ... VALIDATE or
RESTORE ... VALIDATE commands, but if you want to be certain as much as possible, the only way is just to run real restore and recovery. Doing it periodically for big amount of databases is extremely tough, both because of resources needed and DBA time. That’s why one of our DBAs Ruben Gaspar Aparicio has created recovery system, which is heavily used at CERN. Good news – it is available as Open Source on Sourceforge (Recovery Platform). Since its release we’ve introduced many modifications to it, but still it could be good starting point to check the source code in order to start developing your own solution. We’re using it as a validation of our backup strategy, to run real restore and recovery every week or two for most of our databases.