Wednesday, August 27, 2014

Revoke privileges from PUBLIC may cause ORA-600

Oracle Enterprise Manager (EM) reports on many security policies when they are violated. One of them are about execute privileges on standard PL/SQL packages granted to PUBLIC. One a day this summer with nothing better to do I went on to revoke these privileges. The wages for my efforts was several  ORA-600 errors:
ORA-00600: internal error code, arguments: [qmxqtmChkXQAtomMapSQL:2], [], [], [], [], [], [], [], [], [], [], []
00600. 00000 -  "internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s], [%s], [%s]"
*Cause:    This is the generic internal error number for Oracle program
           exceptions.     This indicates that a process has encountered an
           exceptional condition.
*Action:   Report as a bug - the first argument is the internal error number
A quick research on My Oracle Support indicated that some standard packages depends on these privileges granted to PUBLIC. The solution is to grant back to PUBLIC if you are in a hurry, or grant the privilege to the owners of the individual packages (like XDB). The errors continued after the granting was done, but stopped after I bounced the database instance.

Just because EM reports something and the solution seems obvious, it is not alway wise to apply remedy immediately, especially in production. It surprised me a little that EM suggests a security policy that Oracle's own packages do not adhere to.

Sunday, August 24, 2014

Correlation between restore time and used space in a datafile

Recently during restore of a large database (over 10 TB) we monitored the alert log looking for messages like  «Full restore complete of datafile 42 to datafile copy … Elapsed time: 0:39:32»

We saw that the time to restore a 32GB file varied a lot, from around 20 minutes to around 75 minutes. This made me wonder why. We noticed that some of the files belonging to the UNDO tablespace  took usually little time, and then I found a couple of more data files that took little time to restore. Then I thought that maybe the UNDO tablespace at the time of the backup had many unused blocks (because there weren’t many active transactions at the time). So I checked the other two files that did not take long time to restore and found out they had lots of free space in them too. 

So I decided to take the file id and the restore time from all the messages in the alert log and load it into Oracle, in the source database (this all happened when we were making a clone for a RAT session). Then I converted the elapsed time to seconds, found  file size from DBA_DATA_FILES and free space for each file from DBA_FREE_SPACE. Calculating the correlation between restore time and (file size - free space) resulted in 0.96, but for the correlation between restore time and file size, resulted in 0.38. (I used the CORR function.)

By the way the backup was stored on a file system created on a logical volume from a group whose   LUNs where stored on an EMC VMAX 20, but the data files were restored to ASM on LUNs in an EMC XtremIO. 

 As pointed out on Twitter:

it could be caused by the RMAN compression of unused blocks. Next step would be a test with creating a database with several files with different degree of free space, and doing two RMAN backups, one with and another without compression and compare restore time.

This may sounds like a lot of fuzz about nothing, but what we had in mind was the deduplication that the XtremIO does; wondering if the deduplication of empty blocks could  take less time than blocks with real data in it.