Important Change for Grid 12.2 Upgrade and MGMTDB

I just did an Grid Infrastructure Update to Oracle 12.2 with the latest PSU Aug 2017.

I did my preparation with the help of the Doc ID 2111010.1

„12.2 Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running and later on Oracle Linux“

In the Post-upgrade Steps was written that the Management DB should be deconfigured

  • Post-upgrade Steps

      • Deconfigure MGMTDB


The deconfigure is obsolete and the MGMTDB is now part of the Update. 

Here the original text from the Doc ID.

September 28,2017
  • MGMTDB will now be part of upgrade, the flags to reove and deconfigure are removed.



For me it looks like that the revision of the document is not yet finished

So if you plan a Grid Upgrade in the near future read the document very carefully and if needed open an SR

Exadata GI Upgrade to + PSU Jul2016


Recently I did a upgrade to Grid Infrastructure on a few Exadata Clusters

Here my summary of the installation

Before you start please read the following MOS note 1681467.1. This note is very helpful and describes the whole procedure in a Exadata environment

It’s not only the Upgrade to In the same „session“ I also install the GI PSU Jul 2016 and the Oneoff Patch 23273686 because there is a known BUG in the SCAN Listener area (see below)

At the end of the article some „real news“ from GI (see below „real“ news)

So let’s start

First of all keep in mind that the clusterware must be „up and running”

Step 1 Oracle environment


 export SRVM_USE_RACTRANS=true 



Step 2 GI Software Installation

The next slides show the GI Installation Procedure













If the Setup is at that point you need to do the following but

please don’t close the Installer window

Step 3 Install latest opatch tool

Download opatch tool  Patch 6880880 (better you did it before)

On a Exadata the Installation can be done in one step via dcli

dcli -l oracle -g dbs_group unzip -oq -d /u01/app/ -d /u01/patchdepot

Step 3 Install GI PSU JUL 2016 23273686

Node 1 srvdb01

[root@srvdb01]# /u01/app/ napply -oh /u01/app/ -local /u01/patchdepot/23273686

Node 2 srvdb02

[root@srvdb02]# /u01/app/ napply -oh /u01/app/ -local /u01/patchdepot/23273686

While there is a known BUG you should directly install the following Oneoff Patch 20734332 here the Doc ID 2166451.1 with the details

(SCAN Listener or local listener fails to start after applying Patch 23273629 – Oracle Grid Infrastructure Patch Set Update (Jul2016))

Step 4 

After you finish the PSU Jul 2016 & Oneoff Patch installation the must be started

Node 1 srvdb01

[root@srvdb01 grid]# /u01/app/

Node 2 srvdb02

[root@srvdb02 grid]# /u01/app/

The script works around 15 minutes so stay calm

It finished with the following messages here as example from the last Node 2 srvdb02


Successfully accumulated necessary OCR keys.

Creating OCR keys for user ‚root‘, privgrp ‚root‘..

Operation successful. 14:53:13 CLSRSC-474: Initiating upgrade of resource types

14:54:33 CLSRSC-482: Running command: ‚upgrade model  -s -d -p first‘

14:54:33 CLSRSC-475: Upgrade of resource types successfully initiated.

14:54:35 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster … succeeded


Step 5 final tasks

finally some configuration tool runs and finished the GI Upgrade including PSU Jul 2016 and Oneoff Patch

And now the „real news“ in 

The most notable change belongs to the GIMR  (Grid Infrastructure Management Repository) 

Beginning with it was an option installing the GIMR Database – MGMTDB

Starting with it is mandatory and the MGMTDB database is automatically created as part of the upgrade installation process of 12.10.2 Grid Infrastructure. If you start a installation from scratch the GIMR Database is directly configured

Some interesting GI & MGMTDB commands

[oracle@srvdb01 ~]$ crsctl query crs activeversion

Oracle Clusterware active version on the cluster is []

[oracle@srvdb01 ~]$ crsctl query crs releaseversion

Oracle High Availability Services release version on the local node is []

[oracle@srvdb01 ~]$ crsctl query crs activeversion -f

Oracle Clusterware active version on the cluster is []. The cluster upgrade state is [NORMAL]. The cluster active patch level is [3351897854].


[oracle@srvdb01 ~]$ srvctl status mgmtdb -verbose
Database is enabled
Instance -MGMTDB is running on node srvdb01. Instance status: Open.

[oracle@srvdb01 ~]$ srvctl config mgmtdb
Database unique name: _mgmtdb
Database name:
Oracle home: <CRS home>
Oracle user: oracle
Spfile: +DBFS_DG/_MGMTDB/PARAMETERFILE/spfile.268.926345767
Password file:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Type: Management
PDB name: srv_cl12
PDB service: srv_cl12
Cluster name: srv-cl12
Database instance: -MGMTDB

reinstall tfactl after GI Upgrade


Recently I finished a Grid Upgrade from to + PSU JUL 2016. So far so good during a check I saw that the old tfactl tool under Software Release where up and running.

That could not be okay.  So I start an Uninstall and Setup for Release

What steps has to be done?

Check the actual tfactl installation

 /u01/app/grid/tfa/bin/tfactl print config

Start the unistall on both nodes

[root@db03 bin]# ./tfactl uninstall
TFA will be Uninstalled on Node db03: 

Removing TFA from db03 only
Please remove TFA locally on any other configured nodes

Notifying Other Nodes about TFA Uninstall...
Sleeping for 10 seconds...

Stopping TFA Support Tools...
Stopping TFA in db03...
Shutting down TFA
oracle-tfa stop/waiting
. . . . . 
Killing TFA running with pid 159597
. . . 
Successfully shutdown TFA..

Deleting TFA support files on db03:
Removing /u01/app/oracle/tfa/db03/database...
Removing /u01/app/oracle/tfa/db03/log...
Removing /u01/app/oracle/tfa/db03/output...
Removing /u01/app/oracle/tfa/db03...
Removing /u01/app/oracle/tfa...
Removing /etc/rc.d/rc0.d/K17init.tfa
Removing /etc/rc.d/rc1.d/K17init.tfa
Removing /etc/rc.d/rc2.d/K17init.tfa
Removing /etc/rc.d/rc4.d/K17init.tfa
Removing /etc/rc.d/rc6.d/K17init.tfa
Removing /etc/init.d/init.tfa...
Removing /u01/app/
Removing /u01/app/
Removing /u01/app/

The same on the other node

The new tfactl Setup

[root@db03 install]# ./tfa_setup -silent -crshome /u01/app/
TFA Installation Log will be written to File : /tmp/tfa_install_63022_2016_10_18-14_30_43.log
Starting TFA installation

Using JAVA_HOME : /u01/app/
Running Auto Setup for TFA as user root...
Installing TFA now...

TFA Will be Installed on db03...
TFA will scan the following Directories
| db03 |
| Trace Directory | Resource |
| /u01/app/ | CRS |
| /u01/app/ | CFGTOOLS |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | INSTALL |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | DBWLM |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/ | ASM |
| /u01/app/ | CRS |
| /u01/app/ | CRS |
| /u01/app/oraInventory/ContentsXML | INSTALL |
| /u01/app/oraInventory/logs | INSTALL |
| /u01/app/oracle/crsdata/db03/acfs | ACFS |
| /u01/app/oracle/crsdata/db03/core | CRS |
| /u01/app/oracle/crsdata/db03/crsconfig | CRS |
| /u01/app/oracle/crsdata/db03/crsdiag | CRS |
| /u01/app/oracle/crsdata/db03/cvu | CRS |
| /u01/app/oracle/crsdata/db03/evm | CRS |
| /u01/app/oracle/crsdata/db03/output | CRS |
| /u01/app/oracle/crsdata/db03/trace | CRS |

Installing TFA on db03:
HOST: db03 TFA_HOME: /u01/app/
| Host | Status of TFA | PID | Port | Version | Build ID |
| db03 | RUNNING | 63460 | 5000 | | 12127020160304140533 |

Running Inventory in All Nodes...
Enabling Access for Non-root Users on db03...

Adding default users to TFA Access list...
Summary of TFA Installation:
| db03 |
| Parameter | Value |
| Install location | /u01/app/ |
| Repository location | /u01/app/oracle/tfa/repository |
| Repository usage | 0 MB out of 10240 MB |

Installing oratop extension..
TFA is successfully installed...
And also the same on the other node.

Last but not least check the new Setup on both Nodes

Check the status and configuration
tfactl print status 
tfactl print config

That's it. :-)
It is very easy and done in a few minutes.
tfactl is a helpful tool not only for Oracle Support 
take a few minutes and go through the following 
My Oracle Support Note: 1513912.1


ASM „corrupted metadata block“ check via amdu / kfed (Part 1)


Last week we had a crash on our Exadata ASM Instance and we are not amused about this but we restart the instance and start working as usually.

About the environment: „GRID Software is Release 12.1 but the diskgroups are compatible“

To be save we start a check on the DATA diskgroup.


The check run online but nearly 25 hours
We saw in the meantime lots of errors in the ASM alert.log

Tue Jun 21 15:47:15 2016
NOTE: disk DATA_CD_10_srv1CD13, used AU total mismatch: DD={514269, 0} AT={514270, 0}
Tue Jun 21 15:47:15 2016
GMON querying group 1 at 567 for pid 52, osid 138892
GMON checking disk 143 for group 1 at 568 for pid 52, osid 138892

A MOS note said this should not be a problem but is this correct …?

The analyze is done via a  dump with amdu of the diskgroup when the „CHECK NO REPAIRS“ is ready.

amdu -diskstring 'o/*/*' -dump 'DATA'

Yes, we start the dump in a directory where we have enough space while the amdu tool creates a lot

of 2GB files dependent from the size of the diskgroup. One small file will also be created during this

dump and it is the  report.txt file.

The report.txt has information about the System, OS, Version, all scanned disks and also a list

about the scanned disks which have „corrupted metadata blocks“.

Here an example

---------------------------- SCANNING DISK N0002 -----------------------------

Disk N0002: '">'

AMDU-00209: Corrupt block found: Disk N0002 AU [454272] block [0] type [0]

AMDU-00201: Disk N0002: '">'

AMDU-00217: Message 217 not found;  product=RDBMS; facility=AMDU; arguments: [0] [1024] [blk_kfbl]

           Allocated AU's: <strong>507621</strong>

                Free AU's: 57627

       AU's read for dump: 194

       Block images saved: 12457

        Map lines written: 194

          Heartbeats seen: 0

  Corrupt metadata blocks: 1

        Corrupt AT blocks: 0

The next question was: „How can we check if this metadata block is corrupted?“

The answer is you need the kfed tool and theAllocated AU’s: 507621″ 

from the report.txt files.

[oracle0@srv1db1]$ kfed read <strong>aun=507621</strong> aus=4194304 blkn=0 dev=o/<a href="" data-saferedirecturl=""></a>

kfbh.endian:                         58 ; 0x000: 0x3a

kfbh.hard:                          162 ; 0x001: 0xa2

kfbh.type:                            0 ; 0x002: <strong>KFBTYP_INVALID</strong>

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:              1477423104 ; 0x004: blk=1477423104

kfbh.block.obj:              3200986444 ; 0x008: disk=732492

kfbh.check:                    67174540 ; 0x00c: 0x0401008c

kfbh.fcn.base:                    51826 ; 0x010: 0x0000ca72

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

kfbh.spare2:                          0 ; 0x01c: 0x00000000

1EFB9400000 0000A23A 580FB000 BECB2D4C 0401008C  [:......XL-......]

1EFB9400010 0000CA72 00000000 00000000 00000000  [r...............]

1EFB9400020 00000000 00000000 00000000 00000000  [................]

  Repeat 253 times

As you saw in the example the „kfbh.type = KFBTYP_INVALID“ which means the metadata block is corrupt.

So and how can I fix this?

In our situation we have an diskgroup which is compatible so we have to start an


Yeah this could be very dangerous.

If the „CHECK ALL REPAIR“ find a corruption and try to repair this the diskgroup will be dismounted

This means all databases which are up and running will crash

But keep in mind that a „CHECK ALL REPAIR“ will also run 25 hours.

Is there another solution?
Yes but you need also a dismount of the diskgroup.

Then run the amdu tool „OFFLINE“ again and check the report.txt file again for corrupted metadata blocks

More details will be discussed in Part 2 about ASM kfed and amdu

So stay tuned.



Exadata ASM Disk overview via kfod


That’s the question:

How can I  easily create an overview from the OS command line of all ASM Disks which are configured?

Exadata ASM Disks

ASM disks in an Exadata Machine are part of the Storage cells and presented to the Compute Nodes via the proprietary iDB protocol

The ASM instance is running on the Compute Node

Each storage cell has 12 hard disks and flash disks. During Exadata Setup the Grid disks where created on the hard disks.
Grid Disks are not visible to the Operating System, only to ASM, Database Instance and related utilities, via iDB protocol

To get an overview of the Grid Disks via command line use the „kfod tool“
Here an output from the kfod discovering a Full Rack:

Login in and set your GRID Environment
[ora1120@s1s2db2 asm_grash]<strong>$ $ORACLE_HOME/bin/kfod disks=all</strong>
 Disk Size Path User Group 
 1: 2260992 Mb o/ 
 2: 2260992 Mb o/ 
 3: 2260992 Mb o/ 
 4: 2260992 Mb o/ 
 5: 2260992 Mb o/ 
 6: 2260992 Mb o/ 
 7: 2260992 Mb o/ 
 8: 2260992 Mb o/ 
 9: 2260992 Mb o/ 
 10: 2260992 Mb o/ 
 11: 2260992 Mb o/ 
 12: 2260992 Mb o/ 
 13: 34608 Mb o/ 
 14: 34608 Mb o/ 
 15: 34608 Mb o/ 
 16: 34608 Mb o/ 
 17: 34608 Mb o/ 
 18: 34608 Mb o/ 
 19: 34608 Mb o/ 
 20: 34608 Mb o/ 
 21: 34608 Mb o/ 
 22: 34608 Mb o/ 
 23: 565360 Mb o/ 
 24: 565360 Mb o/ 
 25: 565360 Mb o/ 
 26: 565360 Mb o/ 
 27: 565360 Mb o/ 
 28: 565360 Mb o/ 
 29: 565360 Mb o/ 
 30: 565360 Mb o/ 
 31: 565360 Mb o/ 
 32: 565360 Mb o/ 
 33: 565360 Mb o/ 
 34: 565360 Mb o/ 
 35: 2260992 Mb o/ 
 36: 2260992 Mb o/ 
 37: 2260992 Mb o/ 
 38: 2260992 Mb o/ 
 39: 2260992 Mb o/ 
 40: 2260992 Mb o/ 
 41: 2260992 Mb o/ 
 42: 2260992 Mb o/ 
 43: 2260992 Mb o/ 
 44: 2260992 Mb o/ 
 45: 2260992 Mb o/ 
 46: 2260992 Mb o/ 
 47: 34608 Mb o/ 
 48: 34608 Mb o/ 
 49: 34608 Mb o/ 
 50: 34608 Mb o/ 
 51: 34608 Mb o/ 
 52: 34608 Mb o/ 
 53: 34608 Mb o/ 
 54: 34608 Mb o/ 
 55: 34608 Mb o/ 
 56: 34608 Mb o/ 
 57: 565360 Mb o/ 
 58: 565360 Mb o/ 
 59: 565360 Mb o/ 
 60: 565360 Mb o/ 
 61: 565360 Mb o/ 
 62: 565360 Mb o/ 
 63: 565360 Mb o/ 
 64: 565360 Mb o/ 
 65: 565360 Mb o/ 
 66: 565360 Mb o/ 
 67: 565360 Mb o/ 
 68: 565360 Mb o/ 
 69: 2260992 Mb o/ 
 70: 2260992 Mb o/ 
 71: 2260992 Mb o/ 
 72: 2260992 Mb o/ 
 73: 2260992 Mb o/ 
 74: 2260992 Mb o/ 
 75: 2260992 Mb o/ 
 76: 2260992 Mb o/ 
 77: 2260992 Mb o/ 
 78: 2260992 Mb o/ 
 79: 2260992 Mb o/ 
 80: 2260992 Mb o/ 
 81: 34608 Mb o/ 
 82: 34608 Mb o/ 
 83: 34608 Mb o/ 
 84: 34608 Mb o/ 
 85: 34608 Mb o/ 
 86: 34608 Mb o/ 
 87: 34608 Mb o/ 
 88: 34608 Mb o/ 
 89: 34608 Mb o/ 
 90: 34608 Mb o/ 
 91: 565360 Mb o/ 
 92: 565360 Mb o/ 
 93: 565360 Mb o/ 
 94: 565360 Mb o/ 
 95: 565360 Mb o/ 
 96: 565360 Mb o/ 
 97: 565360 Mb o/ 
 98: 565360 Mb o/ 
 99: 565360 Mb o/ 
 448: 2260992 Mb o/ 
 449: 2260992 Mb o/ 
 450: 2260992 Mb o/ 
 451: 2260992 Mb o/ 
 452: 2260992 Mb o/ 
 453: 2260992 Mb o/ 
 454: 2260992 Mb o/ 
 455: 34608 Mb o/ 
 456: 34608 Mb o/ 
 457: 34608 Mb o/ 
 458: 34608 Mb o/ 
 459: 34608 Mb o/ 
 460: 34608 Mb o/ 
 461: 34608 Mb o/ 
 462: 34608 Mb o/ 
 463: 34608 Mb o/ 
 464: 34608 Mb o/ 
 465: 565360 Mb o/ 
 466: 565360 Mb o/ 
 467: 565360 Mb o/ 
 468: 565360 Mb o/ 
 469: 565360 Mb o/ 
 470: 565360 Mb o/ 
 471: 565360 Mb o/ 
 472: 565360 Mb o/ 
 473: 565360 Mb o/ 
 474: 565360 Mb o/ 
 475: 565360 Mb o/ 
 476: 565360 Mb o/ 




X5-2 Jan2016 GI – Bug 22135419 – 12C GRID HOME PERMISSIONS NOT RESET




The last days I did a Upgrade on a Exadata X5-2 Machine including the GRID Software.

During the GI Patching  (Patch 22243551) there was no error message.

But after a short while we get a lot of errors from the Database alert.logs.

ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operation not permitted
ORA-27302: failure occurred at: skgpwinit6
ORA-27303: additional information: startup egid = 1002 (dba), current egid = 1001 (oinstall)

The error message means that the „group id“ (gid) was not correct set.

Using asmcmd was not possible.

$ asmcmd

/app/oragrid/product/<a href=""></a>: line 22: /export/home/oragrid/%ORACLE_HOME%/bin/kfod.bin: No such file or directory
/app/oragrid/product/<a href=""></a>: line 22: exec: /export/home/oragrid/%ORACLE_HOME%/bin/kfod.bin: cannot execute: No such file or directory
Use of uninitialized value $clus_mode in scalar chomp at /app/oragrid/product/<a href=""></a> line 5015.
Use of uninitialized value $clus_mode in string eq at /app/oragrid/product/<a href=""></a> line 5043.
Use of uninitialized value $clus_mode in string eq at /app/oragrid/product/<a href=""></a> line 5092.
Use of uninitialized value $clus_mode in string eq at /app/oragrid/product/<a href=""></a> line 5092.
Use of uninitialized value $clus_mode in string eq at /app/oragrid/product/<a href=""></a> line 5092.
Use of uninitialized value $clus_mode in string eq at /app/oragrid/product/<a href=""></a> line 5139.

This is a known Bug 22135419 – 12C GRID HOME PERMISSIONS NOT RESET.

It can be fixed by unlock & relock the GRID_HOME again.

# /u01/app/ -unlock
# /u01/app// –patch

While updating every node here a X5-2 with 4 Nodes the Clusterware will be restarted.

No problem while you can do it node by node.

Oracle Clusterware active version on the cluster is []. The cluster upgrade state is [NORMAL]. The cluster active patch level is [942923749].

Oracle Clusterware active version on the cluster is []. The cluster upgrade state is [NORMAL]. The cluster active patch level is [942923749].

Oracle Clusterware active version on the cluster is []. The cluster upgrade state is [NORMAL]. The cluster active patch level is [942923749].

Oracle Clusterware active version on the cluster is []. The cluster upgrade state is [NORMAL]. The cluster active patch level is [942923749].

Afterwards the cluster upgrade state is [NORMAL]

The Bug is very tricky while during the patching you saw no error