tfactl summary as HTML (really tricky)

I read an interesting article from Michael Schulze, Optiz Consulting in the actual „Red Stack Magazin / April 2020“, about Tools for the daily Exadata maintenance but it is not so easy as described.

Since some time ago AHF (Autonomous Health Framework) is the tool for checks of any kind on the Exadata.

I have installed and used here Version 19.3 which is an old one. I will update in a few weeks the whole environment to Version 20.1.

So I try to create the „summary HTML“ Report but it will not work as described in Michaels article.

Why?

 



tfactl summary -overview -html


 

Sometimes you need to start the command twice and you NEED to enter a „q

after you saw the „tfactl_summary>“ this is important otherwise the HTML report will not be created.

 

Example

tfactl summary -overview -html

WARNING – TFA Software is older than 180 days. Please consider upgrading TFA to the latest version.

  Executing Summary in Parallel on Following Nodes:

    Node : exa01                             

    Node : exa02                             

    Node : exa03                             

    Node : exa04                             

LOGFILE LOCATION : /opt/oracle.ahf/data/repository/suptools/exa01/summary/root/20200502174852/log/summary_command_20200502174852_exa01_177655.log

  Component Specific Summary collection :

    – Collecting CRS details … Done.   

    – Collecting ASM details … Done.   

    – Collecting ACFS details … Done.

    – Collecting DATABASE details … Done.

    – Collecting EXADATA details … Done.

    – Collecting PATCH details … Done.   

    – Collecting LISTENER details … Done.

    – Collecting NETWORK details … Done.

    – Collecting OS details … Done.      

    – Collecting TFA details … Done.     

    – Collecting SUMMARY details … Done.

  Remote Summary Data Collection : In-Progress – Please wait …

  – Data Collection From Node – exa02 .. Done.           

  – Data Collection From Node – exa03 .. Done.           

  – Data Collection From Node – exa04 .. Done.           

  Prepare Clusterwide Summary Overview … Done

      cluster_status_summary                   

                                                                                                                           

  DETAILS                                                                                             STATUS    COMPONENT 

+---------------------------------------------------------------------------------------------------+---------+-----------+

  .-----------------------------------------------.                                                   PROBLEM   CRS        

  | CRS_SERVER_STATUS   : ONLINE                  |                                                                        

  | CRS_STATE           : ONLINE                  |                                                                        

  | CRS_INTEGRITY_CHECK : FAIL                    |                                                                        

  | CRS_RESOURCE_STATUS : OFFLINE Resources Found |                                                                        

  '-----------------------------------------------'                                                                        

  .-------------------------------------------------------.                                           PROBLEM   ASM        

  | ASM_DISK_SIZE_STATUS : WARNING - Available Size < 20% |                                                                

  | ASM_BLOCK_STATUS     : PASS                           |                                                                

  | ASM_CHAIN_STATUS     : PASS                           |                                                                

  | ASM_INCIDENTS        : FAIL                           |                                                                

  | ASM_PROBLEMS         : FAIL                           |                                                                

  '-------------------------------------------------------'                                                                

  .-----------------------.                                                                           OFFLINE   ACFS       

  | ACFS_STATUS : OFFLINE |                                                                                                

  '-----------------------'                                                                                                

  .-----------------------------------------------------------------------------------------------.   PROBLEM   DATABASE   

  | ORACLE_HOME_DETAILS                                                        | ORACLE_HOME_NAME |                        

  +----------------------------------------------------------------------------+------------------+                        

  | .------------------------------------------------------------------------. | OraDB12Home1     |                        

  | | DB_CHAINS | DB_BLOCKS | INCIDENTS | PROBLEMS | DATABASE_NAME | STATUS  | |                  |                        

  | +-----------+-----------+-----------+----------+---------------+---------+ |                  |                        

  | | PROBLEM   | PASS      | PROBLEM   | PROBLEM  | i10   | PROBLEM | |                  |                        

  | | PROBLEM   | PROBLEM   | PROBLEM   | PROBLEM  | p10     | PROBLEM | |                  |                        

  | | PASS      | PASS      | PROBLEM   | PROBLEM  | i20    | PROBLEM | |                  |                        

  | '-----------+-----------+-----------+----------+---------------+---------' |                  |                        

  '----------------------------------------------------------------------------+------------------'                        

  .--------------------------------.                                                                  PROBLEM   EXADATA    

  | SWITCH_SSH_STATUS : CONFIGURED |                                                                                       

  | CELL_SSH_STATUS   : CONFIGURED |                                                                                       

  | ENVIRONMENT_TEST  : PASS       |                                                                                       

  | LINKUP            : PASS       |                                                                                       

  | LUN_STATUS        : NORMAL     |                                                                                       

  | RS_STATUS         : RUNNING    |                                                                                       

  | CELLSRV_STATUS    : RUNNING    |                                                                                       

  | MS_STATUS         : RUNNING    |                                                                                       

  '--------------------------------'                                                                                       

  .----------------------------------------------.                                                    OK        PATCH      

  | CRS_PATCH_CONSISTENCY_ACROSS_NODES      : OK |                                                                         

  | DATABASE_PATCH_CONSISTENCY_ACROSS_NODES : OK |                                                                         

  '----------------------------------------------'                                                                         

  .-----------------------.                                                                           OK        LISTENER

  | LISTNER_STATUS   : OK |

  '-----------------------'

  .---------------------------.                                                                       OK        NETWORK

  | CLUSTER_NETWORK_STATUS :  |

  '---------------------------'

  .-----------------------.                                                                           OK        OS

  | MEM_USAGE_STATUS : OK |

  '-----------------------'

  .----------------------.                                                                            OK        TFA

  | TFA_STATUS : RUNNING |

  '----------------------'

  .------------------------------------.                                                              OK        SUMMARY

  | SUMMARY_EXECUTION_TIME : 0H:2M:31S |

  '------------------------------------'

+---------------------------------------------------------------------------------------------------+---------+-----------+

        ### Entering in to SUMMARY Command-Line Interface ###

tfactl_summary>list

  Components : Select Component - select [component_number|component_name]

        1 => overview

        2 => crs_overview

        3 => asm_overview

        4 => acfs_overview

        5 => database_overview

        6 => exadata_overview

        7 => patch_overview

        8 => listener_overview

        9 => network_overview

        10 => os_overview

        11 => tfa_overview

        12 => summary_overview

tfactl_summary>q

        ### Exited From SUMMARY Command-Line Interface ###

--------------------------------------------------------------------

REPOSITORY  : /opt/oracle.ahf/data/repository/suptools/exa01/summary/root/20200502174852/exa01

HTML REPORT : <REPOSITORY>/report/Consolidated_Summary_Report_20200502174852.html

--------------------------------------------------------------------

 

So enter „q“ and the HTML Report is created. Then start a browser and you the Report which is really helpful.

 

 

Have fun :-) and if possible use directly AHF 20.1, or do an Update like me …..

 

 

 

 

 

Exadata wrong Kernel version after dbnode upgrade

While doing a lot of Exadata Upgrades I get a problem on one of my DB nodes

The patchmgr works without an error and the Cluster starts up

I did a few checks including if all nodes have the same Kernel version

but node01 has an older Kernel Version than the other nodes

 

node01: Linux node01 4.1.12-124.23.4.el7uek.x86_64 #2 SMP x86_64 x86_64 x86_64

node02: Linux node02 4.14.35-1902.9.2.el7uek.x86_64 #2 SMP x86_64 x86_64 x86_64

node03: Linux node03 4.14.35-1902.9.2.el7uek.x86_64 #2 SMP x86_64 x86_64 x86_64

node04: Linux node04 4.14.35-1902.9.2.el7uek.x86_64 #2 SMP x86_64 x86_64 x86_64

 

Looks like that during the Upgrade of node01 the Kernel was not updated

I checked all logfiles but I can’t find an error

So I checked the installed Kernel

 

rpm -qa | grep -i kernel

kernel-transition-3.10.0-0.0.0.2.el7.x86_64

kernel-ueknano-4.14.35-1902.9.2.el7uek.x86_64

 

Okay the new Kernel is installed but seems to be not in place

In Oracle Linux 7 you need to check the „grub.cfg“ file

/boot/efi/EFI/redhat/grub.cfg

–> showing older kernel

I changed the following line with the new Kernel version

from value „/initrdefi /initramfs-4.1.12-124.23.4.el7uek.x86_64.img“

to value „/initrdefi /initramfs-4.14.35-1902.9.2.el7uek.x86_64.img“

 

Then I update the configuration and reboot the server

grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

Generating grub configuration file …

Found linux image: /boot/vmlinuz-4.14.35-1902.9.2.el7uek.x86_64

Found initrd image: /boot/initramfs-4.14.35-1902.9.2.el7uek.x86_64.img

Found linux image: /boot/vmlinuz-4.1.12-124.23.4.el7uek.x86_64

Found initrd image: /boot/initramfs-4.1.12-124.23.4.el7uek.x86_64.img

 

Reboot the node01 and the db-node came up with the correct Kernel

Linux node01 4.14.35-1902.9.2.el7uek.x86_64 #2 SMP x86_64 x86_64 x86_64 GNU/Linux

 

Attention

„Please make such a change only if you are absolutely sure and a fallback scenario is in place for example boot via diag.iso otherwise please open an Service Request“

Autonomous Health Framework (AHF) available

Bildschirmfoto 2019-10-28 um 19.36.24

Oracle released the new Autonomous Health Framework since one week. It is very interesting to go through the list of new and changed features.

Yes after a very long time we had one Tool which bring everything together:

  • one single Interface
  • all diagnostics tools in one bundle makes everything easier
  • Automatic proactive compliance checks helps to fix problems
  • Diagnostic when the failure occur ensures you get everything for the resolution

To get familiar go to Oracle Support and Download the AHF Framework. You find everything while open the Doc ID 2550798.1

Download install and have fun :-)

In the meantime I did an installation on a Exadata X7 4 Node RAC and it works perfect. The old data of tfactl and exachk will be moved to the new directories and everything starts very smooth. Easy and smooth Setup. Now we have the Tools in one location on the Compute Nodes.

 

 

impdp with parameter „cluster and parallel“ for a RAC Cluster

For using the „parallel“ parameter during an import (impdp) on a Oracle RAC Cluster you need to prepare your environment.

The „parallel“ parameter works correctly when you do the following:

– mount point were the export dump resides must be available on ALL cluster members

– create a Service on the database for the impdp job

srvctl add service -s impdp_service -d xdb1 -pdb xpdb1 -preferred xdb11,xdb12 -available xdb13

srvctl start service -s impdp_service -d xdb1

– Check that the service is running

srvctl status service -s impdp_service -d xdb1

Now you are ready to use the impdp „parallel“ parameter

Here an example with „cluster=y parallel=6

impdp system@xpdb1 directory=dump dumpfile=full_%u.dmp schemas=DB1 cluster=y parallel=6 service_name=impdp_service status=180 logfile=imp_xpdb1.log METRICS=Y logtime=all

impdp Log Parameter which are really helpful for analyzing are:

METRICS=Y

logtime=all

Extract from the Logfile

You see that there are detailed informations about the worker process for example W-1 = Worker 1

W-1 Completed by worker 1 757 TABLE objects in 38 seconds
W-1 Completed by worker 2 764 TABLE objects in 37 seconds
W-1 Completed by worker 3 765 TABLE objects in 48 seconds
W-1 Completed by worker 4 765 TABLE objects in 53 seconds
W-1 Completed by worker 5 766 TABLE objects in 34 seconds
W-1 Completed by worker 6 765 TABLE objects in 44 seconds

W-5 Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA

Worker 5 is processing TABLE_DATA

For analyzing the impdp process you get so detailed informations try the next time.

Depending on your hardware you can also use different integer values for the „parallel“ parameter but a large number will not help in every situation.

Have fun with impdp on your RAC Cluster….

 

 

 

 

12.2 Grid Patching lesson learned

What happend?

During the last month I updated manually the TFA Software.

I  do this update while the TFA release installed via the Patchset is an older Version. This happens while Oracle Support adds the TFA release which is available while they create the Patchset.

Last weekend I start Patching GI Software 12.2 to RU Oct 2018 on a 4 Node Exadata Cluster

As best practice I do the installation manually and not via opatchauto.

First activity is:

/u01/app/12.2.0.1/grid/crs/install/rootcrs.sh -prepatch

This ends with the following error message:

2019/03/09 13:36:12 CLSRSC-46: Error: ‚/u01/app/12.2.0.1/grid/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar‘ does not exist
2019/03/09 13:36:12 CLSRSC-152: Could not set ownership on ‚/u01/app/12.2.0.1/grid/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar‘
Died at /u01/app/12.2.0.1/grid/crs/install/crsutils.pm line 7573.
The command ‚/u01/app/12.2.0.1/grid/perl/bin/perl -I/u01/app/12.2.0.1/grid/perl/lib -I/u01/app/12.2.0.1/grid/crs/install /u01/app/12.2.0.1/grid/crs/install/rootcrs.pl -prepatch‘ execution failed

The following Doc ID 2409411.1 describes how to fix this by modifying two files. I should be fixed in Grid Release 18. 

$GRID_HOME/crs/sbs/crsconfig_fileperms.sbs
$GRID_HOME/crs/utl/<node>/crsconfig_fileperms

remove the following two entries.
unix %ORA_CRS_HOME%/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar %HAS_USER% %ORA_DBA_GROUP% 0644
unix %ORA_CRS_HOME%/suptools/tfa/release/tfa_home/jlib/jewt4.jar %HAS_USER% %ORA_DBA_GROUP% 0644

I made the changes but it did not fix the problem. So I can’t go on with the Patching. For me it looks like a problem with the file permissions.

So next research on MOS and I found this important Doc ID 1931142.1:

„How to check and fix file permissions on Grid Infrastructure environment“

Yes, this was the solution :-)

cd /u01/app/12.2.0.1/grid/crs/install/

./rootcrs.sh -init

Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params

As an add on in the note you can check after the „-init“ the complete GI Installation with the following cluvfy command.

cluvfy comp software -n all -verbose

Verifying Software home: /u01/app/12.2.0.1/grid …2894 files verified
Verifying Software home: /u01/app/12.2.0.1/grid …PASSED

Verification of software was successful.

CVU operation performed: software
Date: Mar 11, 2019 10:10:11 AM
CVU home: /u01/app/12.2.0.1/grid/
User: oracle

This is very helpful. Finally I start the GI Patching without any problems

Lesson learned

„It is a good idea to check from time to time the status of the Software via cluvfy.“

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

exachk 18.4 does not run 12.2.0.1 database

For all of you who use exachk and have updated to exachk Version 18.4. There is a known Bug on Oracle databases 12.2.0.1.

While running exachk as root user the program try to connect to the database and this will no work.

You saw the following message:

„OS authentication is not enabled so please enter sysdba privileged user name for <User name>:- sys

Enter password for sys@<User name>:-

SELECT 1 FROM DUAL

*

ERROR at line 1:

ORA-01012: not logged on

Process ID: 0

Session ID: 0 Serial number: 0“

 

The workaround is to use „-shell“ option when calling exachk

# ./exachk -a -o v -shell

 

 

 

 

 

 

Arbeitsgruppen Treffen „Engineered Systems“ in Nürnberg

Wie jedes Jahr trifft sich die Arbeitsgruppe „Engineered Systems“

am Vorabend der Konferenz.

Termin: Montag, den 19.11.2018, um 17:00 Uhr

Ort: Nürnberg, ConventionCenter Ost Messezentrum, 90471 Nürnberg

Der Raum befindet sich im Zwischengeschoss

Wir haben spannende Themen auf der Tagesordnung und das Oracle Exadata und ODA Product-Management ist ebenfalls vor Ort.

Es wird wie immer interessant mit vielen Neuigkeiten rund um die Engineered Systems

Das Networking kommt natürlich auch nicht zu kurz

Wir sehen uns in Nürnberg :-)