12.2 Grid Patching lesson learned

What happend?

During the last month I updated manually the TFA Software.

I  do this update while the TFA release installed via the Patchset is an older Version. This happens while Oracle Support adds the TFA release which is available while they create the Patchset.

Last weekend I start Patching GI Software 12.2 to RU Oct 2018 on a 4 Node Exadata Cluster

As best practice I do the installation manually and not via opatchauto.

First activity is:

/u01/app/12.2.0.1/grid/crs/install/rootcrs.sh -prepatch

This ends with the following error message:

2019/03/09 13:36:12 CLSRSC-46: Error: ‚/u01/app/12.2.0.1/grid/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar‘ does not exist
2019/03/09 13:36:12 CLSRSC-152: Could not set ownership on ‚/u01/app/12.2.0.1/grid/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar‘
Died at /u01/app/12.2.0.1/grid/crs/install/crsutils.pm line 7573.
The command ‚/u01/app/12.2.0.1/grid/perl/bin/perl -I/u01/app/12.2.0.1/grid/perl/lib -I/u01/app/12.2.0.1/grid/crs/install /u01/app/12.2.0.1/grid/crs/install/rootcrs.pl -prepatch‘ execution failed

The following Doc ID 2409411.1 describes how to fix this by modifying two files. I should be fixed in Grid Release 18. 

$GRID_HOME/crs/sbs/crsconfig_fileperms.sbs
$GRID_HOME/crs/utl/<node>/crsconfig_fileperms

remove the following two entries.
unix %ORA_CRS_HOME%/suptools/tfa/release/tfa_home/jlib/jdev-rt.jar %HAS_USER% %ORA_DBA_GROUP% 0644
unix %ORA_CRS_HOME%/suptools/tfa/release/tfa_home/jlib/jewt4.jar %HAS_USER% %ORA_DBA_GROUP% 0644

I made the changes but it did not fix the problem. So I can’t go on with the Patching. For me it looks like a problem with the file permissions.

So next research on MOS and I found this important Doc ID 1931142.1:

„How to check and fix file permissions on Grid Infrastructure environment“

Yes, this was the solution :-)

cd /u01/app/12.2.0.1/grid/crs/install/

./rootcrs.sh -init

Using configuration parameter file: /u01/app/12.2.0.1/grid/crs/install/crsconfig_params

As an add on in the note you can check after the „-init“ the complete GI Installation with the following cluvfy command.

cluvfy comp software -n all -verbose

Verifying Software home: /u01/app/12.2.0.1/grid …2894 files verified
Verifying Software home: /u01/app/12.2.0.1/grid …PASSED

Verification of software was successful.

CVU operation performed: software
Date: Mar 11, 2019 10:10:11 AM
CVU home: /u01/app/12.2.0.1/grid/
User: oracle

This is very helpful. Finally I start the GI Patching without any problems

Lesson learned

„It is a good idea to check from time to time the status of the Software via cluvfy.“

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Werbeanzeigen

Oracle Database 19.2 for Exadata is available

Oracle 19.2 is available for Exadata on Premise

Yes, and the good news are ..

We actually setup a X7 with Oracle Linux 7 …. which is a prerequisite

I will prepare an article in the near future

 

 

 

 

exachk 18.4 does not run 12.2.0.1 database

For all of you who use exachk and have updated to exachk Version 18.4. There is a known Bug on Oracle databases 12.2.0.1.

While running exachk as root user the program try to connect to the database and this will no work.

You saw the following message:

„OS authentication is not enabled so please enter sysdba privileged user name for <User name>:- sys

Enter password for sys@<User name>:-

SELECT 1 FROM DUAL

*

ERROR at line 1:

ORA-01012: not logged on

Process ID: 0

Session ID: 0 Serial number: 0“

 

The workaround is to use „-shell“ option when calling exachk

# ./exachk -a -o v -shell

 

 

 

 

 

 

Arbeitsgruppen Treffen „Engineered Systems“ in Nürnberg

Wie jedes Jahr trifft sich die Arbeitsgruppe „Engineered Systems“

am Vorabend der Konferenz.

Termin: Montag, den 19.11.2018, um 17:00 Uhr

Ort: Nürnberg, ConventionCenter Ost Messezentrum, 90471 Nürnberg

Der Raum befindet sich im Zwischengeschoss

Wir haben spannende Themen auf der Tagesordnung und das Oracle Exadata und ODA Product-Management ist ebenfalls vor Ort.

Es wird wie immer interessant mit vielen Neuigkeiten rund um die Engineered Systems

Das Networking kommt natürlich auch nicht zu kurz

Wir sehen uns in Nürnberg :-)

 

 

exachk 18.2.0_20180518 released

Oracle released the new Version of exachk. The version is shown as 18.2.0_2018052018 so far so good. The next sentence in the Note is: „What’s new in exachk 12.2.0.1.4 may be found in the „What’s New in 12.2.0.1.4″ section of the User’s Guide available here..“

The new Release model from Oracle is really confusing. From my point of view it makes no sense that you download a Version as 18.2.0 and then the Documentation has a Version 12.2.0.1.4.

 

 

 

 

 

Exadata troubleshooting easy with GetExaWatcherResults.sh

If you have for example a performance problem or a crash on your Exadata Machine and you need an overview about the machine then use the very powerful tool „GetExaWatcherResults.sh“ which based on the data of Exawatcher.

GetExaWatcherResults.sh give you a details about:

  • CPU utilization and details
  • IO Summary
  • and more

GetExaWatcherResults.sh will generate very good graphics and that is what I will show now.

How is the way to get these graphs?


Login as "root"

cd /opt/oracle.ExaWatcher

GetExaWatcherResults.sh --from 02/23/2018_08:00:00 --to 02/23/2018_13:00:00 --resultdir /tmp/exawatcher_230218


Next step



cd /tmp/exawatcher_230218

bunzip2 the archive and un tar it

tar xvf ExaWatcher_exa31_2018-02-23_08_00_00_5h00m00s.tar

 

Change to the newly created directory and you find the following files


index.html
exa31.net_cpu.html
exa31.net.html
exa31.net_iodetail.html
exa31.net_iosummary.html
exa31.net_menu.html
exa31.net_mp.html

 

Start a Browser on the Server or copy the files to your notebook and start analyzing


open the "index.html" files

here the details

 

 

It is easy to answer questions like

  • What is the CPU usage on Friday between 09:00 an 14:00 o’clock?
  • Where do we have a CPU peek during the business hours?

No problem with „GetExaWatcherResults.sh“ and if needed go deeper with other tools for example „awr – report“ on the database itself or an „ash-report“

Try it. It’s very helpful :-)

 

 

 

 

Secure Erase of an Exadata System

What has to be done when the Lifecycle of an Exadata System comes to the end.

You need to do an secure erase of DB and Storage nodes.

By the way you can also secure erase the Switches and PDU, etc. but this is not described in this article.

Documentation

https://docs.oracle.com/cd/E80920_01/DBMSQ/exadata-secure-erase.htm#DBMSQ-GUID-DE6BBDF7-6BCB-412B-AD1E-E0FFEEFCC3AA

and  My Oracle Support Doc ID 2180963.1

Steps to do

I use the method via bootable USB Stick

Download Boot Image via Patch 25470974

You need a complete list of all DB and Storage node names including all IP addresses of each ILOM Server

Tip: Before you start reset the password on each Server for example „welcome1“

Prepare USB Stick

I use my Mac book and do a „dd“ to copy the image to the USB Stick

dd if=image_diagnostics_12.2.1.1.0_LINUX.X64_170126.2-1.x86_64.usb of=/dev/disk2

Start a ILOM Web Console Login for the first Server

In parallel start a ILOM terminal console session to do a restart of the first Server

While the reset is running the Web  Console came up with the BIOS Splash screen. Please  enter


<CTRL+P>

It takes a while and you see the boot menu

Select the USB Stick and press enter to start

the Boot Screen appears

Now it is time to check the output on the ILOM terminal console. You will see after a while a „login:“ prompt


Login as "root" password "sos1exadata" or "sos1Exadata"

If the password doesn’t work contact Oracle Support for help

Weiterlesen „Secure Erase of an Exadata System“

Oracle Essential Support Tools im Exadata Umfeld

Mein DOAG Vortrag zum Thema

„Oracle Support wie gehe ich vor und welche Tools setze ich im Exadata Umfeld ein“

Exadata_Oracle_Support_V1.1

Hinweis:

In der Zwischenzeit hat Oracle eine neue Version den Trace File Analyzer bereitgestellt die man dann auch einsetzen sollte. Einfach über die Doc-ID suchen.

TFA Collector – TFA with Database Support Tools Bundle (Doc ID 1513912.1)

Noch eine Anmerkung für alle die gerade PSU’s etc. eingespielt haben. Leider stellt Oracle Support nicht die oben genannte TFA Version 12.2.1.3.0 mit dem letzten PSU zur Verfügung.  Die Installation des Trace File Analyzer muss separat durchgeführt werden.

 

 

Exadata Flash Cache enabled for Write Back

During Tests for an Migration of a major customer application we saw in our AWR reports that most of the jobs are very write intensive. This was the point where we would like to test what happens when we change the Flash Cache Mode from Write Through to Write Back.

What are the main benefits of Write Back mode:

  • it improves the write intensive operations while writing to flash cache is faster than writing to normal Hard disks
  • on Exadata X3  and newer machines write performance can be improved up to 20X IOPS
  • The Write Back Flash Cache accelerates reads and writes for all workloads

First of all I take a look in Metalink and found a Doc ID 1500257.1 with more details.

What are the requirements?

since April 2017 it is default if the following conditions are full filled:

  • Grid and RDBMS Home
    • 11.2.0.4.1 or higher
    • 12.1.0.2 or higher
    • 12.2.0.2 or higher

and

  • DATA diskgroup has HIGH redundancy

When should I use Write Back?

  • It makes sense if your application is write intensive
  • You find significant waits for „free buffer waits“
  • High IO times to check for write bottlenecks in AWR reports

What are the steps to enable Write Back?

We have the possibility to do it „offline“ so we stop the whole Grid & Rdbms Stack, but you can change it also in a Rolling manner.

  • Actual Flash Cache Mode

dcli -g cell_group -l root "cellcli -e list cell attributes flashcachemode"
cel04: WriteThrough
cel05: WriteThrough
cel06: WriteThrough
cel07: WriteThrough

  • Stop the whole Cluster

crsctl stop cluster -all -f

  • Check State of Flash Cache

name: cel04_FLASHCACHE
status: normal

name: cel05_FLASHCACHE
status: normal

name: cel06_FLASHCACHE
status: normal

name: cel07_FLASHCACHE
status: normal

  • Steps for the Change

These steps has to be done on every Cell Server here as an example

Drop the flash cache on that cell

CellCLI> drop flashcache;
Flash cache cel04_FLASHCACHE successfully dropped.

Shut down Cell service

CellCLI> alter cell shutdown services cellsrv;
Stopping CELLSRV services... The SHUTDOWN of CELLSRV services was successful.

Change Cell Flash Cache mode to Write Back

CellCLI> alter cell flashCacheMode=writeback;
Cell cel04 successfully altered 

Restart the Cell Service

CellCLI> alter cell startup services cellsrv;
Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.

Recreate the Flash Cache

CellCLI> create flashcache all;
Flash cache cel04_FLASHCACHE successfully created

Finally check the State on all Cell Server dcli -g cell_group -l root "cellcli -e list cell attributes flashcachemode"
cel04: WriteBack
cel05: WriteBack
cel06: WriteBack
cel07: WriteBack

So the first step was done and now the tests can go on.

In a few weeks I will give a feedback what are the real improvements so stay tuned.