Troubleshooting an Exadata Environment on the command line

This article describes how you can do an effective Exadata troubleshooting from the command line and show some “tools” which you can used.

What are major components inside the Exadata System? (I know that there are some other components like KVM switches and PDU etc. but they makes normally very rare problems.)

  • Compute Node
  • Storage Cells
  • Infiniband Switches

Troubleshooting Exadata Compute Nodes

We have more or less three components on the Compute Node:

  • Clusterware, ASM
  • Oracle Rdbms
  • Config Files related to the Storage Cells

Clusterware, ASM and the Rdbms Software are the same as you are working on a RAC node there is no difference. While the Exadata System is directly connected to the Storage Cells you need to check for the Cell Config Files.

Troubleshooting the Compute Nodes

Type Description
OS Syslogs /var/log/messages
OS & Clusterware Exawatcher Logs ..oracle.oswatcher/osw/archive/
Oracle Rdbms Exachk Script Comprehensive Report very detailed
 ASM / Storage Cells related KFOD / V$Views ASM tools used by Support : KFOD, KFED, AMDU (Doc ID 1485597.1)Storage Cell config files

Troubleshooting Exadata Storage Cells

First a overview from the documentation

cell_architect

What are the components of the Storage Cell

  • Restart Server
  • Management Server
  • cellsrv

Troubleshooting the Storage Cell

Type Description
OS Syslogs /var/log/messages
Cell Server adrci You can use the ‘adrci’ tool like you were using it for the Rdbms Software
cellcli List alerthistory

 

Troubleshooting Infiniband Switch

Type Description
OS Exawatcher The Logfiles from the Exawatcher
Topology check ibdiagtoolsverify-topology Check all cables, links and speed. For more information use “-h” option
ibdiagtoolsinfinicheck Validate the configuration and performance

 

Conclusion

Troubleshooting an Exadata System looks very similar to an Oracle RAC environment with ASM and Clusterware. Major additional components are the Storage Cells and the Infinband Switches.

So if you would like to do a troubleshooting in a Exadata environment on the command line here are some additional tips:

  • Run the exachk every month and before and after every upgrade. Exachk is a very powerful tool. A must for the daily business.
    • Keep in mind that the new version has an option
    • “–diff <old_report> <new_report>”
      • So it is very easy to find differences between System changes
  •  Automate your work by some small shell scripts here an example for reading the alerthistory of the Storage Cells

for i in 1 2 3 4 5 6 7
do

ssh root@cell{$i} “cellcli –e list alerthistory |grep –I critical”
done

With these kind of easy scripts you can check your environment very fast and effectively.

Stay tuned while I will show the benefit of an exachk report.

Advertisements

Patchen im Oracle Datenbank Umfeld

Wer sich mit dem Patchen von Oracle Datenbanken beschäftigt, hat auf solch eine My Oracle Support Note schon länger gewartet.

MOS Note:1962125.1 – Overview of Database Patch Delivery Methods

Es werden die verschiedenen Typen von Patches wie Patch Sets, PSU’s, SPU, interim Patches, Quarterly Full Stack Bundle Patch, Bundle Patches und Proactive Patches beschrieben. Weiterhin werden Empfehlungen für das Einspielen und Testen von Patches beschrieben.

Abschliessend beinhaltet die MOS Note noch eine kurze FAQ Section und geht zum Schluss auf das Thema Proactives Patchen ein.

Diese MOS Note ist für alle DBA’s ein Muß!