The oldest posts, are written in Italian. If you are interested and you want read the post in English, please use Google Translator. You can find it on the right side. If the translation is wrong, please email me: I'll try to translate for you.

martedì, agosto 02, 2011

A seguito di un crash di entrambe le istanze di un RAC, ho aperto una Service Request su metalink. Più sotto c'è la risposta.

Ecco le cose interessanti:

  1. "KSV master wait" event, indicates the process on the RDBMS side is waiting for a reply from a process on the ASM side.
  2. "ASM file metadata operation" is a wait envet generated by a database remote process (in this case, the database ABC is the clinet)
  3. SIGLOST: SIG is a common prefix for signal names. LOST refers to a lost file lock.

Support note
According to the systemstate dump, from the instance, the majority of the background processes are found dead and last wait for service name: SYS$BACKGROUND , is for 'KSV master wait' . Event 'KSV master wait' indicates the process on the RDBMS side is waiting for a reply from a process on the ASM side.

In both ASM instances, ORA-07445 is being dumped , right before notifying the death of the corresponding instance:

Sun Jul 24 14:45:03 2011
NOTE: ASM client ABCp1:ABCp died unexpectedly.

According to the trace file, it was generated by a remote process waiting for 'ASM file metadata operation'. The call stack of the ORA-07445 is generic and per trace file, "Process received SIGLOST. Therefore the callstack does not point to the location of the problem." However, the callstack contains a hint, and that is the tcpip function, which could point to a problem at the network level.

Given the data above, please check with your system administrator if there's any network issue that could have interrupted the communication between the database and the ASM instances, and let me know the results.

== ODM Data Collection ==

Dump file /sbRdbms/app/oracle/diag/asm/+asm/+ASM1/incident/incdir_17865/+ASM1_ora_5029906_i17865.trc
Oracle Database 11g Enterprise Edition Release - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /sbRdbms/app/oracle/product/ASM
System name: AIX
Node name: abcdbs01
Release: 1
Version: 6
Machine: 00CD96944C00
Instance name: +ASM1
Redo thread mounted by this instance: 0
Oracle process number: 33
Unix process pid: 5029906, image: oracle@abcdbs01 (TNS V1-V3)

*** 2011-07-24 14:45:00.080
*** SESSION ID:(83.3) 2011-07-24 14:45:00.080
*** CLIENT ID:() 2011-07-24 14:45:00.080
*** SERVICE NAME:() 2011-07-24 14:45:00.080
*** MODULE NAME:(oracle@abcdbs01 (TNS V1-V3)) 2011-07-24 14:45:00.080
*** ACTION NAME:() 2011-07-24 14:45:00.080

Dump continued from file: /sbRdbms/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_5029906.trc
ORA-07445: exception encountered: core dump [_ptrgl()] [SIGSEGV] [ADDR:0x7F8000007FC00000] [PC:0x10000DE98] [Address not mapped to object] []

========= Dump for incident 17865 (ORA 7445 [_ptrgl()]) ========
----- Beginning of Customized Incident Dump(s) -----
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x7F8000007FC00000] [PC:0x10000DE98, _ptrgl()]
iar: 0x000000010000de98 lr: 0x0000000102274170
msr: 0xa00000000000d032 cr: 0x0000000024202220
r00: 0x0000000000000028 r01: 0x0fffffffffff9250 r02: 0x0000000110352700
r03: 0x0fffffffffffaf50 r04: 0x0000000000000140 r05: 0x0000000000002683
r06: 0x0000000000000001 r07: 0x0000000000000005 r08: 0x00000000ffff9450
r09: 0x0fffffffffff9348 r10: 0x0700000010f9d400 r11: 0x7f8000007fc00000
r12: 0x0000000044202220 r13: 0x00000001103a8e18 r14: 0x0000000000000000
r15: 0x0fffffffffffac48 r16: 0x00000000000001d5 r17: 0x0000000110282a98
r18: 0x0fffffffffffb104 r19: 0x00000001100caa28 r20: 0x000000010762e2b0
r21: 0x0000000000000000 r22: 0x00000000000000b8 r23: 0x0000000000000000
r24: 0x000000000000008f r25: 0x0000000107d164d8 r26: 0x0000000000000000
r27: 0x00000001100cab80 r28: 0x0fffffffffff9a20 r29: 0x0700000000003640
r30: 0x000000010859bee8 r31: 0x0fffffffffffaf50

*** 2011-07-24 14:45:00.080
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
opiodr ttcpip opitsk opiino opiodr opidrv sou2o opimai_real main start


Argument/Register addr=0x1078833c2.
Dump of memory from 0x107883382 to 0x1078834c2
107883380 7820206C 723A2030 78253031 366C780A [x lr: 0x%016lx.]
107883390 0049424D 6D73723A 20307825 3031366C [.IBMmsr: 0x%016l]
1078833A0 78202063 723A2030 78253031 36780A00 [x cr: 0x%016x..]
1078833B0 72253032 643A2030 78253031 366C7825 [r%02d: 0x%016lx%]
1078833C0 63000A00 00000000 00000000 4E4F5449 [c...........NOTI]
1078833D0 43453A20 50726F63 65737320 72656365 [CE: Process rece]
1078833E0 69766564 20736967 6E616C20 5349474C [ived signal SIGL]
1078833F0 4F53542E 20546865 7265666F 72652C20 [OST. Therefore, ]
107883400 74686520 63616C6C 0A202020 20202020 [the call. ]
107883410 20737461 636B2064 6F657320 6E6F7420 [ stack does not ]
107883420 706F696E 7420746F 20746865 206C6F63 [point to the loc]
107883430 6174696F 6E206F66 20746865 2070726F [ation of the pro]
107883440 626C656D 2E0A0A00 0A457863 65707469 [blem.....Excepti]
107883450 6F6E205B 74797065 3A202573 2C202573 [on [type: %s, %s]
107883460 5D205B41 4444523A 3078256C 585D0049 [] [ADDR:0x%lX].I]
107883470 205B5043 3A307825 6C582C20 0049424D [ [PC:0x%lX, .IBM]
107883480 7B656D70 74797D00 25735D00 205B6578 [{empty}.%s]. [ex]
107883490 63657074 696F6E20 69737375 65642062 [ception issued b]
1078834A0 79207069 643A2025 642C2075 69643A20 [y pid: %d, uid: ]
1078834B0 25645D00 50433A30 78256C58 0049424D [%d].PC:0x%lX.IBM]
1078834C0 41444452 [ADDR]

SO: 0x700000010f9b9e0, type: 4, owner: 0x700000010e692d0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x700000010e692d0, name=session, file=ksu.h LINE:10719 ID:, pg=0
(session) sid: 83 ser: 3 trans: 0x0, creator: 0x700000010e692d0
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x8) -/-
DID: , short-term DID:
txn branch: 0x0
oct: 0, prv: 0, sql: 0x0, psql: 0x0, user: 0/SYS
ksuxds FALSE at location: 0
service name: SYS$USERS
client details:
O/S info: user: oracle, term: , ospid: 1716314
machine: abcdbs01 program: oracle@abcdbs01 (TNS V1-V3)
application name: oracle@abcdbs01 (TNS V1-V3), hash value=335489673
Current Wait Stack:
0: waiting for 'ASM file metadata operation'
msgop=28, locn=1, =0
wait_id=5596 seq_num=9859 snap_id=1
wait times: snap=2.076467 sec, exc=2.076467 sec, total=2.076467 sec
wait times: max=infinite
wait counts: calls=0 os=0
in_wait=1 iflags=0x5a0
Wait State:
auto_close=0 flags=0x22 boundary=0x0/-1
Session Wait History:
0: waited for 'SQL*Net message from client'
driver id=62657100, #bytes=1, =0
wait_id=5595 seq_num=9858 snap_id=1
wait times: snap=4.945453 sec, exc=4.945453 sec, total=4.945453 sec
wait times: max=infinite
wait counts: calls=0 os=0
occurred after 0.000004 sec of elapsed time
1: waited for 'SQL*Net message to client'
driver id=62657100, #bytes=1, =0
wait_id=5594 seq_num=9857 snap_id=1

Nessun commento: