OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
From: system PRIVILEGED account (rootstage1.cxo.cpqcorp.net)
Date: Mon Feb 11 2002 - 00:30:04 CST

  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

    *******************************************************************************
    * *
    * This is an update to an existing patch... *
    * *
    * Online links can be found at *
    * http://ftp.support.compaq.com/patches/public/vms/vax/v6.2/vaxshad10_062.README
    *******************************************************************************

    TITLE: OpenVMS VAXSHAD10_062 V6.2 SHADOWING ECO Summary
     
    New Kit Date: 11-FEB-2002
    Modification Date: Not Applicable
    Modification Type: Updated Kit Supersedes VAXSHAD09_062

    NOTE: An OpenVMS saveset or PCSI installation file is stored
           on the Internet in a self-expanding compressed file.
     
           For OpenVMS savesets, the name of the compressed saveset
           file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
           kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
           saveset is copied to your system, expand the compressed
           saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
     
           For PCSI files, once the PCSI file is copied to your system,
           rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can
           be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant
           file will be the PCSI installation file which can be used to install
           the ECO.
     

     
    Copyright (c) Compaq Computer Corporation 1998, 1999, 2002. All rights reserved.

    OP/SYS: DIGITAL OpenVMS VAX

    COMPONENT: SHADOWING
                  SHDRIVER.EXE
                  SHADOW_SERVER.EXE

    SOURCE: Compaq Computer Corporation

    ECO INFORMATION:

         ECO Kit Name: VAXSHAD10_062
         ECO Kits Superseded by This ECO Kit: VAXSHAD09_062
         ECO Kit Approximate Size: 1944 Blocks
         Kit Applies To: OpenVMS VAX V6.2
         System/Cluster Reboot Necessary: Yes
         Rolling Re-boot Supported: Yes

         Installation Rating: 2 - To be installed on all systems running
                                   the listed version of OpenVMS and
                                   using the following feature:

                                    SHADOWING

         Kit Dependencies:

           The following remedial kit(s), or later, MUST be installed BEFORE
           installation of this, or any required kit:

             VAXCLUSIO01_062

           In order to receive all the corrections listed in this
           kit, the following remedial kits, or later, should also be installed:

             None

    ECO KIT SUMMARY:

    An ECO kit exists for SHADOWING on OpenVMS VAX V6.2. This kit
    addresses the following problems:

    PROBLEMS ADDRESSED IN VAXSHAD10_062 KIT:

      o A host based raidset can hang when one member of the shadowset
          encounters an Operation Incomplete error.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system can crash with a SHADDETINCON bugcheck.

          Crashdump Summary Information:
          ------------------------------
          Bugcheck Type: SHADDETINCON, SHADOWING detects
                             inconsistent state
          Current Process: CTM$_00060006
          Current Image: $1$DGA5014:[CTM$TMROOT.][CTM_HAMMER]
                             CTM_HAMMER_ALPHA_32.EXE;1
          Failing PC: FFFFFFFF.804A1CD4 SYS$SHDRIVER+93CD4
          Failing PS: 14000000.00000804
          Module: SYS$SHDRIVER (Link Date/Time:
                             15-DEC-2000 15:08:57.95)
          Offset: 00093CD4

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A mini copy operation aborts with a %SYSTEM-F-IVADDR error
          message.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Cycles are being consumed by the issuing of TQEs (Time Queue
          Element) that serve no purpose.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system can crash with a SHADDETINCON bugcheck in
          SYS$SHDRIVER + 000762A0. This occurs when the master member
          identified in the IN_SET lock value block is not a member of
          the set on the Watcher node.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A mini copy /POLICY=MINICOPY operation can occasionally fail
          if the shadow set member is not online at the time the SCB
          read occurs.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Multiple systems can hang on cluster shutdown.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A system disk MVTIMEOUT is not managed correctly.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A system crash occurs with SHADDETINCON in EXPEL_DEVICE when
          membership event status cannot be determined in
          end_mbr_change_vp.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o If SYSGEN system check is enabled, the first MOUNT of a system
          disk will crash the system.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Mount verification messages can occur with no apparent cause.
          There is no way to identify what is causing these messages.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

    Problems Addressed in VAXSHAD09_062:

      o A SHADOWSET goes into MOUNTVERIFYTIMEOUT and cannot be
          remounted.The process attempting the mount hangs.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system can crash with a SHADDETINCON bugcheck.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Disabling a FibreChannel cascade connection results in a
          system crash.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Disabling a FibreChannel cascade connection results in a
          corruption of a shadowset member.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Users that had used bit 16 in SHADOW_SYS_DISK to eliminate
          using the remote members for read requests would occasionally
          not exclude reads from going to the remote members.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A CPUSPINWAIT bug check can occur if the read of the SCB of a
          shadow set member cannot pass the checksum test.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o An assisted copy operation (DCD), will not always be initiated
          properly. During an assisted copy operation, if the source
          member was dismounted or otherwise removed from the shadow
          set,the connection to the controller would not clean up
          correctly.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A full copy operation that is interrupted by a mini merge may
          not be completed correctly.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o When a copy operation, that had interrupted a merge operation
          is terminating, if there are no members marked for merge the
          system can crash with a SHADDETINCON bugcheck.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A SHOW DEVICES command shows zero % merged status, even though
          the shadow set status does not indicate that a merge is
          required.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Use of bit 16 in SHADOW_SYS_DISK bias reads of the local
          source shadow set member does not always work.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system can crash with a SHADBOOTFAIL bugcheck.Following
          is a crash dump extract:

              ** Bugcheck code = 000008CC: SHADBOOTFAIL, SHADOWING failed to
                 boot from system disk shadow set
              ** Current Process = NULL
              ** Current PSB ID = 00000001
              ** Image Name =

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system crashes with an INVEXCPTN bugcheck at SYS$SHDRIVER+6D0DC

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system crashes in XQP when an IO gets a SS$_DATACHECK
          during a Shadowset copy operation

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o SHADDETINCON crashes in the SHD_LOCK SHLK$MERGE_SIGNAL
          routine.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o An incomplete SDA displays for DSA devices.

              Images Affected: [SYSEXE]SDA.EXE

      o The system crashes with a SHADBOOTFAIL bugcheck when booting.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o There may be an inconsistent display of output of $ SHOW
          DEVICE /SERVED if SET DEVICE /SERVED issued at the same time.
          A system crash could occur if queues are updated while SHOW
          DEVICE /SERVED code is traversing the queues at elevated IPL
          and the update causes an access violation or pagefault.

              Images Affected: [SYSEXE]]SHOW.EXE

      o Enable shadowing of devices that report the same value for
          MAXBLOCK, but different values for CYLINDER, TRACK, and/or
          SECTOR.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Match master hangs until MVTIMEOUT expires.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The master member of a multi member shadowset, whose members
          are all valid, SRC MBRS, can have a generation number that is
          older then the other 2 members:

                  master mbr ($1$DGA2015:)
                  SDA> eval/time 009E8F02.FBFF8657
                  21-APR-2000 10:15:30.08

                  mbr1 ($1$DGA2016):
                  SDA> eval/time 009E8F02.FDF405D2
                  21-APR-2000 10:15:33.36

                  mbr2 ($1$DGA3013):
                  SDA> eval/time 009E8F02.FDF405D2
                  21-APR-2000 10:15:33.36

          If an attempt is made to mount this set on a client node, the
          client node can hang the set in 'mounted alloc' state and
          OPCOM will report path switches occurring for mbrs $1$DGA2016
          & $1$DGA3013.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o The system crashes with INVEXCEPTN when a write completes to a
          multi-member shadow set. An attempt to return to the WLE
          (write log entry) fails because one member of the shadow set
          has been removed and there is no Write Log Table.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A system can crash with a SHADDETICON bugcheck after removing
          or adding a shadow set member.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o When a full copy operation is pending or in progress, the
          removal of the master shadow set member may cause data
          corruption. The conditions under which this can occur are:

              1. A full merge is pending or in process on a two member
                  shadow set.

              2. A third member is added to the shadow set.

              3. There is a difference between the two SRC members

              4. The master member is then removed from the shadow set

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Under certain circumstances, when a path to a device is lost
          during a write operation, the SCB (system control block) can
          contain a stale master member index value. This will cause
          the system to crash with a SHADDETINCON bugcheck.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o A crash in SHLK$MERGE_SIGNAL can occur on a cluster node.
          This can happen when a lock value block in MRGVAL becomes
          invalid due to another cluster node, that holds MRGVAL lock,
          either crashing or being shut down.

              Images Affected: [SYS$LDR]SHDRIVER.EXE

      o Increase the merge factor for shadowing from 1,000 to 10,000.
          This change also displays the merge factor only during an
          actual merge operation.

              Images Affected: [SYSEXE]SHADOW_SERVER.EXE

    PROBLEMS ADDRESSED IN VAXSHAD08_062 KIT:

      o Functionality was added to enable customers to shadow devices
         that report an identical number of "Total Blocks".

         In the past, Sectors per track, Tracks per cylinder, and Total
         cylinders had to be identical, but the requirement is no
         longer needed.

         For example:

           $ SHOW DEVICES/FULL $84$DKC200:

         Disk $84$DKC200: (CSG84), device type RZ74, is online,
         mounted, file-oriented device, shareable, served to a cluster
         via MSCP Server, error logging enabled.

           Error count 1 Operations completed 28293
           Owner process "" Owner UIC [SYSTEM]
           Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
           Reference count 137 Default buffer size 512
           Total blocks 6976375 Sectors per track 91
           Total cylinders 3067 Tracks per cylinder 25

           $ SHOW DEVICES/FULL $84$MDA1200:

         Disk $84$MDA1200: (CSG84), device type RAM Disk, is online,
         allocated, deallocate on dismount, mounted, file-oriented
         device, shareable, served to cluster via MSCP Server.

           Error count 0 Operations completed 420
           Owner process "username" Owner UIC [SYSTEM]
           Owner process ID 4260041B Dev Prot S:RWPL,O:RWPL,G:R,W
           Reference count 2 Default buffer size 512
           Total blocks 6976375 Sectors per track 64
           Total cylinders 3407 Tracks per cylinder 32
           Allocation class 84

         These two devices can be members of the same shadow set.

         Device Device Error Volume Free Trans Mnt
         Name Status Count Label Blocks Count Cnt
         DSA8400: Mounted 0 CSG84_V71 56308 319 1
         $84$DKC200:(CSG84) ShadowSetMember 0 (member of DSA8400:)
         $84$MDA1200:(CSG84) ShadowCopying 0 (copy trgt DSA8400: 2% copied)
         USERNAME_CSG84 ...

      o Faster I/O subsystems, for example the HSZ50 and the HSZ70,
         were taking longer to perform full merges than some older and
         slower subsystems.

         Changes were made to allow the System Manager to adjust
         thresholds. Two new logicals were adjusted to vary the merge
         multiplication factor used for a virtual unit, on a per node
         basis.

         The logicals used must be defined in the system table and
         therefore should be defined on each node in the cluster. The
         valid range for a threshold is 100 to 1000. Any value outside
         of this range causes a factor to default to 200. This value
         of 200 is displayed at the start of a shadow set merge, in the
         '%SHADOW_SERVER-I-SSRVINIMRG' message, following the word
         'Factor'.

         CAUTION:
         Increasing the values excessively may cause application
         performance problems when merges are occurring. When setting
         values, System Managers must balance the site specific
         application needs with their merge requirements.

         Since two logical names are evaluated every one thousand I/Os,
         the factor can possibly be adjusted while a merge is in
         progress.

         The first logical name is:

                                SHAD$MERGE_DELAY_FACTOR_DSAnnnn
                                                           ^^^^
                                                           ||||
                                                           vvvv
         This logical name is virtual unit specific, with 'nnnn'
         representing the virtual unit number. This delay factor will
         be applied to the virtual unit only. If any important disks
         need to be merged with minimal disruption, values as high
         as 1,000% (threshold = 10 times best time) may be defined. By
         the same token, if a particular disk's merge operation is
         interfering with application I/O, it can cause the disk to
         delay more frequently by reducing the value as low as:

              100 (threshold = 1 times the best time)

         If the above logical is not defined, then the following
         logical is evaluated:

              SHAD$MERGE_DELAY_FACTOR

         Like the virtual unit specific logical, this value will adjust
         the threshold, but only for all shadow sets that do not have a
         virtual unit specific logical defined.

      o Additional tracing code was added to help diagnose why mini
         merge operations were converted to full merge.

      o If full merge operations are interrupted with a copy
         operation, then write logging is enabled, which wastes cluster
         write logging resources.

      o If a VMScluster that has more than 96 nodes crashes, then
         write logging is never used to recover the virtual unit. The
         result is unnecessary full merge operations.

      o If a shadow set exists on multiple nodes in a cluster and one
         cluster member adds a device which cannot be accessed by
         other nodes in the cluster, then those nodes will crash with
         an INVEXCEPTN in the SHDriver within SHSB$MATCH_MASTER_SCB.

         When calling SHSB$AVAILABLE_SHADOW_SET, the call to log an
         error packet resulted in an overwritten register (R0) and then
         a system crash occurred.

         An example of a crash footprint is:

          Crash Time: 28-OCT-1998 12:47:46.03
          Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
          Node: ATOZ (Clustered)
          CPU Type: AlphaServer 8400 Model EV56/440
          VMS Version: V6.2-1H3
          Current Process: ATOZ_1
          Current Image: DSA1111:<GBASE.>[RUN]GEM.EXE
          Failing PC: FFFFFFF8026E454
          Failing PS: 34000000 00000804
          Module: SYS$SHDRIVER
          Offset: 0003E454
          Boot Time: 25-OCT-1998 18:51:50.00

      o A Virtual Unit can hang and then no further use of the virtual
         unit is possible. If the System Dump Analyzer (SDA) is used
         to examine the virtual unit, then a negative value will be
         found in UCB$W_RWAITCNT.

      o Repeating mini merges or full merges can occur immediately
         after the successful completion of a previous mini merge or
         full merge on a virtual unit.

      o During a system shutdown, two possible scenarios could occur:

         1. Other nodes that have the system disk virtual unit MOUNTed
             may suspend use of that virtual unit, until the node
             running shutdown is stopped.

         2. When a system disk that is disabled for write logging is
             mounted on several nodes in a cluster, a non-system disk
             volume aCCESS qto that virtual unit in the cluster may
             suspend, until the node running shutdown is stopped.

      o During a system reboot, the rebooting node may intermittently
         hang if write logging is concurrently enabled on the system
         disk and on other nodes in the cluster.

      o Since a virtual unit can be aborted for several reasons,
         additional tracing is needed to differentiate why the virtual
         units abort.

    PROBLEMS ADDRESSED IN VAXSHAD07_062 KIT:

      o When shutting down a node in a VMScluster, the system that is
         being used to perform the shutdown will crash.

      o Shadowsets intermittently hang.

      o A change has been made in the shadowing code to enhance
         performance on systems that make reasonable use of VIOC cache
         when a drive is in merge state.

      o A new informational message has been added that will result in
         a Mount verify message if the IO$_DIAGNOSE function is executed
         by the SHDRIVER.

      o Additional code changes to improve the error log reporting for
         Volume Shadowing.

      o The Volume Shadowing code in OpenVMS V7.1 and V6.2, with the
         CLUSIO kit installed, included a new algorithm that did not
         always guarantee that read requests would be serviced by a
         locally connected disk in preference to a disk that was MSCP
         served by the host. Prior to V7.1 (and V6.2 with the CLUSIO
         kit installed) if there were local and MSCP served disks to
         choose from, the request was queued to the local disk unless
         the queue depth exceeded twenty.

         Some customers who shadowed over FDDI reported that the new
         algorithm was not preferable, and therefore requested the
         ability to choose the previous behavior.

         The ability to prefer that read requests be performed by local
         shadow set members, over those served by an OpenVMS system has
         been added to this version of the driver. To select that mode
         of operation another bit(16) in SHADOW_SYS_DISK has been used.

           $ MC SYSGEN
           SYSGEN> SHOW SHADOW_SYS_DISK
           Parameter Name Current Default Min. Max.
           -------------- ------- ------- ------- -------
           SHADOW_SYS_DISK 1 0 0 -1
           SYSGEN> SET SHADOW_SYS_DISK %X10001
           SYSGEN> WRITE CURRENT
           SYSGEN> WRITE ACTIVE
           SYSGEN> EXIT

    PROBLEMS ADDRESSED IN VAXSHAD06_062 KIT:

      o A potential system crash with SHADDETINCON bugcheck at
         SHDRIVER+12124 during boot from a multi-member shadow set.
         This occurs if the booting member is not the first in the
         member array, and the other member is not yet visible.

      o SHADDETINCON bugchecks occur on multiple nodes in a cluster
         during a merge operation.

         System crash information
         ------------------------
         Time of system crash: 13-APR-1997 13:21:05.59
         Version of system: OpenVMS (TM) VAX Version V6.2
         System Version Major ID/Minor ID: 1/0
         VAXcluster node: CYV7KE, a VAX 7000-760
         Crash CPU ID/Primary CPU ID: 00/00
         Bitmask of CPUs active/available: 0000003F/0000003F
         CPU 00 reason for Bugcheck: SHADDETINCON, SHADOWING detects
          inconsistent state
         Process currently executing on this CPU: None
         Current IPL: 8 (decimal)
         CPU database address: C9212000
         MPB address: B29B09C0
         CPU 00 Processor stack

         General registers:

          R0 = 00000000 R1 = B67D258C R2 = B67D2180 R3 = B6544600
          R4 = B35992C0 R5 = B624A340 R6 = B65447C8 R7 = 00000000
          R8 = B67D2180 R9 = B6544730 R10 = 00000000 R11 = B6544600
          AP = B65446B8 FP = 7FE2534C SP = C9213DAC PC = B82E42B3
          PSL = 04080000

         Processor registers:

          P0BR = C9946800 SBR = 1EF80400 ASTLVL = 00000004
          P0LR = 0000018B SLR = 003FFF00 SISR = 00000010
          P1BR = C9216400 PCBB = 7F7B0020 ICCS = 00000000
          P1LR = 001FF116 SCBB = 1EF5F000 SID = 17000201

          LDEV = 00018002 LBER = 00000000 LCNR = 00000001
          LCON0 = DF0007ED LCON1 = 00000000 TODR = 44D09B64
          LBECR0 = 0040003A LBECR1 = 00008060 LMODE = 000332A4

          LMERR = 00000000 BIU_STAT = F00E1070 BIU_ADDR = 00000298
          MMESTS = 10004005 TBSTS = 800001D0 PCSTS = FFFFF800
          ISP = C9213DAC
          KSP = 7FFE7800
          ESP = 7FFE9800
          SSP = 7FFED800
          USP = 7FE2534C

      o System crashes in SHADDETINCON SYS$SHDRIVER+3D3C0.

          Bugcheck Type: SHADDETINCON, SHA RBADC2 (Clustered)
          CPU Type: AlphaServer 2100 4/233
          VMS Version: V6.2-1H2
          Current Process: NULL
          Current Image: <not available>
          Failing PC: FFFFFFFF 8025B3C0
          Failing PS: 08000000 00000804
          Module: SYS$SHDRIVER
          Offset: 0003D3C0
          Boot Time: 15-APR-1997 08:39:31.00
          System Uptime: 5 22:23
          Crash/Primary CPU: 00/00
          Saved Processes: 22
          Pagesize: 8 KByte (8192 bytes)
          Physical Memory: 256 MByte (32768 PFNs)
          Dumpfile Pagelets: 184518 blocks
          Dump Flags: olddump,writecomp,errlogcomp,dump_style
          EXE$GL_FLAGS: poolpging,init,bugdump
          Stack Pointers:
          KSP = FFFFFFFF 8A731D88 ESP = FFFFFFFF 8A733000 SSP = FFFFFFFF
               8A72D000
               USP = FFFFFFFF 8A72D000
          General Registers:
          R0 = 00000000 00000001 R1 = FFFFFFFF 8162F7E0 R2 = FFFFFFFF
              8162F7C0
          R3 = FFFFFFFF 8186EBC0 R4 = 00000000 00000003 R5 = FFFFFFFF
              8162F890
          R6 = FFFFFFFF 8186EE80 R7 = 00000000 00000000 R8 = FFFFFFFF
              8162F7C0
          R9 = FFFFFFFF 8186EDE8 R10 = 00000000 00000000 R11 = FFFFFFFF
              8186EBC0
          R12 = FFFFFFFF 8186ED38 R13 = FFFFFFFF 8710A270 R14 = FFFFFFFF
              87084200
          R15 = 00000000 003C60E0 R16 = 00000000 000008B4 R17 = 00000000
              00000501
          R18 = 00000000 00000000 R19 = FFFFFFFF 87084200 R20 = 00000000
              00000000
          R21 = FFFFFFFF 8162F808 R22 = FFFFFFFF 8710FB20 R23 = 00000000
              00000000
          R24 = 00000000 00000001 AI = 00000000 00000001 RA = FFFFFFFF
              80288928
          PV = FFFFFFFF 8710A698 R28 = 00000000 00000000 FP = FFFFFFFF
              8A731DE0
          PC = FFFFFFFF 8025B3C4 PS = 08000000 00000804
          System Registers:
          Page Table Base Register (PTBR) 00000000
           00007FF8
          Processor Base Register (PRBR) FFFFFFFF
           8110A000
          Privileged Context Block Base (PCBB) 00000000
           0110A080
          System Control Block Base (SCBB) 00000000
           000001B3
          Software Interrupt Summary Register (SISR) 00000000
           00000000
          Address Space Number (ASN) 00000000
           00000000
          AST Summary / AST Enable (ASTSR_ASTEN) 00000000
           00000000
          Floating-Point Enable (FEN) 00000000
           00000000
          Interrupt Priority Level (IPL) 00000000
           00000008
          Machine Check Error Summary (MCES) 00000000
           00000000
          Virtual Page Table Base Register (VPTB) 00000002
           00000000
          Failing Instruction:
          SYS$SHDRIVER_NPRO+393C0: BUGCHK
          Instruction Stream (last 20 instructions):
          SYS$SHDRIVER_NPRO+39370: RET R31,(R28)
          SYS$SHDRIVER_NPRO+39374: LDQ_U R31,(SP)
          SYS$SHDRIVER_NPRO+39378: SUBQ SP,#X10,SP
          SYS$SHDRIVER_NPRO+3937C: STQ R16,#X0008(SP)
          SYS$SHDRIVER_NPRO+39380: STQ R17,(SP)
          SYS$SHDRIVER_NPRO+39384: LDQ R17,#XF8E0(R13)
          SYS$SHDRIVER_NPRO+39388: BIS R17,#X04,R17
          SYS$SHDRIVER_NPRO+3938C: BIS R31,R17,R16
          SYS$SHDRIVER_NPRO+39390: LDQ R17,(SP)
          SYS$SHDRIVER_NPRO+39394: ADDQ SP,#X08,SP
          SYS$SHDRIVER_NPRO+39398: BUGCHK
          SYS$SHDRIVER_NPRO+3939C: HALT
          SYS$SHDRIVER_NPRO+393A0: SUBQ SP,#X10,SP
          SYS$SHDRIVER_NPRO+393A4: STQ R16,#X0008(SP)
          SYS$SHDRIVER_NPRO+393A8: STQ R17,(SP)
          SYS$SHDRIVER_NPRO+393AC: LDQ R17,#XF8E0(R13)
          SYS$SHDRIVER_NPRO+393B0: BIS R17,#X04,R17
          SYS$SHDRIVER_NPRO+393B4: BIS R31,R17,R16
          SYS$SHDRIVER_NPRO+393B8: LDQ R17,(SP)
          SYS$SHDRIVER_NPRO+393BC: ADDQ SP,#X08,SP
          SYS$SHDRIVER_NPRO+393C0: BUGCHK
          SYS$SHDRIVER_NPRO+393C4: HALT
          SYS$SHDRIVER_NPRO+393C8: BIS R31,R31,R31
          SYS$SHDRIVER_NPRO+393CC: BIS R31,R31,R31
          SYS$SHDRIVER_NPRO+393D0: SUBQ SP,#X50,SP

      o The Volume Shadowing software which was shipped in OpenVMS
         Alpha and VAX V7.1 and the CLUSIO remedial kits, requires
         additional non-paged pool to improve synchronization.
         Customers should take this into account when they are tuning
         their systems, and be aware that Volume Shadowing is now more
         sensitive to resource problems with the possibility that
         systems may crash if non-paged pool is exhausted.

         Shadowing uses approximately 800 bytes additional non-paged
         pool per concurrent IO to the virtual unit. This remedial kit
         includes codes which avoids system crashes if a system
         exhausts non-paged pool.

         Please be aware that there are still cases under which
         Non-Paged Pool exhaustion will result in a SHADDETINCON
         bugcheck. This modification reduces the probability but
         does not completely eliminate them.

      o During internal testing, a system crash occurred which
         indicated that IOs were left outstanding in DUDRIVER after
         a virtual unit had been removed.

      o There was a missing index on a check for member valid in the
         BBR_READ_RECOVERY routine.

      o There was an "infinite" loop condition at SHCP$START_QUED, and
         the code has been modified so that the persistent thread will
         be "killed" if the VU it has spawned fails.

      o This remedial kit includes additional error logging capabilities
         to collect additional information when a virtual unit is made
         available.

         The new LOG_IT macro code has the following input parameters:

          o R0 - value of P4

          o R1 - value of P5

          o R2 - address of LW in SHAD containing P6

          o R3 - VU UCB

          o R5 - SHAD IRP address with:

           - CDRP$L_BCNT = P1
           - CDRP$L_MEDIA = P2
           - CDRP$L_PID = P3

         The implementation makes use of the following cells in the
         errorlog record.

          o EMB$W_SP_BOFF - set to %xBADE as TAG

          o EMB$W_SP_FUNC - reason code

          o EMB$L_SP_BCNT - LW for information

          o EMB$L_SP_MEDIA - LW for information

          o EMB$L_SP_RQPID - LW for information

          o EMB$Q_SP_IOSB - 2 LW for information

          o EMB$L_SP_CMDREF - LW for Information

      o A process may intermittently hang during dismount of a
         shadow-set while waiting for completion of the QIOW in
          the DO_IO routine.

      o KRNLSTAKNV halt during MOUNT/CLUSTER DSAx:
              Bugcheck Type: CPUSANITY, CPU sanity timer expired
              Node: AI84 (Clustered)
              CPU Type: AlphaServer 8400 Model EV56/440
              VMS Version: V6.2-1H3
              Current Process: PM2SKZ
              Current Image: DSA40:[ZENT410.][EXE]BUS.EXE
              Failing PC: FFFFFFFF 8001F8D0
              Failing PS: 18000000 00001604
              Module: SYSTEM_PRIMITIVES_MIN
              Offset: 0000B8D0
              Boot Time: 26-JUN-1997 08:34:37.00
              System Uptime: 1 00:46:34.07
              Crash/Primary CPU: 01/00
              Saved Processes: 26
              Pagesize: 8 KByte (8192 bytes)
              Physical Memory: 2048 MByte (262144 PFNs)
              Dumpfile Pagelets: 999974 blocks
              Dump Flags: writecomp,errlogcomp,dump_style
              EXE$GL_FLAGS: poolpging,init,bugdump,pgflfrag
              Stack Pointers:
              KSP = 00000000 7FF91C98 ESP = 00000000 7FF96000 SSP = 00000000
               7FF9C100
              USP = 00000000 7EDE4030
              General Registers:
              R0 = 00000000 00000000 R1 = FFFFFFFF 814EA180 R2 = FFFFFFFF
               81410000
              R3 = FFFFFFFF 9DE268F8 R4 = 00000000 0000012C R5 = 00000000
               7FF91D40
              R6 = 00000000 7FF445A0 R7 = 08000000 00000200 R8 = FFFFFFFF
               F7710250
              R9 = 00000000 00000030 R10 = 00000000 00000031 R11 = 00000000
               00000001
              R12 = 00000000 00008001 R13 = FFFFFFFF 9DE268F8 R14 = FFFFFFFF
               9DE25640
              R15 = FFFFFFFF 9DE04200 R16 = 00000000 00000774 R17 = 00000000
               7FF91C38
              R18 = FFFFFFFF 9DE32CE0 R19 = FFFFFFFF 9DE04200 R20 = 00000000
               00000000
              R21 = 00000000 272007F0 R22 = FFFFFFFF 9DE04200 R23 = 00000000
               00000000
              R24 = FFFFFFFF 9DE04AC0 AI = 00000000 00000000 RA = FFFFFFFF
               00000000
              PV = FFFFFFFF FFFFFFFF R28 = FFFFFFFF 8001F83C FP = 00000000
               7FF91E10
              PC = FFFFFFFF 8001F8D4 PS = 18000000 00001604
              Failing Instruction:
              EXE$HWCLKINT_C+00510: BUGCHK

      o The system crashes when a second node attempts to boot a system disk
         shadow set with two members. The following SHADDETINCON bugcheck at
         SHDRIVER+12124 or SYS$SHDRIVER_NPRO+449B4 occurs:
          SHADDETINCON, SHADOWING detects inconsistent state

      o The mount of a shadow set fails. The failure report says that
         the set is already mounted or that there is a duplicate unit
         number.

    INSTALLATION NOTES:

    This kit requires a system reboot. Compaq strongly recommends that
    a reboot is performed immediately after kit installation to avoid
    system instability

    If you have other nodes in your OpenVMS cluster, they must also be
    rebooted in order to make use of the new image(s). If it is not
    possible or convenient to reboot the entire cluster at this time, a
    rolling re-boot may be performed.

    Install this kit with the VMSINSTAL utility by logging into the
    SYSTEM account, and typing the following at the DCL prompt:

    SYS$UPDATE:VMSINSTAL VAXSHAD10_062 [location of the saveset]

    The saveset location may be a tape drive, CD, or a disk directory
    that contains the kit saveset.

    ---