|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: system PRIVILEGED account (root
nfsserver.support.compaq.com)Date: Thu Jun 28 2001 - 02:31:08 CDT
*******************************************************************************
* *
* This is an update to an existing patch... *
* *
* Online links can be found at *
* http://ftp.support.compaq.com/patches/public/vms/axp/v7.2-1/dec-axpvms-vms721_sys-v1000--4.README
*******************************************************************************
TITLE: OpenVMS VMS721_SYS-V1000 Alpha V7.2-1 System ECO Summary
New Kit Date : 28-JUN-2001
Modification Date: Not Applicable
Modification Type: Updated Kit Supersedes VMS721_SYS-V0900
NOTE: An OpenVMS saveset or PCSI installation file is stored
on the Internet in a self-expanding compressed file.
For OpenVMS savesets, the name of the compressed saveset
file will be kit_name.a-dcx_vaxexe for OpenVMS VAX or
kit_name.a-dcx_axpexe for OpenVMS Alpha. Once the OpenVMS
saveset is copied to your system, expand the compressed
saveset by typing RUN kitname.dcx_vaxexe or kitname.dcx_alpexe.
For PCSI files, once the PCSI file is copied to your system,
rename the PCSI file to kitname-dcx_axpexe.pcsi, then it can
be expanded by typing RUN kitname-dcx_axpexe.pcsi. The resultant
file will be the PCSI installation file which can be used to install
the ECO.
Copyright (c) Compaq Computer Corporation 2000, 2001. All rights reserved.
OP/SYS: OpenVMS Alpha
COMPONENTS: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]LOCKING.EXE
[SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
[SYS$LDR]MULTIPATH.EXE
[SYS$LDR]MULTIPATH_MON.EXE
[SYS$LDR]LOGICAL_NAMES.EXE
[SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
[SYS$LDR]SECURITY.EXE
[SYS$LDR]SECURITY_MON.EXE
SOURCE: Compaq Computer Corporation
ECO INFORMATION:
ECO Kit Name: VMS721_SYS-V1000
DEC-AXPVMS-VMS721_SYS-V1000--4.PCSI
ECO Kits Superseded by This ECO Kit: VMS721_SYS-V0900
VMS721_SYS-V0800
VMS721_SYS-V0700
VMS721_SYS-V0600
VMS721_SYS-V0500
VMS721_SYS-V0400
VMS721_SYS-V0300
VMS721_SYS-V0200
VMS721_SYS-V0100
ECO Kit Approximate Size: 18,576 Blocks
Kit Applies To: OpenVMS Alpha V7.2-1
System/Cluster Reboot Necessary: Yes
Rolling Re-boot Supported: Yes
Installation Rating: INSTALL_1 - To be installed on all systems running
the listed version(s) of OpenVMS.
Kit Dependencies:
The following remedial kit(s) must be installed BEFORE
installation of this kit:
VMS721_PCSI-V0100
VMS721_UPDATE-V0200
In order to receive all the corrections listed in this
kit, the following remedial kits should also be installed:
None
ECO KIT SUMMARY:
An ECO kit exists for System components on OpenVMS Alpha V7.2-1.
This kit addresses the following problems:
PROBLEMS ADDRESSED IN VMS721_SYS-V1000 KIT
o The use of HSM on a Multipath device results in a system crash
at HSDRIVER+02AF8. This kit enables the use of Hierarchical
Storage Manager (HSM) on Multipath devices.
Images Affected: [SYS$LDR]IODEF.STB
[SYS$LDR]MULTIPATH.EXE
[SYS$LDR]MULTIPATH.STB
[SYS$LDR]MULTIPATH_MON.EXE
[SYS$LDR]MULTIPATH_MON.STB
o A system crashes with a DOUBLEDEALO bugcheck at
EXE$DEALLOCATE_C+00108 in $BRKTHRU when trying to deallocate
P1 pool that has already been deallocated. See crash dump
summary information below:
Crash Dump Summary
------------------
Bugcheck Type: DOUBLDEALO, Double deallocation of memory
block
Current Process: CANDOUGF_1
Current Image: $1$DUA0:[SYS1.SYSCOMMON.][SYSEXE]MAIL.EXE
Failing PC: FFFFFFFF.80048A28 EXE$DEALLOCATE_C+00108
Failing PS: 20000000.00000200
Module: SYSTEM_PRIMITIVES_MIN
(Link Date/Time: 13-SEP-2000 06:34:17.45)
Offset: 00012A28
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES.STB
[SYS$LDR]IO_ROUTINES_MON.EXE
[SYS$LDR]IO_ROUTINES_MON.STB
[SYS$LDR]SYS$CLUSTER.EXE
o If host-based volume shadowing is running when I/O transfers
to SCSI disks stop, and the disk is a member of a shadow set,
then a SHADDETINCON crash occurs.
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES.STB
o Under a heavy I/O load, where the processes DIOCNT is at zero,
certain circumstances exist that allow DIOCNT to go from 0 to
a negative value. Once this has occurred, it may be possible
for an application to hang a process or system with an
un-managable RWAST condition. Or, it may be possible for an
application to absorb all of non-paged pool with IRPs. This
can cause a system to crash with an INSF_NONPAGED,
'Insufficient non- paged pool' bugcheck in SYS$SHDRIVER. See
the crash dump summary below:
Crashdump Summary Information:
------------------------------
Bugcheck Type: INSF_NONPAGED, Insufficient nonpaged pool
Current Process: NMD_CQS4
Current Image: $2$DUA100:[MARS.V32.][EXE]CI_CQS.EXE;3
Failing PC: FFFFFFFF.92F8A204 SYS$SHDRIVER+70204
Failing PS: 30000000.00000804
Module: SYS$SHDRIVER
(Link Date/Time: 24-JAN-2000 9:28:00.85)
Offset: 00070204
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES.STB
[SYS$LDR]IO_ROUTINES_MON.EXE
[SYS$LDR]IO_ROUTINES_MON.STB
o A system can crash during boot with a SHADDETINCON bugcheck at
SYS$SHDRIVER+7580C in module SHD_THREADS, routine
SHTD$ENQ_LOCK_BLOCK. See crash dump summary below:
Crashdump Summary Information:
------------------------------
Bugcheck Type: SHADDETINCON, SHADOWING detects inconsistent state
Current Process: NULL
Current Image: <not available>
Failing PC: FFFFFFFF.8047F80C SYS$SHDRIVER+7580C
Failing PS: 04000000.00000804
Module: SYS$SHDRIVER
(Link Date/Time: 24-OCT-2000 15:06:45.56)
Offset: 0007580C
Images Affected: [SYS$LDR]MULTIPATH.EXE
[SYS$LDR]MULTIPATH.STB
[SYS$LDR]MULTIPATH_MON.EXE
[SYS$LDR]MULTIPATH_MON.STB
[SYS$LDR]IODEF.STB
75-45-2244, 75-52-238.
o Pool corruption, caused by double deallocation of an IRP, can
occur when the current path to a $1$GGAn: or $a$GKAn: device
is a secondary path and a polling I/O arrives while there is
an active IRP on that path.
Images Affected: [SYS$LDR]MULTIPATH.EXE
[SYS$LDR]MULTIPATH.STB
[SYS$LDR]MULTIPATH_MON.EXE
[SYS$LDR]MULTIPATH_MON.STB
[SYS$LDR]IODEF.STB
o When mounting FibreChannel shadowsets, a system can crash with
an INVEXCEPTN, 'Exception while above ASTDEL' bugcheck at
MULTIPATH_MON+3B2C, routine MPDEV$MAP_STATUS_SHDSET. See the
crash dump summary below:
Crashdump Summary Information:
------------------------------
Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
Current Process: NULL
Current Image: <not available>
Failing PC: FFFFFFFF.8038FB2C
MPDEV$MAP_STATUS_SHDSET_C+0005C
Failing PS: 10000000.00000804
Module: MULTIPATH_MON
(Link Date/Time: 9-FEB-2001 23:41:29.09)
Offset: 00003B2C
Images Affected: [SYS$LDR]MULTIPATH.EXE
[SYS$LDR]MULTIPATH.STB
o The system can crash with an SSRVEXCEPT bugcheck at
LOCKING+019CC on an LDQ_U R11,(R26) instruction while
processing a resource domain. See crash dump summary below:
Crashdump Summary Information:
------------------------------
Bugcheck Type: SSRVEXCEPT, Unexpected system service
exception
Current Process: BATCH_3084
Current Image: DSA509:[SSEXE.DP]SPCUSERMAINT.EXE;44
Failing PC: FFFFFFFF.801619CC LOCKING+019CC
Failing PS: 38000000.00000203
Module: LOCKING
Link Date/Time: 29-MAR-2000 00:58:34.67
Offset: 000019CC
Images Affected: [SYS$LDR]LOCKING.EXE
[SYS$LDR]LOCKING.STB
o Processes hang in Record Management Services (RMS) during $GET
and no apparent locking conflict can be detected. There are
cases where the status does not get updated after granting a
byte range lock.
Images Affected: [SYS$LDR]LOCKING.EXE
[SYS$LDR]LOCKING.STB
o With multithreaded processes, systems have been seen to hang.
Specifically, an AST critical to the file system causes file
system I/O operations to hang for all processes. It would be
appropriate to suspect this problem is occurring on a system
with otherwise unexplained system or application hangs, if the
system has multiple CPUs and any multi-kernel-threaded
applications. This includes essentially any Java
applications.
Images Affected: [SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT.STB
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.STB
o An ACCVIO occurs in MESSAGE_ROUTINES near "ALTNUMTIM." There
are several ways in which this problem may occur:
- A process ACCVIOs near the symbol EXE$ALTNUMTIM.
- ORA-482 crashes on Oracle 8.1.6 OPS.
- An Oracle process exits with the status code 0C, which is
an ACCVIO. Since Oracle processes run as detached
processes, this status code would appear in the accounting
log.
Images Affected: [SYS$LDR]MESSAGE_ROUTINES.EXE
[SYS$LDR]MESSAGE_ROUTINES.STB
o The GETTIMEOFDAY() CRTL function returns an error in ORACLE
Parallel Server Version 8.1.6. This can occur in several
ways:
- The GETTIMEOFDAY() function returns status code 103DFE0.
- The VMS user mode system services returns unexpected
status codes, including C signal status code, SS$_BREAK,
SS$_IMGDMP or SS$_DEBUG.
- An ORA-7211 Oracle 8.1.6 OPS crash could occur.
- A severe performance slowdown on Oracle 8.1.6 or 8.1.7 OPS
could occur.
- Oracle 8.1.6 or 8.1.7 OPS might hang.
Images Affected: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]EXCEPTION_MON.STB
o A process exits with an error status, but without a process
dump. The image dump flag is set for the process. There are
several ways this can be seen:
1. Processes don't dump when they exit while executing an
exception handler. The process must have the image dump
flag set.
2. Oracle8 OPS background processes do not dump when they hit
an error. The processes must have the image dump flag
set.
Images Affected: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]EXCEPTION_MON.STB
o The kill() sys$sigprc CRTL functions return the error
SS$_SUSPENDED when the process is neither suspended nor
waiting on a resource. The processes is simply waiting on a
mutex or is in the transient RWSCS state. This leads to
ORA-482 crashes on Oracle 8.1.6 or 8.1.7 OPS could also occur.
Images Affected: [SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT.STB
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.STB
o The user stack can become corrupt through repeated issuances
of the DEBUG command. The following sequence of commands
illustrates the problem:
$LINK/DEBUG FOO (where FOO simply executes an
infinite loop)
$RUN FOO
DBG> go
DBG> CTRL/Y
$DEBUG
DBG> go
DBG> CTRL/Y
$DEBUG
%DEBUG-I-TRUNC64, address 0000000200000000 being truncated in
DBGKREGISTERS\DBG$
GET_PD_FROM_FP
%DEBUG-I-TRUNC64, address 0000000200000000 being truncated in
DBGKREGISTERS\DBG$
GET_PD_FROM_FP
DBG> g
%DEBUG-I-BADSTACKPATCH1, Corrupt stack detected...attempting
patch STQ
R27,(SP)
%DEBUG-W-NORESUME, unable to resume execution, stack or PC
corrupted in %PROCESS_NUMBER 1
DBG>
The user stack is now corrupt.
Images Affected: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]EXCEPTION_MON.STB
o An Oracle Parallel Server can crash with an ORA-482 error.
The PC is near a routine's prologue code and the FP has been
set. R27 is stored on the stack after the FP is set and the
process exits with an unhandled exception.
Images Affected: [SYS$LDR]EXCEPTION.EXE
[SYS$LDR]EXCEPTION.STB
[SYS$LDR]EXCEPTION_MON.EXE
[SYS$LDR]EXCEPTION_MON.STB
o A system can crash with An INCONSTATE bugcheck at
SYS$VCC+0822C.
Crashdump Summary Information:
------------------------------
Bugcheck Type: INCONSTATE, Inconsistent I/O data base
Current Process: NULL
Current Image: <not available>
Failing PC: FFFFFFFF.801E822C SYS$VCC+0822C
Failing PS: 08000000.00000804
Module: SYS$VCC
(Link Date/Time: 18-MAY-2000 00:48:08.10)
Offset: 0000822C
This is not one of the conditions that the CVCB_CHKLK macro
recognizes as temporary, so it bugchecks rather than retries
the lock conversion. The restriction that fails in this case
is the CVCB lock, which is already in PR mode.
Images Affected: [SYS$LDR]SYS$VCC.EXE
[SYS$LDR]SYS$VCC.STB
[SYS$LDR]SYS$VCC_MON.EXE
[SYS$LDR]SYS$VCC_MON.STB
o Added the Adaptec Line of SCSI adapters.
Images Affected: [SYS.OBJ]IO_ROUTINES.EXE
[SYS.OBJ]IO_ROUTINES.STB
[SYS.OBJ]IO_ROUTINES_MON.EXE
[SYS.OBJ]IO_ROUTINES_MON.STB
o The system hangs with the AUDIT_SERVER process in RWMBX when
Object_Server tries to write to its own mailbox.
Images Affected: [SYS$LDR]SECURITY.EXE
[SYS$LDR]SECURITY.STB
[SYS$LDR]SECURITY_MON.EXE
[SYS$LDR]SECURITY_MON.STB
[SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES.STB
[SYS$LDR]IO_ROUTINES_MON.EXE
[SYS$LDR]IO_ROUTINES_MON.STB
o Shadow sets with FibreChannel members fail to mount with
MULTIPATH, resulting in MOUNTVERIFY error messages and member
removals.
Images Affected: [SYS$LDR]MULTIPATH.EXE
o The following fields have been added for the virtual unit and
shadow set members in the output of an SDA SHO DEVICE DSA
command:
o Site Value
o Timeout Value
Each member will now display Read Cost, Site, SM Timeout. For
example:
OLD OUTPUT:
-----------
$ analyze/SYS
SDA> sho dev dsa64
DSA64 Generic_DK UCB: 817CD500
.
.
.
I/O data structures
-------------------
----- Shadow Descriptor Block (SHAD) 817DA080 -----
Virtual Unit status: 0001 normal
Members 3 Act user IRPs 0 VU UCB 817CD500
Devices 3 SCB LBN 010F4627 Master FL 817DA3E4
Fcpy Targets 0 Generation Num 2C34A8FF Restart FL 817DA3EC
Mcpy Targets 0 009FCC39
Last Read Index 0 Virtual Unit Id 00000000
Master Index 1 12610040
----- SHAD Device summary for DSA64 -----
Device $1$DGA110
Index 0 Status 000000A0 src,valid
UCB 816156C0 VCB 817DABC0 Unit Id. 10E1006E 00000001
Device $1$DGA210
Index 1 Status 000000A0 src,valid
UCB 81615BC0 VCB 817DB540 Unit Id. 10E100D2 00000001
Device $64$DKA301
Index 2 Status 000000A0 src,valid
UCB 81624540 VCB 8166E880 Unit Id. 1161012D 00000040
NEW OUTPUT
-----------
$ analyze/SYS
SDA> sho dev dsa64
DSA64 Generic_DK UCB: 81540000
.
.
.
I/O data structures
-------------------
--- Shadowing Descriptor Block (SHAD) 816EC440 ---
Virtual Unit SCB Status: 0001 normal
Total Devices 3 VU_UCB 81540000
Source Members 3 SCB LBN 010F4627
Act Copy Target 0 Generation 009FCC39
Act Merge Target 0 Number 2C34A8FF
Last Read Index 1 VU Site Value 00000000
Master Mbr Index 1 VU Timeout Value 3600
Device $1$DGA110
Index 0 Status 000000A0 src,valid
Read Cost 0000002A Site 00000000 SM Timeout 120
UCB 8153D440 VCB 8161F880
Device $1$DGA210 ... Master Member
Index 1 Status 000000A0 src,valid
Read Cost 0000002A Site 00000000 SM Timeout 120
UCB 81543800 VCB 8187CA40
Device $64$DKA301
Index 2 Status 000000A0 src,valid
Read Cost 0000002A Site 00000000 SM Timeout 120
UCB 81503700 VCB 8179D800
Images Affected: [SYSLIB]SDA$SHARE.EXE
PROBLEMS ADDRESSED IN VMS721_SYS-V0900 KIT
o Revert back to the original OpenVMS soft affinity algorithm
because the new algorithm will not function properly on the
SCC hardware.
Images Affected: [SYS$LDR]PROCESS_MANAGEMENT.EXE
[SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
[SYS$LDR]SYS$BASE_IMAGE.EXE
[SYSEXE]SYSBOOT.EXE
[SYSEXE]SYSGEN.EXE
[SYSEXE]SYSMAN.EXE
[SYSEXE]SMISERVER.EXE
o A system could crash with a CPUSPINWAIT bugcheck waiting for
the POOL spinlock. This occurs when the primary CPU, which
holds the Pool spinlock, is executing the routine
EXE$TRIM_LISTS and a second CPU attempts to get the Pool
spinlock; the second CPU causes the crash when it times out.
Further, SHOW MEM/POOL/FULL will indicate that there are
thousands of blocks in the variable list.
This crash will happen only if either the SYSGEN parameter
NPAG_GENTLE or NPAG_AGGRESSIVE is set to a number smaller than
100 (the default case).
Images Affected: [SYS$LDR]SYSTEM_PRIMITIVES.EXE
[SYS$LDR]SYSTEM_PRIMITIVES_MON.EXE
o The effect of a SET SECURITY/OBJECT=DEVICE command can be
propagated to the wrong device(s) in a cluster for MK, DK, and
DG devices.
Images Affected: [SYSLIB]IOGEN$SHARE.EXE
[SYS$LDR]SYS$BASE_IMAGE.EXE
[SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
o The system crashes with a DELCONPFN bugcheck in
MMG$DEL_CONTENTS_PFN8. The crash occurs during process
rundown. See highlights of the crash summary below:
Crashdump Summary Information:
------------------------------
Bugcheck Type: DELCONPFN, Fatal error in delete contents of
PFN
Current Process:
Current Image: <not available>
Failing PC: FFFFFFFF.80067764
MMG$DEL_CONTENTS_PFN_C+00204
Failing PS: 38000000.00000800
Module: SYSTEM_PRIMITIVES_MIN (Link Date/Time:
28-MAY-1999 23:29:3 2.00)
Offset: 00031764
Images Affected: [SYS$LDR]SYS$VM.EXE
o A system can crash with a KRNLSTACKNV bugcheck during heavy
I/O activity, such as BACKUP. Forcing the stack out shows
interaction between SYS$DKDRIVER and IO_ROUTINES filling up
the KPB stack, usually with some interrupt topping off the
stack.
Images Affected: [SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
o The AST code for image rundown incorrectly deletes sKASTs when
searching the AST queues for AST addresses in the P0 and P2
space. The causes the system to hang in a LEF state because a
request sKAST is dismissed without execution.
Images Affected: [SYS$LDR]IMAGE_MANAGEMENT.EXE
o Disks with page or swap files installed do not get dismounted
during a system shutdown.
Images Affected: [SYSEXE]OPCCRASH.EXE
[SYS$LDR]IO_ROUTINES.EXE
[SYS$LDR]IO_ROUTINES_MON.EXE
[SYS$LDR]SYS$VM.EXE
o On Single CPU systems, the CPU runs two processes in the
current state, which results in an INCON_SCHED, 'Inconsistent
scheduling' bugcheck. See highlights from the dump summary
below:
Crashdump Summary Information:
Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
Current Process: BROKER
Current Image: DSA360:[BROKER_U.AXP.][P]
BROKER_EDITOR_U.EXE;85
Failing PC: FFFFFFFF.800C3B98 SCH$QEND_C+00038
Failing PS: 10000000.00000704
Module: PROCESS_MANAGEMENT (Link Date/Time:
29-DEC-1999 04:09:20.9
Offset: 00007B98
Images Affected: [SYS$LDR]SYS$VM.EXE
o An INVEXCEPTN bugcheck occurs at SCH$QEND_C+38 while getting
the address of the process alignment fault reporting
information, CTL$GL_REPORT_USER_FAULTS. See highlights from
the Crashdump summary information below:
Crashdump Summary Information:
Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
Current Process: BROKER
Current Image: DSA360:[BROKER_U.AXP.][P]BROKER_EDITOR_U.EXE;85
Failing PC: FFFFFFFF.800C3B98 SCH$QEND_C+00038
Failing PS: 10000000.00000704
Module: PROCESS_MANAGEMENT (Link Date/Time: 29-DEC-1999
Images Affected: [SYS$LDR]SYS$VM.EXE
o A multi-threaded process hangs with all threads suspended
except one that is spinning in a loop using CPU time.
Images Affected: [SYSLIB]SYS$SSISHR.EXE
Problems Addressed in VMS721_SYS-V0800:
o In the VMS721_SYS-V0700 kit, the SDA$SHARE image was placed in
the [SYSEXE] directory. It should be placed in the [SYSLIB]
directory.
Images Affected: [SYSLIB]SDA$SHARE.EXE
Problems Addressed in VMS721_SYS-V0700:
o The system can crash with an INCONSTATE bugcheck in CACHE$MOUNT.
This occurs when a process, usually RAID$SERVER, is attempting to
mount a disk, usually a member of a Raid set. It appears as if
the volume is being mounted twice and the INCONSTATE bugcheck
occurs.
Images Affected: [SYS$LDR]SYS$VCC.EXE
o An INCONSTATE bugcheck can occur during a RAID unbind operation.
Images Affected: [SYS$LDR]SYS$VCC.EXE
o A BLKASTCNT crash occurred with Pathworks enqueuing many
locks, all with blocking ASTs (asynchronous system traps).
The crash occurred when one of the locks was dequeued.
Images Affected: [SYS$LDR]LOCKING.EXE
o The code to verify that a page can be deleted from a process'
virtual address space was too restrictive. If a page had an
elevated reference count, and the process had direct I/O
out-tanding, VMS would not allow deletion of the page. This
could occur in the following instances:
- If the reference count is only 1 (2 for an active global
page).
- If the page is a buffer object page.
- If the page does not belong to a buffer object for the
process.
- If the page is owned by a privileged mode (exec or kernel)
and can only be accessed from privileged mode.
RMS is now using "system buffer objects" for files with global
buffers. This causes RMS to hang when it attempts to close
such a file if the process has direct I/O outstanding. Some
processes, like MULTINET can have direct I/Os and can be outstanding
for a long time.
This ECO enables the deletion of a page from a process'
virtual address space even though the page reference count is
elevated.
Images Affected:
- [SYS$LDR]SYS$VM.EXE
- [SYS$LDR]SYS$VM.STB
o Restructure $GETJPI inter-process PSB references.
+ a $SHOW SYSTEM/FULL" command displays the UIC of all
processes (except interactive process) as [0,0]. $GETJPI
shows the correct UIC's except swapper.
+ A system can crash with a SSRVEXCEPT bugcheck in module
PROCESS_MANAGEMENT_MON (SYSTEM_CHECK=1) at offset 0000F4B4.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMEN_MON.EXE
Problems Addressed in VMS721_SYS-V0600:
o A System can crash with a KRNLSTAKNV, 'Kernel stack not
valid', error.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o Unnecessary and unwanted path switches can occur on multipath
devices. Under certain circumstances, if a user executes a
manual path switch of one member of a shadow set, the
requested path switch takes place. However, the other
member(s) of the shadow set switch paths as well. Further, if
the user attempts to switch the other member(s) back, the
other memebers will switch, but the originally switched member
will then switch back to the unwanted path.
Another symptom of this problem is that a transient error
condition on a multipath device can cause a path switch, even
though the current path is still valid.
This problem can occur if a multipath disk device is
simultaneously online, i.e., connected, on more than one path.
This configuration is created:
+ If two Fibre Channel cables are attached to the two host
FibreChannel ports on an HSG80 controller.
+ If two or more FibreChannel host bus adapters on the same
OpenVMS host system connect to the same fabric, i.e., the
same FibreChannel switch or into a set of cascaded
switches.
+ If two parallel SCSI buses are connected to the two host
ports on an HSZ80 controller.
Images Affected: [SYS$LDR]MULTIPATH_MON.EXE
o A cross process $GETJPI request for security profile (persona)
information, which includes network privileges and rights, can
lead to a SSRVEXCEPT system crash. See crashdump summary
below:
========================CRASH DUMP INFORMATION=========================
XENON1>
**** OpenVMS (TM) Alpha Operating System X6ZG-FT1 - BUGCHECK ****
** Bugcheck code = 000003C4: SSRVEXCEPT, Unexpected system service
exception
** Crash CPU: 00 Primary CPU: 00 Active CPUs: 00000003
** Current Process = NETACP
** Current PSB ID = 00000001
** Image Name = $2$DKA100:[SYS0.SYSCOMMON.][SYSEXE]NETACP.EXE;1
**** Starting compressed selective memory dump at 20-MAR-2000 15:39...
......................................................................
.....................
...Complete ****
halted CPU 0
halt code = 5
HALT instruction executed
PC = ffffffff800acae4
Crashdump Summary Information:
------------------------------
Crash Time: 20-MAR-2000 15:39:49.18
Bugcheck Type: SSRVEXCEPT, Unexpected system service exception
Node: XENON1 (Cluster)
CPU Type: COMPAQ AlphaServer DS20E 500 MHz
VMS Version: X6ZG-FT1
Current Process: NETACP
Current Image: $2$DKA100:[SYS0.SYSCOMMON.][SYSEXE]NETACP.EXE;1
Failing PC: FFFFFFFF.8011657C EXE_STD$CHECK_IMAGE_NAME_C+0033C
Failing PS: 00000000.00000000
Module: PROCESS_MANAGEMENT_MON (Link Date/Time: 13-MAR-2000
13:54:
01.61)
Offset: 0001057C
Boot Time: 17-MAR-2000 12:28:35.00
System Uptime: 3 03:11:14.18
Crash/Primary CPU: 00/00
System/CPU Type: 2208
Saved Processes: 29
Pagesize: 8 KByte (8192 bytes)
Physical Memory: 512 MByte (65536 PFNs, contiguous memory)
Dumpfile Pagelets: 105419 blocks
Dump Flags: olddump,writecomp,errlogcomp,dump_style
Dump Type: compressed,selective,shared_mem
EXE$GL_FLAGS: poolpging,init,bugdump
Paging Files: 1 Pagefile and 0 Swapfiles installed
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o A REPLY/USER= and/or SUBMIT/NOTIFY command crashes the system
with a SSRVEXCEPT in the cluster system process (CSP). See
the crashdump summary below:
Crashdump Summary Information:
------------------------------
Crash Time: 17-APR-2000 11:15:38.40
Bugcheck Type: SSRVEXCEPT, Unexpected system service exception
Node: COBRA3 (Cluster)
CPU Type: DEC 4000 Model 620
VMS Version: X706-FT1
Current Process: CLUSTER_SERVER
Current Image: DSA2:[SYS0.SYSCOMMON.][SYSEXE]CSP.EXE;2
Failing PC: FFFFFFFF.80137920 CHECKITEM_C+00AC0
Failing PS: 0C000000.00000200
Module: PROCESS_MANAGEMENT_MON (Link Date/Time: 9-APR-2000 04:37:
48.31)
Offset: 00037920
********************************************
Images Affected:
- [SYS$LDR]IO_ROUTINES.EXE
- [SYS$LDR]IO_ROUTINES_MON.EXE
o A non-privileged user can access jobs in a batch queue,
regardless of the queue protections. See the comparative
examples below:
$ show queue/full/all unhf_sys$batch ! from privileged account
Batch queue UNHF_SYS$BATCH, idle, on UNHF::
/BASE_PRIORITY=3 /CPUMAXIMUM=00:30:00 /JOB_LIMIT=3 /OWNER=[SYSTEM]
/PROTECTION=(S:M,O:D,G,W:RS) /WSEXTENT=32768 /WSQUOTA=16384
(IDENTIFIER=[SIS_DEVEL,BANNER_SCT],ACCESS=READ+SUBMIT+MANAGE)
Entry Jobname Username Status
----- ------- -------- ------
2719 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 09:16:27.00 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOTIFY /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
3182 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 14:15:28.41 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
$show queue/full/all unhf_sys$batch !from non-privileged account
Batch queue UNHF_SYS$BATCH, idle, on UNHF::
/BASE_PRIORITY=3 /CPUMAXIMUM=00:30:00 /JOB_LIMIT=3 /OWNER=[SYSTEM]
/PROTECTION=(S:M,O:D,G,W:RS) /WSEXTENT=32768 /WSQUOTA=16384
(IDENTIFIER=[SIS_DEVEL,BANNER_SCT],ACCESS=READ+SUBMIT+MANAGE)
Entry Jobname Username Status
----- ------- -------- ------
2719 no privilege Holding
3182 DUMMY B_JOHNSTONE Holding
Submitted 1-APR-2000 14:15:28.41 /KEEP
/LOG=$1$DUA233:[B_JOHNSTONE].LOG; /NOPRINT /PRIORITY=100
/RESTART=UNHF_SYS$BATCH
File: _$1$DUA321:[B_JOHNSTONE.COM]DUMMY.COM;10
$
In this example, the user can see entry 3182, as well as security
information, but cannot see entry 2719.
This also generates the following security alarm:
%%%%%%%%%%% OPCOM 1-APR-2000 15:38:36.66 %%%%%%%%%%% (from node
UNHA at 1-APR-2000 15:38:36.67)
Message from user AUDIT$SERVER on UNHA
Security alarm (SECURITY) on UNHA, system id: 1028
Auditable event: Object access
Event time: 1-APR-2000 15:38:36.65
PID: 2040C464
Source PID: 21012416
Username: R_KENNEY$
Process owner: [R_KENNEY$]
Object class name: QUEUE
Object name: UNHF_SYS$BATCH
Object owner: [0,0]
Object protection: SYSTEM:M, OWNER:D, GROUP:, WORLD:RS
Access requested: READ
Status: %SYSTEM-F-NOPRIV, insufficient privilege or
object protection violation
Images Affected:
- [SYS$LDR]SECURITY.EXE
- [SYS$LDR]SECURITY_MON.EXE
o A batch process can abort with SS$_IVCHNLSEC during image
activation. The batch process aborts with the following
error:
%RDB-E-UNAVAILABLE, Oracle Rdb is not available on yoursystem
-RDB-I-TEXT, Error activating image DSA0:[SYS1.SYSCOMMON.][SYSLIB]
RDMPRV.EXE, Invalid channel for create and map section
Images Affected: [SYS$LDR]SYS$VM.EXE
o A problem with the $TRNLNM code path for INTERLOCKED
translations can cause the service to exit without releasing
the logical name mutex. If the $TRNLNM request or any
subsequent kernel mode system service request made by that
process exits with an error status, the system will crash with
a MTXCNTNZ bugcheck.
If no kernel mode system service request made by that process
exits with an error status, the system will eventually hang,
with some processes in MUTEX wait trying to acquire the
logical name mutex. If some of those processes have already
acquired other mutexes, such as the I/O database mutex and
GSD mutex, there may be other processes in MUTEX wait trying
to acquire those mutexes.
The $TRNLNM bug is exercised by a fairly unusual combination
of circumstances and is more likely to be seen on an SMP system.
Images Affected: [SYS$LDR]LOGICAL_NAMES.EXE
o If a dump is being written to a DOSD disk and a recursive
bugcheck occurs during the processing of a recursive bugcheck,
the dump_dev variable is changed after being verified. The
change redirected the dump write from the correct DOSD disk to
the system disk.
(OUTPUT FROM CONSOLE)
**** MASTER MEMBER UNIT NUMBER OF SYSTEM DISK SHADOW SET IS 502
**** SEARCHING DEVICES LISTED IN DUMP_DEV FOR A VALID DUMP FILE
CHECKING ENTRY #01 IN DUMP_DEV...
%%%% MSCP 1 2 0 13 219 EF00 6601095, DUD219.13.0.2.1
...DUMP_DEV ENTRY #01 IS A VALID DUMP DEVICE
%%%% MSCP 0 2 0 10 502 EF00 6601095 DU 4200100AF138 HSL10B, DUA502.10.0.2.0
**** ACCESSING SYSTEM DISK VIA ORIGINAL BOOT PATH
Note the change in the mass storage control protocol (MSCP)
unit.
Images Affected:
- [SYSEXE]APB.EXE
- [SYSEXE]DEBUG_APB.EXE
- [SYS$LDR]EXCEPTION.EXE
- [SYS$LDR]EXCEPTION_MON.EXE
o Make system buffer objects available for Record Management
Services (RMS) locking and make MAXBOBxxx parameters obsolete.
Images Affected:
- [SYS$LDR]SYS$VM.EXE
- [SYS$LDR]SYS$BASE_IMAGE.EXE
o Executing POSIX on OpenVMS V7.2 or later will crash the
system.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
o The following remote signal handling problems are fixed:
+ A process exits with a supervisor mode ACCVIO at PC =
EXE$REFLECT_C+A30.
+ A program can exit unexpectedly.
+ An ACCVIO occurs at EXE$POWERAST_C+F8 when a program is
calling kill() or $sigprc() to signal another process.
Images Affected:
- [SYS$LDR]EXCEPTION.EXE
- [SYS$LDR]EXCEPTION_MON.EXE
- [SYS$LDR]PROCESS_MANAGENMENT.EXE
- [SYS$LDR]PROCESS_MANAGENMENT_MON.EXE
- [SYS$LDR]IMAGE_MANAGEMENT.EXE
o The incorrect value is calculated in the IEEE handler's e_mult
and e_divt routines. See examples of the failing test results
below:
******** Test fp_mul_s ********
d1 = 2.350989e-38 (0xffffff)
d2 = 5.000000e-01 (0x3f000000)
intermediate result = 1.175494e-38 (0x800000)
result = -1.175494e-38 (0x80800000)
expected result = -0.0 or 0.0
d1 = 2.350989e-38 (0xffffff)
d2 = -5.000000e-01 (0xbf000000)
intermediate result = -1.175494e-38 (0x80800000)
result = 1.175494e-38 (0x800000)
expected result = 0.0
******** Test fp_div_s ********
d1 = 2.350989e-38 (0xffffff)
d2 = 2.000000e+00 (0x40000000)
intermediate result = 1.175494e-38 (0x800000)
result = -1.175494e-38 (0x80800000)
expected result = -0.0 or 0.0
Images Affected: [SYS$LDR]EXCEPTION.EXE
o Batch jobs take longer to process and hang in the local event
flag (LEF) state. AUTOGEN recommends using a higher value for
LNM%HASHTBL, which has previously been constrained by a
maximum value of 8192.
Images Affected:
- [SYSEXE]SYSBOOT.EXE
- [SYSEXE]SYSGEN.EXE
- [SYSEXE]SYSMAN.EXE
Problems Addressed in VMS721_SYS-0500:
o Provides support for the LP9000 adapter, the next generation
of the Emulex FibreChannel adapter.
Images Affected: [SYSEXE]SYS$CONFIG.DAT
o If a subprocess exists with a different security profile, a
call to $DELPRC could stall, leaving the process in RWAST
state.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
- [SYS$LDR]PROCESS_MANAGEMENT.STB
- [SYS$LDR]PROCESS_MANAGEMENT_MON.STB
o After the installation of the VMS72_SYS-V0200 remedial kit,
the SET RIGHTS/DISABLE/PROCESS command will not remove the
specified right from the process rightslist.
Images Affected:
- [SYS$LDR]SECURITY.EXE
- [SYS$LDR]SECURITY_MON.EXE
- [SYS$LDR]SECURITY.STB
- [SYS$LDR]SECURITY_MON.STB
o The system would crash with an ACCVIO in the exception
handling code because a register was not being restored
properly.
Images Affected:
- [SYS$LDR]EXCEPTION.EXE
- [SYS$LDR]EXCEPTION_MON.EXE
- [SYS$LDR]EXCEPTION.STB
- [SYS$LDR]EXCEPTION_MON.STB
o After upgrading from V7.1 or V7.1-2 to V7.2 or V7.2-1, some
processes may periodically hang in HIBernation state, when
they should be awaken.
Images Affected:
- [SYS$LDR]SECURITY.EXE
- [SYS$LDR]SECURITY.STB
o Attempting to read from EXEC mode results in an ACCVIO and the
deletion of the user's process.
Images Affected:
- [SYS$LDR]LOCKING.EXE
- [SYS$LDR]LOCKING.STB
o A file sent to a spooled device could cause the system to
crash with a system service exception (SSRVEXCPT). This
problem only occurs when SYSTEM_CHECK is set to 1, which
causes the IO_ROUTINES_MON.EXE image to be used. This would
not occur, if the device was not spooled.
Images Affected:
- [SYS$LDR]IO_ROUTINES.EXE
- [SYS$LDR]IO_ROUTINES_MON.EXE
- [SYS$LDR]IO_ROUTINES.STB
- [SYS$LDR]IO_ROUTINES_MON.STB
o A below normal value rounds up to an in-range value in the
ADDT, SUBT, and DIVT routines.
Images Affected:
- [SYS]EXCEPTION.EXE
- [SYS]EXCEPTION_MON.EXE
- [SYS]EXCEPTION.STB
- [SYS]EXCEPTION_MON.STB
o Preserve the sign value, if the result from running an ADDT,
SUBT or DIVI routine is a denormal value rather than a zero
value, where the sign is always set to positive.
Images Affected:
- [SYS$LDR]EXCEPTION.EXE
- [SYS$LDR]EXCEPTION_MON.EXE
- [SYS$LDR]EXCEPTION.STB
- [SYS$LDR]EXCEPTION_MON.STB
o As a result, I/O monitoring tools, such as MONITOR_DISK/ITEM=QUEUE_LENGT,
would report erroneous and increasing values for some multipath devices,
even when there was actually no active I/O on the device.
Images Affected:
- [SYS$LDR]IO_ROUTINES.EXE
- [SYS$LDR]IO_ROUTINES[_MON].EXE
- [SYS$LDR]MULTIPATH.EXE
- [SYS$LDR]MULTIPATH_MON.EXE
- [SYS$LDR]IO_ROUTINES.STB
- [SYS$LDR]IO_ROUTINES[_MON].STB
- [SYS$LDR]MULTIPATH.STB
- [SYS$LDR]MULTIPATH_MON.STB
o The system crashes with an invalid address in R0 at
NSA$AUDIT_EVENT_C+00008. The dump stack will have several
entries from logical name support (LNMSUB).
Images Affected:
- [SYS$LDR]LOGICAL_NAMES.EXE
- [SYS$LDR]LOGICAL_NAMES.STB
o Convert $ENQ requests may be queued in the wrong order on the
conversion queue.
Images Affected:
- [SYS$LDR]LOCKING.EXE
- [SYS$LDR]LOCKING.STB
o Documentation states that a process joins the system and
default group resource domains when created. At present, the
process only joins the system domain. The first $ENQ will
result in the process joining the default group domain.
Images Affected:
- [SYS$LDR]LOCKING.EXE
- [SYS$LDR]LOCKING.STB
o One of the inputs to the SCH$CHANGE_CUR_PRIORITY routine is
the CPU db address. This is the address of the CPU that
executes changing the priority of a particular process. On
occasion, the input is from a different CPU and the
SCH$CHANGE_CUR_PRIORITY routine reads the input incorrectly.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
- [SYS$LDR]PROCESS_MANAGEMENT.STB
- [SYS$LDR]PROCESS_MANAGEMENT_MON.STB
Problems Addressed in VMS721_SYS-V0400:
o VMS721_SYS-V0300 kit did not include all images
The VMS721_SYS-V0300 kit did not include all the images necessary to
correct the problems.
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
Problems Addressed in VMS721_SYS-V0300:
o Redefining logical name tables may lead to a system crash.
Redefining a logical name table, such as LNM$TEMPORARY_MAILBOX, to a
process-private logical name table may lead to a system crash if the
process also creates a mailbox with a logical name. The crash would
typically occur when the CLUSTER_SEVER process was the current process.
Images Affected:
- [SYS$LDR]LOGICAL_NAMES.EXE
Problems Addressed in VMS721_SYS-V0200:
o Third party access checks are failing.
Third party access checks are failing after all rights are removed
and new rights are added using Grant/Revoke_id services.
Images Affected:
- [SYS$LDR]SECURITY.EXE
- [SYS$LDR]SECURITY.STB
- [SYS$LDR]SECURITY_MON.EXE
- [SYS$LDR]SECURITY_MON.STB
o Identifiers are being ignored on user accounts.
When granting identifiers to a user, access to a queue that had
previously worked was no longer working. This is shown below:
$ submit/user=USER1/nolog SYS$SYSDEVICE:[USER1]test/que=USER1$test
%SUBMIT-F-CREJOB, error creating job
-JBC-E-NOPRIV, insufficient privilege or queue protection violation
$
$ uaf grant/id ID1 USER1
%UAF-I-GRANTMSG, identifier ID1 granted to USER1
$
$ submit/user=USER1/nolog SYS$SYSDEVICE:[USER1]test/que=USER1$test
Job TEST (queue USER1$TEST, entry 7) started on USER1$TEST
$
$ uaf grant/id ID2 USER1
%UAF-I-GRANTMSG, identifier ID2 granted to USER1
$
$ submit/user=USER1/nolog SYS$SYSDEVICE:[USER1]test/que=USER1$test
Job TEST (queue USER1$TEST, entry 8) started on USER1$TEST
$
$ uaf grant/id ID3 USER1
%UAF-I-GRANTMSG, identifier ID3 granted to USER1
$
$ submit/user=USER1/nolog SYS$SYSDEVICE:[USER1]test/que=USER1$test
Job TEST (queue USER1$TEST, entry 9) started on USER1$TEST
$
$ uaf grant/id ID4 USER1
%UAF-I-GRANTMSG, identifier ID4 granted to USER1
$
$ submit/user=USER1/nolog SYS$SYSDEVICE:[USER1]test/que=USER1
$test
%SUBMIT-F-CREJOB, error creating job
-JBC-E-NOPRIV, insufficient privilege or queue protection violation
Images Affected:
- [SYS$LDR]SECURITY.EXE
Problems Addressed in VMS721_SYS-V0100:
o Prevent System Crash
This fix prevents a system crash on OpenVMS V7.2-1
Images Affected:
- [SYS$LDR]PROCESS_MANAGEMENT.EXE
- [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE
INSTALLATION NOTES:
This kit requires a system reboot. Compaq strongly recommends that
a reboot is performed immediately after kit installation to avoid
system instability
If there are other nodes in the OpenVMS cluster, they must also be
rebooted in order to make use of the new image(s). If it is not
possible or convenient to reboot the entire cluster at this time, a
rolling re-boot may be performed.
INSTALLATION INSTRUCTIONS:
Install this kit with the POLYCENTER Software installation utility
by logging into the SYSTEM account, and typing the following at the
DCL prompt:
PRODUCT INSTALL VMS721_SYS /SOURCE=[location of Kit]
The kit location may be a tape drive, CD, or a disk directory that
contains the kit.
Additional help on installing PCSI kits can be found by typing
HELP PRODUCT INSTALL at the system prompt
All trademarks are the property of their respective owners.
---
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]