|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: AIX Service Mail Server (aixserv
austin.ibm.com)Date: Tue May 08 2001 - 02:18:10 CDT
APAR: IY10868 COMPID: 576554801 REL: 110
ABSTRACT: MSGIGYDS0220-S UNABLE TO LOCATE THE DB2 PRODUCT
PROBLEM DESCRIPTION:
MSGIGYDS0220-S Unable to locate the DB2 product.
The cob2 command forces the LIBPATH to its own location
thus overriding whatever the user specified.
CMVC defect number is 20304
LOCAL FIX:
$ ln -s /usr/lpp/db2_05_00/lib/libdb2.a .
PROBLEM SUMMARY:
When multiple releases of DB2 are installed,
the DB2 shared library libdb2.a cannot be found in /usr/lib.
COB2 was coded to set LIBPATH to point to /usr/lib. There was
no way to cause COB2 to include other paths, so the path to the
libdb2.a to be used to compile a program that includes EXEC SQL
statements could not be specified to the compiler.
PROBLEM CONCLUSION:
COB2 will be modified to append the paths
needed by the compiler to the existing value of LIBPATH rather
than replacing the value of LIBPATH.
------
APAR: IY12785 COMPID: 569692600 REL: 110
ABSTRACT: X.25/ARTIC960HX PERFORMANCE PROBLEM WITH BOS.MP/UP 4.3.3.16
PROBLEM DESCRIPTION:
Only on the ARTIC960HX PCI Adapter,
'mkdev -l sx25a0' intermittently returns error "Method
er/usr/lib/methods/cfgsx25):", "0514-048 Error downloading
microcode or software.", "cfgsx25: tw_get(PH_OK_ACK) failed" or
"cfgsx25: tw_get(DL_OK_ACK) failed".
After the port is made and gets connected, X.25 data transfer
runs very slow over the ARTIC960HX PCI Adapter.
LOCAL FIX:
Use a modified bos.mp/up module
PROBLEM CONCLUSION:
There are two drivers, ddriciop and twd, which work with the
ARTIC960Hx (ricio) card. Each installed its own interrupt
handler for the card. twd needs to be first in the chain.
Due to a change made in interrupt handling in bos.(mp/up),
we needed to have only one interrupt handler (from ddriciop)
installed. Now twd's interrupt handler is registered with
ddriciop.
------
APAR: IY15383 COMPID: 5765D5100 REL: 311
ABSTRACT: SWITCH WENT TO DOWN
PROBLEM DESCRIPTION:
Switch went to down.
2510-898 unable to access SDR to get the list of auto-join nodes
rc= -1.
2510-195 The fault service daemon got a SIGTERM signal.
PROBLEM SUMMARY:
Closed the window so the child process will exit on SIGTERM
without resetting the adapter.
PROBLEM CONCLUSION:
There is a small window where the primary forks a child
process and an SDR test is run where a SIGTERM to the child
will result in the child call the standard SIGTERM handler
and reset the adapter.
------
APAR: IY16163 COMPID: 5765E5400 REL: 440
ABSTRACT: CLUSTER.ES.SERVER.RTE 4.4.0.5 FAILS IF LINK EXISTS
PROBLEM DESCRIPTION:
Install of cluster.es.server.rte 4.4.0.5 fails with error:
ln: 0653-421 /etc/objrepos/HACMPdisktype exists.
PROBLEM CONCLUSION:
Check if /etc/objrepos/HACMPdisktype exists before creating
link.
------
APAR: IY16416 COMPID: 5765D5100 REL: 311
ABSTRACT: PSSP_SCRIPT NEEDS TO INSTALL DEVICES.CHRP.BASE.RTE IS
PROBLEM DESCRIPTION:
The PSSP 3.x pssp_script fails to install (migrate to)
PSSP 2.4 on MCA nodes because devices.chrp.base.rte
gets installed only on PCI nodes. But this fileset is
a prereq of PSSP 2.4, so it is needed on MCA nodes too.
The part of pssp_script to install devices.chrp.base.rte
currently is (I have to wrap lines to fit into SSF)
# Defect 46958: remove -c (commit) flag for AIX filesets
oslvl=$($oslevel)
if $oslvl = $os415 && $oslvl = $os414 ; then
if -z $($lslpp -qh devices.chrp.base.rte 2>/dev/null)
&& $platform = "chrp" ; then #-
$installp -abgXd/mnt devices.chrp.base.rte
==> only on oslevel >=4.2.0.0 and on platform==chrp
devices.chrp.base.rte will be installed
This needs to be changed to
if $oslvl = $os415 && $oslvl = $os414 ; then
if -z $($lslpp -qh devices.chrp.base.rte 2>/dev/null)
&& $platform = "chrp" || "$code_version" = "PSSP-2.4"
then #-
$installp -abgXd/mnt devices.chrp.base.rte
==> devices.chrp.base.rte will be installed if
oslevel>=4.2.0.0 and (platform==chrp or
code_version==PSSP-2.4)
PROBLEM SUMMARY:
pssp_script only installs the fileset devices.chrp.base.rte
on chrp nodes. However the ssp.basic fileset in PSSP 2.4
requires devices.chrp.base.rte, regardless of the type of
node. pssp_script needs to be modified to install the
fileset devices.chrp.base.rte when either the node
platform is chrp, or the PSSP level of the node is 2.4.
PROBLEM CONCLUSION:
pssp_script has been modified to install the
fileset devices.chrp.base.rte when either the node
platform is chrp, or the PSSP level of the node is 2.4.
------
APAR: IY16816 COMPID: 5765E5400 REL: 440
ABSTRACT: NIM_TOK CORE DUMP AND FAIL_STANDBY EVENT AFTER ADDING 64
PROBLEM DESCRIPTION:
When adding 64 aliases to the service adapter, the
associated nim may core dump and generate a fail_standby
event.
PROBLEM CONCLUSION:
Increase the buffer set aside for the nmAdapter list and
skip logic that creates a socket for each alias.
------
APAR: IY16985 COMPID: 5765D6100 REL: 210
ABSTRACT: LLCANCEL POE INTERACTIVE RUNNING JOB W/EXTERNAL SCHEDULER
PROBLEM DESCRIPTION:
Interactive poe running jobs using external scheduler the
llcancel command can not stop the poe job.
LOCAL FIX:
In order to cancel the poe interactive running job w/external
scheduler. Only the ctrl C can kill the poe job not
llcancel for now.
PROBLEM SUMMARY:
The LoadLeveler command llcancel would not be able
to cancel a running POE interactive job with
external scheduler set.
PROBLEM CONCLUSION:
The LoadLeveler command llcancel would now be able
to cancel a running POE interactive job with
external scheduler set.
------
APAR: IY17133 COMPID: 5765D5100 REL: 311
ABSTRACT: NODES NEED BOOTED TWICE TO UPDATE TUNING.CUST
PROBLEM DESCRIPTION:
Because boot procedure always runs tuning.cust locally, *then*
checks for customize and ftp's tuning.cust from CWS, the node
must be rebooted a second time for changes in tuning.cust to
actually be applied to the node.
LOCAL FIX:
Reboot nodes a second time after tuning.cust has been ftp'd
through customize reboot.
PROBLEM SUMMARY:
During a node's customization, tuning.cust is ftp'd from
the node's Boot/Install Server. However, tuning.cust will
not be executed until the next reboot of the node.
pssp_script should be modified so that during a node's
customization, tuning.cust will be executed.
PROBLEM CONCLUSION:
pssp_script has been modified so that during a node's
customization, tuning.cust will be executed.
------
APAR: IY17237 COMPID: 5765D5100 REL: 311
ABSTRACT: SPADAPTR ERROR WITH '-S YES' AND LARGE NODE NUMBER
PROBLEM DESCRIPTION:
Using spadaptr with the '-s yes' option and a large number of
nodes can cause incorrect ip addresses to be calculated,
resulting in error message 0022-047.
PROBLEM SUMMARY:
A customer was using spadaptrs to enter data for a large
number of css adapters using the switch node numbers.
When a node with a high node number, but a low switch
number was encountered, the third octet of the IP address
was calculated incorrectly.
During the processing of all the nodes, the third octet
of the IP address had been incremented because the fourth
octet had exceeded 255. When the node with a high node
number was being processed it used the incremented third
octet number instead of the original value. Since this
calculated IP address was not a valid IP address, an error
message was issued stating that the IP address could not
be resolved and spadaptrs terminated.
PROBLEM CONCLUSION:
spadaptrs was modified to correct the generation of IP
addresses for css adapters, when the switch node numbers
are used. Certain values were not being reset, which
caused the third octet of the generated IP address to be
incorrect, which could cause spadaptrs to fail.
------
APAR: IY17321 COMPID: 5765D5101 REL: 111
ABSTRACT: ORACLE CAN NOT CONNECT HAGSD WITH MANY CLIENTS
PROBLEM DESCRIPTION:
Oracle can not connect hagsd with many clients.
PROBLEM SUMMARY:
Group Services daemon currently limits the maximum
pending connections to 5. Therefore, if there are
many connection requests(e.g., over 500) have been
made to Group Services at a short time, some of
the requests may be disallowed with "ECONNREFUSED".
By extending the maximun pending queue length
(called, backlog size) to the maximum configured
queue length (i.e., no -o somaxconn), the unexpected
connection refusal will be resolved.
PROBLEM CONCLUSION:
This fix will enable Group Services subsystem
to handle many concurrent connection requests
even if the requests have made at very short
period.
------
APAR: IY17369 COMPID: 5765D5101 REL: 111
ABSTRACT: HAGS BROADCAST METHOD CAN CAUSE STORMS ON INSTALLTIONS WITH
PROBLEM DESCRIPTION:
When HAGS needs a broadcast, it first tries to send the msg
out to every body (burst broadcast). Rebroadcasting to the
undelivered nodes will be performed in every 3 seconds. This may
cause a big overhead if the number of nodes is large. This has
been seen to cause hags voting issues.
PROBLEM SUMMARY:
Whenever Group Services needs a broadcast,
it first tries to send the messages to
all nodes and retries the broadcast
the messages to the unresponded nodes
in every 3 seconds.
This behavior may increase the overload
to the IP stack particularly if the
number of nodes is large, and thus
it may increase the message drop rate
and cause more retries which may delay
the Group Services' protocol completion.
PROBLEM CONCLUSION:
On a big system (with more than 64 nodes), this
fix will lessen the overhead to broadcast
Group Services messages by spreading out
the message sends.
------
APAR: IY17580 COMPID: 5765D5100 REL: 311
ABSTRACT: INSUFFICIENT STACK FOR KICKPIPES() IN MPCI, CAUSES A PROBLEM
PROBLEM DESCRIPTION:
PSSP 3.1.1 introduced local var 'shoveq' & 'frq' in kickpipe(
in MPCI. They need stack frame 8192 Bytes, but they are 4096
Bytes. This fact causes a problem for Informix down.
PROBLEM SUMMARY:
Running Informix on an SP system, can fail if
a query with 2000 or's is done. Informix may detect a
corruption of the header of its stack block pool, and quit.
PROBLEM CONCLUSION:
MPCI, which is used by Informix, added a
couple of large stack variables for shared memory support.
This causes a problem, for Informix, because Informix both uses
MPCI and manages its own threading and stacks. Informix's
current management does not account for the addition of 8K of
additional stack space for the MPCI routines that Informix
calls. MPCI changed the declaration of these new large
variables so that they are now locatd in the heap, instead
of the stack. The new MPCI implementation solves the problem
that Informix had working with our MPCI environment and is
probably the better way for MPCI to handle these large
variables.
------
APAR: IY17883 COMPID: 5765D5100 REL: 311
ABSTRACT: VSD.RESERVE UNNECESSARILY OBTAINING DCE CREDENTIALS
PROBLEM DESCRIPTION:
vsd.reserve unnecessarily obtaining DCE credentials
PROBLEM SUMMARY:
The vsd.reserve executable, which is used to reserve a
volume_group, is unnecessarily trying to obtain DCE
credentials when it determines there is a mismatch in
the timestamps for the last time the volume was varied on.
The error can be seen in the console log:
checkvg_timestamps 174 : /usr/lpp/ssp/bin/dsrvtgt: not found
rvsd(recov) 03/30/01 11:46:42 vsd.reserve:
checkvg_timestamps: Failed to obtain
DCE credentials; rc=127. Continuing.
PROBLEM CONCLUSION:
The code to obtain the DCE credentials has been removed.
------
APAR: IY17976 COMPID: 5765E5400 REL: 440
ABSTRACT: AFTER NODEXNODE STOP OF NODE SHOWS HACMPRD NOT CLSOMETIMES THIS
PROBLEM DESCRIPTION:
Administration Guide SC23-4279-01, page 5-11 includes a Note
that reads:
Make sure that the concurrent access volume group is in a
quiescent state (no I/O operations in progress) before
executing these reconfiguration commands.
This Note appears to imply that in order to reach a quiescent
state all applications running in a cluster node should be
stopped.
PROBLEM SUMMARY:
After a node by node migration (successful one) bring a node
down smitty graceful shows the string "hacmprd" and not
"clstrmgr". "hacmprd" is the old recovery driver syntax.
PROBLEM CONCLUSION:
Change the name in the default message.
------
APAR: IY18037 COMPID: 5765E5400 REL: 440
ABSTRACT: TWO NODES HAVE SAME RESOURCE GROUP AFTER DARE WHILE ONE NODE WAS
PROBLEM DESCRIPTION:
The customer had a cascading mutual takeover configuration with
inactive takeover set to false. One node was taken down with
takeover and powered off for a maintenance problem. While that
node was still down, the customer had a problem on the other
node which required it to be rebooted. Thus when starting
HACMP back up on that node, the resource group normally owned
by the other node was not taken. The customer then ran a dare
to move that resource group sticky (required) to the node that
was up to get it active again. Though verification errors had
to be ignored in order to get this to happen, and warnings were
given to sync the config to the powered off node before bringing
it into the cluster, we also stated that the node would not be
allowed into the cluster until this sync was done. However,
when that node was started back into the cluster, there was
nothing that detected the out of sync condition, and so let
the node join the cluster resulting in it taking the resources
without the other node releasing them.
PROBLEM SUMMARY:
The customer had a cascading mutual takeover configuration with
inactive takeover set to false. One node was taken down with
takeover and powered off for a maintenance problem. While that
node was still down, the customer had a problem on the other
node which required it to be rebooted. Thus when starting HACMP
back up on that node, the resource group normally owned by the
other node was not taken. The customer then ran a dare to move
that resource group sticky (required) to the node that was up
to get it active again. Though verification errors had to be
ignored in order to get this to happen, and warnings were given
to sync the config to the powered off node before bringing it
into the cluster, we also stated that the node would not be
allowed into the cluster until this sync was done. However,
when that node was started back into the cluster, there was
nothing that detected the out of sync condition, and so let the
node join the cluster resulting in it taking the resources
without the other node releasing them.
PROBLEM CONCLUSION:
Add a check in the rc.cluster script to compare resource ODMs
with active nodes resource ODMs.
------
APAR: IY18046 COMPID: 5765E5400 REL: 440
ABSTRACT: AFTER N-1 NODEXNODE W/REBOOT: LOCK MANAGER RETURNS CLM_NOLOCKMG
PROBLEM DESCRIPTION:
In a 4 node cluster at HAS 440 and set for node by node
migration the 1st 3 nodes that are "migrated" return
CLM_NOLOCKMGR when requesting a lock. However after the last
node is migrated with a node by node the lock manager responds
correctly on all nodes.
PROBLEM CONCLUSION:
Replace /usr/lib/libclm.a and /usr/lib/libclm_r.a with
/usr/es/lib/libclm.a and /usr/es/lib/libclm_r.a
------
APAR: IY18049 COMPID: 5765E5400 REL: 440
ABSTRACT: HAES: FAILURE CYCLE = 32 FOR ATM: TOO LONG
PROBLEM DESCRIPTION:
The default failure cycle in the HACMPnim class is set to 32,
resulting in a long wait for death detection.
PROBLEM CONCLUSION:
The default setting of the ATM failure cycle will be changed
from 32 to 8.
------
APAR: IY18059 COMPID: 5765E5400 REL: 440
ABSTRACT: ON FALLOVER, STANDBY ADAPTER MARKED DOWN IS SOMETIMES SELECTED
PROBLEM DESCRIPTION:
node has two standbys. A service address fails (unplugged).
Swap_adapter completes sucessfully and standby is marked
down. Fallover occurs. It fails even though there is a
second standby marked up becuase the standby which is down
is selected for the service address of the failed node.
PROBLEM SUMMARY:
node has two standbys. A service address fails (unplugged).
Swap_adapter completes sucessfully and standby is marked
down. Fallover occurs. It fails even though there is a
second standby marked up becuase the standby which is down
is selected for the service address of the failed node.
PROBLEM CONCLUSION:
Modify clstrmgr so that it exports DOWN for standby adapters
which are down instead of doing nothing thus making it
compatible with HAS.
------
APAR: IY18077 COMPID: 5765D5100 REL: 311
ABSTRACT: VSDVGTS -A TOO SLOW WITH MANY HDISKS AND VPATHS
PROBLEM DESCRIPTION:
vsdvgts -a too slow with many hdisks and vpaths
PROBLEM SUMMARY:
Several scalability performance problems have been
observed when a node has a large number of disks
and/or volume groups being managed by RVSD.
PROBLEM CONCLUSION:
The following changes will be made in RVSD to address
the performance problems observed when a node has
a large number of disks and/or volume groups.
- The vsdvgts command has been changed to not use
the lspv command to determine volume group membership.
- The RVSD recovery scripts will limit the number of
varyonvg/varoffvg that can occur in parallel in order
to reduce ODM lock contention.
------
APAR: IY18128 COMPID: 5697F6400 REL: 640
ABSTRACT: FIXES TO MINOR DEFECTS IN MESSAGECENTER 6.4
PROBLEM DESCRIPTION:
Fixes to minor defects in messagecenter 6.4
------
APAR: IY18282 COMPID: 5765E5400 REL: 440
ABSTRACT: NXN MIGRATION IS BROKEN FOR CLUSTERS WITH SERIAL NETWORKS
PROBLEM DESCRIPTION:
HACMP/ES and RSCT daemons are not started
After installing HACMP/ES 4.4.0 + pmrs (ptf set 4)
for node by node migration. The problem is that
cllsif fails in clstart because the HACMPadapters
ODM has been corrupted.
PROBLEM CONCLUSION:
modify cluster.es.server.rte.post_u so that it correctly
parses and changes HACMPadapers file.
------
APAR: IY18289 COMPID: 5765C3403 REL: 430
ABSTRACT: PRINTING PROBLEM IN WIN95
PROBLEM DESCRIPTION:
Printing Problem in Win95
PROBLEM SUMMARY:
Printing Problem in Win95. When a file is printed to a network
printer, only first 18 characters of the filename appear in
the Document name on
PROBLEM CONCLUSION:
Job number will be sent instead of filename, which will be
helpful in differentiating different print jobs.
------
APAR: IY18398 COMPID: 5765E2600 REL: 502
ABSTRACT: OUTPUT OF COUT/CERR FROM DESTRUCTOR OF STATIC OBJECT DOES NOT AP
PROBLEM DESCRIPTION:
When using the C & C++ Compilers 3.6.6 compiler with the
5.0.2.0 level of the C++ Runtime Library the output of
cout and cerr statements in the destructors of static objects
does not appear.
For example in the following program:
#include <stream.h>
class bogus
{
public:
bogus() { cout << "Initialize\n"; }
~bogus() { cout << "Clean up\n"; }
};
bogus y;
main()
{
cout << "Hello, world\n";
return 0; // DELETE
}
The expected output is:
Initialize
Hello, world
Clean up
but if the program is built with 3.6.6 and the 5.0.2.0 runtime
the output "Clean up" from the destructor of the static object
does not appear.
PROBLEM CONCLUSION:
Fixed in VisualAge C++ 5.0.2.1 runtime PTFs
------
APAR: IY18538 COMPID: 5765B8100 REL: 220
ABSTRACT: SUPPRESS CA_TDM_CONNECT ERROR IF RC=CA_HANGUP
PROBLEM DESCRIPTION:
Suppress CA_TDM_Connect error if RC=CA_HANGUP
PROBLEM CONCLUSION:
Error was suppressed if RC=CA_HANGUP
------
APAR: IY18595 COMPID: 5765E6110 REL: 220
ABSTRACT: REQUIRED UPDATES FOR RSCT VERSION 2.2
PROBLEM DESCRIPTION:
Required updates for RSCT Version 2.2
PROBLEM SUMMARY:
These updates must be applied if you are using
WebSM or have the PSSP or HACMP/ES products installed.
PROBLEM CONCLUSION:
These updates must be applied if you are
using WebSM or have the PSSP or HACMP/ES products installed.
------
APAR: IY18632 COMPID: 5765C3403 REL: 430
ABSTRACT: SET SO_KEEPALIVE ON ON CLIENT CONNECTION SOCKETS.
PROBLEM DESCRIPTION:
If client system crash and re-connect to server, old client's
connection and session are not deleted and removed.
PROBLEM CONCLUSION:
set sock option SO_KEEPALIVE on the connection socket, so
the recv() would not block after the tcp_keepidle expired.
------
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]