|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: AIX Service Mail Server (aixserv
austin.ibm.com)Date: Tue Jun 11 2002 - 02:40:08 CDT
APAR: IY16092 COMPID: 5765B7300 REL: 510
ABSTRACT: AVOID MQSERIES IPC KEY CONFLICTS WITH OTHER PRODUCTS
PROBLEM DESCRIPTION:
MQSeries allocates Sys V IPC shared memory and semaphore sets
to handle user workload. One of these sets had a key which
clashed with another product running on the system. When this
clash occurred, MQSeries failed and had to be stopped and
restarted.
LOCAL FIX:
If the inode number contributing to the IPC key clash can be
identified and if it can be allocated to a non-MQSeries file,
this problem can be manually circumvented.
PROBLEM CONCLUSION:
This problem has been fixed and the fix will be shipped in the
following PTFs:
A) MQSeries for V5.1 CSD09
OS/2 U200157
Windows NT U200158
AIX U478874
HP-UX (V10) U478907
HP-UX (V11) U478908
Sun Solaris U478910
B) MQSeries for V5.2 CSD05
Windows NT U200169
AIX U481481
HP-UX (V10) U481510
HP-UX (V11) U481511
Sun Solaris U481514
Linux U481513
C) MQSeries for V5.2.1 CSD03
Windows NT/2000 U200170
------
APAR: IY23070 COMPID: 5765D5100 REL: 320
ABSTRACT: WRONG PATH FOR LUNREST.LST
PROBLEM DESCRIPTION:
wrong path for lunrest.lst
PROBLEM SUMMARY:
vsd_pscsilunreset, which is used to break a reserve on one
LUN on a SCSI ID on a a parallel SCSI adapter and is used
primarily by HACMP, is looking in the wrong place for the
lunreset.lst file.
This file is used so that RVSD tries to do a LUN level
reset as opposed to a Target mode reset for EMC Symetrix
disks.
There is potential for data corruption on EMC Symetrix
disks.
PROBLEM CONCLUSION:
The file path was corrected.
TEMPORARY FIX:
Copy /usr/lpp/csd/bin/lunreset.lst to /usr/lpp/csd/vsdfiles
------
APAR: IY23748 COMPID: 5765E2600 REL: 502
ABSTRACT: INTERNAL COMPILER ERROR WHEN USING NESTED CONDITIONAL EXPRESSION
PROBLEM DESCRIPTION:
A conditional expression in a ctor initializer list was
returning a member function pointer - the compiler was not
handling this situation causing a crash.
PROBLEM CONCLUSION:
USERS AFFECTED:
Users of conditional expressions involving member function
pointers.
RECOMMENDATION:
Avoid using a conditional expressions to assign a pointer
to member function.
PROBLEM SUMMARY:
The compiler will crash when dealing with a conditional
expression that returns a pointer to a member function
------
APAR: IY25627 COMPID: 5697E3000 REL: 220
ABSTRACT: WRONG ENTRY IN DICTIONARY
PROBLEM DESCRIPTION:
Wrong word entry for "Taiguu-Kaizen".
LOCAL FIX:
Update Wnn6 dictionaries.
PROBLEM SUMMARY:
This APAR includes fixes of Wnn6 problems.
------
APAR: IY26232 COMPID: 5765D5100 REL: 311
ABSTRACT: IP_RESET(IP_INIT) FAILURE ON TB3MX AND TB3PCI SP SWITCH
PROBLEM DESCRIPTION:
Machines can drop off the SP switch on an unfence or Estart due
to an ip_reset(IP_INIT) failure. The only approach to
recovering from an ip_reset(IP_INIT) error is to reboot any
machine which logs this error. For TB3MX and TB3PCI switch
adapters, this APAR addresses the IX87954 design change APAR
for ip_reset(IP_INIT) failures.
PROBLEM SUMMARY:
When a new set of switch routes must be downloaded to a
node on the SP switch, it's possible for an
ip_reset(IP_INIT)
error to occur. This error can only be addressed by
rebooting
the affected node.
PROBLEM CONCLUSION:
The SP Switch IP driver and microcode have been changed
to prevent ip_reset(IP_INIT) errors from occuring.
------
APAR: IY26294 COMPID: 5765E2600 REL: 500
ABSTRACT: STRUCT IS COPIED WITH -O IT FAILS.
PROBLEM DESCRIPTION:
PROBLEM:
The exact same test is run twice, once with the code to copy the
structure in an
optimized source file, and also with the same source
(but with different function name) in an unoptimized file with
debug flag
turned on. The main source file uses -g option, but not -O.
The unoptimized source executes as expected, but the optimized
code compiled
and executed, but didn't copy the structure successfully.
The customer found a workaround that may be useful in the defect
analysis.
The workaround involves declaring a temporary pointer to the
structure (in the
optimized code).
failing code: *p1 = *(p1->pNext);
workaround: POPT_BUG tmpPtr = p1->pNext; *p1 = *tmpPtr;
TESTCASE:
*********************** makefile
**********************************
struct_cp : struct_cp.o opt_mod.o dbg_mod.o
xlC_r -o struct_cp struct_cp.o opt_mod.o dbg_mod.o -g
struct_cp.o : struct_cp.cpp opt_dbg.h makefile
xlC_r -c -g struct_cp.cpp
opt_mod.o : opt_mod.cpp opt_dbg.h makefile
xlC_r -c -O opt_mod.cpp
dbg_mod.o : dbg_mod.cpp opt_dbg.h makefile
xlC_r -c -g dbg_mod.cpp
*********************** makefile
**********************************
*********************** opt_dbg.h
*********************************
typedef struct opt_bug_tag {
LOCAL FIX:
*pVal = *pVal->pNext; // Fails
tmpPtr = pVal->pNext; *pVal = *tmpPtr // Works.
PROBLEM CONCLUSION:
This is failing at opt because the compiler has generated an
assignment operator and that assignment
operator is being inlined. When inlined the expression for the
parameter is substituted without any
temporary and so it ends up clobering the pointer to the source.
P-Pak does not appear to generate an assignment operator for
this t/c and it's not clear that we should.
If we define an assignment operator then p-pak doesn't inline it
while we do. We either need to avoid
the inline or use a temporary to hold the parameter value.
------
APAR: IY26987 COMPID: 5765D5100 REL: 340
ABSTRACT: GENERIC FIXES FOR PSSP 3.4
PROBLEM DESCRIPTION:
generic fixes for pssp 3.4
------
APAR: IY26988 COMPID: 5765E6900 REL: 310
ABSTRACT: GENERIC FIXES FOR LOADL 3.1
PROBLEM DESCRIPTION:
generic fixes for LoadL 3.1
------
APAR: IY26989 COMPID: 5765D9300 REL: 320
ABSTRACT: GENERIC FIXES FOR PE 3.2
PROBLEM DESCRIPTION:
generic fixes for pe 3.2
------
APAR: IY26990 COMPID: 5765B9500 REL: 150
ABSTRACT: GENERIC FIXES FOR GPFS 1.5
PROBLEM DESCRIPTION:
generic fixes for gpfs 1.5
------
APAR: IY26991 COMPID: 5765B9501 REL: 340
ABSTRACT: GENERIC FIXES FOR GPFS 3.4
PROBLEM DESCRIPTION:
generic fixes for gpfs 3.4
------
APAR: IY27632 COMPID: 5765E2600 REL: 502
ABSTRACT: WRONG ERROR EMITTED BY COMPILER WHEN USING A POINTER TO MEMBER F
PROBLEM DESCRIPTION:
For the testcase:
template <class T> class Derived : public T {};
class A {};
Derived<A>* get();
typedef void (A::*PtrToMemberFunctionOfA)(void);
void foo(PtrToMemberFunctionOfA f)
{
(get()->*f)();
}
compiled with
xlC -c testcase.cpp
The following false error message is emitted:
"testcase.cpp", line 11.9: 1540-0210 (S) "A" is not a base class
of "Derived<A>".
LOCAL FIX:
Modify the statement:
(get()->*f)();
so that the return value from get() is cast to A*
(((A*)get())->*f)();
PROBLEM CONCLUSION:
Fixed May 2002 PTF
------
APAR: IY27682 COMPID: 5765E2600 REL: 502
ABSTRACT: -QINLINE OPTION CAUSES SEGMENTATION FAULT AT RUNTIME.
PROBLEM DESCRIPTION:
The attached testcase produces a segmentation fault during
runtime when
compiled with the option "-qinline". If it is compiled without
the
option, it seems to be OK.
If is compiled using Visualage C++ 5.0.2.0, then it seems to be
OK,
with or without the option. It appears to be a regression. It
fails with
5.0.2.1, 5.0.2.2, and the daily latest driver on DFS.
TESTCASE:
************************ testcase.C
*****************************
#include <iostream.h>
typedef long temp[3][3];
typedef long temp_slice[3];
class tempBox;
class tempBox_var;
typedef tempBox* tempBox_ptr;
temp_slice* _temp_alloc()
{
return new temp;
}
void _temp_free(temp_slice* _data)
{
if (_data)
delete[] _data;
}
class temp_var
{
private:
temp_slice *_ptr;
public:
temp_var() : _ptr((temp_slice *)NULL) {}
~temp_var() { _temp_free(_ptr); }
temp_var& operator=(temp_slice* _slice)
{
_ptr = _slice;
return *this;
}
temp_slice& operator[](long _index)
{
return _ptr[_index];
}
};
class tempBox_var
{
private:
tempBox_ptr _ptr;
public:
tempBox_var() : _ptr((tempBox_ptr)NULL) {}
~tempBox_var() { delete _ptr; }
tempBox_var(tempBox_ptr p) : _ptr(p) {}
tempBox_var& operator=(tempBox_ptr);
operator tempBox*() const { return _ptr; }
};
class tempBox
{
public:
tempBox() { _temp = _temp_alloc(); }
~tempBox() {}
temp_slice& operator[](long _index)
{
return _temp[_index];
}
private:
temp_var _temp;
};
tempBox_var& tempBox_var::operator=(tempBox_ptr _p)
{
_ptr = _p;
return *this;
}
int main(int argc, char* const* argv)
{
cout << "***** Start of test *****" << endl;
tempBox_var testBox = new tempBox();
cout << "Setting values..." << endl;
(*testBox)[0][0] = 23L;
(*testBox)[0][1] = 3L;
(*testBox)[0][2] = 55L;
(*testBox)[1][0] = 556L;
(*testBox)[1][1] = 22L;
(*testBox)[1][2] = 6L;
(*testBox)[2][0] = 99L;
(*testBox)[2][1] = 33L;
(*testBox)[2][2] = 2L;
cout << "Values are set, reading back..." << endl;
for (int k = 0 ; k < 3 ; k++)
for (int l = 0 ; l < 3 ; l++)
cout<<"(*testBox)["<<k<<"]["<<l<<"] :
"<<(*testBox)[k][l]<<endl;
cout << "***** End of test *****" << endl;
}
************************ testcase.C
*****************************
COMPILE COMMAND:
xlC -qinline testcase.C
EXPECTED RESULTS:
$ ./a.out
***** Start of test *****
Setting values...
Values are set, reading back...
(*testBox)[0][0] : 23
(*testBox)[0][1] : 3
(*testBox)[0][2] : 55
(*testBox)[1][0] : 556
(*testBox)[1][1] : 22
(*testBox)[1][2] : 6
(*testBox)[2][0] : 99
(*testBox)[2][1] : 33
(*testBox)[2][2] : 2
***** End of test *****
$
ACTUAL RESULTS:
$ ./a.out
***** Start of test *****
Setting values...
Segmentation fault
$
------
APAR: IY27710 COMPID: 5765B7300 REL: 520
ABSTRACT: SHARED MEMORY SEGMENT MAY BE DELETED BY OTHER QUEUE MANAGER
PROBLEM DESCRIPTION:
IY16092 is trying to fix MQ failure when AIX FTOK() generate
duplicate key. But MQ still have a hole that may cause to delete
a shared memoy segment which is owned by another queue manager.
This result, the QM got FDC for XC212018 in component
xstConectSegmentViaID that received non-zero RC from shmat .
Actualy, this failure occured under MQ5.1 CSD06 or CSD07.
LOCAL FIX:
Define al QM to same disk to prevent generating duplicate key by
ftok.
PROBLEM CONCLUSION:
This problem has been fixed and the fix will be shipped in the
following PTFs:
A) MQSeries for V5.1 CSD08
AIX U474841
HP-UX (V10) U474877
HP-UX (V11) U474879
Sun Solaris U474878
B) MQSeries for V5.2 CSD05
AIX U481481
HP-UX (V10) U481510
HP-UX (V11) U481511
Linux U481513
Sun Solaris U481514
------
APAR: IY28200 COMPID: 5765E2600 REL: 502
ABSTRACT: ICE WITH -QMAXERR=1:W
PROBLEM DESCRIPTION:
testcase:
int f() {
}
void main() {
}
compile:
$ xlC -v t.C -qmaxerr=1:w -c
The ICE goes away by removing the -qmaxerr option
This does not occur on any level before 011221 (4Q2001 PTF)
This is a regression
------
APAR: IY28235 COMPID: 5765E2600 REL: 502
ABSTRACT: INVALID "INTEGRAL CONSTANT EXPRESSION" ERROR MESSAGE WHEN
PROBLEM DESCRIPTION:
Problem: The customer's testcase produces an error stating that
the "expression must be an integral constant expression",
however the array declaration *is* an integral constant
expression. The testcase should compile.
Error message:
"test.cpp", line 11.20: 1540-0016 (S) The expression must be an
integral constant expression
Testcase (compile using xlC test.cpp) :
int main()
{
// Test 1 compiles
typedef unsigned short test1;
const test1 sizeArray1 = 10;
int array1[sizeArray1];
// Test 2 fails to compile
typedef const test1 test2;
test2 sizeArray2 = 10;
int array2[sizeArray2];
// Test 3 compiles
typedef test1 test3;
const test3 sizeArray3 = 10;
int array3[sizeArray3];
// Test4 compiles
typedef const test1 test4;
const test4 sizeArray4 = 10;
int array4[sizeArray4];
}
Using Test 2 as an example, I expect that sizeArray2 is a
constant integer. However, the compiler produces the error
message. It seems that the type 'test2' ends up being of type
"test1" instead of "const test1".
Here are some sections taken from the standard that state that
C++ requires expressions to evaluate to an integral constant:
5.3.4 New (paragraphs 1 and 6)
5.19 Constant expressions (paragraph 1)
8.3.4 Arrays (paragraph 1)
LOCAL FIX:
use one of the methods described in testcase (test 1, 3, or 4)
PROBLEM CONCLUSION:
Added a fix to respect normalized types.
------
APAR: IY28715 COMPID: 5765D5101 REL: 121
ABSTRACT: NETMON NEEDS TO FLUSH ARP CACHE WITH GB ETHER ADAPTERS
PROBLEM DESCRIPTION:
netmon needs to flush arp cache with gb ether adapters
PROBLEM SUMMARY:
In an HACMP/ES environment, there have been situations
where unexpected adapter events occurred, usually right
after the cluster was started or after a takeover
operation. The problem has been seen more often with
Gigabit ethernet adapters, but may occur with some other
kinds of adapters as well.
The problem is caused by ethernet switches sometimes
delaying responses to ARP requests. This leads RSCT
Topology Services to mark an adapter as down when
actually there exists only a short term outage. The
flushing of the ARP cache by HACMP (to account for
changes in the adapters' IP addresses) is triggering the
situation where ARP requests need to be sent by AIX.
PROBLEM CONCLUSION:
A fix has been introduced into RSCT Topology Services.
With the fix, the subsystem's adapter detection logic
will attempt to flush an incomplete ARP cache entry for
a destination of an "echo request" packet. Flushing the
ARP cache entry forces the operating system to send a new
ARP request in the next time a packet is sent to the same
destination.
With the fix, false adapter events caused by delays in
obtaining response to ARP requests should no longer occur.
------
APAR: IY28936 COMPID: 5765D5101 REL: 121
ABSTRACT: HAES MULTIPLE UNEXPECTED ADAPTER SWAPS
PROBLEM DESCRIPTION:
In an haes cluster where only 1 node remains, unexpected adapter
swaps can occur. These swaps can usually be prevented by using
the netmon.cf file. This APAR is being opened to handle the
unexpected adapter swaps and minimize the need for the netmon.cf
file.
PROBLEM SUMMARY:
The problem usually occurs in a 2-node HACMP/ES cluster
that adopts a cascading takeover policy. When one of
the nodes fails or is shut down, the resources on that
node are taken over by the remaining node, but at that
point the remaining node starts producing "network_down"
events due to apparent failures in the service adapter.
With the second node being down, the first node can only
count on the network traffic generated by either other
local adapters or other nodes not in the cluster. Without
such incoming traffic, RSCT Topology Services (which is
used in HACMP/ES for adapter/network liveness determination)
will indicate that the local adapter is down.
Though Topology Services will explicitly generate traffic
from other adapters in the cluster into the adapter under
test, the subsystem currently does not use adapters
that are configured with cascading takeover addresses. If
the node only has one service and one standby adapter, and
the standby adapter takes over the second node's address,
Topology Services is left with no local adapter to
exercise the service adapter. This results in false
adapter down events, unless external (client) traffic can
be produced into the service adapter.
As long as IP traffic keeps flowing into the service
adapter, the adapter will still be marked as up. On the
other hand, without such IP traffic (for example, the
cluster is idle), the adapter may be flagged as down.
Therefore, to work around this problem, addresses of
adapters external to the cluster can be added to the
/usr/sbin/cluster/netmon.cf file.
PROBLEM CONCLUSION:
A fix was introduced into the RSCT Topology Services
subsystem. With the fix, the use of the
/usr/sbin/cluster/netmon.cf file to help create traffic
into the service adapter should no longer be needed for a
2-node HACMP/ES cluster using a cascading takeover policy.
Even after a cascading takeover occurs, the service
adapter on the remaining node will still be considered
up as desired. To achieve this goal, Topology Services
will use other local adapters to exercise traffic into
the service adapter when needed.
------
APAR: IY29151 COMPID: 5765D5100 REL: 340
ABSTRACT: RPOOL LEAK
PROBLEM DESCRIPTION:
Under certain traffic loads, the IP/CSS protocol can loose track
of rpool. This can lead to a condition where the node can not
receive large IP/CSS packets.
LOCAL FIX:
reboot the node
PROBLEM SUMMARY:
Under certain traffic loads, the IP/CSS protocol
can lose track of rpool. This can lead to a
condition where the node can not
receive large IP/CSS packets.
PROBLEM CONCLUSION:
RPool Leak has been fixed.
------
APAR: IY29221 COMPID: 5765E3200 REL: 502
ABSTRACT: INTERNAL COMPILER ERROR.
PROBLEM DESCRIPTION:
V502:Internal Compiler Error:
xlC -c -v t.C:
$ xlC -c -v target.C
exec: /.../torolab.ibm.com/fs/projects/vabld/run
/tuscany/502/aix/daily/020305/exe/xlCentry
-D_AIX,-D_AIX32,-D_AIX41,-D_AIX43,-D_IBMR2,
-D_POWER,-qansialias,-otarget.o,target.C,
/tmp/x,/tmp/xlcW1l6ywyb,/dev/null,target.lst,
/dev/null,/tmp/xlcW2l6ywyc,NULL)
"target.C", line 50.59: 1540-0216 (W)
An expression of type "std::_Ptrit<CascadeAtom,
long,m *,CascadeAtom &,CascadeAtom *,CascadeAtom &>"
cannot be converted to "CascadeAtom *".
exec: /.../torolab.ibm.com/fs/projects/vabld/run
/tuscany/502/aix/daily/020305/exe/xlCcode(ansialias,
/tmp/xlcW0l5ywya,/tmp/xlcW1l6ywyb,target.o,target.lst,
/tmp/xlcW2l6ywyc,NULL)
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/
502/aix/daily/020305/bin/.orig/xlC: 150
rnal compiler error; please contact your Service
Representative
unlink: /tmp/xlcW0l5ywya
unlink: /tmp/xlcW1l6ywyb
unlink: /tmp/xlcW2l6ywyc
$
Fails with dev, 5020, and 502 daily latest.
testcase:
#include <vector>
namespace utmar{
template <class T>
class vector {
public:
T x, y, z; // vector values
vector() { x = y = z = 0.0; }
vector(const T &a, const T &b, const T &c)
{ x = a; y = b; z = c;}
};
}
typedef utmar::vector<double> dvect ;
class Atom {
public:
Atom();
double uppertable;
double upperlownrg;
int species;
};
class CascadeAtom {
public:
int species;
dvect sourcepos;
double uppertable;
double upperlownrg;
double ximin;
int primary;
int rare 2 ;
int karma;
dvect pos;
dvect dir;
dvect refpos;
int reflist;
double energy;
double weight;
CascadeAtom(int karma, int atom_type, int species, dvect pos,
int reflist, double energy, double
ximin, double weight, const int rare 2 ,
int prim=0,
double upper=200000.0,
double upperlow=0.0, dvect sourcepos=dvect(0,0,0));
};
class CascadeAtomHeap{
std::vector<CascadeAtom *>
casc_atom_ptr_vec;
std::vector<CascadeAtom>
casc_atom_vec;
public:
CascadeAtomHeap();
CascadeAtom *
AppendAtom(int karma, int atom_type,
int species, dvect pos, dvect dir,
dvect refpos,int reflist, double energy,
double ximin, double weight,
const int rare 2 , int prim=0,
double upper=200000.0, double upperlow=0.0,
dvect sourcepos=dvect(0,0,0)){
CascadeAtom ca(karma, atom_type,
species, pos, dir, refpos,
reflist, energy, ximin,
weight, rare, prim,
upper, upperlow, sourcepos);
return (CascadeAtom *)
casc_atom_vec.insert(casc_atom_vec.end(), ca);
}
};
class monte_carlo {
public:
int target(const CascadeAtom * current_atom_ptr,
int idp, int nsim, double ep);
int lat 1 ;
int targ_site_type 1 ;
Atom *atom;
CascadeAtomHeap * casc_atom_heap;
};
int monte_carlo::target(const CascadeAtom *
current_atom_ptr, int , int nsim,
double scaled_energy)
{
int i=0;
double targ_en=0, xim=0;
dvect pos, dir, refpos;
// THIS LINE CAUSES THE PROBLEM
casc_atom_heap->AppendAtom(1050, lat i ,
atom lat i .species,
pos, dir, refpos, targ_site_type i ,
targ_en,
xim, current_atom_ptr->weight,
current_atom_ptr->rare, 0,
atom lat i .uppertable,
atom lat i .upperlownrg,pos);
return 0;
}
LOCAL FIX:
A workaround is to recieve the return type of the function
PROBLEM SUMMARY:
small testcase:
struct S
{
int i;
};
S foo()
{
S s = {1};
return s;
}
S* bar()
{
return (S*) foo();
}
------
APAR: IY29299 COMPID: 5765E2600 REL: 502
ABSTRACT: INTERNAL COMPILER ERROR
PROBLEM DESCRIPTION:
- Internal Compiler Error in xlCcode when compiling the
testcase.
TESTCASE:
****************************** t.C ***********************
#pragma weak Register_
extern "C" int Register_(void* fobject)
{
return 0;
}
****************************** t.C ********=**************
COMPILE SCRIPT:
xlC -c t.C
EXPECTED RESULTS:
successful compile with object file being generated
ACTUAL RESULTS:
results of xlC -c -v t.C :
~~~~~~~~~~~~~~~~~~~~~~~~~
jromano
dcelogin:/:/projects/vacsup/jromano/pmr/32187> make
xlC -c t.C -v
exec:
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/502/aix/daily
/020319/ex
e/xlCentry(xlCentry,-D_AIX,-D_AIX32,-D_AIX41,-D_AIX43,-D_IBMR2,-
D_POWER,-qansial
ias,-ot.o,t.C,/tmp/xlcW0dFjPUa,/tmp/xlcW1dFjPUb,/dev/null,t.lst,
/dev/null,/tmp/x
lcW2dFjPUc,NULL)
exec:
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/502/aix/daily
/020319/ex
e/xlCcode(xlCcode,-qansialias,/tmp/xlcW0dFjPUa,/tmp/xlcW1dFjPUb,
t.o,t.lst,/tmp/x
lcW2dFjPUc,NULL)
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/502/aix/daily
/020319/bin/.ori
g/xlC: 1501-230 Internal compiler error; please contact your
Service Representat
ive
unlink: /tmp/xlcW0dFjPUa
unlink: /tmp/xlcW1dFjPUb
unlink: /tmp/xlcW2dFjPUc
make: The error code from the last command is 40.
Stop.
jromano
dcelogin:/:/projects/vacsup/jromano/pmr/32187>
results of xlC -c -v t.C -qdebug=except :
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
jromano
dcelogin:/:/projects/vacsup/jromano/pmr/32187> make
xlC -c t.C -v -qdebug=except
exec:
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/502/aix/daily
/020319/ex
e/xlCentry(xlCentry,-D_AIX,-D_AIX32,-D_AIX41,-D_AIX43,-D_IBMR2,-
D_POWER,-qansial
ias,-qdebug=except,-ot.o,t.C,/tmp/xlcW0wcjQqa,/tmp/xlcW1wcjQqb,/
dev/null,t.lst,/
dev/null,/tmp/xlcW2wcjQqc,NULL)
exec:
/.../torolab.ibm.com/fs/projects/vabld/run/tuscany/502/aix/daily
/020319/ex
e/xlCcode(xlCcode,-qansialias,-qdebug=except,/tmp/xlcW0wcjQqa,/t
mp/xlcW1wcjQqb,t
.o,t.lst,/tmp/xlcW2wcjQqc,NULL)
EXCEPTION terminates program: no handler for exception.
Exception of type: TRAP at 1043054C. Regs:
000000A9 2FF220C0 301B4200 00000000 00000009 A0000000 00000000
00000000
00000053 00000059 00000000 00000000 28222480 00000008 2FF222FC
2FF22320
DEADBEEF A00011D0 3001F2F8 30015668 3016CE50 00000002 00010938
00000008
300347CC 3EDC8980 1042BE6C 10467220 00000001 00000000 A00011D0
00000000
resume: cr: lr: ctr: xer: fpscr: msr:
mq:
10430550 28222480 1042FC84 1042BE80 00000007 00002000 0000D032
00000000
Trap condition: 0 = 0 arithmetic compare
Trap instruction: TI 4,R03,0
Traceback:
Line ? Disp 00000064
Dsa: 2FF220C0
Line ? Disp 00003E00
Dsa: 2FF22100
Line ? Disp 00000130
Dsa: 2FF221A0
Line ? Disp 000000DC
Dsa: 2FF22230
--- End of call chain ---
unlink: /tmp/xlcW0wcjQqa
unlink: /tmp/xlcW1wcjQqb
unlink: /tmp/xlcW2wcjQqc
make: The error code from the last command is 217.
Stop.
jromano
dcelogin:/:/projects/vacsup/jromano/pmr/32187>
------
APAR: IY29384 COMPID: 5765E2600 REL: 500
ABSTRACT: LINK ERROR: MUNCH: ERROR READING INPUT FILE /XXX.O
PROBLEM DESCRIPTION:
The customer uses files called aligns to deal with the issue of
turning fortran common blocks into shared memory. These aligns
amount to linker directive files called align.o and link
against. A compile link may look like:
cc -o send.tsk -bmaxdata:0x50000000 /align_1.o /align_2.o
send.ibm.o -L/source/lib -ldbutil -lpeutil
align.1 is a text file beginning with #! and then consisting of
lines of the format of "symbol address". The customer then pegs
their common blocks to fixed locations, and then put shared
memory in its place.
This works using the C compielr or the FORTRAN compiler.
However, using the C++ compiler to link creates the error:
munch: Error reading input file /align_1.o"
PROBLEM CONCLUSION:
A new request for munch to accept .o format
------
APAR: IY29397 COMPID: 5765E2600 REL: 502
ABSTRACT: SIGNAL 11 IN BE USING -G
PROBLEM DESCRIPTION:
There is an signal 11 in xlCcode when compiling with -g.
Summary of analysis:
-attached dbx to xlCcode to see where it failed (see below)
-the core file is empty
-reproduced on a AIX 51 64-bit test machine
-I tried the latest compilers using the complete compile line:
xlC_r -q64 -qtbtable=full -qhalt=e -+ -g -qnoinline -c
testcase.i
5.0.2.0 fails immediately
5.0.2.1 fails immediately
build 011221 (4Q2001 PTF) fails after 20 minutes (why does
this compiler take 20 minutes to fail when previous takes 10
seconds?)
build 020305 fails after 20 minutes
-I removed options, and found: xlC -g -c testcase.i is the
smallest compile command to reproduce the signal 11
-if I remove debug option (-g), 020305 and 011221 (4Q2001)
compiler passes *but* 5.0.2.0 and 5.0.2.1 (3Q2001) fails with
signal 11
-compiling with xlC -g -c testcase.i causes the signal 11 and
the following 2 warnings:
"testcase.i", line 51829.31: 1540-0840 (W) The integer
literal "0x100000000" is out of range.
"testcase.i", line 51847.31: 1540-0840 (W) The integer
literal "0x100000000" is out of range.
LOCAL FIX:
compile without -g.
PROBLEM CONCLUSION:
The wcode generated by the frontend was bad.
The compiler developers corrected it.
TEMPORARY FIX:
compile without debug information
------
APAR: IY29421 COMPID: 5765E2600 REL: 502
ABSTRACT: ICE IN C++ FE
PROBLEM DESCRIPTION:
There is an Internal Compiler Error using the latest compiler
build. Here is the testcase:
...
namespace ncbi { using namespace std; }
namespace std {
template<class _St>
class fpos { static _St _Stz; };
template<class _E>
struct char_traits {};
template<class _T>
class allocator;
class ios_base;
};
namespace std {
class _String_base {};
template<class _Ty, class _A>
class _String_val : public _String_base {};
template<class _E,
class _Tr = char_traits<_E>,
class _Ax = allocator<_E> >
class basic_string : public _String_val<_E, _Ax> {};
typedef basic_string<char, char_traits<char>, allocator<char> >
string;
};
namespace std {
class CObject
{
virtual ~CObject(void);
};
}
namespace ncbi {
class CNCBINode : public CObject {};
}
namespace ncbi {
class CNCBINode;
struct BaseTagMapper {};
template<class C>
struct TagMapper : public BaseTagMapper
{
TagMapper(CNCBINode* (C::*method)(void));
virtual CNCBINode* MapTag(CNCBINode* _this, const string&
name) const;
CNCBINode* (C::*m_Method)(void);
};
template<class C>
CNCBINode* TagMapper<C>::MapTag(CNCBINode* _this, const string&)
const
{
return (dynamic_cast<C*>(_this)->*m_Method)();
}
}
namespace ncbi {
class CHTMLBasicPage: public CNCBINode {};
class CHTMLPage : public CHTMLBasicPage {};
template struct TagMapper<CHTMLPage>;
}
...
Compile using xlC t.cpp
PROBLEM SUMMARY:
same as submitter's text
PROBLEM CONCLUSION:
the fix is for the object model to create
a runtime function with return type pointer to target class
instead of void*.
TEMPORARY FIX:
Move the out of line member function definition
inline.
------
APAR: IY29448 COMPID: 5765D5100 REL: 340
ABSTRACT: SPSWITCH2 DEVICE DRIVER MISSING CM-UNREGISTER FUNCTIONALITY.
PROBLEM DESCRIPTION:
SPSwitch2 device driver missing CM-unregister functionality on
window close. This can lead to a reference to freed kernel
memory and a node crash.
PROBLEM SUMMARY:
The device driver doesn't deregister with the pseudo device
driver for CM updates. Missing cadd_CM_deregistration.
PROBLEM CONCLUSION:
Add cadd_CM_deregistration function. It is called when the
window is closed or in state change handler when thread is
to be terminated.
------
APAR: IY29529 COMPID: 5765E6900 REL: 310
ABSTRACT: STARTD DRAIN EMAIL WHEN DRAIN_ON_SWITCH_TABLE_ERROR NOT TRUE
PROBLEM DESCRIPTION:
When DRAIN_ON_SWITCH_TABLE_ERROR = false or not set,
ACTION_ON_SWITCH_TABLE_ERROR still takes place BUT should not
be recieving Email that states the Node in question is being
drained. Message stating that there was a switch table error is
valid and should be the only message in the Email to the admin.
PROBLEM SUMMARY:
In LoadLeveler 3.1,
when the DRAIN_ON_SWITCH_TABLE_ERROR is false
but the ACTION_ON_SWITCH_TABLE_ERROR is set,
the email sent to the admin should not
have the statement that the node is going
to be drain.
PROBLEM CONCLUSION:
In LoadLeveler 3.1,
when the DRAIN_ON_SWITCH_TABLE_ERROR is false
and the ACTION_ON_SWITCH_TABLE_ERROR is set,
the email sent to the admin will not have
the node is going to be drain statment.
------
APAR: IY29560 COMPID: 5765D5100 REL: 340
ABSTRACT: PSSP SUPPORT FOR P690/P670 USING THE SP SWITCH
PROBLEM DESCRIPTION:
pssp support for p690/p670 using the sp switch
------
APAR: IY29872 COMPID: 5765E8500 REL: 200
ABSTRACT: ARP ENTRY NOT CREATED FOR X25 ADAPTER
PROBLEM DESCRIPTION:
The command "arp hostname" will fail with the latest
bos.net.tcp.client installed.
PROBLEM SUMMARY:
Arp table structure changes caused this failure.
------
APAR: IY29930 COMPID: 5765E8200 REL: 230
ABSTRACT: FIXES TO GEOMIRROR DEVICE CONFIGURATION SCALING PROBLEMS
PROBLEM DESCRIPTION:
For HAGEO configurations with a large number of GeoMirror
devices the HACMP failover processing performance can be
very slow. The worst performance occurs when the nodes
at the remote site are powered off, and in these cases
the user sees many occurrences of the message:
Unable to contact address: <Remote site IP address>
Some users have tried to work around this problem by
configuring many GeoMirror devices in parallel, and this
uncovered another problem, which results in the error
message:
<Command>: Failed getting minor number for <Device Name>
This secondary problem exists in both HAGEO and GeoRM.
PROBLEM CONCLUSION:
For HAGEO users the GeoMirror device configuration commands
have been changed so that they no longer try to contact
IP addresses that HACMP determines to be unreachable. This
significantly improves performance when remote site nodes
are powered off.
For both HAGEO and GeoRM users a device configuration locking
error has been fixed.
------
APAR: IY29951 COMPID: 5765E8500 REL: 200
ABSTRACT: UNDO DEFECT 358122.
PROBLEM DESCRIPTION:
Defect 3588122 was created to backout certain features
to the microcode that were not included in the version 200.
Now, this defect will put these features back.
PROBLEM SUMMARY:
Defect 3588122 was created to backout certain features
to the microcode that were not included in the version 200
Now, this defect will put these features back.
PROBLEM CONCLUSION:
Defect will put back previous backout.
------
APAR: IY29957 COMPID: 5765D5100 REL: 340
ABSTRACT: "TRY_TBIC_DEV" NOT INITIALIZED (BREAKS WRAPPED PORT CHECKING)
PROBLEM DESCRIPTION:
(In phase 1, and possibly other Worm paths), the field
"try_tbic_dev" is not initialized when the Worm is checking
a presumed wrapped port for a miswire (i.e., a node, versus a
wrap plug.) Due to this, the correct E/S response from the chip
is not recognized by ResponseWaitAndReceive(). There is also a
logging pbm (in this same code block) because the field
"att_dev_id" is not initialized to the current device;
specifically, the "route to" and "route from" messages to the
print file call out the wrong target device ID.
LOCAL FIX:
None. The effect of both of the these problems is minimal.
One is simply a logging problem. The other (wrap plug checking)
is not a concern (assuming the port is really wrapped, as
specified in the topology file), because the port will be
marked as wrapped when there is no response from the target
chip.
PROBLEM SUMMARY:
In phase 1 processing, during an Estart, the Worm for the
SP Switch2 can log incorrect messages. In an attempt to
identify all possible miswires, the Worm will try to query
a port that it believes to be wrapped. The response that
the Worm usually gets back is not handled properly.
PROBLEM CONCLUSION:
The Worm will now correctly handle an Error/Status packet
that comes back as a result of doing a query on a port that
is presumed to be wrapped.
------
APAR: IY29958 COMPID: 5765E8500 REL: 200
ABSTRACT: X.25 NPI VIRTUAL CIRCUITS CAN BE LOST,WATERMARK TUNING REQUIRED
PROBLEM DESCRIPTION:
By trying to stress the NPI driver by sending a high amount
of data where NPI cannot keep up with the amount of transmits
queued to it, Virtual Circuits can be lost.
LOCAL FIX:
Decrease the amount of stress put on the NPI driver.
PROBLEM SUMMARY:
Changes were made to STREAMS watermark values to prevent the
STREAM from getting clogged up.
PROBLEM CONCLUSION:
Changes were made to STREAMS watermark values to prevent the
STREAM from getting clogged up.
TEMPORARY FIX:
Decrease the amount of data being sent at one time with
X.25.
------
APAR: IY29959 COMPID: 5765E8500 REL: 200
ABSTRACT: X25FRAME:DEFAULT PHYS LAYER CONNECTION MODE WRONG ON SOME LINES
PROBLEM DESCRIPTION:
Default physical layer connection mode in fr_lprof_t is
incorrect for some lines.
PROBLEM CONCLUSION:
Correct the default physical layer connection mode in
fr_lprof_t for all lines.
------
APAR: IY30051 COMPID: 5765D5100 REL: 340
ABSTRACT: SYSLOGD NOT RESTARTED BY CLEANUP.LOGS.NODES
PROBLEM DESCRIPTION:
The syslogm routine in logmgt.cmds, called by cleanup.logs.nodes
(via psyslclr) stops syslogd after trimming a log. If trimming
another log results in an error, the routine exits without
restarting syslogd.
PROBLEM SUMMARY:
When psyslclr is invoked to trim multiple logs and it
successfully trims the first log, but does not have enough
space to trim subsequent logs, syslogd is stopped
but not restarted.
PROBLEM CONCLUSION:
Modified code in logmgt.cmds so that if syslogd is stopped,
it is always restarted.
------
APAR: IY30107 COMPID: 5765E8200 REL: 230
ABSTRACT: GMDSIZING REPORTING NEGATIVE NUMBERS
PROBLEM DESCRIPTION:
The customer is running the gmdsizing command for long intervals
and the report produced is showing negative numbers.
PROBLEM SUMMARY:
gmdsizing may report negative values for disk activity
during an interval if the iostat setting for sys0 is
set to true during the interval.
PROBLEM CONCLUSION:
gmdsizing was modified to handle changes to the iostat
setting for sys0. If data for an interval can not be
determined, this will be indicated by a new message
instead of having invalid data displayed.
------
APAR: IY30227 COMPID: 5765B9500 REL: 150
ABSTRACT: MMCRLV FAILS YET RETURNS ERROR CODE OF 0
PROBLEM DESCRIPTION:
mmcrlv fails yet returns error code of 0
PROBLEM SUMMARY:
mmcrlv and mmcrvse updated to ensure zero
return code on a failure related to hdisk already part of a VG.
PROBLEM CONCLUSION:
mmcrlv and mmcrvsd: fix bug in which
conditional call to unlockSDR() clobbered rc
------
APAR: IY30280 COMPID: 5765D5100 REL: 311
ABSTRACT: RC.SP SETS THE WRONG BOOTLIST IF TOTAL BOOTDISKS NOT
PROBLEM DESCRIPTION:
rc.sp sets the wrong bootlist if total bootdisks
not equivalent to total install disks
PROBLEM SUMMARY:
On the reboot of a node, the bootlist was being reset to
include all of the physical volumes listed for the selected
volume group of the node. Even the physical volumes that
did not contain boot logical volumes were included in
the bootlist. If there was a high number of physical
volumes it could cause a subsequent reboot to fail.
PROBLEM CONCLUSION:
spboot, which is called by /etc/rc.sp, was modified to only
set the bootlist to physical volumes that contain boot
logical volumes.
------
APAR: IY30342 COMPID: 5697E3000 REL: 230
ABSTRACT: JKIT.WNN7.BASE 2.3.0.1 WITH NEW DPKEYLIST
PROBLEM DESCRIPTION:
Shipping new dpkeylist.
LOCAL FIX:
Change for jkit.Wnn7.base 2.3.0.1.
------
APAR: IY30343 COMPID: 5765D5100 REL: 340
ABSTRACT: PSSP SUPPORT FOR THE P690/P670 USING THE SP SWITCH2 IN A
PROBLEM DESCRIPTION:
pssp support for p690/p670 using the sp switch2 in a
single-plane environment
PROBLEM SUMMARY:
pssp support for the p690/p670 using the
sp switch2 in a single/plane environment
------
APAR: IY30345 COMPID: 5765D5100 REL: 340
ABSTRACT: PSSP SUPPORT FOR P690/P670 IN A SWITCHLESS ENVIRONMENT
PROBLEM DESCRIPTION:
pssp support for p690/p670 in a switchless environment
PROBLEM SUMMARY:
PSSP support for switchless environments
------
APAR: IY30351 COMPID: 5697E3000 REL: 230
ABSTRACT: BUGFIX FOR JKIT.WNN7.BASE 2.3.0.2
PROBLEM DESCRIPTION:
There are aix defects for Wnn7.
LOCAL FIX:
Change to jkit.Wnn7.base 2.3.0.2 with bug fixes.
------
APAR: IY30354 COMPID: 5765D5100 REL: 311
ABSTRACT: PMAN ARRAY LIMIT MEANS THAT WHEN AN EVENT HAPPENS, A MESSAGE MAY
PROBLEM DESCRIPTION:
Pman internal array default of 16 adapters per node may not be
enough and can overwrite the pman definitions, causing the
nodes not to see any pman definitions!
LOCAL FIX:
pmand uses an internal array to read the SDR Adapter info. into.
This array is hard coded to 16 members (for each node).
If you have more than 16, it writes beyond the end of the
array, stepping on the PMAN_Subscription variable.
The array size has been set to 32 in a new version of pmand.
Until this new pmand is used, try reduce the amount of SDR info.
PROBLEM SUMMARY:
The code uses the PMAN_subscription variable to remember if
the SDR file is the new PMAN_Subscription file or the old
pmandConfig file.
Because the variable got stepped on when the array
overflowed, the code was incorrectly looking for a
pmandConfig file. The result is it does not find
any events, because it is looking in the wrong
place for them.
PROBLEM CONCLUSION:
Increased the size of the array (number of adapters per
node) from 16 to 64. This will prevent the array from
being overrun and the PMAN_Subscription variable from
getting stepped on.
------
APAR: IY30358 COMPID: 5765D5101 REL: 121
ABSTRACT: NETMONADAPTERHEALTH - LOOK FOR NETMON.CF IN EITHER /USR/SBIN/CLU
PROBLEM DESCRIPTION:
have netmonAdapterHealth look for netmon.cf file in either
/usr/sbin/cluster OR /usr/es/sbin/cluster directories.
Customer's have run into alot of problems because the file maybe
in the HACMP classic directory (/usr/sbin/cluster) rather than
HACMP/ES directory (/usr/es/sbin/cluster).
This apar will check in both places since customer's can legit-
imately migrate from HACMP Classic to HACMP/ES but forget to
move the netmon.cf file to the correct ES directory.
LOCAL FIX:
The problem can be circumvented by creating a /usr/sbin/cluster/
netmon.cf file.
PROBLEM SUMMARY:
Customers in HACMP/ES 2 node environment
were experiencing problems with topsvcs
false node downs because they were placing
the netmon.cf file in the wrong directories.
hats_nim of topsvcs was changed to check
for the existence of netmon.cf file first
in /usr/es/bin/cluster directory and if not
there then in /usr/sbin/cluster directory.
PROBLEM CONCLUSION:
The nim function of topsvcs now checks
to see if there is a netmon.cf file
first in /usr/es/sbin/cluster directory
and if not then checks in /usr/sbin/cluster
directory.
nmDiag.nim.topsvcs.<adapter>.<clustername>
reflects any errors in locating the file.
------
APAR: IY30371 COMPID: 5765B8100 REL: 220
ABSTRACT: ISDN_CALL_TRANSFER CS FAILS TO TRANSFER CALLS. ERRORID 20501
PROBLEM DESCRIPTION:
The ISDN_Call_Transfer custom server occasionally gets into a
bad state and will no longer transfer calls. Each attempt to
transfer a call will result in an errorid 20501 being generated,
with the following error text:
ISDN_Call_Transfer: MakeCallStatus is not valid at this time.
PROBLEM SUMMARY:
ISDN_CALL_TRANSFER CS FAILS TO TRANSFER CALLS.
ERRORID 20501
PROBLEM CONCLUSION:
Layer 4 was fixed so that the TERMINATE_CNF
is always sent freeing up the CHP properly when the transfer is
complete; or when it fails due to one of the parties involved
in the transfer hanging up.
------
APAR: IY30375 COMPID: 5765B9501 REL: 340
ABSTRACT: HANG: MMAP FLUSH BUFFER, VMM IOWAIT
PROBLEM DESCRIPTION:
hang: mmap flush buffer, VMM iowait
When findOrCreate has to go to the daemon to determine the gnode
type the gnode lock must be dropped or a deadlock can occur
involving the gnode lock and hash table lock.
LOCAL FIX:
The transition flag will stop the gnode from being used. Other
threads that find transition set now must wait untill the flag
is cleared, either by the gnode being deleted, or by the gnType
finally being set.
PROBLEM SUMMARY:
Deadlock condition could occur while mmap flush buffer
PROBLEM CONCLUSION:
When findOrCreate has to go to the daemon to determine the
gnode type the gnode lock must be dropped or a deadlock can
occur involving the gnode lock and hash table lock.
------
APAR: IY30378 COMPID: 5765E5400 REL: 440
ABSTRACT: HAS,HAES: APPLYING SNAPSHOT FAILS WITH LIBODM ERROR ON CUAT
PROBLEM DESCRIPTION:
Attempt to apply a cluster snapshot containing filesystems to
export, fails during verification with libodm error and
msg "Can't odmget CuAt".
PROBLEM CONCLUSION:
set the odmdir to /etc/objrepos in clver_table.c when
checking major numbers of vg's associated with exported
filesystems.
------
APAR: IY30598 COMPID: 5765B9501 REL: 340
ABSTRACT: MEMORY LEAK WHILE USING DMAPI
PROBLEM DESCRIPTION:
Memory leak while using DMAPI.
PROBLEM SUMMARY:
fixed memory leadk in dmapi
PROBLEM CONCLUSION:
sfsdmgetdirattrs: not freeing inode
buffer.
------
APAR: IY30599 COMPID: 5765B9501 REL: 340
ABSTRACT: MMDELDISK -C STOPS ON EMEDIA ERROR
PROBLEM DESCRIPTION:
mmdeldisk -c stops on emedia error
PROBLEM SUMMARY:
Fixed mmdeldisk stopping on EMEDIA error
PROBLEM CONCLUSION:
Check for both EIO and EMEDIA errors on reads only on
copyReplicas when deciding to 'break' disk addresses
pointing to bad stripes.
------
APAR: IY30602 COMPID: 5765B9501 REL: 340
ABSTRACT: ASSERT --IBDP1->INDDIRTY && IDP2->INDDIRTY, METADATA.C, LINE 98
PROBLEM DESCRIPTION:
assert --ibp1->inddirty && ibdp2->inddirty, metadata.c line 98
PROBLEM SUMMARY:
Fixed Assert condition: ibdP1->indDirty && ibdP2->indDirty
PROBLEM CONCLUSION:
iIn doubleUpdateDiskAddr, don't assert that the indirect
blocks are dirty. The update might have already been done by
another node which failed after logging the indirect block
changes. Also, don't deallocate the old addresses unless
they changed since the deallocation might have also happened
before the node crashed.
------
APAR: IY30605 COMPID: 5765B9501 REL: 340
ABSTRACT: PANIC--FETCH-VFS-KX.C
PROBLEM DESCRIPTION:
panic--fetch-vfs-kx.c
PROBLEM SUMMARY:
Fixed panic condition in fetch-vfs-kx.C::bdP->whichBufList
!= w
PROBLEM CONCLUSION:
Prefetch list mutex does not need to be dropped across the
call to cacheObjRele, since the hold count cannot go to
zero.
------
APAR: IY30607 COMPID: 5765D5101 REL: 121
ABSTRACT: DISABLE CLVER-C IN PHOENIX.SNAP AND OTHER MINOR BUG FIXES
PROBLEM DESCRIPTION:
disable clver -c in phoenix.snap and other minor bug fixes
PROBLEM SUMMARY:
clver -c takes too long to run and doesn't give any useful
information. There were also some bugs collecting the
odmget_CuAt file and lsvg file and a problem with the
timeouts in the forked phoenix.snap processes.
PROBLEM CONCLUSION:
The clver -c data is no longer collected. The odmget_CuAt
and lsvg files are now collected properly. The timeout
problem on the forked processes has been fixed by using
process id numbers instead of command names.
------
APAR: IY30652 COMPID: 5765D5100 REL: 340
ABSTRACT: PMAN ARRAY LIMIT MEANS THAT WHEN AN EVENT HAPPENS, A MESSAGE MAY
PROBLEM DESCRIPTION:
Pman internal array default of 16 adapters per node may not be
enough and can overwrite the pman definitions, causing the
nodes not to see any pman definitions!
LOCAL FIX:
pmand uses an internal array to read the SDR Adapter info. into.
This array is hard coded to 16 members (for each node).
If you have more than 16, it writes beyond the end of the
array, stepping on the PMAN_Subscription variable.
The array size has been set to 32 in a new version of pmand.
Until this new pmand is used, try reduce the amount of SDR info.
PROBLEM SUMMARY:
The code uses the PMAN_subscription variable to remember if
the SDR file is the new PMAN_Subscription file or the old
pmandConfig file.
Because the variable got stepped on when the array
overflowed, the code was incorrectly looking for a
pmandConfig file. The result is it does not find
any events, because it is looking in the wrong
place for them.
PROBLEM CONCLUSION:
Increased the size of the array (number of adapters per
node) from 16 to 64. This will prevent the array from
being overrun and the PMAN_Subscription variable from
getting stepped on.
------
APAR: IY30694 COMPID: 5765D5100 REL: 340
ABSTRACT: CSS.SNAP.LOG FILE CAN BE OVERWRITTEN
PROBLEM DESCRIPTION:
css.snap.log file can be overwritten
PROBLEM SUMMARY:
If the contents of the css log directories in
/var/adm/SPlogs/css occupy more than 30% of /var, the
css.snap utility will try to free space by deleting old
css.snap files. If there are no files with names ending
in "....css.snap.tar.Z", the css.snap.log file will be
overwritten.
PROBLEM CONCLUSION:
The output of the "ls" command to list the css.snap tar
files is appended to the end of the css.snap.log file.
------
APAR: IY30719 COMPID: 5765B9500 REL: 150
ABSTRACT: GPFS MINOR NUMBERS NOT SYNCHRONIZED BETWEEN HACMP NODES CAN
PROBLEM DESCRIPTION:
GPFS is not picky about the minor numbers it assigns to its
filesystem entries in /dev. Basically it just starts at 100 and
increments until it find a free number.
The problem that occurrs on hacmp clusters when clients are NFS
mounting the GPFS filesystems, is that NFS receives a filehandle
based, in part, on the minor number of the filesystem.
If different clients are accessing the same filesystem from two
different gpfs nodes using differing device minor numbers (and
thus different filehandles), when a failover occurs, the node
now handling all the clients will not recognize the other node's
client requests.
LOCAL FIX:
Manually synchronize the device minor numbers when the file-
systems are created, and monitor them periodically in case one
gets deleted (which will result in gpfs recreating it in the
original manner).
PROBLEM SUMMARY:
Fixed /dev minot number needed NFS failover of
GPFS server nodes
PROBLEM CONCLUSION:
Start assigning permanent minor numbers to
all new file systems. The minor numbers will be in the
range 150-maxMinor number (65535 or 255).
TEMPORARY FIX:
Manually synchronize the device minor numbers
when the file-systems are created, and monitor them
periodically in case one gets deleted (which will result in
gpfs recreataing it in the original manner.)
------
APAR: IY30720 COMPID: 5765E6900 REL: 310
ABSTRACT: LLSUMMARY -R THROUGHPUT/MAXQUEUED AND REAL TIME MULTIPLE OF
PROBLEM DESCRIPTION:
llsummary -r throughput produces Queue and Real times wrong.
Value is a multiple of nodes used when parallel.
PROBLEM SUMMARY:
The throughput reports, produced by the LoadLeveler
llsummary command, can produce higher than appropriate
Queue Time and Real Time numbers for parallel jobs that ran
on multiple nodes.
PROBLEM CONCLUSION:
The llsummary command had been adding the Queue Time and
Real Time numbers from each node that was used to execute
a parallel job. That made the resulting numbers too high
by the number of nodes that was used for the job.
The command was changed to first determine if the job was a
serial job or a parallel job, and to do the correct
calculations, after that.
------
APAR: IY30732 COMPID: 5765E5400 REL: 440
ABSTRACT: APPLICATION SERVER NOT RESTARTED AFTER MONITORED PROCESS FAILS
PROBLEM DESCRIPTION:
Application monitoring is configured for process monitoring.
The resource group starts correctly and clappmond and the
monitored process are running. If I kill the monitored process
the application server is not restarted and I have the following
in /tmp/clstrmgr.debug.
RationalizeEvent: Mark event TE_SERVER_RESTART (200a5dd8)
inactive, resource group 33 has state 4 on node 3
This problem was introduced by APAR IY29008.
PROBLEM SUMMARY:
After an application monitor fails, it is not restarted, even
though there may be retries left.
PROBLEM CONCLUSION:
Correct the internal lookup of the monitor id such that the
recovery actions are handled properly.
------
APAR: IY30736 COMPID: 5765E5400 REL: 440
ABSTRACT: CLSTRMGR EXITS BECAUSE OF SEGMENTATION FAULT IN MEMSET
PROBLEM DESCRIPTION:
clstrmgr exits because cc_rpcProg does malloc without free
each time commands clfindres, cllssvcs or cllsstbys are run.
PROBLEM SUMMARY:
The clstrmgr will exit and the system will halt if the
commands clfindres, cllssvcs or cllsstbys are run many times
without restarting the clstrmgr.
PROBLEM CONCLUSION:
The memory allocated by the clstrmgr increases each time the
commands clfindres, cllssvcs or cllsstbys are run. The
clstrmgr will be changed to free the memory allocated after
these commands are run.
------
APAR: IY30738 COMPID: 5765E6900 REL: 310
ABSTRACT: MACHPRIO DOES WRONG CALCULATION OVER TIME
PROBLEM DESCRIPTION:
MACHPRIO very often is based on a computation around
LoadAvg. Now LoadLeveler adjusts the LoadAvg with the
value of NEGOTIATOR_LAODAVG_INCREMEMENT when a job is
started on that node.
unfortunately it can happen that this add-on stays longer,
and if multiple jobs are started on the node in question
accumulate to really strange values. (i saw values of upto
-1240.00).
if the machine is idle for a certain amount of time,
the MACHPRIO value recovers on its own ...
LOCAL FIX:
Recycling of Negotiator or Startd on the problem node
recovers immediatly.
alternatively a sequence of "llctl resume" to that node
can recover eventually, too
PROBLEM SUMMARY:
If Loadavg is used in your calculation for MACHPRIO in the
LoadL_config file, the MACHPRIO value can sometimes get
values well out of the range of what it should be. The
problem can happen if a parallel job starts more than two
tasks on the same node. Newer hardware, with increased
numbers of CPUs, are most susceptible to this problem.
That is based on the assumption that the MAX_STARTERS value
is set equal to the number of CPUs on the machine.
PROBLEM CONCLUSION:
The Negotiator internally adjusts a machine's loadavg when
it starts a new job on that machine. As part of that
adjustment, the Negotiator could sometimes keep adjusting
the adjusted value instead of adjusting the real load value
that it received from the machine. The code that
determined which value to adjust was modified to correct
the problem.
------
APAR: IY30745 COMPID: 5648C9802 REL: 430
ABSTRACT: SDK 1.3.0 PTF 10 : CA130-20020504
PROBLEM DESCRIPTION:
New PTF information to be added later
PROBLEM SUMMARY:
Fixes since PTF 9b (ca130-20020208) :
(Note: The descriptions here have been truncated.)
+----------+--------+-------+---------------------------------+
|20020226 |39013 | |space in JScrollPane not repainte|
+----------+--------+-------+---------------------------------+
|20020226 |39358 | |SIGSEGV with JIT on |
+----------+--------+-------+---------------------------------+
|20020226 |39468 | |JNI Compatibility section of Read|
+----------+--------+-------+---------------------------------+
|20020226 |39473 | |Cannot navigate the print dialog |
+----------+--------+-------+---------------------------------+
|20020226 |39752 | |Path problem. |
+----------+--------+-------+---------------------------------+
|20020226 |39891 |PQ56430|Problem with java.net.URLEncoder/|
+----------+--------+-------+---------------------------------+
|20020226 |40083 | |awt.h is broken with DEBUG flag |
+----------+--------+-------+---------------------------------+
|20020226 |40218 | |SEGV in libawt.so with TL6.5 |
+----------+--------+-------+---------------------------------+
|20020226 |40637 | |'January' translation in pt BR in|
+----------+--------+-------+---------------------------------+
|20020226 |40644 |PQ57103|JToolBar Repositioning Defect |
+----------+--------+-------+---------------------------------+
|20020226 |40792 |PQ57343|Java App - Printing doesn't work |
+----------+--------+-------+---------------------------------+
|20020226 |40871 |PQ57496|ZipFile throws "extra header info|
+----------+--------+-------+---------------------------------+
|20020226 |40897 |PQ57740|JOptionPane - Cut off text with T|
+----------+--------+-------+---------------------------------+
|20020226 |40988 |PQ57656|javax.naming.CommunicationExcepti|
+----------+--------+-------+---------------------------------+
|20020226 |41005 | |ObjectOutputStream leaks objects |
+----------+--------+-------+---------------------------------+
|20020226 |41034 |PQ57734|javac language problem on Double |
+----------+--------+-------+---------------------------------+
|20020227 |41377 | |Use of strerror in panicHandler c|
+----------+--------+-------+---------------------------------+
|20020227 |41448 | |ObjectOutputStream leaks objects |
+----------+--------+-------+---------------------------------+
|20020227 |41458 | |locationsOfLine() throws exceptio|
+----------+--------+-------+---------------------------------+
|20020301 |41519 | |ObjectOutputStream leaks objects |
+----------+--------+-------+---------------------------------+
|20020302 |41165 | |JIT:Core dump |
+----------+--------+-------+---------------------------------+
|20020306 |41377 | |Use of strerror in panicHandler c|
+----------+--------+-------+---------------------------------+
|20020227 |41448 | |ObjectOutputStream leaks objects |
+----------+--------+-------+---------------------------------+
|20020227 |41458 | |locationsOfLine() throws exceptio|
+----------+--------+-------+---------------------------------+
|20020301 |41519 | |ObjectOutputStream leaks objects |
+----------+--------+-------+---------------------------------+
|20020302 |41165 | |JIT:Core dump |
+----------+--------+-------+---------------------------------+
|20020306 |41377 | |Use of strerror in panicHandler c|
+----------+--------+-------+---------------------------------+
|20020306 |41483 |PQ58473|Problem with JDK and WAS and WSAD|
+----------+--------+-------+---------------------------------+
|20020314 |40945 |PQ58912|URL(URL, String) incorrect relati|
+----------+--------+-------+---------------------------------+
|20020315 |41325 |IY29106|TextFields are displayed overlapp|
+----------+--------+-------+---------------------------------+
|20020315 |41846 | |130 ORB.init() to use currentcont|
+----------+--------+-------+---------------------------------+
|20020323 |41506 | |JToolBar Repositioning Defect |
+----------+--------+-------+---------------------------------+
|20020323 |41612 |IY28988|IM status left after workspace sw|
+----------+--------+-------+---------------------------------+
|20020326 |41732 | |JVMPI GetCallTrace seg fault with|
+----------+--------+-------+---------------------------------+
|20020326 |41847 | |Cannot backtab from JComboBox |
+----------+--------+-------+---------------------------------+
|20020329 |42303 | |Abstract: |
+----------+--------+-------+---------------------------------+
|20020330 |42097 | |Alert "zlib security hole" |
+----------+--------+-------+---------------------------------+
|20020330 |42233 | |currency symbol not being updated|
+----------+--------+-------+---------------------------------+
|20020402 |42016 | |Wnn IM Status not appear on plugi|
+----------+--------+-------+---------------------------------+
|20020403 |42133 | |JFileChooser:Once file choosen di|
+----------+--------+-------+---------------------------------+
|20020423 |42739 | |Setting LIBPATH - error in 131 RE|
+----------+--------+-------+---------------------------------+
|20020425 |42307 | |Memory prematurely released durin|
+----------+--------+-------+---------------------------------+
|20020426 |42443 |IY30278|Stack overflow with HTML document|
+----------+--------+-------+---------------------------------+
|20020426 |43157 | |Application Crash with JIT compil|
+----------+--------+-------+---------------------------------+
|20020501 |43486 | |Thread Pooling for RMI Connection|
+----------+--------+-------+---------------------------------+
PROBLEM CONCLUSION:
All of the above defects have been fixed
------
APAR: IY30747 COMPID: 5765E6110 REL: 220
ABSTRACT: CT: 64BIT KEYFILE PROBLEM
PROBLEM DESCRIPTION:
ct: 64-bit keyfile problem
PROBLEM SUMMARY:
32-bit and 64-bit RSCT based applications record security
keyfiles of differing sizes. A keyfile created by a 32-bit
RSCT application cannot be used by a 64-bit RSCT
application, and vice versa.
PROBLEM CONCLUSION:
Memory offset calculations within the code have been
repaired to remove the problem.
TEMPORARY FIX:
Execute 32-bit versions of the RSCT daemons and libraries.
32-bit RMC client applications will execute correctly.
Restrict 64-bit RMC applications from executing until the
repair is applied.
------
APAR: IY30748 COMPID: 5765E6110 REL: 220
ABSTRACT: CT:CTSEC:MSS'S SEEC_(UN)MARSHAL_TYPED_KEY ROUTINES DO NOT
PROBLEM DESCRIPTION:
ct:ctsec: mss's sec(un)marshal_typed_key routines do not
interoperate between 64bit and 32bit applications
PROBLEM SUMMARY:
The problem occurs when a 64bit/32bit process marshals a
typed key and a 32bit/64bit process unmarshals the typed
key. The unmarshaling fails. The reason is because of data
alignment differences in 64bit and 32bit processes.
PROBLEM CONCLUSION:
The fix changes the way the length of the typed key is
calculated.
------
APAR: IY30760 COMPID: 5765E6110 REL: 220
ABSTRACT: CT:MCASSEMBLE GETS SIGBUS IF NO SPACE IN FILE SYSTEM
PROBLEM DESCRIPTION:
ct:mcassemble gets sigbus if no space in file system
PROBLEM SUMMARY:
The mcassemble program, which is invoked during RSCT
installation, creates an output file which it subsequently
writes to through memory-mapped file techniques. It does
not, however, guarantee that all blocks necessary for the
output are allocated before proceeding to write. This can
lead to mcassemble receiving a SIGBUS exception if/when the
filesystem containing the output file fills up before
mcassemble is done writing to the output file.
PROBLEM CONCLUSION:
The mcassemble program has been modified to write to all
blocks of the output file immediately after its creation,
thus guaranteeing that all the file blocks are allocated in
advance so that mcassemble can complete without failing due
to file blocks being unavailable for allocation during its
normal processing. If mcassemble cannot write to all
blocks of the output file immediately after file creation,
it gracefully exits with an appropriate return code (rather
than being terminated by the kernel with a SIGBUS signal)
which indicates that the filesystem that would contain the
output file need be resized before installation can occur.
------
APAR: IY30853 COMPID: 5765E7200 REL: 310
ABSTRACT: CORE DUMP AT POPEN
PROBLEM DESCRIPTION:
possible core dump when cifsServer is quite large
PROBLEM CONCLUSION:
moved print queue processing to cifsUserProc
------
APAR: IY30860 COMPID: 5765E7200 REL: 310
ABSTRACT: ADD USER SOMETIMES RETURNS ERROR WHEN CIFS_REGISTRY=1
PROBLEM DESCRIPTION:
When cifs_registry flag is set and adding users to the
DCE registry sometimes returns error even the user was
successfully added to the DCE registry.
PROBLEM CONCLUSION:
Added addition error handling code.
------
APAR: IY30885 COMPID: 5765E7200 REL: 310
ABSTRACT: MULITIUSERLOGON AND EXCEL IS BROKEN.
PROBLEM DESCRIPTION:
if multiuserlogin is on, user cann't open an excel file.
Some time it also cause core dump.
PROBLEM CONCLUSION:
Accept the IPC sessesion setup request from clients.
------
APAR: IY30886 COMPID: 5765C3403 REL: 430
ABSTRACT: JDK 1.1.8 PTF 13 : A118-20020509
PROBLEM DESCRIPTION:
Fixes since PTF 12 (a118-20010804) :
(Note: The descriptions here have been truncated.)
+--------+------+-------+-------------------------------------+
|20010814|13164 | |NSmtpClient.to() strips angle bracket|
+--------+------+-------+-------------------------------------+
|20010830|13181 | |NTime problem with java |
+--------+------+-------+-------------------------------------+
|20010917|13193 | |TJVM freezes |
+--------+------+-------+-------------------------------------+
|20010927|13043 | |NJIT: java backtrace not reported for|
+--------+------+-------+-------------------------------------+
|20011018|13212 | |NAdd a mechanism for checking and rep|
+--------+------+-------+-------------------------------------+
|20011203|13218 | |Nverbosegc message for gc_do3003first|
+--------+------+-------+-------------------------------------+
PROBLEM CONCLUSION:
All of the above defects have been fixed
------
APAR: IY30893 COMPID: 5765D5100 REL: 340
ABSTRACT: MEMORY EXHAUSTED W/ NON-CONTIGUOUS USER DEFINED DATA TYPES
PROBLEM DESCRIPTION:
Memory leak sending MPI non-contiguous user defined types with
MPI_GATHER. After a few thousand iterations, the job will end
with ERROR: 0032-171 Communication subsystem error: Memory is
exhausted. in MPI_Gather, task 0. For 32bit us it runs out at
1,043,000 cycles, for 32 bit ip it runs out at 38,000 cycles and
for 64-bit PRPQ over US it runs out at 521,000 cycles.
PROBLEM SUMMARY:
Memory leak sending MPI non-contiguous user defined types
with MPI_GATHER. After a few thousand iterations, the job
will end with ERROR: 0032-171 Communication subsystem error:
Memory is exhausted. The fix has cleaned up memory
allocations that are no longer in used.
PROBLEM CONCLUSION:
The fix is effective. Unused memories are being cleaned up.
------
APAR: IY30904 COMPID: 5765D5100 REL: 340
ABSTRACT: PROBLEMS WITH NOSUID FLAG IN GPFS FILESYSTEMS
PROBLEM DESCRIPTION:
problems with nosuid flag in gpfs filesystems
PROBLEM SUMMARY:
security issue
PROBLEM CONCLUSION:
security issue resolved
------
APAR: IY30922 COMPID: 5765D5100 REL: 311
ABSTRACT: SETUP_SERVER SHOULD IGNORE PPP CONNECTIONS
PROBLEM DESCRIPTION:
If pp0 adapter is pressent setup_server fails.
setup_server : host: 0827-803 Cannot find address 0.0.0.0.
setup_CWS: 0016-338 Kerberos setup was bypassed for network
interfaces that could not be resolved
Setup_server ends with rc = 0. But The node you are installing
does not receive a kerberos ticket.
Circumvention this problem by detaching pp0 causes that
svcagent cannot be activated and running during setup_server
action.
LOCAL FIX:
A good workaround is to add an entry to /etc/hosts like:
zero 0.0.0.0 # dummy ppp entry to prevent setup_server problems
PROBLEM SUMMARY:
When the Point-to-Point Protocol (PPP) is being used on
a Control Workstation, setup_CWS will terminate processing
with the messages:
host: 0827-803 Cannot find address 0.0.0.0.
setup_CWS: 0016-338 Kerberos setup was bypassed for
network interfaces that could not be resolved.
Since the Point-to-Point Protocol is being displayed in
the netstat -in data, setup_CWS tries to determine the
IP addresses for these interfaces and fails. The data
from the Point-to-Point Protocol should be ignored
by setup_CWS.
PROBLEM CONCLUSION:
setup_CWS has been modified to skip lines of data from
netstat -in which refer to the Point-to-Point Protocol.
------
APAR: IY30927 COMPID: 5765C3403 REL: 430
ABSTRACT: POLL FAILS TO WAIT FOR THE SPECIFIED TIMEOUT PERIOD IN 64-BIT
PROBLEM DESCRIPTION:
When poll is called for a timer operation in 64-bit
applications, it does not wait for the specified time.
PROBLEM SUMMARY:
When poll is called from a 64-bit app for a timer function,
it does not wait for the timeout period.
PROBLEM CONCLUSION:
The poll code has been changed to handle the data when called
from a 64-bit application.
------
APAR: IY30932 COMPID: 5765D5100 REL: 340
ABSTRACT: CSS.SNAP MAY NOT CONTAIN CADD_DUMP.OUT
PROBLEM DESCRIPTION:
The css.snap script may not collect the cadd_dump.out file,
which can be helpful in debugging certain SP switch 2 problems.
PROBLEM SUMMARY:
In some cases in which the css.snap script is run
to collect data for an SP Switch 2 problem, the
trace from the device driver (cad_dump.out) will
not be collected.
PROBLEM CONCLUSION:
The css.snap script has been changed to ensure, for the
SP Switch 2, the device driver trace (cadd_dump.out) is
always collected when the css.snap command is run.
------
APAR: IY30961 COMPID: 5765E8500 REL: 200
ABSTRACT: CRASH IN THE FRAME LAYER FR_RNR_NET FUNCTION WHEN RNR FRAMES
PROBLEM DESCRIPTION:
Customer is receiving RNR constantly,
the fr_rnr_net function calls the fr_rej_net
that frees the message and return.
fr_rnr_net use the same conditions than
frn_rej_net to free the message again,
causing a crash
PROBLEM SUMMARY:
When RNRs are being handle with the poll final bit set
the machine will crash in the frame layer fr_net_rnr that
calls the freemsg
PROBLEM CONCLUSION:
We implement a flag that check if the path taken in the
fr_net_rej function called by the fr_net_rnr is taken.
If it is, we are not going to let the fr_net_rnr frees the
message because it was already freed in the fr_net_rej.
------
APAR: IY31012 COMPID: 5765D5100 REL: 320
ABSTRACT: RVSD SUPPORT FOR FAST T500
PROBLEM DESCRIPTION:
rvsd support for fastt500
PROBLEM SUMMARY:
RVSD device recovery sometimes needs to break
disk reservations that are held by a server that
has failed. In the past, this involved issuing
low level SCSI commands or opening the devices
with the openx() system call with an argument
of SC_FORCED_OPEN (which sends a a SCSI Target
Mode reset to the bus). Recently AIX has
started to support openx() with an argument
of SC_FORCED_OPEN_LUN which will target just
that specific LUN. RVSD device recovery
should take advantage of SC_FORCED_OPEN_LUN
where appropriate.
PROBLEM CONCLUSION:
RVSD device recovery will support
SC_FORCED_OPEN_LUN where appropriate.
------
APAR: IY31025 COMPID: 5765D5100 REL: 340
ABSTRACT: RVSD SUPPORT FOR FAST T500
PROBLEM DESCRIPTION:
rvsd support for fastt500
PROBLEM SUMMARY:
RVSD device recovery sometimes needs to break
disk reservations that are held by a server that
has failed. In the past, this involved issuing
low level SCSI commands or opening the devices
with the openx() system call with an argument
of SC_FORCED_OPEN (which sends a a SCSI Target
Mode reset to the bus). Recently AIX has
started to support openx() with an argument
of SC_FORCED_OPEN_LUN which will target just
that specific LUN. RVSD device recovery
should take advantage of SC_FORCED_OPEN_LUN
where appropriate.
PROBLEM CONCLUSION:
RVSD device recovery will support
SC_FORCED_OPEN_LUN where appropriate.
------
APAR: IY31030 COMPID: 5765D5100 REL: 340
ABSTRACT: IP_RESET(IP_INIT) ERRORS ON SP SWITCH
PROBLEM DESCRIPTION:
ip_reset(IP_INIT) errors on sp switch
PROBLEM SUMMARY:
When a new set of switch routes must be
downloaded to a node on the SP switch, it's possible for an
ip_reset (IP_INIT) error to occur. This error can only be
addressed by rebooting the affected node.
PROBLEM CONCLUSION:
The SP Switch IP driver and microcode have
been changed to prevent ip_reset (IP_INIT) errors from
occuring.
------
APAR: IY31041 COMPID: 5765D5100 REL: 340
ABSTRACT: BAD DMA WRITE FOR KLAPI 0-COPY MSG
PROBLEM DESCRIPTION:
This problem is caused by cleaning up a hal dma handle while the
there is still a post of the message possible.
PROBLEM SUMMARY:
There is a time hole where a DMA buffer may remain posted
after a message is marked as complete to the user. This
leaves the possiblity of data corruption or, in the case
of a Regatta, a system check stop.
PROBLEM CONCLUSION:
All outstanding DMA buffers are canceled before a
buffer is marked as complete to the user.
------
APAR: IY31069 COMPID: 5765E5400 REL: 440
ABSTRACT: CLM_NOLOCKMGR AFTER STARTING CLUSTER SERVICES
PROBLEM DESCRIPTION:
node_up_complete script fails with the following error.
Bad return code from clmlock: CLM_NOLOCKMGR
PROBLEM CONCLUSION:
add test start_server.sh for lock manager run state
------
APAR: IY31071 COMPID: 5765E5400 REL: 440
ABSTRACT: CLLSFS CORE DUMPS WHEN THERE ARE MANY FILESYSTEMS
PROBLEM DESCRIPTION:
cllsfs command generates core dump when there are a large
number of filesystems configured on the system; the core is
also generated if the command is called by smitty cl_fs or
cl_vg.
PROBLEM CONCLUSION:
Correct error in memory allocation rutine.
------
APAR: IY31072 COMPID: 5765E5400 REL: 440
ABSTRACT: HAS: LOCK MANAGER FAILS AFTER NODE FAILURE
PROBLEM DESCRIPTION:
Applications that use the lock manager will hang on surviving
nodes when a node fails.
PROBLEM CONCLUSION:
Don't cancel reconfig processing.
------
APAR: IY31073 COMPID: 5765E5400 REL: 440
ABSTRACT: APPLY SNAPSHOT WITH HACMPLOGS CLUSTER.MMDD WILL FAIL.
PROBLEM DESCRIPTION:
If a snapshot was created before APAR IY28996 was applied,
the snapshot cannot be applied to a system that has
APAR IY28996 applied. The following is the error message.
ERROR: the log file cluster.mmdd is unable to be redirected
through this utility.
PROBLEM CONCLUSION:
The routine that applies the snapshot will update the
name of the history log to cluster.mmddyyyy.
------
APAR: IY31075 COMPID: 5765E8200 REL: 230
ABSTRACT: HAGEO: LOGREDO NOT BEING RUN ON GMD-BASED FILESYSTEMS
PROBLEM DESCRIPTION:
Logredo is not run on gmd_based JFS filesystems after upgrading
HACMP.
PROBLEM CONCLUSION:
Change Geo_mount_fs to run logredo once for each gmd-based
jfslog that is detected.
------
APAR: IY31076 COMPID: 5765E8200 REL: 230
ABSTRACT: HAGEO: DEADLOCK IN GMDPIN
PROBLEM DESCRIPTION:
A deadlock in the gmd driver causes the system to eventually
hault from a DMS timrout.
PROBLEM CONCLUSION:
Remove memory profiling code that used unsafe locking.
------
APAR: IY31115 COMPID: 5765D5100 REL: 340
ABSTRACT: SPGETDESC SUPPORT OF WINTERHAWK2 450MHZ
PROBLEM DESCRIPTION:
The WinterHawk2 450MHz needs to be added to the spgetdesc
command .
PROBLEM SUMMARY:
/usr/lpp/ssp/bin/spgetdesc did not describe the
processor speed of the Winterhawk II nodes.
For Winterhawk II nodes, both thin and wide,
currently the description that is returned is:
spgetdesc: Node 5 (c183n05.ppd.pok.ibm.com)
is a 375_MHz_POWER3_SMP_Thin
The description has been updated to return:
375/450_MHz_POWER3_SMP_Thin - for thin nodes
375/450_MHz_POWER3_SMP_Wide - for wide nodes
PROBLEM CONCLUSION:
In the spgetdesc script, the definition table for
a Winterhawk II node was changed to read
375/450_MHz_POWER3_SMP_Thin
OR
375/450_MHz_POWER3_SMP_Wide
------
APAR: IY31121 COMPID: 5765D5100 REL: 340
ABSTRACT: RAS: IMPROVE CSS.SNAP DATA COLLECTION
PROBLEM DESCRIPTION:
The css.snap command needs to improve the data collected on some
'soft' snaps.
PROBLEM SUMMARY:
A 'soft' css.snap command needs to collect a switch adapter
microcode dump under certain conditions.
PROBLEM CONCLUSION:
The css.snap script has been changed to improve RAS.
The changes allow for a soft css.snap to collect a switch
adapter microcode dump in certain cases.
------
APAR: IY31151 COMPID: 5765D5100 REL: 311
ABSTRACT: NODES CAN DROP OFF SP SWITCH 2 WHEN VFS MOUNT POINT BECOMES
PROBLEM DESCRIPTION:
Nodes can drop off SP Switch 2 when a VFS mount point becomes
unavailable and handleNodeReadStatusPacket() takes too long
due to popen() hanging on computing sum of topology file.
PROBLEM SUMMARY:
The sdrd and the switch processes such as the fault service
daemon can hang when a system() call is made and VFS mount
points are unavailable (e.g., an NFS server is not servicing
a mount to a the client which is seeing a hang).
PROBLEM CONCLUSION:
The script, rc.switch, has been changed to ensure that
processes it forks (such as the fault service daemon) are
not succeptable to hanging on system calls when VFS mount
points
are unavailable. The SDR daemon, sdrd, has also been
changed to
avoid these hangs.
------
APAR: IY31179 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31187 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31192 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31212 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31213 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31214 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31215 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31216 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31217 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31218 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31219 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31220 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31221 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31222 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31225 COMPID: 5724C3505 REL: 310
ABSTRACT: REBRAND DIRECTTALK TO WEBSPHERE VOICE RESPONSE
PROBLEM DESCRIPTION:
Rebranding of DirectTalk to Websphere Voice Response
PROBLEM SUMMARY:
rebrand dirTalk to Websphere Voice Resopnse
PROBLEM CONCLUSION:
product rebranded
------
APAR: IY31250 COMPID: 5765B9501 REL: 340
ABSTRACT: MMFS: FCNTL LOCK LOOPING ON A NODE
PROBLEM DESCRIPTION:
mmfs hanging in fcntl lock on one node while trying to revoke
from another node that had already relinquished that token, but
had forgotten to tell the token manager.
PROBLEM SUMMARY:
fixed multi-node fcntl token locking condition.
PROBLEM CONCLUSION:
always relinquish down to nl in revoke
handler when byte range tokens are unknown.
------
APAR: IY31258 COMPID: 5765B8100 REL: 220
ABSTRACT: INTERMITTENT ERROR 17038 ON FXS/LOOP START PROTOCOL
PROBLEM DESCRIPTION:
INTERMITTENT ERROR 17038 ON FXS/LOOP START PROTOCOL caused
by a very short ring which immediately goes to idle again.
PROBLEM SUMMARY:
INTERMITTENT ERROR 17038 ON FXS/LOOP START
PROTOCOL
PROBLEM CONCLUSION:
Corrected reset to idle startus routines
------
APAR: IY31300 COMPID: 5765B8100 REL: 220
ABSTRACT: ISDN_CALL_TRANSFER CS FAILS TO TRANSFER CALLS. 20501 TRACE
PROBLEM DESCRIPTION:
The ISDN_Call_Transfer custom server occasionally gets into a
bad state and will no longer transfer calls. Each attempt to
transfer a call will result in an errorid 20501 being generated,
with the following error text:
ISDN_Call_Transfer: MakeCallStatus is not valid at this time.
PROBLEM SUMMARY:
ISDN_CALL_TRANSFER CS FAILS TO TRANSFER CALLS.
ERRORID 20501
PROBLEM CONCLUSION:
Layer 4 was fixed so that the TERMINATE_CNF
is always sent freeing up the CHP properly when the transfer is
complete; or when it fails due to one of the parties involved
in the transfer hanging up.
------
APAR: IY31316 COMPID: 5765E6110 REL: 220
ABSTRACT: NEED TO TRACE SELECT() ERRORS
PROBLEM DESCRIPTION:
need to trace select()errors
PROBLEM SUMMARY:
When the Resource Managers exit with -1 due to an error
on a select() call, the errno returned by select() is
not recorded. This causes difficulty determining the
cause of the error. The Resource Managers should capture
the errno returned by select().
PROBLEM CONCLUSION:
The Resource Managers have been modified to capture
the errno returned by select().
------
APAR: IY31327 COMPID: 5765E6110 REL: 220
ABSTRACT: PROGRAMMING ERROR WITH CLEANING UP OPENED SOCKET DURING
PROBLEM DESCRIPTION:
programming error dealing with cleaning up opened socket
during ctcasd daemon shut down.
PROBLEM SUMMARY:
The problem was a programming error in a section of the code
that deals with cleaning up the opened socket when the
ctcasd daemon is shutting down
PROBLEM CONCLUSION:
The fix simply corrects the manner in which the shutdown
routine is being called during ctcasd's shutting down.
------
APAR: IY31328 COMPID: 5765E6110 REL: 220
ABSTRACT: 384 WAY D/S:HAGSGLSM FAILED WHEN COULD NOT OPEN NEW
PROBLEM DESCRIPTION:
384 way D/S:hagsglsm failed when could not open new log file
PROBLEM SUMMARY:
The current hags daemon will die when the log
directory is full so that a new log file cannot be
opened. Currently it will keep all possible
hags daemon and hagsglsm daemon log files if
the size doesn't exceed the limit. So in a system
with many nodes the log directory will become
full very quickly.
PROBLEM CONCLUSION:
The solution is that
1. no matter if the directory is full only keep
3 different kinds of incarnation log files for both
hags daemon and hagsglsm daemon.
2. If it cannot open a new log file the daemon
will not die, it will mark it failed in opening
a new log file and try again until a maximum
number of times of trying is reached.
------
APAR: IY31329 COMPID: 5765E6110 REL: 220
ABSTRACT: CT:LX RMCAPI SOMETIMES DOES NOT CLOSE SESSION FILE
PROBLEM DESCRIPTION:
ct:lx rmcapi sometimes does not close session file
descriptor
PROBLEM SUMMARY:
When the RMC daemon closes a connection with a client in a
narrow window, the RMCAPI fails to close its file
descriptor, possibly causing the process to run out of file
descriptors.
PROBLEM CONCLUSION:
The RMCAPI has been corrected to close its file descriptor
when the RMC daemon breaks its connection.
------
APAR: IY31383 COMPID: 5765B9501 REL: 340
ABSTRACT: FCNTL LOCKS NOT CLEANED UP ON MMFS DEATH
PROBLEM DESCRIPTION:
fcntl locksnot cleaned up on mmfs death
PROBLEM SUMMARY:
Fixed GPFS recovery condition
PROBLEM CONCLUSION:
kxRecLockReset should process all filesystems even if they
have been marked unmounted previously during shutdown.
------
APAR: IY31389 COMPID: 5765B9501 REL: 340
ABSTRACT: ASSERT !"NEW_DELETE_DEBUG", NEWDEBUG.C, LINE 176
PROBLEM DESCRIPTION:
assert !"new_delete_deubg", newdebug.c, line 176
PROBLEM SUMMARY:
Fixed failure in mmrestripefs
PROBLEM CONCLUSION:
Realloc code needs to verify the configuration is correct
before updating the disk effort counters.
------
APAR: IY31462 COMPID: 5765E6110 REL: 110
ABSTRACT: CT: 64BIT KEYFILE PROBLEM
PROBLEM DESCRIPTION:
ct: 64-bit keyfile problem
PROBLEM SUMMARY:
32-bit and 64-bit RSCT based applications record security
keyfiles of differing sizes. A keyfile created by a 32-bit
RSCT application cannot be used by a 64-bit RSCT
application, and vice versa.
PROBLEM CONCLUSION:
Memory offset calculations within the code have been
repaired to remove the problem.
TEMPORARY FIX:
Execute 32-bit versions of the RSCT daemons and libraries.
32-bit RMC client applications will execute correctly.
Restrict 64-bit RMC applications from executing until the
repair is applied.
------
APAR: IY31611 COMPID: 5765D5100 REL: 340
ABSTRACT: LATEST PSSP 3.4.0 FIXES AS OF MAY 2002
PROBLEM DESCRIPTION:
This is the lastest PSSP ptf as of May 2002.
Order this apar to get all of the ptfs as of May 2002.
PROBLEM SUMMARY:
This is a packaging apar for PSSP 3.4.0 fixes
as of May 2002.
PROBLEM CONCLUSION:
This is a packaging apar for PSSP 3.4.0
fixes as of May 2002.
------
APAR: IY31742 COMPID: 5765E2600 REL: 502
ABSTRACT: U483370 TOC FILE INVALID
PROBLEM DESCRIPTION:
TOC file shipped with PTF U483370 is invalid
LOCAL FIX:
Copy PTF to local directory and create new TOC file.
PROBLEM SUMMARY:
u483370.toc file contains Error
PROBLEM CONCLUSION:
will be fixed in upcoming PTF.
TEMPORARY FIX:
Generat new .toc file using inutoc
------
APAR: IY31743 COMPID: 5765D5100 REL: 311
ABSTRACT: LATEST PSSP 3.1.1 FIXES AS OF JUNE 2002.
PROBLEM DESCRIPTION:
This is the latest PSSP ptf as of June 2002.
Order this apar to get all of the ptfs as of June 2002.
PROBLEM SUMMARY:
This is the latest PSSP ptf as of June 2002.
------
APAR: IY31763 COMPID: 5765B8100 REL: 220
ABSTRACT: CORRECT ERROR 20501 ON EARLY HANGUP OF ORIGINATING CALLER
PROBLEM DESCRIPTION:
Prevent error 20501 on early hangup of originating caller.
This error is caused by the originating caller hanging up just
as the trombone connects resulting in error on line 1022
of IBM_Trombone_CS_function.c with State = 7 and event = 4.
PROBLEM SUMMARY:
Prevent error 20501 on early hangup of
originating caller.
This error is caused by the originating caller hanging up just
as the trombone connects resulting in error on line 1022
of IBM_Trombone_CS_function.c with State = 7 and event = 4.
PROBLEM CONCLUSION:
Changed code to not issue an
error for MakeCallStatus State DISCONNECTED.
------
APAR: IY31798 COMPID: 5765D5100 REL: 320
ABSTRACT: LATEST PSSP 3.2.0 FIXES AS OF JUNE 2002
PROBLEM DESCRIPTION:
This is the latest PSSP ptf as of June 2002.
Order this apar to get all of the ptfs as of June 2002.
PROBLEM SUMMARY:
This is a packaging apar for PSSP 3.2.0 fixes
as of June 2002
------
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]