OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
From: AIX Service Mail Server (aixservaustin.ibm.com)
Date: Tue May 07 2002 - 02:37:08 CDT

  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

    APAR: IY12345 COMPID: 5698PDD00 REL: 360
    ABSTRACT: MAKING SERVER SIDE SCRIPT-FILTERING CONFIGURABLE

    PROBLEM DESCRIPTION:
    making server side script-filtering configurable

    PROBLEM CONCLUSION:
    making server side script-filtering configurable for
    junctions with scripting support (-j option in junctioncp).
    To disable filtering within scripting support (filtering
    is only one of the scripting support functionality) the
    user can add this line to the wand stanza in iv.conf
     wand
    script-filtering = no

    ------

    APAR: IY25153 COMPID: 5622DJX00 REL: 211
    ABSTRACT: PTF USED FOR DATAJOINER 211 FOR AIX

    PROBLEM DESCRIPTION:
    PTF used for DataJoiner 211 for AIX

    ------

    APAR: IY28176 COMPID: 5765D9300 REL: 320
    ABSTRACT: ROUTINE MPI_WTIME RETURNS INCORRECT RESULT ON -Q64 -QREALSIZE=8

    PROBLEM DESCRIPTION:
    The routine MPI_Wtime produces incorrect results when the
    program is compiled with the options -q64 and -qrealsize=8.

    LOCAL FIX:
    omit -q64 and -qrealsize=8

    PROBLEM SUMMARY:
    The problem was solved by changing the mpif.h file to
    reflect the correct return value type for mpi_wtime and
    mpi_wtick functions. Also the mpi.mod files were generated
    from these new header files.

    PROBLEM CONCLUSION:
    The problem arose due to the fact that internally the return
    value to mpi_wtime was classified as double precision when
    it should have been real*8 instead. Likewise the same was
    true of mpi_wtick. Becuase of this, when supplying the
    -qrealsize=8 option a mismatch would occur between the two
    representations of reals and incorrect data would occur.

    ------

    APAR: IY28351 COMPID: 5765D5100 REL: 340
    ABSTRACT: SP SWITCH 2 DAEMON HAS INVALID DEPENDENCY ON SWITCH NODE NUM 0.

    PROBLEM DESCRIPTION:
    The SP Switch 2 fault-service daemon has a dependency on the
    existence of switch node number 0. (Switch node number 0 doesn't
    have to be initialized on the switch, but it must exist in the
    SDR.) This is an invalid dependency because a customer can
    remove the node that is assigned switch node number 0.

    LOCAL FIX:
    Add the node that is assigned switch node number 0 back to
    the system.

    PROBLEM SUMMARY:
    On an SP_Switch2, if a customer deletes a node that has
    been assigned switch node number 0, Estart will fail.
    You will see the following message in the flt file on
    the primary node:
    CSswitchInit: CSworm_bfs_phase1() failed with rc=-51

    PROBLEM CONCLUSION:
    During phase 1 switch initialization, an assumption was
    made that there always was a node that was assigned
    switch node number 0 (zero). This is not a valid
    assumption. Phase 1 initialization has been changed to
    use the primary node's switch node number, instead of
    the value 0, as the starting point for exploring the
    network.

    ------

    APAR: IY28505 COMPID: 5765D9300 REL: 320
    ABSTRACT: FAULTY MPI_CART_SHIFT ROUTINE

    PROBLEM DESCRIPTION:
    A Fortran MPI program, when run on 4 processors
    poe ./topol -procs 4 -hostfile ../psi.test/host.lis
    produces the output
      myid= 0 myidm,p= 3 1 MPI_PROC_NULL= -3
      myid= 1 myidm,p= 0 2 MPI_PROC_NULL= -3
      myid= 2 myidm,p= 1 3 MPI_PROC_NULL= -3
      myid= 3 myidm,p= 2 -3 MPI_PROC_NULL= -3
    The correct output should be:
      myid= 0 myidm,p= 3, 1 MPI_PROC_NULL= -3
      myid= 1 myidm,p= 0, 2 MPI_PROC_NULL= -3
      myid= 2 myidm,p= 1, 3 MPI_PROC_NULL= -3
      myid= 3 myidm,p= 2, 0 MPI_PROC_NULL= -3

    PROBLEM SUMMARY:
    Fixed the problem by changing the code to the mpi_cart_sub
    routine so that information whether a dimension is periodic
    is preserved in the new topology from the old communicator.
    By doing this mpi_cart_shift produces results as expected.

    PROBLEM CONCLUSION:
    The problem in this case was that the mpi_cart_sub call was
    not saving the information about the original dimensions
    being periodic or not. As a result consequent call to the
    mpi_cart_shift routine did not produe the desirable results.
    when the new dimension was shifted some tasks ended up being
    pushed out since the dimension was mistakenly marked as non
    periodic in the mpi_cart_sub routine.

    ------

    APAR: IY28533 COMPID: 5765D5100 REL: 340
    ABSTRACT: NODE NUMBERS NOT BEING IDENTIFIED PROPERLY DURING SETUP_CWS

    PROBLEM DESCRIPTION:
    In PSSP 3.4, if a customer adds a node whose node number is part
    of an already existing node number (such as 17 and 117),
    setup_CWS may fail to properly create a new krb-srvtab.
    A similar problem exists with mknimres, which is being handled
    in IY27621.

    LOCAL FIX:
    Manually create the krb-srvtab file.

    PROBLEM SUMMARY:
    The mknimres part of this was addressed under IY27621.
    We will only address the setup_CWS bug here.
    Under the following conditions setup_CWS will fail to
    generate a new-srvtab file for node A:
    node A's node number is a substring of node B's
    (e.g. node 1 and node 11)
    In the SDR Adapter Class, the entry for node B's en0
    appears BEFORE node A's. This would normally only happen
    if node B is added first.
    The code is checking the list to see if the node number
    is already in there. But if a node is already in the list
    whose number is a superstring of the node it is looking for,
    it will think it has a match when it shouldn't, and the node
    won't get added to the list. The result is a new-srvtab file
    won't get generated for the node.

    PROBLEM CONCLUSION:
    The code has been modified to search the list for the
    node number surrounded by spaces. This will ensure
    an exact match only. The list is built in such a
    manner that there will always be spaces around each
    node number.

    ------

    APAR: IY28535 COMPID: 5765D5100 REL: 340
    ABSTRACT: MKNIMRES DOES NOT HANDLE UMASK=077

    PROBLEM DESCRIPTION:
    If umask is 077, then mknimres creates pssplpp directory that
    cannot be exported.

    LOCAL FIX:
    change umask or change perms on directory manually after failure

    PROBLEM SUMMARY:
    When a node is a BIS server for another node,
    and a customer's umask setting is 077, the
    /spdata/sys1/install/pssplpp/PSSP-3.4 directory
    is created incorrectly on the BIS node.
    The /spdata directory should be created with
    permissions of 755, however, with the umask
    setting of 077 it is created with permissions
    of 2740. This subsequently, makes customization
    fail because /spdata/sys1/install/pssplpp/PSSP-3.4/
    pssp.installp file cannot be seen or executed.
    /usr/lpp/ssp/install/bin/makedir was changed to
    add a umask setting of 022, which enables mknimres
    to create the /spdata/sys1/install/pssplpp/PSSP-3.4
    directory with permission of 755 as intended.
    After completion of makedir, the customer's umask
    setting will be returned to its original setting.

    PROBLEM CONCLUSION:
    /usr/lpp/ssp/install/bin/makedir was changed to
    add umask(022). Since makedir is only called by
    mknimres, this will correct all occurrences
    of creating a directory with incorrect permissions
    due to customer's umask setting differing from
    what we expect. Subsequently, when makedir script
    end, the umask setting will be returned to the
    customer's original setting.

    ------

    APAR: IY28578 COMPID: 5765D5100 REL: 340
    ABSTRACT: THE PERSPECTIVE TABLE VIEW HAS A PROBLEM WITH SWITCH_RESPONDS

    PROBLEM DESCRIPTION:
    The table view has a problem with switch_responds on
    switchless systems. Everything after (to the right)
    of where the switch_responds is supposed to be displayed
    does not get displayed either.Any columns after the one
    where switch_responds WOULD be are blank. Any columns
    after the one where switch_responds WOULD be are blank.

    LOCAL FIX:
    Modified .sphardwaremonitorTableView and put it on
    /usr/lpp/ssp/perspectives/profiles/en_US. (current $LANG=en_US)
    It swaps the position of switch_responds end Environment LED,
    making switch_responds last so there is nothing to the right
    of it for it to mess up.

    PROBLEM SUMMARY:
    The code was not properly handling a non-existent attribute
    for example "switch responds" in a switchless system.
    This was causing the attribute's column and all columns
    to its right to not appear in the table.
    Normally a non-existent attribute would not be present in
    any profile. However, "switch responds" is a member of the
    default table profile.

    PROBLEM CONCLUSION:
    The code has been modified to check for and skip
    non-existent attributes when it creates the list
    of columns for the table.

    ------

    APAR: IY28615 COMPID: 5765D5100 REL: 340
    ABSTRACT: CSSADM CORE DUMP

    PROBLEM DESCRIPTION:
    In case of a clock problem on the switch the cssadm
    recovery action may cause cssadm to dump core.
    The srcmstr then starts a new cssadm which starts
    over with the recovery (basicly running Eclock+Estart).
    The cssadm dumps core after the Eclock and before the
    Estart. As each new cssadm does an Eclock the switch
    never gets up again.
    The only work around is to stop cssadm, run Eclock -d
    followed by an Estart.

    PROBLEM SUMMARY:
    Core dump in the Switch Admin Daemon (cssadm)
    when global switch recovery is turned on and Eclock
    global switch recovery is turned on and Eclock is run.
    The core dump is due to an overflow of a buffer that
    is used to generate shell commands; the core dump may
    occur when using long hostnames. Also, on some cssadm
    messages, there is a mismatch between the format
    specification (%s) and the corresponding parameter
    (which is defined as int). The result is that the data
    (node number, for example) is not displayed.

    PROBLEM CONCLUSION:
    The buffer for the generated commands was increased
    from 120 to 256 bytes. The message text for information
    messages 168 and 169 was changed to display the node's
    hostname so as to match the format specification (%s).

    ------

    APAR: IY28755 COMPID: 5765D5100 REL: 340
    ABSTRACT: CSS_TYPE NOT UPDATED PROPERLY IN SDR ADAPTER CLASS WHEN

    PROBLEM DESCRIPTION:
    When migrating directly from PSSP 311 to PSSP 34, the css_type
    attribute will not be updated in the Adapter class of the SDR.

    LOCAL FIX:
    1) Manually update the Adapter class with the correct css_type
       using SDRChangeAttrValues.
    2) Run SDR_config.
    3) Customize all nodes to update their css ODM entries.

    PROBLEM SUMMARY:
    The old css_type list was taken out of the code for PSSP 3.4
    and replaced with two new lists - one for SP-Switch and one
    for the SP2-Switch. But the code that fills in the css_type
    on migrations to PSSP 3.2 (and all later releases) was still
    looking for the old list. This code is also run on new
    installs.

    PROBLEM CONCLUSION:
    The code in SDR_init was modified to use the new tables to
    determine the css_type. SDR_init will will run
    automatically on installation of this APAR.

    ------

    APAR: IY28837 COMPID: 5765D5100 REL: 340
    ABSTRACT: PSSP_SCRIPT VERIFY_QUORUM FUNCTION BREAKS WHEN LVSG IS EXECUTED

    PROBLEM DESCRIPTION:
    According to PSSP admin guide, when the SDR is initiliazed, a
    single volume group object is created with the name of "rootvg"
    for each node. However, users are allowed to create additional
    volume group objects to represent alternate volume groups for
    a node. In this case, when pssp_script runs, the function
    verify_quorum fails with error code "516-306: Unable to find
    volume group <vg name> in a the Device Configuration Database".
    The reason for that is because that vg name is only known to the
    SDR but not to the operating system.For example, customer ran
    the spbootins -c rootvg51 ... to set a node to install. The
    /tftpboot/<node_name.config_info file gets built based on the
    volume group object specified. When verify quorum does an lsvg
    rootvg51, it really does not exist on the machine, however the
    label is valid. The pssp_script fails as a result with return
    code of 1.

    LOCAL FIX:
    Offered the following as a possible circumvention:
    >modify the node's config_info file and specify the name of
    a volume group that's known to the OS:
    -edit /tftpboot/<node_name>.config_info file
    -instead of rootvg51, change it to rootvg.
    -run /tmp/pssp_script on the node again.

    PROBLEM SUMMARY:
    ***********************************************************
    * USERS AFFECTED: *
    * *
    * Users at the following levels or higher, *
    * ssp.basic 3.2.0.15 *
    * ssp.basic 3.3.0.2 *
    * ssp.basic 3.4.0.1 *
    * who are installing or migrating a node for which the *
    * selected volume group is named something other than *
    * rootvg. *
    * *
    ***********************************************************
    * PROBLEM DESCRIPTION: *
    * *
    * When installing or migrating a node when the selected *
    * volume group is not named rootvg, pssp_script will fail *
    * with a message that the lsvg command failed. *
    * *
    ***********************************************************
    * RECOMMENDATION: *
    * *
    * Install the appropriate APAR for your release of PSSP, *
    * when available. *
    * *
    * APAR IY29411, currently targeted for *
    * ssp.basic 3.2.0.19 on PTF Set 19. *
    * *
    * APAR IY28837, currently targeted for *
    * ssp.basic 3.3.0.6 on PTF Set 8. *
    * *
    * APAR IY29410, currently targeted for *
    * ssp.basic 3.4.0.7 on PTF Set 8. *
    * *
    * Until the APAR for your release is available, prior to *
    * issuing nodecond to begin the installation of the node, *
    * edit the file /tftpboot/node_hostname.config_info and *
    * change the name of the selected volume group to rootvg. *
    * *
    ***********************************************************

    ------

    APAR: IY28873 COMPID: 5765D5100 REL: 340
    ABSTRACT: CFGCOL & CFGCOR DON'T UNREGISTER W/ CSS PDD IN SOME EXIT PATHS

    PROBLEM DESCRIPTION:
    cfgcol and cfgcor have exit paths where they do not de-register
    with the CSS pseudo device driver (pdd) for Communication Matrix
    (CM) updates. This leaves a stale entry in the pdd data struct
    that maintains the list of clients registered for CM updates.
    The results are indeterminate, but one possible result is a node
    crash.

    PROBLEM SUMMARY:
    The Colony and Corsair configuration programs (cfgcol
    and cfgcor) have error exit paths where they do not
    unregister the adapter with the CSS pseudo device driver
    (pdd) table. This situation could arise, for example,
    if post diagnostics fail. This may leave a stale entry
    in the pdd data structure that maintains the list of
    clients registered for Communication Matrix (CM) updates.
    The results are indeterminate, but one possible result
    is a node crash.

    PROBLEM CONCLUSION:
    If there are errors in the configuration program
    (cfgcol or cfgcor) of the Colony or Corsair adapter
    after the device is registered with the CSS pseudo device
    driver, the adapter will be unregistered before the
    configuration program exits.

    ------

    APAR: IY28877 COMPID: 5765E6900 REL: 310
    ABSTRACT: SYNTAX ERR IN CONFIG CAUSES NEGOTIATOR/ALL DEAMONS DOWN

    PROBLEM DESCRIPTION:
    Syntax error in user defined keyword causes it to run into
    Start expression corrupting it and corrupted data send to
    Negotiator causing it to crash/all Deamons.

    PROBLEM SUMMARY:
    A missing closing parenthesis in an expression on one node
    can bring down the negotiator.

    PROBLEM CONCLUSION:
    There are two routines that evaluate expressions. One of
    them was fixed in the LoadL 2.2 GA code, and the other one
    needs the same fix applied to it - implements a clean
    error on a missing closing parenthesis.

    ------

    APAR: IY29067 COMPID: 5765D5100 REL: 340
    ABSTRACT: ECLOCK -R DOES NOT CHANGE CLOCK SETTINGS

    PROBLEM DESCRIPTION:
    On PSSP 320 Eclock -r does not change clock settings on the
    SP Switch.

    LOCAL FIX:
    Use Eclock -f <Clock topology file> or the -d option to clock
    your system.

    PROBLEM SUMMARY:
    Eclock -r now makes the correct determination on the switch
    type.

    PROBLEM CONCLUSION:
    When specifying the -r option on Eclock there was no check
    to set the switch type before testing it. Eclock -r was not
    running the full Eclock process.

    ------

    APAR: IY29068 COMPID: 5765D5100 REL: 340
    ABSTRACT: DIAG -C -D CSS0 BRINGS UP PROBLEM DETERMINATION SCREEN

    PROBLEM DESCRIPTION:
    executing diag -c -d css0 should not take the user to ELA screen
    s, it appears css0 diagnostic method does not use the -c flag
    anymore.
    I think we want to check DA_CONSOLE_TRUE before calling ela_run
    damode bits runing diags -c:
    DA_CONSOLE_FALSE 0x00080000
    da mode bits when running diags without the -c flag:
    DA_CONSOLE_TRUE 0x00040000

    PROBLEM SUMMARY:
    When running diags -c -d css0, the diag method was calling
    diagrpt without first checking for a console causing the
    user to be prompted.

    PROBLEM CONCLUSION:
    The diag method first checks for a console before running
    diagrpt.

    ------

    APAR: IY29072 COMPID: 5765E7200 REL: 310
    ABSTRACT: SMIT: CANNOT CHANGE OR REMOVE SHARES WITH LARGE # OF SHARES

    PROBLEM DESCRIPTION:
    Problem occurs on SMITTY during "Change Share" or "Delete
    Share" If there are a large number of shares, smitty will
    display
            '1800-051 There are no items of this type.' instead of
    showing the list of shares (to select from).

    PROBLEM SUMMARY:
    Problem occurs on SMIT during "Change Share" or "Delete Share"
    If there are a large number of shares, smitty will display
    '1800-051 There are no items of this type.'
    instead of showing the list of shares (to select from).
    (Problem is due to "net share /infolevel:99" showing different
     output for more than 300 shares.)

    PROBLEM CONCLUSION:
    Fix "net share /infolevel:99" to show same output even when
    number of shares exceeds 300.

    ------

    APAR: IY29155 COMPID: 5765B9500 REL: 150
    ABSTRACT: MMCHFS DOES NOT CHECK FOR EXISTANCE OF SPECIAL DEVICE FILE FOR

    PROBLEM DESCRIPTION:
    When using mmchfs to change nodeset for fileystem there is no
    check to see if the device special file for the gpfs filesystem
    already exists on the nodes in the target nodeset. This becomes
    a problem when the customer has a jfs filesystem that has a
    device special file with the same name. The mmchfs command
    completes successfully; however, if the mmchfs command is used
    to bring the gpfs filesystem back to the original nodeset the
    special device file that belongs to the jfs filesystem is
    removed.

    PROBLEM SUMMARY:
    When using mmchfs to change nodeset for
    filesystem there is no check to see if the device special file
    for the gpfs filesystem already exists on the nodes in the
    target nodeset. This becomes a problem when the customer has a
    jfs filesystem that has a device special file with the same
    name. The mmchfs command completes successfully; however, if
    the mmchfs command is used to bring the gpfs filesystem back
    to the original nodeset, the special device file that belongs
    to the jfs filesystem is removed.

    PROBLEM CONCLUSION:
    Do not remove /dev entries if they were not
    created by GPFS .

    ------

    APAR: IY29158 COMPID: 5765B9500 REL: 150
    ABSTRACT: MMCHFS FAILS WITH _MOUNT_CHECK_ONLY ERROR

    PROBLEM DESCRIPTION:
    mmchfs fails with _MOUNT_CHECK_ONLY_error

    PROBLEM SUMMARY:
    Can not reassign a file system from a nodeset
    where all nodes are unavailable.

    PROBLEM CONCLUSION:
    When moving a file system there is no need
    for the daemon to be running anywhere in the source nodeset.

    ------

    APAR: IY29208 COMPID: 5765E6900 REL: 310
    ABSTRACT: LOADL_STARTER HANGING WITH DEFUNCT CHILD PROCESS

    PROBLEM DESCRIPTION:
    LoadL_starter hanging with defunct child process.
    This hangs the whole job. Sending a SIGKILL to
    the LoadL_starter puts the job into the VACATED
    state and it will be restarted. Sometimes the
    job may run then.

    PROBLEM SUMMARY:
    A child defunct process would be
    produced if the SA_RESETHAND flag is set
    and LoadLeveler had inherited that environment.

    PROBLEM CONCLUSION:
    LoadLeveler would always disable the SA_RESETHAND
    flag so no defunct child process would
    be produced.

    ------

    APAR: IY29210 COMPID: 5765D5100 REL: 340
    ABSTRACT: LONG DEFAULT MSG TO FFDC_STACK_LOG MACRO WILL CORRUPT STACK

    PROBLEM DESCRIPTION:
    FFDC_STACK_LOG() macro doesn't accommodate a long default msg
    string (argument). A default msg that is too long will cause the
    the stack to be corrupted. The results are indeterminate.

    PROBLEM SUMMARY:
    Messages longer than 80 chars sent from the fault service
    daemon are corrupting the FFDC stack.

    PROBLEM CONCLUSION:
    The function that creates the stack record for the fault
    service daemon needs to have the extraneous "Msg not
    found: " text removed allowing 15 additional characters
    to be in the message. Then msgs 2510-606, 2510-203,
    2510-816, 2510-817, and 2510-820 need to be shortened
    so that they do not exceed the new limit of 98 characters.

    ------

    APAR: IY29241 COMPID: 5765D5100 REL: 340
    ABSTRACT: SETUP_CWS IS TRYING TO UPDATE AFS SRVTABS WHEN IT SHOULD NOT

    PROBLEM DESCRIPTION:
    For afs authentication, the srvtab files are only generated when
    the principals are initially created. However as of IY12653
    (63733 in PTF 2) the code is trying to regenerate them every
    time a node is set to customize, install or migrate.

    PROBLEM SUMMARY:
    A change was made to the code in this release such that the
    srvtab files are regenerated every time a node's bootp
    response is set to customize, install, or migrate.
    However, when afs authentication is being used the srvtab
    file(s) can only be generated when the principals are
    originally created. The result is setup_server failing
    with:
    afs_add_principal:0016-301 Cannot access /tmp/addprin.27490
    setup_CWS: 0016-052 The add_principal command could not
    add service principals to the Kerberos V4 database.
    setup_server: 0016-279 Problem of internally called
    command: /usr/lpp/ssp/bin/se tup_CWS; rc= 2.
    setup_server: Processing incomplete (rc= 2).

    PROBLEM CONCLUSION:
    The code has been modified to not regenerate the
    srvtab files when afs authentication is being used.

    ------

    APAR: IY29372 COMPID: 5765D5100 REL: 340
    ABSTRACT: SMITTY SPCHUSER SHOWS TOO MANY (ALL) SECONDARY GROUPS

    PROBLEM DESCRIPTION:
    Customer installed PSSP 3.4 (PTF set 5) and AIX 5.1 (ML 01) on
    their control workstation. They have observed that the 'smitty
    spchuser' command now displays ALL of groups in the "Secondary
    GROUPS" line. The /etc/group file is correct and the 'lsuser -a
    groups <userid>' and 'smitty chuser' commands show the correct
    outputs. The problem is only observed for the 'smitty spchuser'
    command for changing SP users.
    I have reproduced these observations. On our test system using
    AIX 4.33 ML 9, PSSP 3.2 (ssp.basic 3.2.0.14), I created the user
    "hughey" with primary group "staff" and secondary group "nobody"
    After this, I tried smitty spchuser and observed:
      PRIMARY group [1]
      Secondary GROUPS [staff,nobody]
    I repeated the same procedure on a test system with AIX 5.1.0.15
    and PSSP 3.4.0.3 and observed the following incorrect output:
      PRIMARY group [1]
      Secondary GROUPS [system,staff,bin,sys,adm,..ALL GROUPS HERE]

    LOCAL FIX:
    None. This is a display feature of 'smitty spchuser' which is
    not working correctly. Command 'smitty chuser' circumvents.

    PROBLEM SUMMARY:
    When splsuser is invoked for a userid, only those groups
    which the userid is a member of should be displayed.
    Currently all groups that have members are being displayed.
    The cause of this is the change from perl 4 to perl 5 and
    the way that strings are handled.

    PROBLEM CONCLUSION:
    group.pkg which is used by splsuser and spchuser has been
    modified to check for certain strings not being null as
    opposed to not being defined to handle differences between
    perl 4 and perl 5.

    ------

    APAR: IY29398 COMPID: 5765B9501 REL: 340
    ABSTRACT: KERNEL PANIC DUE TO ASSERT IN SHHASHV.C LINE 1127:LM_HAVE != NL

    PROBLEM DESCRIPTION:
    If a file was changing from datashipping to non-datashipping
    state in the middle of a read/write request. The file was left
    in an unlocked state when returning from dsRdWr, and kSFSRead
    was asserting when trying to upgrade the lock.
    So gpfsperf with the -ds option is just cleaning up the
    datashipping state from all the nodes when the next job starts
    reading the same file. This read hits a timing window where it
    gets into this retry-after-DS-just-turned-off code and forgot
    to relock the file.
    The result of all this is a kernel panic which causes the node
    to crash with this type of traceback:
       mmfs:DoPanic__FPcT1iN23T1+0000EC
       mmfs:logAssertFailed+0000F0
       mmfs:change_lock_vfs__5LkObjFP8CacheObjQ2_5LkObj12LockMode
            EnumT2i+000C
       mmfs:upgradeFileLock__FP15KernelOperationP8OpenFile7
            FileUIDPQ2_5LkOb

    PROBLEM SUMMARY:
    GPFS self check logic detected an error ShHashV.C, line 1127
    while running jobs using the MPI-Io library.

    PROBLEM CONCLUSION:
    If a file was changing from datashipping to non-datashipping
    state in the middle of a read/write request. The file was
    left in an unlocked state when returning from dsRdWr and
    kSFSRead was asserting when trying to upgrade the lock. Fix
    is to relock the file before returning EAGAIN.

    ------

    APAR: IY29417 COMPID: 5765B9501 REL: 340
    ABSTRACT: FILESYSTEM HUNG AND LONG WAITERS

    PROBLEM DESCRIPTION:
    filesystem hung and long waiters

    PROBLEM SUMMARY:
    GPFS Deadlock on P690 with AIX 5.1.

    PROBLEM CONCLUSION:
    If a connection is dropped while an incomplete message was
    in progress from the node, the hasData flag must be cleared
    in receiveMsg, or it will loop forever.

    ------

    APAR: IY29442 COMPID: 5765D5100 REL: 340
    ABSTRACT: PARENT/CHILD PROCESSES TIMING PROBLEM FROM SPMON QUERY

    PROBLEM DESCRIPTION:
    Patent/Child processes timing problem from spmon query function

    PROBLEM SUMMARY:
    When spmon does a query it forks a process. But the parent
    does not wait for the child to finish and exits.
    If the parent is faster than the child then it will return
    the user to the command line (print the prompt) before the
    query's output is written. This gives the impression that
    the shell prompt was never returned and that the command is
    hung.

    PROBLEM CONCLUSION:
    The code has been modified so that on a query the parent
    will now wait for the child process to finish before it
    exits and returns the user to the command prompt.

    ------

    APAR: IY29444 COMPID: 5765B9501 REL: 340
    ABSTRACT: SIGNAL 11 IN FORCEDONERECORDS ON RO MOUNTED FILESYSTEM

    PROBLEM DESCRIPTION:
    Signal 11 in forceDoneRecords on RO mounted filesystem on token
    revoke. token_revoke should not call forceDoneRecords if the
    filesystem logfile pointer is null, which is the case when it is
    mounted RO

    PROBLEM SUMMARY:
    Signal 11 in forceDoneRecords on RO mounted filesystem on
    token revoke.

    PROBLEM CONCLUSION:
    token_revoke should not call forceDoneRecords if the
    filesystem logfile pointer is null, which is the case when
    it is mounted RO

    ------

    APAR: IY29456 COMPID: 5765D5100 REL: 340
    ABSTRACT: SETUP_SERVER FAILS IF MULTIPLE OTHER_ADDRS SPECIFIED FOR AN ADAP

    PROBLEM DESCRIPTION:
    setup_server dies when multiple IP adresses are put in
    the other_addrs field in the Adapter SDR class.
    The error message is: 0016-338 Kerberos V4 setup
    was bypassed for network interfaces that could
    not be resolved.
    The other_addrs field is used in a HACMP
    environment.

    LOCAL FIX:
    change setup_CWS as follows:
    if ($other_addrs ne "\"\"" && $other_addrs ne "") {
    $netaddr=$other_addrs;
    unless (&check_new_name) {
    &exit_setup_CWS($return_code); # exit if errors
    } } }
    to:
    if ($other_addrs ne "\"\"" && $other_addrs ne "") {
    foreach $netaddr (split(/,/,$other_addrs)) {
    unless (&check_new_name) {
    &exit_setup_CWS($return_code); # exit if errors
    } } } }

    PROBLEM SUMMARY:
    setup_CWS was unable to process more that one IP address in
    the other_addrs attribute of the Adapter class. It was
    not parsing the addresses, so it would attempt to perform
    host name resolution on the entire string which would fail.

    PROBLEM CONCLUSION:
    Modified setup_CWS to parse the values in the other_addrs
    attribute of the Adapter class prior to performing host
    name resolution.

    ------

    APAR: IY29472 COMPID: 5765B9501 REL: 340
    ABSTRACT: GPFS WITH QUOTA ON, IN_DOUBT GROWS TOO QUICKLY ON LARGE SYSTEMS

    PROBLEM DESCRIPTION:
    On large systems where GPFS quota subsystem is turned on,
    the per user in_doubt value grows unchecked.
    The reclaiming of unused in_doubt does not bring down its value
    to an acceptable level.

    LOCAL FIX:
    Running mmcheckquota on a GPFS file system that is offline will
    reclaim all of the in_doubt . Running mmcheckquota on a mounted
    GPFS file system will reduce the in_doubt size but will not
    fully reclaim it.

    PROBLEM SUMMARY:
    The in-doubt value for GPFS files systems using quotas does
    not get reclaimed after a period of inactivity.

    PROBLEM CONCLUSION:
    Provide a way to automatically reclaim unused client
    shares after a period of allocation/deallocation inactivity
    of the user on a client node, so that the overall inDoubt
    for a user does not grow "unboundedly".

    ------

    APAR: IY29570 COMPID: 5765E7200 REL: 310
    ABSTRACT: FC COREDUMP WHEN FAILED TO SETUP NEW SESSION

    PROBLEM DESCRIPTION:
    FC can core dump when failed to setup new session for
    new user

    PROBLEM CONCLUSION:
    revalide the failed session.

    ------

    APAR: IY29571 COMPID: 5765E7200 REL: 310
    ABSTRACT: NETBIOS DATAGRAM SERVICE IS NOT WORKING

    PROBLEM DESCRIPTION:
    currently the NetBIOS DataGram service is not supported

    PROBLEM CONCLUSION:
    implement the NetBIOS DataGram service

    ------

    APAR: IY29620 COMPID: 5765E7200 REL: 310
    ABSTRACT: MAKE POSSIBLE TO CHANGE PERMISSION OF AIX FAST CONNECT FILE

    PROBLEM DESCRIPTION:
    User from PC client can not change the permissions of a
    file owned by different user, though has the write
    permission on the parent directory.

    PROBLEM CONCLUSION:
    Check for write access on the parent directory and based
    on that allow or disallow to change the file permissions.

    ------

    APAR: IY29622 COMPID: 5765E6900 REL: 310
    ABSTRACT: LL CHANGES TO SUPPORT AIX 5.1.D TECHNICAL LARGE PAGE

    PROBLEM DESCRIPTION:
    ll changes to support AIX 5.1.D Technical Large Page

    ------

    APAR: IY29623 COMPID: 5765E7200 REL: 310
    ABSTRACT: POWERPOINT FILE TIMESTAMP CHANGE WITH WINDOWS 2000

    PROBLEM DESCRIPTION:
    Modification times are not preserved and they change to the
    current time

    PROBLEM CONCLUSION:
    Moving the code that saves the timestamp from FileInstance
    to FileEntry class.

    ------

    APAR: IY29683 COMPID: 5765B9501 REL: 340
    ABSTRACT: RECOVERY FAILED AFTER NODE FAILURE DURING RESTRIPE

    PROBLEM DESCRIPTION:
    recovery failed after node failure during restripe

    PROBLEM SUMMARY:
    Recovery of the GPFS file system failed after a node failure
    while running the mmrestripe command.

    PROBLEM CONCLUSION:
    When moving indirect blocks for restripe, must spool a done
    record before changing the disk address. Otherwise, log
    recovery might apply the updates after the indirect block
    has been re-used

    ------

    APAR: IY29694 COMPID: 5765D9300 REL: 320
    ABSTRACT: MPI TASK DUMPS CORE IN COL_READPKT

    PROBLEM DESCRIPTION:
    Signal handling (non-threaded) allows an mpi write
    to be re-entered by the same task causing tasks of
    a POE job to hang or dump core. The stack trace
    of a core dumping task is:
    Segmentation fault in col_readpkt at 0xd0505d88
    0xd0505d88 (col_readpkt+0x12fc) cbea0008 lfd fr31,0x8(r10)
    (dbx) t
    col_readpkt() at 0xd0505d88
    kickpipes() at 0xd04d9870
    mpci_recv(??, ??, ??, ??, ??, ??, ??, ??) at 0xd04f6a70
    barrier_shft_b(??) at 0xd05920e8
    _mpi_barrier(??, ??, ??) at 0xd0591e4c
    MPI__Barrier(??) at 0xd059105c
    mpi__barrier(??, ??) at 0xd011c3ec
    gather_field() at 0x101d3d48
    pp_output_slice() at 0x101d31f4
    pp_output() at 0x101c9e24
    pp_makegribs() at 0x101b97b4
    pporg() at 0x101ab6a8
    progorg() at 0x100c502c
    gmeorg() at 0x1000079c

    PROBLEM SUMMARY:
    Running a non-threaded user space ( mpi signal handling
    library ) program, a thread was re-entering writepkt driven
    by a signal and causing the program to core dump with
    various errors.

    PROBLEM CONCLUSION:
    Running a non-threaded ( signal handling mpi ) user space
    program the thread that is writing will not reenter the
    write routine.

    ------

    APAR: IY29838 COMPID: 5765B9501 REL: 340
    ABSTRACT: MMCHECKQUOTA PRODUCES NEGATIVE NUMBERS

    PROBLEM DESCRIPTION:
    mmcheckquota sometimes produces negative numbers when GPFS is
    under heavy load.

    PROBLEM SUMMARY:
    mmquotacheck sometimes produces negative numbers for disk
    usage.

    PROBLEM CONCLUSION:
    Do not update server's shadow entries at ComputeShare and
    Relinquish routines since the quota usage and quota share
    accounting in this case is done through regular quota
    entries.

    ------

    APAR: IY29867 COMPID: 5765D5100 REL: 340
    ABSTRACT: SETUP_SERVER RETURNS 1 WHEN SPNET_ENX EXISTS ALREADY

    PROBLEM DESCRIPTION:
    setup_server returns 1 when spnet_enx exists aleady

    PROBLEM SUMMARY:
    On a CWS, after changing the IP address of an external
    ethernet adapter that is not part of the SP LAN,
    setup_server exits with a return code of 1. Before failing
    though, setup_server displays a message letting the user
    know there was a problem defining the associated NIM
    spnet_enX resource (where enX is the en number of the
    external ethernet with the changed IP address). The
    following is an error message put out by mknimint ... it
    is these errors which cause setup_server to truly fail:
    0042-001 nim: processing error encountered on "master":
    0042-032 m_mknet: object name must be unique and
                     "spnet_enX" already exists
    mknimint: 0016-286 The "nim -o define" command had a
             problem defining spnet_enX on <master_hostname>
             with a return code value of 1.

    PROBLEM CONCLUSION:
    The new code within /usr/lpp/ssp/bin/mknimint does a check
    to ensure that the interface associated with the network
    being changed is not referenced by any client other than the
    master. If the master is the only machine which contains the
    changed network then the new code will blank out the nim
    'if' install interface and cabletype in order to remove the
    spnet_enX network. Once the network associated with the
    external ethernet adapter (not part of the SP LAN) is
    removed, the code will drop down into the 'regular' routine
    of defining/redefining the network to nim.

    ------

    APAR: IY29914 COMPID: 5765B9500 REL: 150
    ABSTRACT: MMSTARTUP -W FAILS WHEN THERE IS A SPACE AFTER THE NODE

    PROBLEM DESCRIPTION:
    mmstartup -w fails when there is a space after the node

    PROBLEM SUMMARY:
    GPFS not handling a blank after nodename in
    the node file associated with mmstartup -w.

    PROBLEM CONCLUSION:
    Remove leading and trailing white space
    around hostnames.

    ------

    APAR: IY29929 COMPID: 5765D5100 REL: 340
    ABSTRACT: 64BIT:CSS0 DIAGS FAILS WITH TB3PCI

    PROBLEM DESCRIPTION:
    64bit:css0 diags fails with tb3pci

    PROBLEM SUMMARY:
    css0 diags may fail with TB3PCI on LPAR node.
    The problem was caused when a previously undefined constant
    was given a definition in system header files. A conditional
    compilation was then reversed.

    PROBLEM CONCLUSION:
    The solution was to rename the constant so that it is again
    undefined.

    ------

    APAR: IY30003 COMPID: 5765B9501 REL: 340
    ABSTRACT: EIO ERROR WHILE CREATING FILES

    PROBLEM DESCRIPTION:
    Since flushFile with FLUSH_FILESIZE_ONLY does not call
    flushIndirects, it cannot skip sending the dirty inode to the
    metanode.

    PROBLEM SUMMARY:
    EIO errors being returned on file creation

    PROBLEM CONCLUSION:
    Since flushFile with FLUSH_FILESIZE_ONLY does not call
    flushIndirects, it cannot skip sending the dirty inode to
    the metanode

    ------

    APAR: IY30060 COMPID: 5765D5100 REL: 340
    ABSTRACT: SWITCH CLOCK NEEDS TO BLOCK SIGNALS DURING INITIALIZATION

    PROBLEM DESCRIPTION:
    switch clock needs to block signals during initialization

    PROBLEM SUMMARY:
    the switch clock API listener thread was not blocking
    signals, and was taking delivery of signals intended to
    signal other threads in the MPI job.

    PROBLEM CONCLUSION:
    block signals in switch clock API listener thread.

    ------

    APAR: IY30156 COMPID: 5765B9501 REL: 340
    ABSTRACT: FSSTRUCT 111 DIRECTORY ERROR

    PROBLEM DESCRIPTION:
    fsstruct 111 directory error

    PROBLEM SUMMARY:
    Cached data block of deleted directory could cause FSSTRUCT
    on a newly created file.

    PROBLEM CONCLUSION:
    Before using the new file data block as a directory block,
    compare the generation number in gnode with the generation
    number of the new file.

    ------

    APAR: IY30205 COMPID: 5765B8100 REL: 230
    ABSTRACT: VXML RECORD TAG NOINPUT BLOCK EXECUTES EVEN AFTER DTMF ENTRY

    PROBLEM DESCRIPTION:
    When using a catch block to catch noinput within a record block,
    the catch block will execute even if the caller ends the
    recording with a DTMF. The catch block should only execute if
    the recording is not stopped by a DTMF.

    PROBLEM SUMMARY:
    When using a catch block to catch noinput
    within a record block, the catch block will execute even
    if the caller ends the recording with a DTMF. The catch block
    should only execute if the recording is not stopped by a DTMF.

    PROBLEM CONCLUSION:
    The record logic in DTInChannel was adding a
     DTMF terminator key after the input DTMF key, or fake keys
     indicating timout or max length. Due to other changes in the
     DTMF processing logic this terminator had become unnecessary
     and was thus treated erroneously as input to a non-existent
     field causing the raising of a noinput event.

    ------

    APAR: IY30367 COMPID: 5765E4600 REL: 120
    ABSTRACT: HANG OF VTT PROCESSES ON AIX 4.3.3

    PROBLEM DESCRIPTION:
    ViaVoice child hangs with error id 20503

    PROBLEM SUMMARY:
    Hang of VTT processes on AIX 4.3.3 due to a
    problem with the logger process. The looger process hangs when
    the /var/vtt/log/current.log file exceeds the size of 1MB.
    Subsequently all other ViaVoice telephony processes hang when
    trying to write logging messages.

    PROBLEM CONCLUSION:
    Fixed a deadlock in the logger process
    apparent with AIX 4.3.3.

    ------

    APAR: IY30372 COMPID: 5765B8100 REL: 230
    ABSTRACT: EXCEPTION FROM VXML WHEN USING SUBMIT TAG.

    PROBLEM DESCRIPTION:
    The VXMLBrowser generates an Exception under certain conditions
    with the use of submit.

    PROBLEM SUMMARY:
    EXCEPTION FROM VXML WHEN USING SUBMIT TAG

    ------

    APAR: IY30373 COMPID: 5765E5300 REL: 120
    ABSTRACT: CORE DUMP OF VTT PROCESSES ON PM EXIT

    PROBLEM DESCRIPTION:
    The ViaVoice process tsmp will occasionally core when shutting
    down ViaVoice.

    PROBLEM SUMMARY:
    Core dump of VTT processes on pm exit. Symptom
    is that a ViaVoice server system is busied out wiht the
    'tsmcon -b all' command. After waiting for all active calls
    to be completed and shutting down the system with the command
    'pm exit', ViaVoice telephony processes intermittently abend.

    PROBLEM CONCLUSION:
    Fixed the faulty exit procedures.

    ------

    APAR: IY30647 COMPID: 5765D5100 REL: 340
    ABSTRACT: LATEST PSSP 3.4.0 FIXES AS OF APRIL 2002

    PROBLEM DESCRIPTION:
    This is the lastest PSSP ptf as of April 2002.
    Order this apar to get all of the ptfs as of April 2002.

    PROBLEM SUMMARY:
    This is a packaging apar for PSSP 3.4.0 fixes
    as of April 2002.

    PROBLEM CONCLUSION:
    This is a packaging apar for PSSP 3.4.0
    fixes as of April 2002.

    ------

    APAR: IY30713 COMPID: 5765B8100 REL: 230
    ABSTRACT: INCORRECT PUASES IN <SAYAS> TAG

    PROBLEM DESCRIPTION:
    Incorrect pause in <sayas> tag. Tele number 04 123 07 456
    is said as 0, pause, 4, pause, one two three, etc.

    PROBLEM SUMMARY:
    Incorrect pause in sayas tag
    For example 04 123 5678
    said as 0, pause, 4, pause, 123, pause, etc.

    ------