OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
 
From: AIX Service Mail Server (aixservaustin.ibm.com)
Date: Tue Mar 27 2001 - 02:16:11 CST

  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

    APAR: IY11446 COMPID: 5765D2800 REL: 430
    ABSTRACT: NIM_TOK CORE DUMP AND FAIL_STANDBY EVENT AFTER ADDING 64

    PROBLEM DESCRIPTION:
    When adding 64 aliases to the service adapter, the
    associated nim may core dump and generate a fail_standby
    event.

    PROBLEM CONCLUSION:
    Increase the buffer set aside for the nmAdapter list and
    skip logic that creates a socket for each alias.

    ------

    APAR: IY14628 COMPID: 5765D2800 REL: 430
    ABSTRACT: CLDARE -T FAILS ON CLUSTER WITH ENHANCED SECURITY

    PROBLEM DESCRIPTION:
    cldare -t fails with error "godm: Failed to get tickets"
    on cluster with enhanced security and if no principal for host.

    PROBLEM SUMMARY:
    After cl_setup_kerberos completes without error and the cluster
    security is set to enhanced cluster sync will fail with
    following errors,
    get_tickets: krbtgt kerberos authentication failed.
    2504-008 Kerberos principal unknown
    godm: Failed to get tickets

    PROBLEM CONCLUSION:
    The get_tickets function should not use host name for
    principal. Change gethostname to use cluster interface name.

    ------

    APAR: IY14900 COMPID: 5765D2800 REL: 430
    ABSTRACT: ERROR APPLICATION SERVER PARAMETERS CANNOT CONTAIN SPACES OCCURS

    PROBLEM DESCRIPTION:
    Cluster Planing Worksheets Application Server has entries for
    scripts. If parameters follow the script name an error message
    Application Server parameters cannot contain spaces is sent.

    PROBLEM CONCLUSION:
    Change error checking to allow spaces.

    ------

    APAR: IY15704 COMPID: 5765D2800 REL: 430
    ABSTRACT: C-SPOC LVM FUNCTIONS FAIL WHEN GHOST DISKS EXIST - HACMP,HAES

    PROBLEM DESCRIPTION:
    The customer had booted one of his machines while the other
    one had a shared vg varied on, which created ghost disks on the
    booted machine. Then when the customer tried to increase the
    size of an lv through C-SPOC, the node with the ghost disks
    could not find the related vg because of the ghost disks and
    so the function failed on that node.

    PROBLEM SUMMARY:
    The customer had booted one of his machines while the other
    one had a shared vg varied on, which created ghost disks on the
    booted machine. Then when the customer tried to increase the
    size of an lv through C-SPOC, the node with the ghost disks
    could not find the related vg because of the ghost disks and
    so the function failed on that node.

    PROBLEM CONCLUSION:
    An appropriate test for the existence of ghost disks and
    a call to cl_disk_available to clean them up was added into
    both clupdatevg and climportvg and the set -u instruction
    was removed from cl_disk_available.

    ------

    APAR: IY15707 COMPID: 5765D2800 REL: 430
    ABSTRACT: TWO NODES HAVE SAME RESOURCE GROUP AFTER DARE WHILE ONE NODE WAS

    PROBLEM DESCRIPTION:
    The customer had a cascading mutual takeover configuration with
    inactive takeover set to false. One node was taken down with
    takeover and powered off for a maintenance problem. While that
    node was still down, the customer had a problem on the other
    node which required it to be rebooted. Thus when starting
    HACMP back up on that node, the resource group normally owned
    by the other node was not taken. The customer then ran a dare
    to move that resource group sticky (required) to the node that
    was up to get it active again. Though verification errors had
    to be ignored in order to get this to happen, and warnings were
    given to sync the config to the powered off node before bringing
    it into the cluster, we also stated that the node would not be
    allowed into the cluster until this sync was done. However,
    when that node was started back into the cluster, there was
    nothing that detected the out of sync condition, and so let
    the node join the cluster resulting in it taking the resources
    without the other node releasing them.

    PROBLEM SUMMARY:
    The customer had a cascading mutual takeover configuration with
    inactive takeover set to false. One node was taken down with
    takeover and powered off for a maintenance problem. While that
    node was still down, the customer had a problem on the other
    node which required it to be rebooted. Thus when starting HACMP
    back up on that node, the resource group normally owned by the
    other node was not taken. The customer then ran a dare to move
    that resource group sticky (required) to the node that was up
    to get it active again. Though verification errors had to be
    ignored in order to get this to happen, and warnings were given
    to sync the config to the powered off node before bringing it
    into the cluster, we also stated that the node would not be
    allowed into the cluster until this sync was done. However,
    when that node was started back into the cluster, there was
    nothing that detected the out of sync condition, and so let the
    node join the cluster resulting in it taking the resources
    without the other node releasing them.

    PROBLEM CONCLUSION:
    Add a check in the rc.cluster script to compare resource ODMs
    with active nodes resource ODMs.

    ------

    APAR: IY15840 COMPID: 5765D2800 REL: 430
    ABSTRACT: NETWORK ROUTE(S) LOST WHEN HAES IS STARTED - HAES

    PROBLEM DESCRIPTION:
    After starting HAES on a node, the customer noticed that the
    standby adapter could no longer be reached, and then found that
    the network address for that network no longer existed.

    LOCAL FIX:
    Turn off pmtu discovery, or an e-fix is available.

    PROBLEM SUMMARY:
    After starting HAES on a node, the customer noticed that the
    standby adapter could no longer be reached, and then found that
    the network route for that network no longer existed.

    PROBLEM CONCLUSION:
    Corrected the remove routes logic in clstart to determine
    host or network route and always pass back a zero return
    code. Changed logic to always call rc.net after removal of
    routes.

    ------

    APAR: IY15923 COMPID: 5765D2800 REL: 430
    ABSTRACT: VERBOSE_LOGGING IS NOT EFFECTIVE FOR CLSTART - HACMP,HAES

    PROBLEM DESCRIPTION:
    There are tests for VERBOSE_LOGGING within clstart to allow
    for logging execution statements mainly for debugging, but
    these are not effective in the normal case of being called
    from rc.cluster since the output on the call to clstart is
    redirected to /dev/console.

    PROBLEM CONCLUSION:
    The code in rc.cluster was modified so call clstart without
    redirection of the output if VERBOSE_LOGGING = high.

    ------

    APAR: IY16029 COMPID: 5765D2800 REL: 430
    ABSTRACT: CLUSTER VERIFY OR SYNCH FAILS WITH MSG INDICATING CLUSTER.LOG FI

    PROBLEM DESCRIPTION:
    When the customer tried to run cluster verification or synch
    it would fail with the only message indicated:
    cllog: The cluster.log log file has already been redirected via
    modification of the /etc/syslog.conf file on node <nodename>.
    If you wish to redirect this log again, please change the
    cluster.log entries of the /etc/syslog.conf file. Inspection
    of the /etc/syslog.conf file on the cluster nodes, however, did
    not show any apparent problem.

    LOCAL FIX:
    Remove tabs between fields of /etc/syslog.conf file entries.

    PROBLEM SUMMARY:
    When the customer tried to run cluster verification or synch
    it would fail with the only message indicated:
    cllog: The cluster.log log file has already been redirected via
    modification of the /etc/syslog.conf file on node <nodename>.
    If you wish to redirect this log again, please change the
    cluster.log entries of the /etc/syslog.conf file. Inspection
    of the /etc/syslog.conf file on the cluster nodes, however, did
    not show any apparent problem.

    PROBLEM CONCLUSION:
    The entries in the syslog.conf file for cluster.log had
    tab(s) instead of space(s) between fields, and the code that
    was parsing these entries was using cut with space as
    field separator, so did not separate the fields. The code
    was changed to use awk rather than cut in cllog.sh,
    clsnapshot.sh, and cld_logfiles.sh.

    ------

    APAR: IY16436 COMPID: 5765D2800 REL: 430
    ABSTRACT: APPLY OF ACTIVE.0.ODM SNAPSHOT FAILS WITH HACMPLOGS CORRUPTION

    PROBLEM DESCRIPTION:
    The customer had installed HACMP 4.3.1 and latest PTF level and
    then after configuring and taking a snapshot of the
    configuration. Later changing something in a resource group and
    then applying the previous snapshot while the cluster was up,
    resulted in the loss of the HACMPlogs and HACMPsna ODM classes
    in the active directory. If then trying to apply the
    active.O.odm saved when the dare was performed, this resulted
    in errors during verify that the HACMPlogs class was corrupted.

    PROBLEM SUMMARY:
    The customer had installed HACMP 4.3.1 and latest PTF level and
    then after configuring and taking a snapshot of the
    configuration. Later changing something in a resource group and
    then applying the previous snapshot while the cluster was up,
    resulted in the loss of the HACMPlogs and HACMPsna ODM classes
    in the active directory. If then trying to apply the
    active.O.odm saved when the dare was performed, this resulted
    in errors during verify that the HACMPlogs class was corrupted.

    PROBLEM CONCLUSION:
    Previous accidentally deleted code from cldare.sh was
    merged back in. This code also included code handling
    migrations from HAS to HAES.

    ------

    APAR: IY16489 COMPID: 5765E6400 REL: 220
    ABSTRACT: GEO: GEO_VERIFY REPORTS INCORRECT KRPC PRIORITY FOR INTERFACES

    PROBLEM DESCRIPTION:
    Geo_verify reports and incorrect value for the KRPC Priority
    associated with each interface.

    PROBLEM CONCLUSION:
    Correct printf so that the number of qualifiers in the string
    from the message catalog matches the number of qualifiers in
    the C string and the number of arguments provided.

    ------

    APAR: IY16505 COMPID: 5765D2800 REL: 430
    ABSTRACT: INTERFACE ROUTE LOST AFTER SWAP_ADAPTER EVENT - HACMP,HAES

    PROBLEM DESCRIPTION:
    Evidently, after upgrading to a certain level of AIX 4.3.3,
    after a swap_adapter event occurred there was no route on the
    interface to which the service address had been swapped. This
    route should normally be created as a result of the ifconfig
    of the interface to the up state by the HACMP script, since
    the previous routes were deleted.

    PROBLEM SUMMARY:
    Evidently, after upgrading to a certain level of AIX 4.3.3,
    after a swap_adapter event occurred there was no route on the
    interface to which the service address had been swapped. This
    route should normally be created as a result of the ifconfig
    of the interface to the up state by the HACMP script, since
    the previous routes were deleted.

    PROBLEM CONCLUSION:
    Though the HACMP script has been working across several
    different versions and levels of AIX and HACMP through
    use of deleting the route and ifconfig down of the interfaces,
    at some level of change within AIX 4.3.3 this is no longer
    consistent. As recommended by AIX TCP I have changed the
    cl_swap_IP_address script to detach both interfaces before
    ifconfig either of them back up.

    ------

    APAR: IY16564 COMPID: 5765D2800 REL: 430
    ABSTRACT: HACMP: VOLUME GROUPS MISSING AFTER FAILED TAKEOVER

    PROBLEM DESCRIPTION:
    When a node takes over the disk resources of another node, and
    those physical disks are not accessable, then the volume group
    associated with that disk is exported, leaving it undefined on
    that node.

    PROBLEM CONCLUSION:
    Change the code so that prior to doing the export, a test is
    made to see if there is at least one accessable disk in the
    volume group. If there is not, do not export the volume group
    since import will surely fail. Additionally, change the logic
    in the replay file to not remove the replay file unless the
    import is successful.

    ------

    APAR: IY16565 COMPID: 5765D2800 REL: 430
    ABSTRACT: ENHNCEMENT:HAES CSPOC MIRRORCONCURRENTVG HANGS/FAILS

    PROBLEM DESCRIPTION:
    C-SPOC operations on volume groups managed by the group
    services based concurrent mode support do not succeed,
    because disks are not recongized as being supportable
    in concurrent mode.

    PROBLEM CONCLUSION:
    Enhance the logic that C-SPOC uses to determine if a disk is
    concurrent capable to allow any disk in group services based
    concurrent mode.

    ------

    APAR: IY16667 COMPID: 5765D2800 REL: 430
    ABSTRACT: PATH FOR THE GREP CMD IS INCORRECT IN THE SCRIPT, SSA_CONFIGURE

    PROBLEM DESCRIPTION:
    Path for the grep cmd is incorrect in ssa_configure script.
    The check_err() routine has a line bin/grep without the
    leading /.

    PROBLEM CONCLUSION:
    Add missing leading /.

    ------

    APAR: IY16755 COMPID: 5765E6400 REL: 220
    ABSTRACT: SMIT GEORM UTILITIES MENU HAS INCORRECT ENTRY

    PROBLEM DESCRIPTION:
    geoRM has a catalog entry missing and thus smit panel
    georm_utils is displayed incorrectly.

    PROBLEM SUMMARY:
    geoRM has a catalog entry missing and thus smit panel
    is displayed incorrectly.

    PROBLEM CONCLUSION:
    Add necessary entry to gmd_smit.msg catalog

    ------

    APAR: IY16815 COMPID: 5765E6400 REL: 220
    ABSTRACT: HAGEO: IMPORTING TOPOLOGY INCLUDES NONGEO NETWORKS

    PROBLEM DESCRIPTION:
    HAGEO "geo_import_hacmp" imports nonGEO networks, although
    the adapters onto those networks are not imported.

    PROBLEM CONCLUSION:
    Update geo_import_hacmp such that it checks the type of the
    network before importing that network.

    ------

    APAR: IY16832 COMPID: 5765D2800 REL: 430
    ABSTRACT: SMIT HELP FOR HAES CLUSTER NETWORK MODULE PARAMETERS FIELD

    PROBLEM DESCRIPTION:
    No help text for the Parameters field on the Cluster Network
    Module smit panel.

    PROBLEM CONCLUSION:
    Add help text for Parameters field:
      "This field specifies the parameters passed to the network
       interface module (NIM) executable. For the rs232 NIM, this
       field specifies the baud rate. Allowable values are
      9600 (the default), 19200 and 38400"

    ------

    APAR: IY16851 COMPID: 5765D2800 REL: 430
    ABSTRACT: ADDING/CONFIGURING AEN FAILS WITH SDD/DPO INSTALLED

    PROBLEM DESCRIPTION:
    Configuring (adding) Automatic Error Notification fails with
    the following messages echoed to the SMIT screen: sili:
    Operation failed. tech: Operation failed. dsh: 5025-509 tech
    rsh had exit code 1

    PROBLEM CONCLUSION:
    The behavior of AEN with SDD
    has been documented in the relevant chapters
    of the HACMP Administration Guide and the
    HACMP/ES Guide. Information was reviewed by
    Venky. Information will also be added to
    the Release Notes for version 4.4.1.

    ------

    APAR: IY16858 COMPID: 5765E6400 REL: 220
    ABSTRACT: GEORM: SMIT SHOWS WRONG NODES WHEN CHANGING SITE

    PROBLEM DESCRIPTION:
    The "Change/Show Site" SMIT screen may display the wrong
    nodes for a given site.

    PROBLEM CONCLUSION:
    Correct an error in the SMIT screen definition.

    ------

    APAR: IY17161 COMPID: 576550500 REL: 210
    ABSTRACT: MESSAGING IS NOT FUNCTIONING CORRECTLY. IT IS TRUNCATING THE

    PROBLEM DESCRIPTION:
    PSF messaging is truncate the Email address to have a . (dot)
    at the end of the address. This results in an un-deleverable
    message.

    PROBLEM SUMMARY:
    Messaging is truncating the email address and
    adding a "." as "LPSAPISIS."

    PROBLEM CONCLUSION:
    Code was changed to correct this problem.

    ------

    APAR: IY17538 COMPID: 5765B8100 REL: 220
    ABSTRACT: PROBLEM IN ERROR MESSAGE HANDLING

    PROBLEM DESCRIPTION:
    Problem in error message handling due to excess message length

    PROBLEM SUMMARY:
    Problem handling error message due to excess
    length of error message

    PROBLEM CONCLUSION:
    corrected error message length

    ------