PMDF V6.0-23 Patches & Enhancements

What PMDF channel counters are for

PMDF channel counters are meant to indicate the trend and health of your mail system.

Accuracy of accounting is not what the channel counters are designed to do. The lack of accuracy in PMDF's channel counters in an inherent aspect of their design; it is not a bug.

Specifically, PMDF's channel counters adhere to what Marshall Rose calls the fundamental axiom of management, which is that management must itself not interfere with proper system and network operation by consuming anything but the tiniest amount of resource.

PMDF's channel counters are implemented using the lightest weight mechanisms we have available: A shared memory section on each system that is periodically synchronized to a disk database. Channel counters do not "try harder" -- if an attempt to map the section fails, no information is recorded, if one of the locks in the section cannot be obtained almost immediately no information is recorded, when a system is shut down the information contained in the in-memory section is lost forever.

What PMDF channel counters are not for

PMDF channel counters are not intended to provide an accurate accounting of message traffic.

Components of PMDF Channel Counters

  • PMDF channel counter process (VMS only). The resident worker process which is started by the PMDF_STARTUP.COM command. Its process name is usually "PMDF counters", changing to "PMDF count exit" when it is exiting or restarting.

  • In-memory database. The in-memory channel counters cache is stored in a system permanent page file section named PMDF_COUNTERS. This page file section is node-specific. It is created and initialized by the PMDF channel counter process if it does not already exist. If it already exists, then the channel counters contained within will be your "current" in-memory channel counters.

  • On-disk database (VMS only). This is a cluster-wide database on disk. This database is created automatically by the PMDF channel counter process. When created it will have all counters zeroed except for the count of messages files stored in the queue directory for the channel when that directory was scanned. Note that on an active system, the number of files in a directory at any instant does not reflect the number of real, stored messages. For example, some files may be in the process of being deleted, other files may be added after the scan is done, etc.

  • Synchronization (VMS only) of the in-memory database with the on-disk database. This happens when you use the PMDF COUNTER/SYNCHRONIZE command.
    1. the data recorded on disk is locked and read,
    2. the value is added to the in-memory value, and
    3. the new value written back out to disk, and
    4. the in-memory value is zeroed.

  • COUNTERS commands. There are two sets of commands. See the manual for more complete descriptions.
    from DCL PMDF QM/MAINT subcommands on UNIX function
    PMDF COUNTERS/SYNCHRONIZE COUNTERS SYNCHRONIZE Not applicable Tells the PMDF channel counters process to synchronize
    PMDF COUNTERS/SHOW COUNTERS SHOW pmdf counters -show Shows you the on-disk channel counter values. For the QM subcommand, the counters are implicitly synchronized first.

    When Should You Worry?

    • If a channel's stored counter keeps decreasing over long period of time.
    • If a channel's stored counter keeps increasing over long period of time.
    • If a channel name does not make sense.
    When You Shouldn't Worry

    • Negative numbers.
    • The stored number does not match the real number of messages in the channel.
    • The number seems to be off by a few (relative to the total number of messages processed by your system).
    Example interpretation

    An example of the output from the QM/MAINT COUNTER SHOW command is shown below

    Channel                       Messages   Recipients    Blocks
    ----------------------------- --------   ---------- ---------
    directory
         Received                    6519        9038       69545
         Stored                        -4          -4        -149
         Delivered                   6523        9042       69694
         Submitted                   6811        9019       71123
    

    name description example corresponding mail.log entry
    Received Messages coming from any channel (e.g., "xyz") to the channel named "directory". That is, messages enqueued to the "directory" channel by any other channel. xyz directory E
    Stored Messages stored in the channel queue to be delivered. That the number may be negative just means that the zero used for the counters does not mean zero messages stored on disk. (Not applicable)
    Delivered Message which have been processed by the channel "directory" and either delivered or returned. directory D
    Submitted Messages sent from the channel "directory" to any other channel; e.g., enqueued to the channel "xyz". directory xyz E

    Note that the directory channel shows more submissions than delivered. This is usually the case, SUBMITTED >= DELIVERED, since each message the channel dequeues (DELIVERED) will result in at least one new message enqueued (SUBMITTED) but possibly more than one. For example, if it has two recipients reached via different channels, then two enqueues will be required. Or, if the message bounces a copy will go back to the sender and another copy may be sent to the postmaster. Usually that will be two submissions (unless both are reached through the same channel).

    Moreover, when you shut down a node, the data which has accumulated since the last PMDF COUNTERS/SYNCH command in the in-memory data cache is lost. If PMDF processing is spread across a cluster, that means that an enqueue may have been processed and hence accumulated in the in-memory cache on node A and the associated dequeue processed and accumulated on node B. If node A goes down before its in-memory cache is synchronized to the cluster-wide on-disk database, it will be lost. If the node B data then does get synchronized to the disk, you will then have a lack of balance in the recorded enqueues/dequeues for that channel.

    Now, you can force the detached channel counter processes to periodically flush their data. Also, having your shutdown procedures do a PMDF CACHE/SYNCH is marginally helpful, but still leaves a window during which a message may come in (say over DECnet or TCP/IP) and get its enqueue recorded to the in-memory cache after the final synchronization. This is marginally helpful because synchronizations do not always succeed. For instance, if another process has the on-disk database locked, a detached channel counter process may not be able to make its updates. In such a case, the process gives up and assumes that the synchronization will succeed later. Of course, the synchronization will never occur if the system is about to be brought down.

    To have periodic synchronizations done, define cluster-wide the system logical

       $ DEFINE/SYSTEM PMDF_COUNTER_INTERVAL "dd hh:mm:ss"
    

    For instance, to update the on-disk database once every 10 minutes, you would use

       $ DEFINE/SYSTEM PMDF_COUNTER_INTERVAL "00 00:10:00.00"
    

    That logical must be defined before PMDF_STARTUP.COM runs so that it is seen by the detached channel counter processes.

Search About Contact Home