PMDF FAQ: Troubleshooting

 

Why do I see Recorded error -- Zero length SMTP status line errors?

PMDF is expecting the remote system to contain valid SMTP status code when it does not have it.

This error can be found in your mail.log_current file as well as in debug files and TCPDUMP files.

All SMTP status lines are required to begin with three digits, followed by a space or dash, then an optional status message. Having an SMTP status line containing no characters is a protocol violation. The usual suspect is an extra CR or LF just before a status. PMDF treats this as a temporary error and attempts to send the mail again.

This is a problem on the server end (usually from firewall software, or some Exchange servers); however, PMDF can handle this broken behavior. All you need to do is specify the smtp_crlf or smtp_lf channel keyword rather than the default keyword smtp_crorlf (smtp is synonymous for the smtp_crorlf keyword), and PMDF no longer treats bare CRs as a terminator. A single CR or a single LF is treated as a "normal" character. Process Software recommends using the keyword smtp_crlf since RFC 821 (section 4.1.1) says that lines should be terminated by a single CRLF sequence.

There are some SMTP servers that use LF-only terminators. However, bare CR terminators are quite rare. You need to choose between supporting the agents that use bare CRs and LFs as line terminators or supporting the ones that use them as regular characters. In general, you can not support both.

Unfortunately, this is not addressed in current RFCs. RFC821 says that lines should be terminated by a single CRLF sequence, but does not say anything about the interpretation of bare CRs and LFs. Some clients break the rules and use either LF or CR alone instead of CRLF together. So PMDF tries to deal with this by treating these as line terminators (as noted above, this treatment is configurable). The problem is that other agents expect bare CR or LF NOT to be interpreted as a terminator.

Newer versions of PMDF have code that ignores such things in status responses, so one way to avoid the problem is to upgrade.


Sometimes my email will back up with several hundred jobs that need to be processed, but my system resources are free. What can I do to prevent this?

One of the ways of improving throughput is by adding additional queues. By default, all mail gets queued to one queue and is processed from there. On VMS, the queue is MAIL$BATCH, which is a generic queue that points to PMDF_1, PMDF_2, PMDF_3, and PMDF_4.

Linux is slightly different - it doesn't have a concept of "queue" or "batch jobs" or "job controller" like VMS does. To make the code and configuration consistent among multiple operating system, a job controller of for PMDF's exclusive use is included with PMDF. This information is in a file in /pmdf/table/job_controller.cnf. The default configuration shows:

[QUEUE=DEFAULT]
job_limit=4
capacity=200

DEFAULT is the queue name, job_limit is the number of processes that can run simultaneously and capacity is the limit that can be "holding in the queue" waiting to be processed. You can increase these numbers. Once changed, you will need to do a pmdf restart for the change to take effect.

In both cases (VMS and Linux) only 4 jobs that process the mail can be running simultaneously. Should you be using a conversion channel that does virus checking, these queues can have conversion jobs in them for some time, preventing other jobs from running.

To increase delivery performance, you can setup queues specific for any or all of the channels. For instance, you can setup a queue specifically for the conversion channel to run in so that it will not take up slots for, say, the top_local or the msgstore channels. This will enable other mail to be delivered in a timely fashion. Although this is a simple example for the conversion channel, you can be very creative in queue management depending on your particular situation, including configuring multiple channels to run in a single queue or enabling each channel to process in it's own queue.

Let's assume you want to set up queues specifically for the conversion channel.

On VMS, you would edit the SYS$STARTUP:PMDF_INIT_ QUEUES.COM; and add lines like:

$initialize/queue/device=server/noenable_generic-
/processor=pmdf_process_smb/on=NODEA::-
/protection=(s:rwe,o:rwd, g:r, w:r)PMDF-CONV_1

$initialize/queue/device=server/noenable_generic-
/processor=pmdf_process_smb/on=NODEA::-
/protection=(s:rwe,o:rwd, g:r,w:r) PMDF-CONV_2

and so on... then

$initialize/queue/device=server/generic=(PMDF-CONV_1,PMDF-CONV_2) -
CONV-MAIL$BATCH

In SYS$STARTUP:PMDF_START_QUEUES.COM

$ start/queue PMDF-CONV_1
$ start/queue PMDF-CONV_2
$ start/queue CONV-MAIL$BATCH

You may wish to make similar updates to PMDF_DELETE_QUEUES.COM and PMDF_STOP_QUEUES.COM

In UNIX, add the lines in job_controller.cnf

[QUEUE=CONV-MAIL$BATCH]
job_limit=4
capacity=200

Once the queues are added in VMS, make sure you run both these files before making the appropriate changes to your PMDF.CNF channel block definitions. For our conversion channel,

conversion queue conv-mail$batch
CONVERSION-DAEMON

When you are done, make sure you do a

$ pmdf cnbuild (if you run an installed configuration)
$ install replace pmdf_config_data (if you run an installed configuration)
$ pmdf restart dispatcher (installed or non-installed configuration)

On Linux:

# pmdf cnbuild
# pmdf restart

Why do we get the errors: response to dot-stuffed message expected?

Background: SMTP [RFC 821] specifies that when transferring the body of an SMTP message, any line that begins with a . (dot) be prefixed, before being sent, with another dot. This is commonly referred to as "dot-stuffing". It is necessary because the end of the body is signaled by a single dot on a line. So in the message

> Error reading SMTP packet; response to dot-stuffed message expected

The "dot-stuffed message" portion may be understood more simply as "message body". This means that the remote side failed to respond in ten minutes after PMDF sent the last of the message.

The error text indicates that PMDF successfully connected, addresses were accepted, and the entire message body was sent. The problem is that the remote side SMTP server is either aborting or being very slow to respond or the actual network connection was dropped. In any case, PMDF never received a response back within the default timeout period.

As is typical with TCP channel/SMTP protocol problems, enabling debugging for the channel and generating a debug log reflecting the error often greatly clarifies what is happening. Most TCP channel or SMTP protocol error messages become clearer when seen in the context of exactly when during the SMTP dialogue they occurred.

Recommendations: If you are having a consistent problem sending to a particular system, first determine it is not a network problem. If the remote end insists there is nothing wrong with their SMTP server, but is overloaded and hence very slow at accepting e-mail, you could try setting up a separate channel for sending to this system. You should also provide a more generous timeout value for that channel. This would not be advisable for the general TCP/IP channel since often waiting longer is futile and means wasting additional time before moving on to another message.

If you desire to enable debugging for the outbound tcp_ channel, put master_debug on the channel and look for the resulting tcp_*_master.log.

See Section 23.1.2 of the PMDF System Manager’s Guide, especially the STATUS_DATA_RECEIVE_TIME option, for more information.

Also, note that the STATUS_DATA_RECV_PER_ADDR_TIME, STATUS_DATA_RECV_PER_BLOCK_TIME,and STATUS_DATA_RECV_PER_ADDR_PER_BLOCK_TIME options may be adjusted to allow for greater timeout adjustment factors depending on the number of addresses in and size of the message, if they were factors.