Transferring text files

Text file transfer involves converting the code page of a file from one code page to another. Text file transfer also involves converting CRLF (carriage return-line feed) characters between systems. This topic summarizes text file transfer behavior of IBM® MQ Managed File Transfer.

Unless you specify otherwise, conversion is from the default code page of the file's source system to the default code page of its destination system. Additionally, text file transfer performs new line conversion, which means that new line characters for the destination file are those native to its destination platform. You can override the use of the default code pages on a system by specifying the code page to use for reading the source file and writing the destination file. You can also specify the end-of-line character sequence to use for the destination file. For more information, see the topics fteCreateTransfer (create new file transfer) and Using transfer definition files.

Text file transfers perform simple code point substitutions between code pages. Text file transfers do not perform complex transfers or translations of data, for example, conversions between visual and logical forms of bidi data or text shaping.

Table 1. Text file transfer behavior for all platforms
Area Default behavior Can you change this behavior?
Source file encoding Source platform encoding Yes

When you specify source file encoding and the source is a data set, the encoding must be an EBCDIC code page, otherwise the transfer fails. Similarly, if the destination is a data set, the destination encoding must be an EBCDIC code page.

Source file end of line character sequence Convert a single (LF) or (CRLF) sequence to the destination end of line character sequence No
Destination file encoding Destination platform encoding Yes

When you specify source file encoding and the source is a data set, the encoding must be an EBCDIC code page, otherwise the transfer fails. Similarly, if the destination is a data set, the destination encoding must be an EBCDIC code page.

Destination file end of line character sequence Destination platform EOL Yes
Text replacement character sequence for unmappable or malformed characters in the source or destination Blank, meaning the transfer fails if unmappable characters or malformed characters are present. You can use the textReplacementCharacterSequence property to specify the replacement text, which is described in The agent.properties file. Yes

z/OS data sets

When data set records are accessed in text mode, each record represents a single line. New line characters do not exist in the record but for ASA format data sets an ASA format control code character is set that represents a new line (or other control character). When a line of text with a terminating new line character is written to a record, the new line character is either automatically removed or an appropriate ASA control code is set, as appropriate. When a record is read a new line character is automatically appended to the return data. For ASA format data sets this character can be multiple new lines or a form feed, as appropriate for the ASA control code of the record.

Additionally, for fixed-format data sets when a record is read the new line is appended after the last character in the record that is not a space character, thus making fixed-format data sets suitable for storing text.

Table 2. Additional text file transfer behavior specific to z/OS
Area Default behavior Can you change this behavior?
Maximum line length Destination data set LRECL or BLKSIZE setting, as appropriate No
Wrap over length lines Wrap. The line is split over multiple records and blocks as required. No

When the IBM MQ Managed File Transfer agent is run, the environment variable _EDC_ZERO_RECLEN is always set to "Y". This setting makes IBM MQ Managed File Transfer text transfer behavior the same as FTP for variable and fixed block data sets. However, for undefined format data sets, IBM MQ Managed File Transfer converts single space lines to an empty line and preserves empty lines. FTP converts empty lines to single space lines and preserves single space lines. Table 3 describes the IBM MQ Managed File Transfer behavior and how FTP behavior differs.

The format of the data set also determines how each line of text is written to a record. For non-ASA format data sets newline and carriage-return characters are not written to the record. For ASA format data sets, the first byte of each record is an ASA control code representing end of lines, a form feed, and other codes, as appropriate. Because ASA control codes are at the start of each record, if the source text file does not start with a new line character sequence, a blank (' ') ASA control character sequence (which equates to a newline) is inserted. This means that if the ASA data set is transferred to a file, a blank line is present at the start of the file.

Table 3. The IBM MQ Managed File Transfer behavior for data sets
Data set format Original text line in file Data set record Read of data set record FTP Read behavior
Fixed block Empty line Space filled record Empty line Same as MQMFT
Fixed block Single space Space filled record Empty line Same as MQMFT
Variable block Empty line Empty record Empty line Same as MQMFT
Variable block Single space Single space record Single space Same as MQMFT
Undefined Empty line Single space record Empty line Single space
Undefined Single space Single space record Empty line Single space