Binhex 4.0 Converter

Posted on  by 

BinHex is an encoding system used in converting binary data to text, used by the Macintosh OS to send binary files through email. Conversion of binary data into ASCII characters is done to easily transfer the files from one platform to another, as almost all the computers can handle ASCII text files. BinHex was originally the idea of Tim Mann. BinHex 4.0 Encoded File. Did your computer fail to open a hqx file? We explain what hqx files are and recommend software that we know can open or convert your hqx files. Did you enjoy using our free file extension finder service? Please help us continue to make this service free by leaving us a good review at TrustPilot.

  • FUNCTIONS
  • OO INTERFACE
  • SUBMODULES
  • UNDER THE HOOD

Convert::BinHex - extract data from Macintosh BinHex files

ALPHA WARNING: this code is currently in its Alpha release. Things may change drastically until the interface is hammered out: if you have suggestions or objections, please speak up now!

Simple functions:

Hex to bin, low-level interface. Conversion is actually done via an object ('Convert::BinHex::Hex2Bin') which keeps internal conversion state:

Hex to bin, OO interface. The following operations must be done in the order shown!

Bin to hex, low-level interface. Conversion is actually done via an object ('Convert::BinHex::Bin2Hex') which keeps internal conversion state:

Bin to hex, file interface. Yes, you can convert to BinHex as well as from it!

PLANNED!!!! Bin to hex, 'CAP' interface.Thanks to Ken Lunde for suggesting this.

BinHex is a format used by Macintosh for transporting Mac files safely through electronic mail, as short-lined, 7-bit, semi-compressed data streams. Ths module provides a means of converting those data streams back into into binary data.

(Some text taken from RFC-1741.) Files on the Macintosh consist of two parts, called forks:

Data fork

The actual data included in the file. The Data fork is typically the only meaningful part of a Macintosh file on a non-Macintosh computer system. For example, if a Macintosh user wants to send a file of data to a user on an IBM-PC, she would only send the Data fork.

Resource fork

Contains a collection of arbitrary attribute/value pairs, including program segments, icon bitmaps, and parametric values.

Additional information regarding Macintosh files is stored by the Finder in a hidden file, called the 'Desktop Database'.

Because of the complications in storing different parts of a Macintosh file in a non-Macintosh filesystem that only handles consecutive data in one part, it is common to convert the Macintosh file into some other format before transferring it over the network. The BinHex format squashes that data into transmittable ASCII as follows:

  1. The file is output as a byte stream consisting of some basic header information (filename, type, creator), then the data fork, then the resource fork.

  2. The byte stream is compressed by looking for series of duplicated bytes and representing them using a special binary escape sequence (of course, any occurences of the escape character must also be escaped).

  3. The compressed stream is encoded via the '6/8 hemiola' common to base64 and uuencode: each group of three 8-bit bytes (24 bits) is chopped into four 6-bit numbers, which are used as indexes into an ASCII 'alphabet'. (I assume that leftover bytes are zero-padded; documentation is thin).

CRC computation

macbinary_crc DATA, SEED

Compute the MacBinary-II-style CRC for the given DATA, with the CRC seeded to SEED. Normally, you start with a SEED of 0, and you pump in the previous CRC as the SEED if you're handling a lot of data one chunk at a time. That is:

Note: Extracted from the mcvert utility (Doug Moore, April '87), using a 'magic array' algorithm by Jim Van Verth for efficiency. Converted to Perl5 by Eryq. Untested.

binhex_crc DATA, SEED

Compute the HQX-style CRC for the given DATA, with the CRC seeded to SEED. Normally, you start with a SEED of 0, and you pump in the previous CRC as the SEED if you're handling a lot of data one chunk at a time. That is:

Note: Extracted from the mcvert utility (Doug Moore, April '87), using a 'magic array' algorithm by Jim Van Verth for efficiency. Converted to Perl5 by Eryq.

Conversion

bin2hex

Class method, constructor. Return a converter object. Just creates a new instance of 'Convert::BinHex::Bin2Hex'; see that class for details.

hex2bin

Class method, constructor. Return a converter object. Just creates a new instance of 'Convert::BinHex::Hex2Bin'; see that class for details.

Construction

new PARAMHASH

Class method, constructor. Return a handle on a BinHex'able entity. In general, the data and resource forks for such an entity are stored in native format (binary) format.

Parameters in the PARAMHASH are the same as header-oriented method names, and may be used to set attributes:

open PARAMHASH

Class method, constructor. Return a handle on a new BinHex'ed stream, for parsing. Params are:

Data

Input a HEX stream from the given data. This can be a scalar, or a reference to an array of scalars.

Expr

Input a HEX stream from any open()able expression. It will be opened and binmode'd, and the filehandle will be closed either on a close() or when the object is destructed.

FH

Input a HEX stream from the given filehandle.

NoComment

If true, the parser should not attempt to skip a leading '(This file...)' comment. That means that the first nonwhite characters encountered must be the binhex'ed data.

Get/set header information

creator [VALUE]

Instance method. Get/set the creator of the file. This is a four-character string (though I don't know if it's guaranteed to be printable ASCII!) that serves as part of the Macintosh's version of a MIME 'content-type'.

For example, a document created by 'Canvas' might have creator 'CNVS'.

data [PARAMHASH]

Instance method. Get/set the data fork. Any arguments are passed into the new() method of 'Convert::BinHex::Fork'.

filename [VALUE]

Instance method. Get/set the name of the file.

flags [VALUE]

Instance method. Return the flags, as an integer. Use bitmasking to get as the values you need.

header_as_string

Return a stringified version of the header that you might use for logging/debugging purposes. It looks like this:

As some of you might have guessed, this is RFC-822-style, and may be easily plunked down into the middle of a mail header, or split into lines, etc.

requires [VALUE]

Instance method. Get/set the software version required to convert this file, as extracted from the comment that preceded the actual binhex'ed data; e.g.:

In this case, after parsing in the comment, the code:

would get back '4.0'.

Binhex 4.0 Converter
resource [PARAMHASH]

Instance method. Get/set the resource fork. Any arguments are passed into the new() method of 'Convert::BinHex::Fork'.

type [VALUE]
Binhex 4.0 Converter

Instance method. Get/set the type of the file. This is a four-character string (though I don't know if it's guaranteed to be printable ASCII!) that serves as part of the Macintosh's version of a MIME 'content-type'.

For example, a GIF89a file might have type 'GF89'.

version [VALUE]
Binhex

Instance method. Get/set the version, as an integer.

4.0

Decode, high-level

read_comment

Instance method. Skip past the opening comment in the file, which is of the form:

As per RFC-1741, this comment must immediately precede the BinHex data, and any text before it will be ignored.

You don't need to invoke this method yourself;read_header() will do it for you. After the call, the version number in the comment is accessible via the requires() method.

Bin2hex Tool

read_header

Instance method. Read in the BinHex file header. You must do this first!

read_data [NBYTES]

Instance method. Read information from the data fork. Use it in an array context to slurp all the data into an array of scalars:

Or use it in a scalar context to get the data piecemeal:

The NBYTES to read defaults to 2048.

read_resource [NBYTES]

Instance method. Read in all/some of the resource fork. See read_data() for usage.

Encode, high-level

encode OUT

Encode the object as a BinHex stream to the given output handle OUT. OUT can be a filehandle, or any blessed object that responds to a print() message.

The leading comment is output, using the requires() attribute.

Convert::BinHex::Bin2Hex

A BINary-to-HEX converter. This kind of conversion requires a certain amount of state information; it cannot be done by just calling a simple function repeatedly. Use it like this:

On each iteration, next() (and done()) may return either a decent-sized non-empty string (indicating that more converted data is ready for you) or an empty string (indicating that the converter is waiting to amass more input in its private buffers before handing you more stuff to output.

Note that done()always converts and hands you whatever is left.

This may have been a good approach. It may not. Someday, the converter may also allow you give it an object that responds to read(), or a FileHandle, and it will do all the nasty buffer-filling on its own, serving you stuff line by line:

Binhex 4.0 Converter Vs

Someday, maybe. Feel free to voice your opinions.

Convert::BinHex::Hex2Bin

A HEX-to-BINary converter. This kind of conversion requires a certain amount of state information; it cannot be done by just calling a simple function repeatedly. Use it like this:

On each iteration, next() (and done()) may return either a decent-sized non-empty string (indicating that more converted data is ready for you) or an empty string (indicating that the converter is waiting to amass more input in its private buffers before handing you more stuff to output.

Note that done()always converts and hands you whatever is left.

Note that this converter does not find the initial 'BinHex version' comment. You have to skip that yourself. It only handles data between the opening and closing ':'.

Convert::BinHex::Fork

A fork in a Macintosh file.

Design issues

BinHex needs a stateful parser

Unlike its cousins base64 and uuencode, BinHex format is not amenable to being parsed line-by-line. There appears to be no guarantee that lines contain 4n encoded characters... and even if there is one, the BinHex compression algorithm interferes: even when you can decode one line at a time, you can't necessarily decompress a line at a time.

For example: a decoded line ending with the byte x90 (the escape or 'mark' character) is ambiguous: depending on the next decoded byte, it could mean a literal x90 (if the next byte is a x00), or it could mean n-1 more repetitions of the previous character (if the next byte is some nonzero n).

For this reason, a BinHex parser has to be somewhat stateful: you cannot have code like this:

unless something is happening 'behind the scenes' to keep track of what was last done. The dangerous thing, however, is that this approach will seem to work, if you only test it on BinHex files which do not use compression and which have 4n HEX characters on each line.

Since we have to be stateful anyway, we use the parser object to keep our state.

We need to be handle large input files

Solutions that demand reading everything into core don't cut it in my book. The first MPEG file that comes along can louse up your whole day. So, there are no size limitations in this module: the data is read on-demand, and filehandles are always an option.

Boy, is this slow!

A lot of the byte-level manipulation that has to go on, particularly the CRC computing (which involves intensive bit-shifting and masking) slows this module down significantly. What is needed perhaps is an optional extension library where the slow pieces can be done more quickly... a Convert::BinHex::CRC, if you will. Volunteers, anyone?

Even considering that, however, it's slower than I'd like. I'm sure many improvements can be made in the HEX-to-BIN end of things. No doubt I'll attempt some as time goes on...

How it works

Open Hckx File

Since BinHex is a layered format, consisting of...

Open Hqx File

...there is a layered parsing algorithm to reverse the process. Basically, it works in a similar fashion to stdio's fread():

The conversion-and-decompression algorithms need their own internal buffers and state (since the next input chunk may not contain all the data needed for a complete conversion/decompression operation). These are maintained in the object, so parsing two different input streams simultaneously is possible.

Only handles Hqx7 files, as per RFC-1741.

Remember that Macintosh text files use 'r' as end-of-line: this means that if you want a textual file to look normal on a non-Mac system, you probably want to do this to the data:

Maintained by Stephen Nelson <stephenenelson@mac.com>

Written by Eryq, http://www.enteract.com/~eryq / eryq@enteract.com

Binhex 4.0 Converter Free

Support for native-Mac conversion, plus invaluable contributions in Alpha Testing, plus a few patches, plus the baseline binhex/debinhex programs, were provided by Paul J. Schinder (NASA/GSFC).

4.0

Ken Lunde (Adobe) suggested incorporating the CAP file representation.

Copyright (c) 1997 by Eryq. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

This software comes with NO WARRANTY of any kind. See the COPYING file in the distribution for details.

To install Convert::BinHex, copy and paste the appropriate command in to your terminal.

Binhex 4.0 Converter Pdf

For more information on module installation, please visit the detailed CPAN module installation guide.

Coments are closed