Glossary Item Box

PowerTCP Mail for .NET

MIME Overview

MIME Overview

MIME (Multipurpose Internet Mail Extensions) is necessary because the text-only format described by RFC 822 is extremely limiting. MIME allows an email message to contain data other than text, and still be backwardly-compatible with non-MIME compatible mail servers. When MIME was implemented, it caused some changes to the original message structure specified in RFC 822. Additional MIME-specific header fields were required to specify the MIME content. The structuring of the message body was changed to allow multiple types of files. Finally, encoding rules were specified for those attachments that were not ASCII. By implementing MIME this way, mail transfer agents to could remain the same; only the sending and receiving client needed to be changed to handle the MIME attachments.

The message header fields that were added to the RFC 822 mail structure are:

Of these headers, the two most pertinent to this discussion are the Content-Transfer-Encoding header field and the Content-Type header field.

Content-Type

Content-Type specifies the nature of the message body (the type of file). RFC 1521 defined seven types (which are still under revision):

The purpose of this field is to let the receiving client know the content of the part to allow it to handle the part appropriately. Each of these types has a subtype which defines the specific format for the type of data. To see this work, examine these fields for a ASCII text message sent with a MIME compliant user agent.

< RFC 822 header fields omitted >

Content-Transfer-Encoding: Quoted-Printable

Content-Type: text/plain; charset=ISO-8859-1

MIME-Version: 1.0

< message content (ASCII text omitted) >

Here one can see that MIME version 1.0 and Quoted-Printable encoding (more on this later) was used. The Content-Type type is "text" and the subtype is "plain", meaning that it is simply made up of ASCII without any other formatting applied.

As another example, let's examine the MIME header fields for a message containing only a single image.

< RFC 822 header fields omitted >

MIME-Version: 1.0

Content-Type: image/gif;

name="Dart.GIF"

Content-Transfer-Encoding: base64

Content-Disposition: attachment;

filename="Dart.GIF"

< encoded message omitted >

Again the MIME-Version is 1.0, but the encoding used was Base64. The Content-Type is "image" and the subtype is "gif", telling the receiving user agent that the file "Dart.GIF" is indeed a "gif" type of image.

Next, examine the header field Content-Transfer-Encoding.


Content-Transfer-Encoding

As mentioned earlier, the header field Content-Transfer-Encoding tells what type of MIME-encoding was used so the receiving client can properly decode it using the same scheme. Why do MIME messages need to be encoded? Most file formats, in their natural state, are 8-bit files. Unfortunately, the SMTP protocol restricts messages to 7-bits because it was originally implemented to only transfer 7-bit ASCII text. Encoding as it applies to email delivery means to takes an 8-bit file and, using an encoding algorithm, create a 7-bit file to allow it to be compatible with all SMTP agents.

How does this happen? RFC 2045 defines the use of two algorithms, Quoted-Printable and Base64. There is no definable relationship between these schemes and the Content-Type header field (in other words, the same file type is not always encoded with the same scheme).

Quoted-Printable

The Quoted-Printable encoding system works in the following way: An 8-bit character has 256 possible combinations (2^8). Of these 128 (2^7) are printable ASCII characters (7-bit characters). These 7-bit characters are not a problem for transfer. We need a way to deal with the characters 128-256 (the characters that are represented by 8-bits). This is done by quoting the characters hexadecimal value. For example, a character with the value of 200 would be encoded as "=C8".

Quoted-Printable encoded data is assumed to be line oriented. In fact, since the ASCII characters are kept in their original form, often a file encoded with Quoted-Printable is highly readable. Using this encoding system lines are transmitted in lengths no longer than 76 characters.

To see this algorithm illustrated, examine this data encoded with Quoted-Printable encoding.

<!doctype HTML public "-//W3C//DTD HTML 4.0 Frameset//EN">

<html>

<!--(=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D)-->

<!--(Document created with RoboEditor. )=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D-->

<!--(=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D)-->

<head>

<title>Tutorial</title>

This document is apparently a portion of an HTML page, as some tags made up of ASCII characters have been preserved. There are quite a few encoded characters as well, clearly identifiable by the "=" preceding them.


Base64

The encoding in Base64 is entirely different from Quoted-Printable encoding. Base64 makes no attempt for the encoded data to be human-readable. Also, this form on encoding results in the encoded data being 33% larger than the original data. The scheme works in the following way. Groups of 24-bits are broken off. In the data's native form this would represent 3 "8-bit" segments. However we are restricted to the use of "7-bits", so this data is encoded as 4 "6-bit" segments.

To see this algorithm illustrated, examine this data encoded with Base64 encoding.

pcEATSAJBAAA8BK/AAAAAAAAEAAAAAAABAAAnRsAAA4AYmpiauI94j0AAAAAAAAAAAAAAAAAAAAA

AAAJBBYAIjgAAIBXAACAVwAAnRcAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA

AAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAAAAAAAAAAAAGwAAAAAACABAAAAAAAAIAEAACAB

AAAAAAAAIAEAAAAAAAAgAQAAAAAAACABAAAAAAAAIAEAABQAAAAAAAAAAAAAADQBAAAAAAAAeAkA

AAAAAAB4CQAAAAAAAHgJAAAAAAAAeAkAAAwAAACECQAAJAAAADQBAAAAAAAAA1UAAGgBAAC0CQAA

AAAAALQJAAAAAAAAtAkAAAAAAAC0CQAAAAAAALQJAAAAAAAAtAkAAAAAAAC0CQAAAAAAALQJAAAA

AAAAglQAAAIAAACEVAAAAAAAAIRUAAAAAAAAhFQAAAAAAACEVAAAAAAAAIRUAAAAAAAAhFQAACQA

AABrVgAAIAIAAItYAAB8AAAAqFQAABUAAAAAAAAAAAAAAAAAAAAAAAAAIAEAAAAAAAC0CQAAAAAA

Because the structure of the data is completely changed, the result is unreadable. Now we will address what happens when more than one part exists, by looking at the Multipart type of the Content-Type header field.


MIME With Multiple Parts

MIME allows the use of multiple parts. It may be necessary at this point to distinguish between a part and an attachment. An attachment is always a part but a part is not always an attachment. For example, if a person sent a text email and attaches a file, this would create two parts, the text and the attached file.

When any file is attached to a message, this causes the Content-Type in the main header to become multipart, identifying that there are more parts than the message text. These parts are enumerated after the main header, separated by an automatically generated divider. Each of these parts may be encoded with a different encoding scheme. To see this illustrated, lets examine a message made up of text, a gif file, and a word doc.

At the beginning of the message will be the header.

Return-Path: <test@test.com>

Received: from YOURCOMPUTER ([192.168.0.00]) by yourserver.com

          (Post.Office MTA v3.5 release 215 ID# 0-54045U100L2S100V35)

          with ESMTP id com for <test@email.com>;

          Fri, 2 Feb 2001 11:09:42 -0500

To: test@email.com

From: test@test.com

Subject: multipart email

MIME-Version: 1.0

Content-Type: multipart/mixed; Boundary="--PTCP_00011cb405020407d1"

Message-ID: <00011cb505020507d1@[192.168.0.71]>

Date: Fri, 02 Feb 2001 11:10:35 -0500

Of course, all the fields from the original RFC 822 specifications are present, but for this example, let's focus on the Content-Type. The type is described as multipart/mixed, which tells us to expect several independent parts after the main header. The Boundary tells us that these parts will be separated by the character string "--PTCP_00011cb405020407d1". This is easily seen in the remainder of the message.

----PTCP_00011cb405020407d1

Content-Type: text/plain; charset=ISO-8859-1

Content-Transfer-Encoding: Quoted-Printable

This message is a multipart MIME test.

 

----PTCP_00011cb405020407d1

Content-Transfer-Encoding: base64

Content-Disposition: attachment; filename="Dart.GIF"

Content-Type: application/octet-stream; name="Dart.GIF"

R0lGODlhbwAmAPf/AP///xAQECEhISkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nzc3t7e4SE

hIyMjJSUlJycnKWlpa2trbW1tb29vcbGxs7OztbW1t7e3ufn5+/v7/f39+fe3tbGxrWcnGMxMVIA

< other Base64 encoding omitted >

 

----PTCP_00011cb405020407d1

Content-Transfer-Encoding: base64

Content-Disposition: attachment; filename="dart site report.doc"

Content-Type: application/msword; name="dart site report.doc"

0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAKAAAAhwQAAAAAAAAA

EAAAiQQAAAEAAAD+////AAAAAH0EAAB+BAAAfwQAAIAEAACBBAAAggQAAIMEAACEBAAAhQQAAIYE

< other Base64 coding omitted >

----PTCP_00011cb405020407d1--

It is easy to see the three MIME parts, each delimited by the aforementioned Boundary string. The first part is our plain text message. It was encoded with Quoted-Printable, but is made up of ASCII characters and is therefore intact: "This message is a multipart MIME test."

The second part is identified as being the image Dart.GIF. Not surprisingly it was encoded with the Base64 scheme, resulting in a human-unreadable jumble.

The third part is identified as a Microsoft Word document called "dart site report.doc". Again, Base64 was used to encode the message. The message ends with a final Boundary string.

 

In This Section

Email Beginnings
Discusses the origin of Internet email and the advent of SMTP.
SMTP Protocol Overview
Discusses some of the inner workings of the Simple Mail Transfer Protocol.
POP Protocol Overview
Discusses some of the inner workings of the Post Office Protocol.
IMAP Protocol Overview
Discusses some of the inner workings of the Internet Message Access Protocol.
Basic Message Structure
Discusses the elements that comprise a basic email message.
MIME Overview
Discusses the need for and the implementation of Multipurpose Internet Mail Extensions.
S/MIME Overview
Provides an overview of S/MIME.

 

 


Send comments on this topic.

Documentation version 3.1.

© 2009 Dart Communications.  All rights reserved.