Concept

Encoding

Imagine that you want to move data between two contexts. The data represent one thing in the original context, and the same thing or another thing in the final context. But that data must move through one or more transfer contexts as well.

What will that data represent to each of the transfer contexts? I was sending data from one context to another, and received an error message to the effect that some of the data was unacceptable to the transfer context. But I needed that data transfered.

Two choices are obvious. One is to change the transfer method. The second is to modify the data before and after the transfer, before to make it acceptable in all (or most) transfer contexts, and after to return the data to their original form.

Fortunately this problem was solved in more than one way in the dawn of computers. This is called encoding and decoding the data. There are thousands of ways to do it, and different methods are prefered depending upon your particular concerns.

Base64 Encoding

I chose to use Base64 Encoding because it is simple and old, tried and true. It fit my use case well.

How Base64 Encoding Works

simplified for the sake of simplicity

three bytes of data become four bytes of data
the new bytes of data are used as indices into a table of 64 bland bytes
The table is  ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
  0 maps to A, 1 maps to B, 26 maps to a, 55 maps to 3, etc.

imagine "Abc"  as bits grouped two at a time (little endian)
     01 00 00 01   01 10 00 10   01 10 00 11
     01 00 00  01 01 10  00 10 01   10 00 11
     Q         W         J          j

so "Abc" maps to "QWJj"

Sample Code

I wrote this in C to fit my particular use case. Details of base64 encoding can vary

/* base64encode */

#include <stdio.h>

static const unsigned char base64_table[66] =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=";

void base64encode( FILE * in , FILE * out ) {

	int inchar;
	int count ;
	unsigned long accum;
	unsigned j,k,l,m ;

	accum = 0;
	count = 0;
	while ( ( inchar = fgetc( in ) ) ){
		if( inchar == EOF ) {
			/* done */
			if( count == 0 ) return;
			if( count == 1 ) accum = accum * 256 * 256;
			if( count == 2 ) accum = accum * 256;
			m = 64;
			accum = accum / 64;
			if( count == 1 ) {
				l = 64;
			} else {
				l = accum % 64 ;
			}
			accum = accum /64;
			k = accum % 64;
			accum = accum /64;
			j = accum % 64;
			fputc( base64_table[j] , out );
			fputc( base64_table[k] , out );
			fputc( base64_table[l] , out );
			fputc( base64_table[m] , out );
			return;
		}
		count += 1;
		accum = accum * 256 + inchar;
		if( count == 3 ) {
			m = accum % 64;
			accum = accum /64;
			l = accum % 64;
			accum = accum /64;
			k = accum % 64;
			accum = accum /64;
			j = accum % 64;
			fputc( base64_table[j] , out );
			fputc( base64_table[k] , out );
			fputc( base64_table[l] , out );
			fputc( base64_table[m] , out );
			count = 0;
			accum = 0;
		}
	}
}

int main(void ){

	base64encode( stdin, stdout );
	return 0;
}

Notes

  1. I wrote it to be easy for me to understand.
  2. I am not concerned about performance at this point.
  3. I appended an '=' to the translation table to help keep the code simple for the case where the bytes received are not a mutiple of 3.