What is Base64 and How it works : Base64 Encoding/Decoding Guide for Beginners

Base64 is a mechanism to enable representing and transferring binary data over mediums that allow only printable characters. in other words Base64 is an encoding and decoding technique used to convert binary data to ASCII text format, and vice versa. Base64 encoding schemes are commonly used when there is a need to encode binary data that needs to be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remain intact without modification during transport. It is used to transfer data over a medium that only supports ASCII formats, such as email messages on Multipurpose Internet Mail Extension (MIME) and Extensible Markup Language (XML) data.

Base64 Encoding Table :

The Base64 alphabet contains a character set of 64 printable ASCII characters. The following set of characters is used to encode binary to text :


At the above table there are

  • A to Z characters  -  26 characters
  • a to z characters  -  26 characters
  • 0 to 9 - 10 characters
  • + (plus character)  - 1 character
  • / (forward-slash character)  -  1 character
  • = (equal character) - Used for Padding purposes, as explained later

Now here, since the numerals and alphabets make up for only 62 characters in all, so '+' and '/' are used to fill the gap. And also in Base64 the '=' sign is also used for filling purpose, which will explained below.


The Encoding Process :

  • 1. The Data is read from left to right.
  • 2. Three separate 8-bit data from the input are joined to make a 24-bit-long group.
  • 3. The 24-bit long group is divided into 6-bit individual groups, that is, 4 groups. The grouping into 6 bits is for the simple reason that 6 bits will cover the range of printable characters        [0-26-1 = 63]
  • 4. Each of these 4 groups of 6-bits is then encoded using the above-mentioned Base64 encoding table.
For more clarification of the Encoding process lets see the below exmaple where we encode the word 'Sec' :


Therefore, the Base64 equivalent for Sec becomes U2Vj.

Padding in Base64 :

However, a problem arises when the character groups are do not exactly form the 24-bit pattern. Consider the word Cloud, we cannot divide this word into 24-bit groups equally. Because theres only a single pair of 24-bit group (Clo), and the remaining characters 'ud', only create 16-bit. Now at here last 8-bit character is missing. Now at every missing character we append '='. So for one missing character, '=' is used; for every two missing characters '==' is used.

For example Lets see how the word 'Cloud' would be encoded into base64 :



Therefore, the Base64 equivalent for Cloud becomes Q2xvdWQ=. Similarly if there, two words is missing in the pair then we have to put two == characters in the bas64 encoded string.

Base64 Encoding/Decoding Functions :

In Javascript :

For Base64 encoding :  btoa()
 var str = 'sec-art.net';
 var encoded_string = btoa(str);
 console.log(encoded_string); 	// output is 'c2VjLWFydC5uZXQ='
For Base64 decoding : atob()
 var encoded_string = "c2VjLWFydC5uZXQ=";
 var decoded_string = atob(encoded_string);
 console.log(decoded_string);
In PHP :

For Base64 encoding : base64_encode()
 <?php
 $str = 'sec-art.net';
 echo base64_encode($str);
 ?>		
For Base64 decoding : base64_decode()
 >?php
 $str = 'c2VjLWFydC5uZXQ=';
 echo base64_decode($str);
 ?<


Conclusion :

Base-64 encoding is a way of taking binary data and turning it into text so that it's more easily transmitted in things like e-mail and HTML form data. It's a textual encoding of binary data where the resultant text has nothing but letters, numbers and the symbols "+", "/" and "=". It's a convenient way to store/transmit binary data over media that is specifically used for textual data.