Search notes:

Encoding and decoding base 64 with c++

This is a simple library to Base64 encode and decode data with C++.

Interface version 2.0

This is the proposed interface for the version 2.0 of this library (as of 2020-04-29).

base64_encode

base64_encode() comes in two overloaded versions:
std::string base64_encode(std::string const& s, bool url = false);
std::string base64_encode(unsigned char const*, unsigned int len, bool url = false);
Both of these functions encode data as Base 64 and return the encoded string as a std::string.
The parameter url determines if the encoded string can be used in URLs: if url is set to true, the encoded string will contain -, _ and + instead of +, / and =. (See Wikipedia for more information).
The parameter len is needed for the second version because the length of an unsigned char const* is not determined if the data contains null values.

base64_encode_pem / base64_encode_mime

std::string base64_encode_pem (std::string const& s);
std::string base64_encode_mime(std::string const& s);
These two functions also encode data as Base 64. Additionally, they insert a line break after each 64th (pem) and 76th (mime) encoded characters.

base64_decode

base64_decode() decodes an encoded string.
std::string base64_decode(std::string const& s, bool remove_linebreaks = false);
The parameter remove_linebreaks needs to be set to true if the encoded string is expected to contain line breaks, for example because they were encoded with base64_encode_pem() or base64_encode_mime().

Source files

The source code is hosted on github. The repository consists of the following files
base64.cpp and base64.h The two files that are required to encode end decode data with and from Base64.
test.cpp A program that uses base64.cpp and verifies that the implemented functionality is correct.
Makefile The Makefile that compiles base64.cpp and test.cpp and executes the tests.
test-google.cpp A test file that can be used with a Google test suite (#include <gtest/gtest.h>).
measure-time.cpp Also uses base64.cpp to encode and decode a lorem ipsum text and to measure the time it takes to do so.
wsjcpp.yml This file is apparently used for or as a source file manager.
compile-and-run-test A shell script that uses the GNU C++ compiler to compile test.cpp and run it.
base64.cpp contains two simple C++ functions to encode and decode string from/to Base64: base64_encode and base64_decode.
test.cpp can be used to test the functionality.

Contributions

I am thankful for the following contributions to this libary.
minastaros for improving the code and adding test cases. (Version 1.01.00)
Evgenii Sopov provided wsjcpp.yml.
Khiem Doan added the correct headerfile: <cctype> rather than <iostream>, which is sufficient for isalnum().
2020-04-29: it turns out, this header file is not needed anymore.
ipodipad changed the code so that it eliminates the annoying CppCheck static analysis warning cppcheck:variableScope.
wbfloofsky provided a pull request that encouraged me to write directly into pre-allocated std::string buffers rather than first write into a separate buffer and then copy this buffer (push_back()) to the string. This change led to Version 1.03.
wbfloofsky also added the functionality to measure the time it takes to encode and decode a string in test.cpp. This inspired me to add measure-time.cpp which does that, and only that.
Francisco Ruiz added test-google.cpp, which is more or less the same as test.cpp, but apparently can be used in a Google testing framework.
Francisco Ruiz and huangxinV587 both suggested that I use a lookup function to find the position of an encodeded character in base64_chars in order to improve performance. I have merged their code into one version and given the lookup function the name pos_of_char().
JomaCorpFX provided the main functionality for base64_encode_pem(), base64_encode_mime() and the url parameter in base64_encode().
kosniaz spotted a bug with url-encoded data. This bug is now fixed (2020-05-09) and resulted in release candidate 2.rc.01.
Yannic Bonenberger provided an interface (function overloads) for std::string_view (rather than const std::string&). This interface requires at least C++17. This interface is available as of release candidate 2.rc.02 (2020-05-13).
Yannic Bonenberger also notified me of a concurrency issue if the library was used in a multi-threaded environment. This issue is fixed with 2.rc.03 (2020-05-13).
Celemony (via mynameisjohn) provided the changes to fix implicit cast warnings (size_t etc.), resulting in 2.rc.04 (2020-05-31).
Pablo Martin-Gomez changed the throw "…" (which must be caught by catch (const char* …) to throw std::runtime_error("Input is not valid base64-encoded data.") (which can be caught by catch (std::exception …)), resulting in 2.rc.05 (2020-10-23).
Pablo Martin-Gomez also exchanged the cumbersome while loop
size_t pos=0;
while ((pos = copy.find("\n", pos)) != std::string::npos) {
    copy.erase(pos, 1);
}
with the much more elegant erase-remove idiom which allows to move the elements to be kept only once instead of at each erase() call.
#include <algorithm>
…
copy.erase(std::remove(copy.begin(), copy.end(), '\n'), copy.end());
Pablo Martin-Gomez also improved the code by returning early from the function decode() if encoded_string is empty, resulting in 2.rc.07 (2020-10-23).
Peter Jansson removed an unnecessary check for an empty string.
Pablo Martin-Gomez, xanather and Joma notified me of a buffer overrun problem that occurs when trying to decode unpadded data (which RFC 2045 apparently explicitly allows).
Yibo Cai noticed a possible out of range input buffer accees which I hope is fixed with 2.rc.09 / 38c6315.

base64.cpp

/*
   base64.cpp and base64.h

   base64 encoding and decoding with C++.
   More information at
     https://renenyffenegger.ch/notes/development/Base64/Encoding-and-decoding-base-64-with-cpp

   Version: 2.rc.09 (release candidate)

   Copyright (C) 2004-2017, 2020-2022 René Nyffenegger

   This source code is provided 'as-is', without any express or implied
   warranty. In no event will the author be held liable for any damages
   arising from the use of this software.

   Permission is granted to anyone to use this software for any purpose,
   including commercial applications, and to alter it and redistribute it
   freely, subject to the following restrictions:

   1. The origin of this source code must not be misrepresented; you must not
      claim that you wrote the original source code. If you use this source code
      in a product, an acknowledgment in the product documentation would be
      appreciated but is not required.

   2. Altered source versions must be plainly marked as such, and must not be
      misrepresented as being the original source code.

   3. This notice may not be removed or altered from any source distribution.

   René Nyffenegger rene.nyffenegger@adp-gmbh.ch

*/

#include "base64.h"

#include <algorithm>
#include <stdexcept>

 //
 // Depending on the url parameter in base64_chars, one of
 // two sets of base64 characters needs to be chosen.
 // They differ in their last two characters.
 //
static const char* base64_chars[2] = {
             "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
             "abcdefghijklmnopqrstuvwxyz"
             "0123456789"
             "+/",

             "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
             "abcdefghijklmnopqrstuvwxyz"
             "0123456789"
             "-_"};

static unsigned int pos_of_char(const unsigned char chr) {
 //
 // Return the position of chr within base64_encode()
 //

    if      (chr >= 'A' && chr <= 'Z') return chr - 'A';
    else if (chr >= 'a' && chr <= 'z') return chr - 'a' + ('Z' - 'A')               + 1;
    else if (chr >= '0' && chr <= '9') return chr - '0' + ('Z' - 'A') + ('z' - 'a') + 2;
    else if (chr == '+' || chr == '-') return 62; // Be liberal with input and accept both url ('-') and non-url ('+') base 64 characters (
    else if (chr == '/' || chr == '_') return 63; // Ditto for '/' and '_'
    else
 //
 // 2020-10-23: Throw std::exception rather than const char*
 //(Pablo Martin-Gomez, https://github.com/Bouska)
 //
    throw std::runtime_error("Input is not valid base64-encoded data.");
}

static std::string insert_linebreaks(std::string str, size_t distance) {
 //
 // Provided by https://github.com/JomaCorpFX, adapted by me.
 //
    if (!str.length()) {
        return "";
    }

    size_t pos = distance;

    while (pos < str.size()) {
        str.insert(pos, "\n");
        pos += distance + 1;
    }

    return str;
}

template <typename String, unsigned int line_length>
static std::string encode_with_line_breaks(String s) {
  return insert_linebreaks(base64_encode(s, false), line_length);
}

template <typename String>
static std::string encode_pem(String s) {
  return encode_with_line_breaks<String, 64>(s);
}

template <typename String>
static std::string encode_mime(String s) {
  return encode_with_line_breaks<String, 76>(s);
}

template <typename String>
static std::string encode(String s, bool url) {
  return base64_encode(reinterpret_cast<const unsigned char*>(s.data()), s.length(), url);
}

std::string base64_encode(unsigned char const* bytes_to_encode, size_t in_len, bool url) {

    size_t len_encoded = (in_len +2) / 3 * 4;

    unsigned char trailing_char = url ? '.' : '=';

 //
 // Choose set of base64 characters. They differ
 // for the last two positions, depending on the url
 // parameter.
 // A bool (as is the parameter url) is guaranteed
 // to evaluate to either 0 or 1 in C++ therefore,
 // the correct character set is chosen by subscripting
 // base64_chars with url.
 //
    const char* base64_chars_ = base64_chars[url];

    std::string ret;
    ret.reserve(len_encoded);

    unsigned int pos = 0;

    while (pos < in_len) {
        ret.push_back(base64_chars_[(bytes_to_encode[pos + 0] & 0xfc) >> 2]);

        if (pos+1 < in_len) {
           ret.push_back(base64_chars_[((bytes_to_encode[pos + 0] & 0x03) << 4) + ((bytes_to_encode[pos + 1] & 0xf0) >> 4)]);

           if (pos+2 < in_len) {
              ret.push_back(base64_chars_[((bytes_to_encode[pos + 1] & 0x0f) << 2) + ((bytes_to_encode[pos + 2] & 0xc0) >> 6)]);
              ret.push_back(base64_chars_[  bytes_to_encode[pos + 2] & 0x3f]);
           }
           else {
              ret.push_back(base64_chars_[(bytes_to_encode[pos + 1] & 0x0f) << 2]);
              ret.push_back(trailing_char);
           }
        }
        else {

            ret.push_back(base64_chars_[(bytes_to_encode[pos + 0] & 0x03) << 4]);
            ret.push_back(trailing_char);
            ret.push_back(trailing_char);
        }

        pos += 3;
    }


    return ret;
}

template <typename String>
static std::string decode(String const& encoded_string, bool remove_linebreaks) {
 //
 // decode(…) is templated so that it can be used with String = const std::string&
 // or std::string_view (requires at least C++17)
 //

    if (encoded_string.empty()) return std::string();

    if (remove_linebreaks) {

       std::string copy(encoded_string);

       copy.erase(std::remove(copy.begin(), copy.end(), '\n'), copy.end());

       return base64_decode(copy, false);
    }

    size_t length_of_string = encoded_string.length();
    size_t pos = 0;

 //
 // The approximate length (bytes) of the decoded string might be one or
 // two bytes smaller, depending on the amount of trailing equal signs
 // in the encoded string. This approximation is needed to reserve
 // enough space in the string to be returned.
 //
    size_t approx_length_of_decoded_string = length_of_string / 4 * 3;
    std::string ret;
    ret.reserve(approx_length_of_decoded_string);

    while (pos < length_of_string) {
    //
    // Iterate over encoded input string in chunks. The size of all
    // chunks except the last one is 4 bytes.
    //
    // The last chunk might be padded with equal signs or dots
    // in order to make it 4 bytes in size as well, but this
    // is not required as per RFC 2045.
    //
    // All chunks except the last one produce three output bytes.
    //
    // The last chunk produces at least one and up to three bytes.
    //

       size_t pos_of_char_1 = pos_of_char(encoded_string.at(pos+1) );

    //
    // Emit the first output byte that is produced in each chunk:
    //
       ret.push_back(static_cast<std::string::value_type>( ( (pos_of_char(encoded_string.at(pos+0)) ) << 2 ) + ( (pos_of_char_1 & 0x30 ) >> 4)));

       if ( ( pos + 2 < length_of_string  )       &&  // Check for data that is not padded with equal signs (which is allowed by RFC 2045)
              encoded_string.at(pos+2) != '='     &&
              encoded_string.at(pos+2) != '.'         // accept URL-safe base 64 strings, too, so check for '.' also.
          )
       {
       //
       // Emit a chunk's second byte (which might not be produced in the last chunk).
       //
          unsigned int pos_of_char_2 = pos_of_char(encoded_string.at(pos+2) );
          ret.push_back(static_cast<std::string::value_type>( (( pos_of_char_1 & 0x0f) << 4) + (( pos_of_char_2 & 0x3c) >> 2)));

          if ( ( pos + 3 < length_of_string )     &&
                 encoded_string.at(pos+3) != '='  &&
                 encoded_string.at(pos+3) != '.'
             )
          {
          //
          // Emit a chunk's third byte (which might not be produced in the last chunk).
          //
             ret.push_back(static_cast<std::string::value_type>( ( (pos_of_char_2 & 0x03 ) << 6 ) + pos_of_char(encoded_string.at(pos+3))   ));
          }
       }

       pos += 4;
    }

    return ret;
}

std::string base64_decode(std::string const& s, bool remove_linebreaks) {
   return decode(s, remove_linebreaks);
}

std::string base64_encode(std::string const& s, bool url) {
   return encode(s, url);
}

std::string base64_encode_pem (std::string const& s) {
   return encode_pem(s);
}

std::string base64_encode_mime(std::string const& s) {
   return encode_mime(s);
}

#if __cplusplus >= 201703L
//
// Interface with std::string_view rather than const std::string&
// Requires C++17
// Provided by Yannic Bonenberger (https://github.com/Yannic)
//

std::string base64_encode(std::string_view s, bool url) {
   return encode(s, url);
}

std::string base64_encode_pem(std::string_view s) {
   return encode_pem(s);
}

std::string base64_encode_mime(std::string_view s) {
   return encode_mime(s);
}

std::string base64_decode(std::string_view s, bool remove_linebreaks) {
   return decode(s, remove_linebreaks);
}

#endif  // __cplusplus >= 201703L
Github repository cpp-base64, path: /base64.cpp

base64.h

//
//  base64 encoding and decoding with C++.
//  Version: 2.rc.09 (release candidate)
//

#ifndef BASE64_H_C0CE2A47_D10E_42C9_A27C_C883944E704A
#define BASE64_H_C0CE2A47_D10E_42C9_A27C_C883944E704A

#include <string>

#if __cplusplus >= 201703L
#include <string_view>
#endif  // __cplusplus >= 201703L

std::string base64_encode     (std::string const& s, bool url = false);
std::string base64_encode_pem (std::string const& s);
std::string base64_encode_mime(std::string const& s);

std::string base64_decode(std::string const& s, bool remove_linebreaks = false);
std::string base64_encode(unsigned char const*, size_t len, bool url = false);

#if __cplusplus >= 201703L
//
// Interface with std::string_view rather than const std::string&
// Requires C++17
// Provided by Yannic Bonenberger (https://github.com/Yannic)
//
std::string base64_encode     (std::string_view s, bool url = false);
std::string base64_encode_pem (std::string_view s);
std::string base64_encode_mime(std::string_view s);

std::string base64_decode(std::string_view s, bool remove_linebreaks = false);
#endif  // __cplusplus >= 201703L

#endif /* BASE64_H_C0CE2A47_D10E_42C9_A27C_C883944E704A */
Github repository cpp-base64, path: /base64.h

Usages

This base 64 class is used for the

Index