17. Serialization
Programming Project 2021/22

17.1. Introduction

When data is saved to a file or transmitted over a network, it must be processed.

We want to process it in a way that allows us to rebuilt it later, when the file is read or the transmission is received.

There are two main groups of file serialization formats.

You can find a more complete list at https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats

As it is often the case elsewhere, also here there is no absolutely-better option. There is, however, the right tool for the right job!

Character-based formats

There are good reasons to use character-based serialization formats as:

  • you can actually read these files,
  • you can edit them without any additional tools, and
  • these formats are simpler to implement and work with.

Their main disadvantages are that:

  • they are usually a lot larger,
  • manipulating data in them may be inefficient in comparison to binary alternatives.

That is why some database systems accept data in JSON, but store them in BSON (e.g. MongoDB).