Skip to content

Collections Bundling

Certain collections, such as fonts, consist of a very large number of files
(e.g., >600 for FontAwesome4, several thousand for FontAwesome7) and are not modified by users.

An excessively large number of files overloads the repository and increases synchronization time.

Proposal

Introduce the concept of a bundle within collections.
A bundle is a JSON file stored alongside other JSON files in a collection,
with the distinction that it may contain multiple items.

A bundle is identified by the presence of:

  • A $type key with the value "bundle".
  • An items key containing the elements.

Each element within items must strictly follow the JSON format of a regular item in the given collection.

Hash Calculation

For regular items, the hash is computed from the entire JSON file.
For items within a bundle, a slightly different algorithm is applied:

  1. Decode the element into a Python dict.
  2. Serialize it using orjson.dumps.
  3. Compute the hash from the resulting byte string.

This algorithm is robust against formatting changes, including added/removed whitespace.

Implementation Details

Only the iter_items_from_file function needs to be modified
so that it yields bundle elements along with their corresponding hashes.

The noc collection command should be extended with a bundle subcommand:

  • bundle add --bundle=<path> <path1>..<pathN>
    Add files to a bundle or replace existing entries.

  • bundle remove --bundle=<path> UUID1 .. UUIDN
    Remove entries from a bundle.