Collections Bundling¶
Certain collections, such as fonts, consist of a very large number of files
(e.g., >600 for FontAwesome4, several thousand for FontAwesome7) and are not modified by users.
An excessively large number of files overloads the repository and increases synchronization time.
Proposal¶
Introduce the concept of a bundle within collections.
A bundle is a JSON file stored alongside other JSON files in a collection,
with the distinction that it may contain multiple items.
A bundle is identified by the presence of:
- A
$type
key with the value"bundle"
. - An
items
key containing the elements.
Each element within items
must strictly follow the JSON format of a regular item in the given collection.
Hash Calculation¶
For regular items, the hash is computed from the entire JSON file.
For items within a bundle, a slightly different algorithm is applied:
- Decode the element into a Python
dict
. - Serialize it using
orjson.dumps
. - Compute the hash from the resulting byte string.
This algorithm is robust against formatting changes, including added/removed whitespace.
Implementation Details¶
Only the iter_items_from_file
function needs to be modified
so that it yields bundle elements along with their corresponding hashes.
The noc collection
command should be extended with a bundle
subcommand:
-
bundle add --bundle=<path> <path1>..<pathN>
Add files to a bundle or replace existing entries. -
bundle remove --bundle=<path> UUID1 .. UUIDN
Remove entries from a bundle.