Razor Design
The corner stone of razor is the package set on-disk data structure / file format. It is how razor represents the packages installed on the system and it's what razor downloads from upstream servers to find out what's available. It's a simple binary file format, somewhat inspired by the ELF binary format. It has a sorted list of all packages and a sorted list of all unique properties (requires/provides/conflicts/obsoletes). Each package has a list of the properties associated with it (a list of indices into the list of all properties) and each property has a list of packages that it belongs to (as a list of indices into the package list). Strings are stored in a string pool, and referred to by their byte index in the pool.
Much of this is still changing, but as of June 13th, this write up from Dan Winship is reasonably accurate. The repo file format / razor_set data structure starts with a header, containing some number of sections, terminated by a section with type 0:
struct razor_set_header { uint32_t magic; uint32_t version; struct razor_set_section sections[0]; }; struct razor_set_section { uint32_t type; uint32_t offset; uint32_t size; };
razor_set_open()
mmaps the repo file, and creates
a struct razor_set:
struct razor_set { struct array string_pool; struct array packages; struct array properties; struct array files; struct array package_pool; struct array property_pool; struct array file_pool; struct razor_set_header *header; };
by finding the sections with those IDs and creating "struct array"s pointing to the right places in the mmap()ed data. (This is the only processing needed when reading in the file; everything else is used exactly as-is.) The sections
- RAZOR_STRING_POOL: Stores one copy of each string that appears in the repo. (At the moment, this is: package names, package versions, property names, property versions, and (basenames of) filenames.) The strings are arbitrarily-sized, 0-terminated, and not in any particular order (although the empty string always ends up being at offset 0).
- RAZOR_PACKAGES: Array of struct razor_package; one for each package in the set, sorted by name.
- RAZOR_PROPERTIES: Array of struct razor_property; one for each unique property in the set, sorted by type, then name, then relation type (eg, "<" or ">="), then version. (Properties with no version have relation type RAZOR_VERSION_EQUAL, and version "".)
- RAZOR_FILES: Array of struct razor_entry; one for each file owned by any package in the set. The current sort order (which is subject to change) is breadth-first, sorted by basename. So eg: /, /bin, /dev, /etc, /bin/false, /bin/true, /dev/null, /etc/passwd.
- RAZOR_PACKAGE_POOL: Array of struct list, with each list item containing the index of a struct razor_package in the packages section. See the discussion of lists below.
- RAZOR_PROPERTY_POOL: Array of struct list, with each list item containing the index of a struct razor_property in the properties section. See the discussion of lists below.
- RAZOR_FILE_POOL: Array of struct list, with each list item containing the index o f a struct razor_entry in the files section. See the discussion of lists below.
Data types
Note that the exact layout of bits involves some historical accidents. (Particularly the fact that the "name" field in most structs loses its high bits to a flags field.)
struct list_head { uint list_ptr : 24; uint flags : 8; }; struct list { uint data : 24; uint flags : 8; };
Used to store lists of package, property, or file IDs. "struct list_head" stores the head of the list, which points to one or more "struct list"s in the appropriate "pool" section. ("struct list" should probably be called "struct list_item".)