Layout in Memory
sf.net Project Page
Email the Author
Layout in Memory
and System Organization
- Can you describe Seed in terms of system organization and memory layout?
Sure. A running Seed instance's RAM first contains a
system_header, which describes various characteristics of the system as a whole. Following that is a
volume_header_table, which contains a
volume_header for every volume accessable to the system. Every volume has an
object_header_table, which contains an
object_header for every object in the volume. Object headers describe the properties and contents of the object in question, in a way analogous to the way that inodes describe files in some traditional filesystems.
- Fair enough, but you have yet to explain how exactly volumes contain objects.
Volumes are comprised of an
volume_header (which is copied into the
volume_header_table described by the
system_header at boot time), a
primary_object_header_table, a (considerably smaller)
Object headers normally reside in a volume's
primary_object_header_table, at an offset determined by the hash of the object's name (modula the size of the
primary_object_header_table). If a hash collision has occured, lookup is attempted in
secondary_object_header_table (modula size). Object footers (the actual data or value of objects) reside in a volume's
object_footer_space. As object headers contain references to object footers, once the header is located, the footer can be too.
- So the object header table is essentially a hashtable, with (hashes of) object names as keys, and object headers as values?
Yup... From there, the object headers reference (and describe) the object footers.
- The way hash collisions are handled - with a smaller secondary table - seems unusual.
As far as I know, nobody else does it that way. It seems pretty clean, though, especially when compared to the traditional linked-list-of-collisions method. Also, lookups would run in either O(1) (primary table) or O(2) (secondary table) time. If collisions start occuring in the secondary table, the time has clearly come to rehash and enlarge it... In theory, a third table could be created to handle collisions in the secondary one, but that seems unnecessary.
- But what about object footers? How does space for them get allocated and deallocated? Surely you don't resort to traditional measures like reference-counting garbage collection...
Indeed not! ;) When volumes are formatted (at boot-time, for RAM), empty objects are created, named, and given footers of sizes in the pattern of 256B, 1KB, 16KB, 64KB, 256KB, 1MB, etc. As an example, let's use an object named
/loc/empty/vol_0x00BADF00/_1KB/space_0xDEADBEEF, whose empty footer occupies 1KB of memory, and resides at 0xDEADBEEF (an address) in vol_0x00BADF00's (a volume's) address space.
When a new object (1KB in size, say, and named
/home/sayke/foobar) is to be allocated in a volume, its header is hashed into its volume's
object_header_table, and associated with the footer previously claimed by the empty object (
/loc/empty/vol_0x00BADF00/_1KB/space_0xDEADBEEF). The old header for the empty space is then zeroed out... And so space for a new object is allocated.
- So you're using pool allocation, kinda like Unix's malloc(), only in a really weird way.
Yes. To avoid wasting large amounts of memory in the presence of 1.1KB objects, support for block aggregation (combining a 1KB block with a 256B block, say) would have to be implemented at some point... Also, because the vast majority of objects in a fully-functional Seed system would be very small, volume nesting (creating objects that act as volumes in their own right, so as to abstract away a bunch of those tiny, usually-locally-referenced objects) seems like a useful optimization... Although, right now, I'm not worrying about implementing such things. I'd like to keep it simple and get everything else working properly first.