Dynamic Memory Allocation
#lecture note based on 15-213 Introduction to Computer Systems
H2 Allocation and freeing
~~ malloc ~ ~ ~
There are just things we don’t know the size of. We use dynamic memory allocation to acquiring virtual memory at runtime.
Types of allocators
- explicit - need to call to allocation
- like
malloc
,free
in C
- like
- implicit allocator - something else does the allocation and freeing
new
and garbage collector in Java
H3 Malloc, Calloc, Free
malloc
:- (usually it allocates larger size than requested)
- success
- return pointer to memory block according to 16 byte alignment since it’s the maximum alignment requirement (viz. it rounds up)
- fail
- returns
NULL
and setserrno
- returns
calloc
:- malloc but sets stuff to
0
- malloc but sets stuff to
free
- must come from malloc, calloc, realloc
H3 Heap visualisation convention
H3 Constraints
- Applications
malloc
andfree
can be in arbitrary sequencefree
must be tomalloc
’d block
- Explicit allocators
- Can’t control number and size of allocated blocks
- Can’t move around already allocated stuff
- Performance
- Maximise throughput - speed
- Maximise allocated / actual space
- Minimise overhead
- aggregate payload
- current heap size
- => heap space not used for program data
H2 Fragmentation
Poor memory allocation leading to low utilisation
- internal fragmentation - payload smaller than block size caused by
free
doesn’t specify size, it usually takes extra space before payload to record the size- padding for alignment - malloc 1 byte => 16/32 bytes
- other policy decisions - minimum size requirement, etc.
- external fragmentation - enough aggregate heap memory, but no single block is enough (viz. free space enough but not together)
- depends on future requests
- solution: grow the heap
H2 Implementation
H6 Problems to address
- How to know how much to free given a pointer
- How to keep track of what block is free
- Which free block to use when allocating
- What to do with free space still in block after putting in something smaller
H6 Possible methods
- Implicit list: use length to link blocks (and tag each block as free/not free)
- Explicit list among free blocks (pointer to next free block in header)
- Segregated free list - different list for different size
- Block sorted by size - balanced tree, etc.
H6 Design decisions to make
- Placement policy
- Splitting policy
- Coalescing policy
- Insertion policy (for explicit list)
H3 Method 1: Implicit list
- Keep track of size of block in bytes before payload, called header or header field (align payload, not header)
- Put inside header:
- size on higher bits
- allocation status, and other flags, in lower bits
H4 Block splitting and coalescing
If not perfect fit, split the block or take entire block, maybe former saves more space.
When freeing, leave blocks split or merge aka coalesce them. If latter, see if next surrounding block also free, if so increase size of first block (no need to zero out header in middle).
H4 Boundary tag
Use last 8 byte to indicate size, effectively letting us travel both ways.
Note footer needs to have run time offset - putting it after payload[0]
doesn’t work.
Dummy footer and header at two sides of heap - prevents walking off heap / coalescing beyond heap
Problem: boundary tags take lots of space, so lots of internal fragmentation
H4 Status of previous block, rather than footer
Recall we have 4 spare bits, why not use another to indicate if previous allocated?
So we only need header for allocated block (free block can still have footer to indicate size)
H3 Method 2: explicit free list
Implicit list is $O(n)$ search like list traversal.
Idea: add pointer among free blocks (allocated block as before)
Note the pointers can point in any order
H4 Insertion policy
Where to put newly freed blocks in the list?
- Unordered, we can put hem in front or at back of list
- Address ordered: maybe slower but has other benefits?
H3 Method 3: segregated free lists
Idea: partition the free blocks according to some criteria, so that search doesn’t have to go through the whole set.
E.g. set of list of size 16 blocks, set of size 32-48 blocks, set of the rest…
Usually, higher throughput and better memory utilisation