mirror of
https://github.com/microsoft/mimalloc.git
synced 2025-07-06 19:38:41 +03:00
merge from dev
This commit is contained in:
commit
3726cf94ba
106 changed files with 6623 additions and 5540 deletions
696
doc/doxyfile
696
doc/doxyfile
File diff suppressed because it is too large
Load diff
|
@ -25,12 +25,15 @@ without code changes, for example, on Unix you can use it as:
|
|||
```
|
||||
|
||||
Notable aspects of the design include:
|
||||
|
||||
- __small and consistent__: the library is about 8k LOC using simple and
|
||||
consistent data structures. This makes it very suitable
|
||||
to integrate and adapt in other projects. For runtime systems it
|
||||
provides hooks for a monotonic _heartbeat_ and deferred freeing (for
|
||||
bounded worst-case times with reference counting).
|
||||
Partly due to its simplicity, mimalloc has been ported to many systems (Windows, macOS,
|
||||
Linux, WASM, various BSD's, Haiku, MUSL, etc) and has excellent support for dynamic overriding.
|
||||
At the same time, it is an industrial strength allocator that runs (very) large scale
|
||||
distributed services on thousands of machines with excellent worst case latencies.
|
||||
- __free list sharding__: instead of one big free list (per size class) we have
|
||||
many smaller lists per "mimalloc page" which reduces fragmentation and
|
||||
increases locality --
|
||||
|
@ -45,23 +48,23 @@ Notable aspects of the design include:
|
|||
and the chance of contending on a single location will be low -- this is quite
|
||||
similar to randomized algorithms like skip lists where adding
|
||||
a random oracle removes the need for a more complex algorithm.
|
||||
- __eager page reset__: when a "page" becomes empty (with increased chance
|
||||
due to free list sharding) the memory is marked to the OS as unused ("reset" or "purged")
|
||||
- __eager page purging__: when a "page" becomes empty (with increased chance
|
||||
due to free list sharding) the memory is marked to the OS as unused (reset or decommitted)
|
||||
reducing (real) memory pressure and fragmentation, especially in long running
|
||||
programs.
|
||||
- __secure__: _mimalloc_ can be build in secure mode, adding guard pages,
|
||||
- __secure__: _mimalloc_ can be built in secure mode, adding guard pages,
|
||||
randomized allocation, encrypted free lists, etc. to protect against various
|
||||
heap vulnerabilities. The performance penalty is only around 5% on average
|
||||
heap vulnerabilities. The performance penalty is usually around 10% on average
|
||||
over our benchmarks.
|
||||
- __first-class heaps__: efficiently create and use multiple heaps to allocate across different regions.
|
||||
A heap can be destroyed at once instead of deallocating each object separately.
|
||||
- __bounded__: it does not suffer from _blowup_ \[1\], has bounded worst-case allocation
|
||||
times (_wcat_), bounded space overhead (~0.2% meta-data, with low internal fragmentation),
|
||||
and has no internal points of contention using only atomic operations.
|
||||
- __fast__: In our benchmarks (see [below](#performance)),
|
||||
_mimalloc_ outperforms all other leading allocators (_jemalloc_, _tcmalloc_, _Hoard_, etc),
|
||||
and usually uses less memory (up to 25% more in the worst case). A nice property
|
||||
is that it does consistently well over a wide range of benchmarks.
|
||||
times (_wcat_) (upto OS primitives), bounded space overhead (~0.2% meta-data, with low
|
||||
internal fragmentation), and has no internal points of contention using only atomic operations.
|
||||
- __fast__: In our benchmarks (see [below](#bench)),
|
||||
_mimalloc_ outperforms other leading allocators (_jemalloc_, _tcmalloc_, _Hoard_, etc),
|
||||
and often uses less memory. A nice property is that it does consistently well over a wide range
|
||||
of benchmarks. There is also good huge OS page support for larger server programs.
|
||||
|
||||
You can read more on the design of _mimalloc_ in the
|
||||
[technical report](https://www.microsoft.com/en-us/research/publication/mimalloc-free-list-sharding-in-action)
|
||||
|
@ -278,8 +281,7 @@ void* mi_zalloc_small(size_t size);
|
|||
/// The returned size can be
|
||||
/// used to call \a mi_expand successfully.
|
||||
/// The returned size is always at least equal to the
|
||||
/// allocated size of \a p, and, in the current design,
|
||||
/// should be less than 16.7% more.
|
||||
/// allocated size of \a p.
|
||||
///
|
||||
/// @see [_msize](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/msize?view=vs-2017) (Windows)
|
||||
/// @see [malloc_usable_size](http://man7.org/linux/man-pages/man3/malloc_usable_size.3.html) (Linux)
|
||||
|
@ -304,7 +306,7 @@ size_t mi_good_size(size_t size);
|
|||
/// in very narrow circumstances; in particular, when a long running thread
|
||||
/// allocates a lot of blocks that are freed by other threads it may improve
|
||||
/// resource usage by calling this every once in a while.
|
||||
void mi_collect(bool force);
|
||||
void mi_collect(bool force);
|
||||
|
||||
/// Deprecated
|
||||
/// @param out Ignored, outputs to the registered output function or stderr by default.
|
||||
|
@ -428,7 +430,7 @@ int mi_reserve_os_memory(size_t size, bool commit, bool allow_large);
|
|||
/// allocated in some manner and available for use my mimalloc.
|
||||
/// @param start Start of the memory area
|
||||
/// @param size The size of the memory area.
|
||||
/// @param commit Is the area already committed?
|
||||
/// @param is_committed Is the area already committed?
|
||||
/// @param is_large Does it consist of large OS pages? Set this to \a true as well for memory
|
||||
/// that should not be decommitted or protected (like rdma etc.)
|
||||
/// @param is_zero Does the area consists of zero's?
|
||||
|
@ -453,7 +455,7 @@ int mi_reserve_huge_os_pages_interleave(size_t pages, size_t numa_nodes, size_t
|
|||
/// Reserve \a pages of huge OS pages (1GiB) at a specific \a numa_node,
|
||||
/// but stops after at most `timeout_msecs` seconds.
|
||||
/// @param pages The number of 1GiB pages to reserve.
|
||||
/// @param numa_node The NUMA node where the memory is reserved (start at 0).
|
||||
/// @param numa_node The NUMA node where the memory is reserved (start at 0). Use -1 for no affinity.
|
||||
/// @param timeout_msecs Maximum number of milli-seconds to try reserving, or 0 for no timeout.
|
||||
/// @returns 0 if successful, \a ENOMEM if running out of memory, or \a ETIMEDOUT if timed out.
|
||||
///
|
||||
|
@ -486,6 +488,91 @@ bool mi_is_redirected();
|
|||
/// on other systems as the amount of read/write accessible memory reserved by mimalloc.
|
||||
void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_msecs, size_t* current_rss, size_t* peak_rss, size_t* current_commit, size_t* peak_commit, size_t* page_faults);
|
||||
|
||||
/// @brief Show all current arena's.
|
||||
/// @param show_inuse Show the arena blocks that are in use.
|
||||
/// @param show_abandoned Show the abandoned arena blocks.
|
||||
/// @param show_purge Show arena blocks scheduled for purging.
|
||||
void mi_debug_show_arenas(bool show_inuse, bool show_abandoned, bool show_purge);
|
||||
|
||||
/// Mimalloc uses large (virtual) memory areas, called "arena"s, from the OS to manage its memory.
|
||||
/// Each arena has an associated identifier.
|
||||
typedef int mi_arena_id_t;
|
||||
|
||||
/// @brief Return the size of an arena.
|
||||
/// @param arena_id The arena identifier.
|
||||
/// @param size Returned size in bytes of the (virtual) arena area.
|
||||
/// @return base address of the arena.
|
||||
void* mi_arena_area(mi_arena_id_t arena_id, size_t* size);
|
||||
|
||||
/// @brief Reserve huge OS pages (1GiB) into a single arena.
|
||||
/// @param pages Number of 1GiB pages to reserve.
|
||||
/// @param numa_node The associated NUMA node, or -1 for no NUMA preference.
|
||||
/// @param timeout_msecs Max amount of milli-seconds this operation is allowed to take. (0 is infinite)
|
||||
/// @param exclusive If exclusive, only a heap associated with this arena can allocate in it.
|
||||
/// @param arena_id The arena identifier.
|
||||
/// @return 0 if successful, \a ENOMEM if running out of memory, or \a ETIMEDOUT if timed out.
|
||||
int mi_reserve_huge_os_pages_at_ex(size_t pages, int numa_node, size_t timeout_msecs, bool exclusive, mi_arena_id_t* arena_id);
|
||||
|
||||
/// @brief Reserve OS memory to be managed in an arena.
|
||||
/// @param size Size the reserve.
|
||||
/// @param commit Should the memory be initially committed?
|
||||
/// @param allow_large Allow the use of large OS pages?
|
||||
/// @param exclusive Is the returned arena exclusive?
|
||||
/// @param arena_id The new arena identifier.
|
||||
/// @return Zero on success, an error code otherwise.
|
||||
int mi_reserve_os_memory_ex(size_t size, bool commit, bool allow_large, bool exclusive, mi_arena_id_t* arena_id);
|
||||
|
||||
/// @brief Manage externally allocated memory as a mimalloc arena. This memory will not be freed by mimalloc.
|
||||
/// @param start Start address of the area.
|
||||
/// @param size Size in bytes of the area.
|
||||
/// @param is_committed Is the memory already committed?
|
||||
/// @param is_large Does it consist of (pinned) large OS pages?
|
||||
/// @param is_zero Is the memory zero-initialized?
|
||||
/// @param numa_node Associated NUMA node, or -1 to have no NUMA preference.
|
||||
/// @param exclusive Is the arena exclusive (where only heaps associated with the arena can allocate in it)
|
||||
/// @param arena_id The new arena identifier.
|
||||
/// @return `true` if successful.
|
||||
bool mi_manage_os_memory_ex(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node, bool exclusive, mi_arena_id_t* arena_id);
|
||||
|
||||
/// @brief Create a new heap that only allocates in the specified arena.
|
||||
/// @param arena_id The arena identifier.
|
||||
/// @return The new heap or `NULL`.
|
||||
mi_heap_t* mi_heap_new_in_arena(mi_arena_id_t arena_id);
|
||||
|
||||
/// @brief Create a new heap
|
||||
/// @param heap_tag The heap tag associated with this heap; heaps only reclaim memory between heaps with the same tag.
|
||||
/// @param allow_destroy Is \a mi_heap_destroy allowed? Not allowing this allows the heap to reclaim memory from terminated threads.
|
||||
/// @param arena_id If not 0, the heap will only allocate from the specified arena.
|
||||
/// @return A new heap or `NULL` on failure.
|
||||
///
|
||||
/// The \a arena_id can be used by runtimes to allocate only in a specified pre-reserved arena.
|
||||
/// This is used for example for a compressed pointer heap in Koka.
|
||||
/// The \a heap_tag enables heaps to keep objects of a certain type isolated to heaps with that tag.
|
||||
/// This is used for example in the CPython integration.
|
||||
mi_heap_t* mi_heap_new_ex(int heap_tag, bool allow_destroy, mi_arena_id_t arena_id);
|
||||
|
||||
/// A process can associate threads with sub-processes.
|
||||
/// A sub-process will not reclaim memory from (abandoned heaps/threads)
|
||||
/// other subprocesses.
|
||||
typedef void* mi_subproc_id_t;
|
||||
|
||||
/// @brief Get the main sub-process identifier.
|
||||
mi_subproc_id_t mi_subproc_main(void);
|
||||
|
||||
/// @brief Create a fresh sub-process (with no associated threads yet).
|
||||
/// @return The new sub-process identifier.
|
||||
mi_subproc_id_t mi_subproc_new(void);
|
||||
|
||||
/// @brief Delete a previously created sub-process.
|
||||
/// @param subproc The sub-process identifier.
|
||||
/// Only delete sub-processes if all associated threads have terminated.
|
||||
void mi_subproc_delete(mi_subproc_id_t subproc);
|
||||
|
||||
/// Add the current thread to the given sub-process.
|
||||
/// This should be called right after a thread is created (and no allocation has taken place yet)
|
||||
void mi_subproc_add_current_thread(mi_subproc_id_t subproc);
|
||||
|
||||
|
||||
/// \}
|
||||
|
||||
// ------------------------------------------------------
|
||||
|
@ -495,20 +582,24 @@ void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_m
|
|||
/// \defgroup aligned Aligned Allocation
|
||||
///
|
||||
/// Allocating aligned memory blocks.
|
||||
/// Note that `alignment` always follows `size` for consistency with the unaligned
|
||||
/// allocation API, but unfortunately this differs from `posix_memalign` and `aligned_alloc` in the C library.
|
||||
///
|
||||
/// \{
|
||||
|
||||
/// The maximum supported alignment size (currently 1MiB).
|
||||
#define MI_BLOCK_ALIGNMENT_MAX (1024*1024UL)
|
||||
|
||||
/// Allocate \a size bytes aligned by \a alignment.
|
||||
/// @param size number of bytes to allocate.
|
||||
/// @param alignment the minimal alignment of the allocated memory. Must be less than #MI_BLOCK_ALIGNMENT_MAX.
|
||||
/// @returns pointer to the allocated memory or \a NULL if out of memory.
|
||||
/// The returned pointer is aligned by \a alignment, i.e.
|
||||
/// `(uintptr_t)p % alignment == 0`.
|
||||
///
|
||||
/// @param alignment the minimal alignment of the allocated memory.
|
||||
/// @returns pointer to the allocated memory or \a NULL if out of memory,
|
||||
/// or if the alignment is not a power of 2 (including 0). The \a size is unrestricted
|
||||
/// (and does not have to be an integral multiple of the \a alignment).
|
||||
/// The returned pointer is aligned by \a alignment, i.e. `(uintptr_t)p % alignment == 0`.
|
||||
/// Returns a unique pointer if called with \a size 0.
|
||||
///
|
||||
/// Note that `alignment` always follows `size` for consistency with the unaligned
|
||||
/// allocation API, but unfortunately this differs from `posix_memalign` and `aligned_alloc` in the C library.
|
||||
///
|
||||
/// @see [aligned_alloc](https://en.cppreference.com/w/c/memory/aligned_alloc) (in the standard C11 library, with switched arguments!)
|
||||
/// @see [_aligned_malloc](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/aligned-malloc?view=vs-2017) (on Windows)
|
||||
/// @see [aligned_alloc](http://man.openbsd.org/reallocarray) (on BSD, with switched arguments!)
|
||||
/// @see [posix_memalign](https://linux.die.net/man/3/posix_memalign) (on Posix, with switched arguments!)
|
||||
|
@ -522,11 +613,12 @@ void* mi_realloc_aligned(void* p, size_t newsize, size_t alignment);
|
|||
/// @param size number of bytes to allocate.
|
||||
/// @param alignment the minimal alignment of the allocated memory at \a offset.
|
||||
/// @param offset the offset that should be aligned.
|
||||
/// @returns pointer to the allocated memory or \a NULL if out of memory.
|
||||
/// The returned pointer is aligned by \a alignment at \a offset, i.e.
|
||||
/// `((uintptr_t)p + offset) % alignment == 0`.
|
||||
///
|
||||
/// @returns pointer to the allocated memory or \a NULL if out of memory,
|
||||
/// or if the alignment is not a power of 2 (including 0). The \a size is unrestricted
|
||||
/// (and does not have to be an integral multiple of the \a alignment).
|
||||
/// The returned pointer is aligned by \a alignment, i.e. `(uintptr_t)p % alignment == 0`.
|
||||
/// Returns a unique pointer if called with \a size 0.
|
||||
///
|
||||
/// @see [_aligned_offset_malloc](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/aligned-offset-malloc?view=vs-2017) (on Windows)
|
||||
void* mi_malloc_aligned_at(size_t size, size_t alignment, size_t offset);
|
||||
void* mi_zalloc_aligned_at(size_t size, size_t alignment, size_t offset);
|
||||
|
@ -574,12 +666,12 @@ void mi_heap_delete(mi_heap_t* heap);
|
|||
/// heap is set to the backing heap.
|
||||
void mi_heap_destroy(mi_heap_t* heap);
|
||||
|
||||
/// Set the default heap to use for mi_malloc() et al.
|
||||
/// Set the default heap to use in the current thread for mi_malloc() et al.
|
||||
/// @param heap The new default heap.
|
||||
/// @returns The previous default heap.
|
||||
mi_heap_t* mi_heap_set_default(mi_heap_t* heap);
|
||||
|
||||
/// Get the default heap that is used for mi_malloc() et al.
|
||||
/// Get the default heap that is used for mi_malloc() et al. (for the current thread).
|
||||
/// @returns The current default heap.
|
||||
mi_heap_t* mi_heap_get_default();
|
||||
|
||||
|
@ -764,6 +856,8 @@ typedef struct mi_heap_area_s {
|
|||
size_t committed; ///< current committed bytes of this area
|
||||
size_t used; ///< bytes in use by allocated blocks
|
||||
size_t block_size; ///< size in bytes of one block
|
||||
size_t full_block_size; ///< size in bytes of a full block including padding and metadata.
|
||||
int heap_tag; ///< heap tag associated with this area (see \a mi_heap_new_ex)
|
||||
} mi_heap_area_t;
|
||||
|
||||
/// Visitor function passed to mi_heap_visit_blocks()
|
||||
|
@ -788,6 +882,23 @@ typedef bool (mi_block_visit_fun)(const mi_heap_t* heap, const mi_heap_area_t* a
|
|||
/// @returns \a true if all areas and blocks were visited.
|
||||
bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_all_blocks, mi_block_visit_fun* visitor, void* arg);
|
||||
|
||||
/// @brief Visit all areas and blocks in abandoned heaps.
|
||||
/// @param subproc_id The sub-process id associated with the abandonded heaps.
|
||||
/// @param heap_tag Visit only abandoned memory with the specified heap tag, use -1 to visit all abandoned memory.
|
||||
/// @param visit_blocks If \a true visits all allocated blocks, otherwise
|
||||
/// \a visitor is only called for every heap area.
|
||||
/// @param visitor This function is called for every area in the heap
|
||||
/// (with \a block as \a NULL). If \a visit_all_blocks is
|
||||
/// \a true, \a visitor is also called for every allocated
|
||||
/// block in every area (with `block!=NULL`).
|
||||
/// return \a false from this function to stop visiting early.
|
||||
/// @param arg extra argument passed to the \a visitor.
|
||||
/// @return \a true if all areas and blocks were visited.
|
||||
///
|
||||
/// Note: requires the option `mi_option_visit_abandoned` to be set
|
||||
/// at the start of the program.
|
||||
bool mi_abandoned_visit_blocks(mi_subproc_id_t subproc_id, int heap_tag, bool visit_blocks, mi_block_visit_fun* visitor, void* arg);
|
||||
|
||||
/// \}
|
||||
|
||||
/// \defgroup options Runtime Options
|
||||
|
@ -799,34 +910,38 @@ bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_all_blocks, mi_block
|
|||
/// Runtime options.
|
||||
typedef enum mi_option_e {
|
||||
// stable options
|
||||
mi_option_show_errors, ///< Print error messages to `stderr`.
|
||||
mi_option_show_stats, ///< Print statistics to `stderr` when the program is done.
|
||||
mi_option_verbose, ///< Print verbose messages to `stderr`.
|
||||
mi_option_show_errors, ///< Print error messages.
|
||||
mi_option_show_stats, ///< Print statistics on termination.
|
||||
mi_option_verbose, ///< Print verbose messages.
|
||||
mi_option_max_errors, ///< issue at most N error messages
|
||||
mi_option_max_warnings, ///< issue at most N warning messages
|
||||
|
||||
// the following options are experimental
|
||||
mi_option_eager_commit, ///< Eagerly commit segments (4MiB) (enabled by default).
|
||||
mi_option_large_os_pages, ///< Use large OS pages (2MiB in size) if possible
|
||||
mi_option_reserve_huge_os_pages, ///< The number of huge OS pages (1GiB in size) to reserve at the start of the program.
|
||||
mi_option_reserve_huge_os_pages_at, ///< Reserve huge OS pages at node N.
|
||||
mi_option_reserve_os_memory, ///< Reserve specified amount of OS memory at startup, e.g. "1g" or "512m".
|
||||
mi_option_segment_cache, ///< The number of segments per thread to keep cached (0).
|
||||
mi_option_page_reset, ///< Reset page memory after \a mi_option_reset_delay milliseconds when it becomes free.
|
||||
mi_option_abandoned_page_reset, //< Reset free page memory when a thread terminates.
|
||||
mi_option_use_numa_nodes, ///< Pretend there are at most N NUMA nodes; Use 0 to use the actual detected NUMA nodes at runtime.
|
||||
mi_option_eager_commit_delay, ///< the first N segments per thread are not eagerly committed (=1).
|
||||
mi_option_os_tag, ///< OS tag to assign to mimalloc'd memory
|
||||
mi_option_limit_os_alloc, ///< If set to 1, do not use OS memory for allocation (but only pre-reserved arenas)
|
||||
// advanced options
|
||||
mi_option_reserve_huge_os_pages, ///< reserve N huge OS pages (1GiB pages) at startup
|
||||
mi_option_reserve_huge_os_pages_at, ///< Reserve N huge OS pages at a specific NUMA node N.
|
||||
mi_option_reserve_os_memory, ///< reserve specified amount of OS memory in an arena at startup (internally, this value is in KiB; use `mi_option_get_size`)
|
||||
mi_option_allow_large_os_pages, ///< allow large (2 or 4 MiB) OS pages, implies eager commit. If false, also disables THP for the process.
|
||||
mi_option_purge_decommits, ///< should a memory purge decommit? (=1). Set to 0 to use memory reset on a purge (instead of decommit)
|
||||
mi_option_arena_reserve, ///< initial memory size for arena reservation (= 1 GiB on 64-bit) (internally, this value is in KiB; use `mi_option_get_size`)
|
||||
mi_option_os_tag, ///< tag used for OS logging (macOS only for now) (=100)
|
||||
mi_option_retry_on_oom, ///< retry on out-of-memory for N milli seconds (=400), set to 0 to disable retries. (only on windows)
|
||||
|
||||
// v1.x specific options
|
||||
mi_option_eager_region_commit, ///< Eagerly commit large (256MiB) memory regions (enabled by default, except on Windows)
|
||||
mi_option_segment_reset, ///< Experimental
|
||||
mi_option_reset_delay, ///< Delay in milli-seconds before resetting a page (100ms by default)
|
||||
mi_option_purge_decommits, ///< Experimental
|
||||
|
||||
// v2.x specific options
|
||||
mi_option_allow_purge, ///< Enable decommitting memory (=on)
|
||||
mi_option_purge_delay, ///< Decommit page memory after N milli-seconds delay (25ms).
|
||||
mi_option_segment_purge_delay, ///< Decommit large segment memory after N milli-seconds delay (500ms).
|
||||
// experimental options
|
||||
mi_option_eager_commit, ///< eager commit segments? (after `eager_commit_delay` segments) (enabled by default).
|
||||
mi_option_eager_commit_delay, ///< the first N segments per thread are not eagerly committed (but per page in the segment on demand)
|
||||
mi_option_arena_eager_commit, ///< eager commit arenas? Use 2 to enable just on overcommit systems (=2)
|
||||
mi_option_abandoned_page_purge, ///< immediately purge delayed purges on thread termination
|
||||
mi_option_purge_delay, ///< memory purging is delayed by N milli seconds; use 0 for immediate purging or -1 for no purging at all. (=10)
|
||||
mi_option_use_numa_nodes, ///< 0 = use all available numa nodes, otherwise use at most N nodes.
|
||||
mi_option_disallow_os_alloc, ///< 1 = do not use OS memory for allocation (but only programmatically reserved arenas)
|
||||
mi_option_limit_os_alloc, ///< If set to 1, do not use OS memory for allocation (but only pre-reserved arenas)
|
||||
mi_option_max_segment_reclaim, ///< max. percentage of the abandoned segments can be reclaimed per try (=10%)
|
||||
mi_option_destroy_on_exit, ///< if set, release all memory on exit; sometimes used for dynamic unloading but can be unsafe
|
||||
mi_option_arena_purge_mult, ///< multiplier for `purge_delay` for the purging delay for arenas (=10)
|
||||
mi_option_abandoned_reclaim_on_free, ///< allow to reclaim an abandoned segment on a free (=1)
|
||||
mi_option_purge_extend_delay, ///< extend purge delay on each subsequent delay (=1)
|
||||
mi_option_disallow_arena_alloc, ///< 1 = do not use arena's for allocation (except if using specific arena id's)
|
||||
mi_option_visit_abandoned, ///< allow visiting heap blocks from abandoned threads (=0)
|
||||
|
||||
_mi_option_last
|
||||
} mi_option_t;
|
||||
|
@ -838,7 +953,10 @@ void mi_option_disable(mi_option_t option);
|
|||
void mi_option_set_enabled(mi_option_t option, bool enable);
|
||||
void mi_option_set_enabled_default(mi_option_t option, bool enable);
|
||||
|
||||
long mi_option_get(mi_option_t option);
|
||||
long mi_option_get(mi_option_t option);
|
||||
long mi_option_get_clamp(mi_option_t option, long min, long max);
|
||||
size_t mi_option_get_size(mi_option_t option);
|
||||
|
||||
void mi_option_set(mi_option_t option, long value);
|
||||
void mi_option_set_default(mi_option_t option, long value);
|
||||
|
||||
|
@ -852,21 +970,27 @@ void mi_option_set_default(mi_option_t option, long value);
|
|||
///
|
||||
/// \{
|
||||
|
||||
void* mi_recalloc(void* p, size_t count, size_t size);
|
||||
size_t mi_malloc_size(const void* p);
|
||||
size_t mi_malloc_usable_size(const void *p);
|
||||
|
||||
/// Just as `free` but also checks if the pointer `p` belongs to our heap.
|
||||
void mi_cfree(void* p);
|
||||
void* mi__expand(void* p, size_t newsize);
|
||||
|
||||
void* mi_recalloc(void* p, size_t count, size_t size);
|
||||
size_t mi_malloc_size(const void* p);
|
||||
size_t mi_malloc_good_size(size_t size);
|
||||
size_t mi_malloc_usable_size(const void *p);
|
||||
|
||||
int mi_posix_memalign(void** p, size_t alignment, size_t size);
|
||||
int mi__posix_memalign(void** p, size_t alignment, size_t size);
|
||||
void* mi_memalign(size_t alignment, size_t size);
|
||||
void* mi_valloc(size_t size);
|
||||
|
||||
void* mi_pvalloc(size_t size);
|
||||
void* mi_aligned_alloc(size_t alignment, size_t size);
|
||||
|
||||
unsigned short* mi_wcsdup(const unsigned short* s);
|
||||
unsigned char* mi_mbsdup(const unsigned char* s);
|
||||
int mi_dupenv_s(char** buf, size_t* size, const char* name);
|
||||
int mi_wdupenv_s(unsigned short** buf, size_t* size, const unsigned short* name);
|
||||
|
||||
/// Correspond s to [reallocarray](https://www.freebsd.org/cgi/man.cgi?query=reallocarray&sektion=3&manpath=freebsd-release-ports)
|
||||
/// in FreeBSD.
|
||||
void* mi_reallocarray(void* p, size_t count, size_t size);
|
||||
|
@ -874,6 +998,9 @@ void* mi_reallocarray(void* p, size_t count, size_t size);
|
|||
/// Corresponds to [reallocarr](https://man.netbsd.org/reallocarr.3) in NetBSD.
|
||||
int mi_reallocarr(void* p, size_t count, size_t size);
|
||||
|
||||
void* mi_aligned_recalloc(void* p, size_t newcount, size_t size, size_t alignment);
|
||||
void* mi_aligned_offset_recalloc(void* p, size_t newcount, size_t size, size_t alignment, size_t offset);
|
||||
|
||||
void mi_free_size(void* p, size_t size);
|
||||
void mi_free_size_aligned(void* p, size_t size, size_t alignment);
|
||||
void mi_free_aligned(void* p, size_t alignment);
|
||||
|
@ -998,7 +1125,7 @@ mimalloc uses only safe OS calls (`mmap` and `VirtualAlloc`) and can co-exist
|
|||
with other allocators linked to the same program.
|
||||
If you use `cmake`, you can simply use:
|
||||
```
|
||||
find_package(mimalloc 1.0 REQUIRED)
|
||||
find_package(mimalloc 2.1 REQUIRED)
|
||||
```
|
||||
in your `CMakeLists.txt` to find a locally installed mimalloc. Then use either:
|
||||
```
|
||||
|
@ -1071,38 +1198,63 @@ See \ref overrides for more info.
|
|||
|
||||
/*! \page environment Environment Options
|
||||
|
||||
You can set further options either programmatically (using [`mi_option_set`](https://microsoft.github.io/mimalloc/group__options.html)),
|
||||
or via environment variables.
|
||||
You can set further options either programmatically (using [`mi_option_set`](https://microsoft.github.io/mimalloc/group__options.html)), or via environment variables:
|
||||
|
||||
- `MIMALLOC_SHOW_STATS=1`: show statistics when the program terminates.
|
||||
- `MIMALLOC_VERBOSE=1`: show verbose messages.
|
||||
- `MIMALLOC_SHOW_ERRORS=1`: show error and warning messages.
|
||||
- `MIMALLOC_PAGE_RESET=0`: by default, mimalloc will reset (or purge) OS pages when not in use to signal to the OS
|
||||
that the underlying physical memory can be reused. This can reduce memory fragmentation in long running (server)
|
||||
programs. By setting it to `0` no such page resets will be done which can improve performance for programs that are not long
|
||||
running. As an alternative, the `MIMALLOC_DECOMMIT_DELAY=`<msecs> can be set higher (100ms by default) to make the page
|
||||
reset occur less frequently instead of turning it off completely.
|
||||
- `MIMALLOC_LARGE_OS_PAGES=1`: use large OS pages (2MiB) when available; for some workloads this can significantly
|
||||
improve performance. Use `MIMALLOC_VERBOSE` to check if the large OS pages are enabled -- usually one needs
|
||||
to explicitly allow large OS pages (as on [Windows][windows-huge] and [Linux][linux-huge]). However, sometimes
|
||||
|
||||
Advanced options:
|
||||
|
||||
- `MIMALLOC_ARENA_EAGER_COMMIT=2`: turns on eager commit for the large arenas (usually 1GiB) from which mimalloc
|
||||
allocates segments and pages. Set this to 2 (default) to
|
||||
only enable this on overcommit systems (e.g. Linux). Set this to 1 to enable explicitly on other systems
|
||||
as well (like Windows or macOS) which may improve performance (as the whole arena is committed at once).
|
||||
Note that eager commit only increases the commit but not the actual the peak resident set
|
||||
(rss) so it is generally ok to enable this.
|
||||
- `MIMALLOC_PURGE_DELAY=N`: the delay in `N` milli-seconds (by default `10`) after which mimalloc will purge
|
||||
OS pages that are not in use. This signals to the OS that the underlying physical memory can be reused which
|
||||
can reduce memory fragmentation especially in long running (server) programs. Setting `N` to `0` purges immediately when
|
||||
a page becomes unused which can improve memory usage but also decreases performance. Setting `N` to a higher
|
||||
value like `100` can improve performance (sometimes by a lot) at the cost of potentially using more memory at times.
|
||||
Setting it to `-1` disables purging completely.
|
||||
- `MIMALLOC_PURGE_DECOMMITS=1`: By default "purging" memory means unused memory is decommitted (`MEM_DECOMMIT` on Windows,
|
||||
`MADV_DONTNEED` (which decresease rss immediately) on `mmap` systems). Set this to 0 to instead "reset" unused
|
||||
memory on a purge (`MEM_RESET` on Windows, generally `MADV_FREE` (which does not decrease rss immediately) on `mmap` systems).
|
||||
Mimalloc generally does not "free" OS memory but only "purges" OS memory, in other words, it tries to keep virtual
|
||||
address ranges and decommits within those ranges (to make the underlying physical memory available to other processes).
|
||||
|
||||
Further options for large workloads and services:
|
||||
|
||||
- `MIMALLOC_USE_NUMA_NODES=N`: pretend there are at most `N` NUMA nodes. If not set, the actual NUMA nodes are detected
|
||||
at runtime. Setting `N` to 1 may avoid problems in some virtual environments. Also, setting it to a lower number than
|
||||
the actual NUMA nodes is fine and will only cause threads to potentially allocate more memory across actual NUMA
|
||||
nodes (but this can happen in any case as NUMA local allocation is always a best effort but not guaranteed).
|
||||
- `MIMALLOC_ALLOW_LARGE_OS_PAGES=1`: use large OS pages (2 or 4MiB) when available; for some workloads this can significantly
|
||||
improve performance. When this option is disabled, it also disables transparent huge pages (THP) for the process
|
||||
(on Linux and Android). Use `MIMALLOC_VERBOSE` to check if the large OS pages are enabled -- usually one needs
|
||||
to explicitly give permissions for large OS pages (as on [Windows][windows-huge] and [Linux][linux-huge]). However, sometimes
|
||||
the OS is very slow to reserve contiguous physical memory for large OS pages so use with care on systems that
|
||||
can have fragmented memory (for that reason, we generally recommend to use `MIMALLOC_RESERVE_HUGE_OS_PAGES` instead when possible).
|
||||
- `MIMALLOC_RESERVE_HUGE_OS_PAGES=N`: where N is the number of 1GiB _huge_ OS pages. This reserves the huge pages at
|
||||
can have fragmented memory (for that reason, we generally recommend to use `MIMALLOC_RESERVE_HUGE_OS_PAGES` instead whenever possible).
|
||||
- `MIMALLOC_RESERVE_HUGE_OS_PAGES=N`: where `N` is the number of 1GiB _huge_ OS pages. This reserves the huge pages at
|
||||
startup and sometimes this can give a large (latency) performance improvement on big workloads.
|
||||
Usually it is better to not use
|
||||
`MIMALLOC_LARGE_OS_PAGES` in combination with this setting. Just like large OS pages, use with care as reserving
|
||||
Usually it is better to not use `MIMALLOC_ALLOW_LARGE_OS_PAGES=1` in combination with this setting. Just like large
|
||||
OS pages, use with care as reserving
|
||||
contiguous physical memory can take a long time when memory is fragmented (but reserving the huge pages is done at
|
||||
startup only once).
|
||||
Note that we usually need to explicitly enable huge OS pages (as on [Windows][windows-huge] and [Linux][linux-huge])). With huge OS pages, it may be beneficial to set the setting
|
||||
Note that we usually need to explicitly give permission for huge OS pages (as on [Windows][windows-huge] and [Linux][linux-huge])).
|
||||
With huge OS pages, it may be beneficial to set the setting
|
||||
`MIMALLOC_EAGER_COMMIT_DELAY=N` (`N` is 1 by default) to delay the initial `N` segments (of 4MiB)
|
||||
of a thread to not allocate in the huge OS pages; this prevents threads that are short lived
|
||||
and allocate just a little to take up space in the huge OS page area (which cannot be reset).
|
||||
- `MIMALLOC_RESERVE_HUGE_OS_PAGES_AT=N`: where N is the numa node. This reserves the huge pages at a specific numa node.
|
||||
(`N` is -1 by default to reserve huge pages evenly among the given number of numa nodes (or use the available ones as detected))
|
||||
and allocate just a little to take up space in the huge OS page area (which cannot be purged as huge OS pages are pinned
|
||||
to physical memory).
|
||||
The huge pages are usually allocated evenly among NUMA nodes.
|
||||
We can use `MIMALLOC_RESERVE_HUGE_OS_PAGES_AT=N` where `N` is the numa node (starting at 0) to allocate all
|
||||
the huge pages at a specific numa node instead.
|
||||
|
||||
Use caution when using `fork` in combination with either large or huge OS pages: on a fork, the OS uses copy-on-write
|
||||
for all pages in the original process including the huge OS pages. When any memory is now written in that area, the
|
||||
OS will copy the entire 1GiB huge page (or 2MiB large page) which can cause the memory usage to grow in big increments.
|
||||
OS will copy the entire 1GiB huge page (or 2MiB large page) which can cause the memory usage to grow in large increments.
|
||||
|
||||
[linux-huge]: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/tuning_and_optimizing_red_hat_enterprise_linux_for_oracle_9i_and_10g_databases/sect-oracle_9i_and_10g_tuning_guide-large_memory_optimization_big_pages_and_huge_pages-configuring_huge_pages_in_red_hat_enterprise_linux_4_or_5
|
||||
[windows-huge]: https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows?view=sql-server-2017
|
||||
|
@ -1111,87 +1263,100 @@ OS will copy the entire 1GiB huge page (or 2MiB large page) which can cause the
|
|||
|
||||
/*! \page overrides Overriding Malloc
|
||||
|
||||
Overriding the standard `malloc` can be done either _dynamically_ or _statically_.
|
||||
Overriding the standard `malloc` (and `new`) can be done either _dynamically_ or _statically_.
|
||||
|
||||
## Dynamic override
|
||||
|
||||
This is the recommended way to override the standard malloc interface.
|
||||
|
||||
### Dynamic Override on Linux, BSD
|
||||
|
||||
### Linux, BSD
|
||||
|
||||
On these systems we preload the mimalloc shared
|
||||
On these ELF-based systems we preload the mimalloc shared
|
||||
library so all calls to the standard `malloc` interface are
|
||||
resolved to the _mimalloc_ library.
|
||||
|
||||
- `env LD_PRELOAD=/usr/lib/libmimalloc.so myprogram`
|
||||
```
|
||||
> env LD_PRELOAD=/usr/lib/libmimalloc.so myprogram
|
||||
```
|
||||
|
||||
You can set extra environment variables to check that mimalloc is running,
|
||||
like:
|
||||
```
|
||||
env MIMALLOC_VERBOSE=1 LD_PRELOAD=/usr/lib/libmimalloc.so myprogram
|
||||
> env MIMALLOC_VERBOSE=1 LD_PRELOAD=/usr/lib/libmimalloc.so myprogram
|
||||
```
|
||||
or run with the debug version to get detailed statistics:
|
||||
```
|
||||
env MIMALLOC_SHOW_STATS=1 LD_PRELOAD=/usr/lib/libmimalloc-debug.so myprogram
|
||||
> env MIMALLOC_SHOW_STATS=1 LD_PRELOAD=/usr/lib/libmimalloc-debug.so myprogram
|
||||
```
|
||||
|
||||
### MacOS
|
||||
### Dynamic Override on MacOS
|
||||
|
||||
On macOS we can also preload the mimalloc shared
|
||||
library so all calls to the standard `malloc` interface are
|
||||
resolved to the _mimalloc_ library.
|
||||
|
||||
- `env DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=/usr/lib/libmimalloc.dylib myprogram`
|
||||
```
|
||||
> env DYLD_INSERT_LIBRARIES=/usr/lib/libmimalloc.dylib myprogram
|
||||
```
|
||||
|
||||
Note that certain security restrictions may apply when doing this from
|
||||
the [shell](https://stackoverflow.com/questions/43941322/dyld-insert-libraries-ignored-when-calling-application-through-bash).
|
||||
|
||||
(Note: macOS support for dynamic overriding is recent, please report any issues.)
|
||||
|
||||
### Dynamic Override on Windows
|
||||
|
||||
### Windows
|
||||
|
||||
Overriding on Windows is robust and has the
|
||||
particular advantage to be able to redirect all malloc/free calls that go through
|
||||
<span id="override_on_windows">Dynamically overriding on mimalloc on Windows</span>
|
||||
is robust and has the particular advantage to be able to redirect all malloc/free calls that go through
|
||||
the (dynamic) C runtime allocator, including those from other DLL's or libraries.
|
||||
As it intercepts all allocation calls on a low level, it can be used reliably
|
||||
on large programs that include other 3rd party components.
|
||||
There are four requirements to make the overriding work robustly:
|
||||
|
||||
The overriding on Windows requires that you link your program explicitly with
|
||||
the mimalloc DLL and use the C-runtime library as a DLL (using the `/MD` or `/MDd` switch).
|
||||
Also, the `mimalloc-redirect.dll` (or `mimalloc-redirect32.dll`) must be available
|
||||
in the same folder as the main `mimalloc-override.dll` at runtime (as it is a dependency).
|
||||
The redirection DLL ensures that all calls to the C runtime malloc API get redirected to
|
||||
mimalloc (in `mimalloc-override.dll`).
|
||||
1. Use the C-runtime library as a DLL (using the `/MD` or `/MDd` switch).
|
||||
2. Link your program explicitly with `mimalloc-override.dll` library.
|
||||
To ensure the `mimalloc-override.dll` is loaded at run-time it is easiest to insert some
|
||||
call to the mimalloc API in the `main` function, like `mi_version()`
|
||||
(or use the `/INCLUDE:mi_version` switch on the linker). See the `mimalloc-override-test` project
|
||||
for an example on how to use this.
|
||||
3. The [`mimalloc-redirect.dll`](bin) (or `mimalloc-redirect32.dll`) must be put
|
||||
in the same folder as the main `mimalloc-override.dll` at runtime (as it is a dependency of that DLL).
|
||||
The redirection DLL ensures that all calls to the C runtime malloc API get redirected to
|
||||
mimalloc functions (which reside in `mimalloc-override.dll`).
|
||||
4. Ensure the `mimalloc-override.dll` comes as early as possible in the import
|
||||
list of the final executable (so it can intercept all potential allocations).
|
||||
|
||||
To ensure the mimalloc DLL is loaded at run-time it is easiest to insert some
|
||||
call to the mimalloc API in the `main` function, like `mi_version()`
|
||||
(or use the `/INCLUDE:mi_version` switch on the linker). See the `mimalloc-override-test` project
|
||||
for an example on how to use this. For best performance on Windows with C++, it
|
||||
For best performance on Windows with C++, it
|
||||
is also recommended to also override the `new`/`delete` operations (by including
|
||||
[`mimalloc-new-delete.h`](https://github.com/microsoft/mimalloc/blob/master/include/mimalloc-new-delete.h) a single(!) source file in your project).
|
||||
[`mimalloc-new-delete.h`](include/mimalloc-new-delete.h)
|
||||
a single(!) source file in your project).
|
||||
|
||||
The environment variable `MIMALLOC_DISABLE_REDIRECT=1` can be used to disable dynamic
|
||||
overriding at run-time. Use `MIMALLOC_VERBOSE=1` to check if mimalloc was successfully redirected.
|
||||
|
||||
(Note: in principle, it is possible to even patch existing executables without any recompilation
|
||||
We cannot always re-link an executable with `mimalloc-override.dll`, and similarly, we cannot always
|
||||
ensure the the DLL comes first in the import table of the final executable.
|
||||
In many cases though we can patch existing executables without any recompilation
|
||||
if they are linked with the dynamic C runtime (`ucrtbase.dll`) -- just put the `mimalloc-override.dll`
|
||||
into the import table (and put `mimalloc-redirect.dll` in the same folder)
|
||||
Such patching can be done for example with [CFF Explorer](https://ntcore.com/?page_id=388)).
|
||||
|
||||
Such patching can be done for example with [CFF Explorer](https://ntcore.com/?page_id=388) or
|
||||
the [`minject`](bin) program.
|
||||
|
||||
## Static override
|
||||
|
||||
On Unix systems, you can also statically link with _mimalloc_ to override the standard
|
||||
On Unix-like systems, you can also statically link with _mimalloc_ to override the standard
|
||||
malloc interface. The recommended way is to link the final program with the
|
||||
_mimalloc_ single object file (`mimalloc-override.o`). We use
|
||||
_mimalloc_ single object file (`mimalloc.o`). We use
|
||||
an object file instead of a library file as linkers give preference to
|
||||
that over archives to resolve symbols. To ensure that the standard
|
||||
malloc interface resolves to the _mimalloc_ library, link it as the first
|
||||
object file. For example:
|
||||
```
|
||||
> gcc -o myprogram mimalloc.o myfile1.c ...
|
||||
```
|
||||
|
||||
```
|
||||
gcc -o myprogram mimalloc-override.o myfile1.c ...
|
||||
```
|
||||
Another way to override statically that works on all platforms, is to
|
||||
link statically to mimalloc (as shown in the introduction) and include a
|
||||
header file in each source file that re-defines `malloc` etc. to `mi_malloc`.
|
||||
This is provided by [`mimalloc-override.h`](https://github.com/microsoft/mimalloc/blob/master/include/mimalloc-override.h). This only works reliably though if all sources are
|
||||
under your control or otherwise mixing of pointers from different heaps may occur!
|
||||
|
||||
## List of Overrides:
|
||||
|
||||
|
|
|
@ -47,3 +47,14 @@ div.fragment {
|
|||
#nav-sync img {
|
||||
display: none;
|
||||
}
|
||||
h1,h2,h3,h4,h5,h6 {
|
||||
transition:none;
|
||||
}
|
||||
.memtitle {
|
||||
background-image: none;
|
||||
background-color: #EEE;
|
||||
}
|
||||
table.memproto, .memproto {
|
||||
text-shadow: none;
|
||||
font-size: 110%;
|
||||
}
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue