The ``_os_random_weak`` function is the only non-static function
besides ``_ZSt15get_new_handlerv`` that is not prefixed with ``mi`` or
``_mi``.
The discrepancy was discovered by CPython's smelly script. The checker
looks for exported symbols that don't have well-defined prefixes.
Signed-off-by: Christian Heimes <christian@python.org>
Fix warning ``warning: function declaration isn’t a prototype`` when
building mimalloc with ``-Wstrict-prototypes`` flag. In C argumentless
functions should be declared as ``func(void)``.
Reproducer:
```shell
$ cmake ../.. -DCMAKE_C_FLAGS="-Wstrict-prototypes"
$ make VERBOSE=1
```
Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Neil Schemenauer <nas@arctrix.com>
Signed-off-by: Christian Heimes <christian@python.org>
When building some code against mimalloc with C inside Visual Studio
with ClangCL, the compiler complains about __GNUC__ being undefined.
Reported by Mojca Miklavec.
Close#422
Common cache line sizes are 32, 64 and 128 bytes. On x86_64 the standard
cache line size is 64B. Even though this is not architecturally required,
all the x86_64implementations stick to it. Some AArch64 processors also
follow the x86_64 style with 64B cachelines. However, on Apple M1
devices, the underlying hardware is using a 128B cache line size. Quote
from Apple Developer documentation [1]:
"Some features of Apple silicon are decidedly different than those of
Intel-based Mac computers, and may impact your code if you don't fetch
them dynamically. These features include:
* Cache line sizes are different. Fetch the hw.cachelinesize setting
using sysctl."
M1 cache lines are double of what is commonly used by x86_64 and other
Arm implementation. The cache line sizes for Arm depend on implementations,
not architectures. For example, TI AM57x (Cortex-A15) uses 64B cache
line while TI AM437x (Cortex-A9) uses 32B cache line. And, there are
even Arm implementations with cache line sizes configurable at boot time.
This patch attempts to detect L1 cache size at compile time. For Aarch64
hosts, the build process would collect system information and determine
L1 cache line size. At present, both macOS and Linux are supported. For
Arm targets, the software packages are usually cross-compiled, and
developers should specify the appropriate MI_CACHE_LINE setting in
advance.
64B is the default cache line size if none of the above is able to set.
[1] https://developer.apple.com/documentation/apple-silicon/addressing-architectural-differences-in-your-macos-code
SI prefixes [the decimal prefixes] refer strictly to powers of 10. They
should not be used to indicate powers of 2. e.g., one kilobit
represents 1000 bits instead of 1024 bits. IEC 60027‐2 symbols are
formed adding a "i" to the SI symbol (e.g. G + i = Gi).
Android's Bionic libc stores the thread ID in TLS slot 1 instead of 0
on 32-bit ARM and AArch64. Slot 0 contains a pointer to the ELF DTV
(Dynamic Thread Vector) instead, which is constant for each loaded DSO.
Because mimalloc uses the thread ID to determine whether operations are
thread-local or cross-thread (atomic), all threads having the same ID
causes internal data structures to get corrupted quickly when multiple
threads are using the allocator:
mimalloc: assertion failed: at "external/mimalloc/src/page.c":563, mi_page_extend_free
assertion: "page->local_free == NULL"
mimalloc: assertion failed: at "external/mimalloc/src/page.c":74, mi_page_is_valid_init
assertion: "page->used <= page->capacity"
mimalloc: assertion failed: at "external/mimalloc/src/page.c":100, mi_page_is_valid_init
assertion: "page->used + free_count == page->capacity"
mimalloc: assertion failed: at "external/mimalloc/src/page.c":74, mi_page_is_valid_init
assertion: "page->used <= page->capacity"
Add support for Android's alternate TLS layout to fix the crashes in
multi-threaded use cases.
Fixes#376.