Matthew Dillon has added KVABIO, an API for avoiding the need to sync the TLB across all CPUs before continuing. What’s this mean? The more CPUs you are dealing with, the longer it takes to make sure all of them have the same cached view of the virtual memory. There’s a tradeoff – caching that view speeds up memory access, but the time cost of the synchronization can erase those benefits.
This API is now supported for NVMe and swap, HAMMER2, and tmpfs. Note that those last two links show a huge drop in IPI messaging. In the real world, this showed about a 5% improvement in performance for CPU-intensive work like complete synth builds. (Based on IRC conversations.)