mirror of
https://github.com/FEX-Emu/linux.git
synced 2025-01-10 11:30:49 +00:00
9536c8d2da
There's been reports of high NMI handler overhead, highlighted by such kernel messages: [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000 [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs Don Zickus analyzed the source of the overhead and reported: > While there are a few places that are causing latencies, for now I focused on > the longest one first. It seems to be 'copy_user_from_nmi' > > intel_pmu_handle_irq -> > intel_pmu_drain_pebs_nhm -> > __intel_pmu_drain_pebs_nhm -> > __intel_pmu_pebs_event -> > intel_pmu_pebs_fixup_ip -> > copy_from_user_nmi > > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles > (there are some cases where only 10 iterations are needed to go that high > too, but in generall over 50 or so). At this point copy_user_from_nmi > seems to account for over 90% of the nmi latency. The solution to that is to avoid having to call copy_from_user_nmi() for every instruction. Since we already limit the max basic block size, we can easily pre-allocate a piece of memory to copy the entire thing into in one go. Don reported this test result: > Your patch made a huge difference in improvement. The > copy_from_user_nmi() no longer hits the million of cycles. I still > have a batch of 100,000-300,000 cycles. My longest NMI paths used > to be dominated by copy_from_user_nmi, now it is not (I have to dig > up the new hot path). Reported-and-tested-by: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
---|---|---|
.. | ||
mcheck | ||
mtrr | ||
.gitignore | ||
amd.c | ||
bugs_64.c | ||
bugs.c | ||
centaur.c | ||
common.c | ||
cpu.h | ||
cyrix.c | ||
hypervisor.c | ||
intel_cacheinfo.c | ||
intel.c | ||
Makefile | ||
match.c | ||
mkcapflags.sh | ||
mshyperv.c | ||
perf_event_amd_ibs.c | ||
perf_event_amd_iommu.c | ||
perf_event_amd_iommu.h | ||
perf_event_amd_uncore.c | ||
perf_event_amd.c | ||
perf_event_intel_ds.c | ||
perf_event_intel_lbr.c | ||
perf_event_intel_uncore.c | ||
perf_event_intel_uncore.h | ||
perf_event_intel.c | ||
perf_event_knc.c | ||
perf_event_p4.c | ||
perf_event_p6.c | ||
perf_event.c | ||
perf_event.h | ||
perfctr-watchdog.c | ||
powerflags.c | ||
proc.c | ||
rdrand.c | ||
scattered.c | ||
topology.c | ||
transmeta.c | ||
umc.c | ||
vmware.c |