linux/Documentation
Ian Campbell 1431559200 x86, mm: Allow highmem user page tables to be disabled at boot time
Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
enable CONFIG_HIGHPTE for any x86 configuration which has highmem
enabled. This means that the overhead applies even to machines which
have a fairly modest amount of high memory and which therefore do not
really benefit from allocating PTEs in high memory but still pay the
price of the additional mapping operations.

Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
no actual highptes being allocated there was a reduction in system
time used from 59.737s to 55.9s.

With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 175.396 (0.238914)
  User Time 515.983 (5.85019)
  System Time 59.737 (1.26727)
  Percent CPU 263.8 (71.6796)
  Context Switches 39989.7 (4672.64)
  Sleeps 42617.7 (246.307)

With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
  Average Optimal load -j 4 Run (std deviation):
  Elapsed Time 174.278 (0.831968)
  User Time 515.659 (6.07012)
  System Time 55.9 (1.07799)
  Percent CPU 263.8 (71.266)
  Context Switches 39929.6 (4485.13)
  Sleeps 42583.7 (373.039)

This patch allows the user to control the allocation of PTEs in
highmem from the command line ("userpte=nohigh") but retains the
status-quo as the default.

It is possible that some simple heuristic could be developed which
allows auto-tuning of this option however I don't have a sufficiently
large machine available to me to perform any particularly meaningful
experiments. We could probably handwave up an argument for a threshold
at 16G of total RAM.

Assuming 768M of lowmem we have 196608 potential lowmem PTE
pages. Each page can map 2M of RAM in a PAE-enabled configuration,
meaning a maximum of 384G of RAM could potentially be mapped using
lowmem PTEs.

Even allowing generous factor of 10 to account for other required
lowmem allocations, generous slop to account for page sharing (which
reduces the total amount of RAM mappable by a given number of PT
pages) and other innacuracies in the estimations it would seem that
even a 32G machine would not have a particularly pressing need for
highmem PTEs. I think 32G could be considered to be at the upper bound
of what might be sensible on a 32 bit machine (although I think in
practice 64G is still supported).

It's seems questionable if HIGHPTE is even a win for any amount of RAM
you would sensibly run a 32 bit kernel on rather than going 64 bit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25 10:28:19 +01:00
..
ABI ima: rename PATH_CHECK to FILE_CHECK 2010-02-07 03:06:23 -05:00
accounting Documentation/: fix warnings from -Wmissing-prototypes in HOSTCFLAGS 2009-09-23 07:39:28 -07:00
acpi ACPI: support customizing ACPI control methods at runtime 2009-12-11 01:50:08 -05:00
aoe
arm OMAP: DSS2: Documentation for DSS2 2009-12-09 12:04:34 +02:00
auxdisplay includecheck fix: Documentation, cfag12864b-example.c 2009-09-24 07:20:57 -07:00
blackfin Blackfin: add an example showing how to use the gptimers API 2009-12-15 00:15:04 -05:00
block Documentation: Rename Documentation/DMA-mapping.txt 2010-01-02 10:09:44 -08:00
blockdev The DRBD driver 2009-10-01 21:17:49 +02:00
cdrom debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. 2009-06-15 21:30:28 -07:00
cgroups blkio: Documentation 2009-12-03 19:28:53 +01:00
connector connector: Provide the sender's credentials to the callback 2009-10-02 10:54:01 -07:00
console
cpu-freq [CPUFREQ] fix default value for ondemand governor 2010-01-13 10:55:15 -05:00
cpuidle
cris
crypto async_tx: add support for asynchronous RAID6 recovery operations 2009-08-29 19:09:27 -07:00
development-process docs: Encourage better changelogs in the development process document 2009-06-04 10:32:49 -06:00
device-mapper dm snapshot: add merge target 2009-12-10 23:52:30 +00:00
DocBook DocBook: fix ioremap return type 2010-01-02 10:09:44 -08:00
driver-model Driver core: driver_attribute parameters can often be const* 2009-12-23 11:23:43 -08:00
dvb tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
early-userspace
fault-injection fault injection: correct function names in documentation 2010-02-02 18:11:22 -08:00
fb viafb: documentation update 2009-12-16 07:20:05 -08:00
filesystems proc: partially revert "procfs: provide stack information for threads" 2010-01-11 09:34:06 -08:00
firmware_class driver core: fix documentation of request_firmware_nowait 2009-06-15 21:30:24 -07:00
frv
hwmon Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging 2010-01-11 09:45:55 -08:00
i2c i2c: Get rid of struct i2c_client_address_data 2009-12-14 21:17:25 +01:00
i2o
ia64 Documentation/: fix warnings from -Wmissing-prototypes in HOSTCFLAGS 2009-09-23 07:39:28 -07:00
ide ide: preserve Host Protected Area by default (v2) 2009-06-07 13:52:52 +02:00
infiniband IB: Fix typo in ipoib.txt 2009-12-09 14:21:36 -08:00
input Input: update multi-touch protocol documentation 2010-01-28 22:32:52 -08:00
ioctl docs: large update to ioctl-number.txt 2010-01-11 09:34:04 -08:00
isdn gigaset: documentation amendments 2009-12-08 20:30:41 -08:00
ja_JP block: rename CONFIG_LBD to CONFIG_LBDAF 2009-06-19 08:08:50 +02:00
kbuild kbuild: generate modules.builtin 2009-12-12 13:08:16 +01:00
kdump trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
ko_KR
kvm KVM: x86: Extend KVM_SET_VCPU_EVENTS with selective updates 2009-12-27 13:36:33 -02:00
laptops thinkpad-acpi: update volume subdriver documentation 2009-12-26 22:37:58 -05:00
lguest tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
m68k
make
mips
misc-devices ad525x_dpot: new driver for AD525x digital potentiometers 2009-12-15 08:53:25 -08:00
mn10300 trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
mtd trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
namespaces
netlabel
networking Documentation/3c509: document ethtool support 2010-01-11 15:53:45 -08:00
parisc
PCI Documentation: Rename Documentation/DMA-mapping.txt 2010-01-02 10:09:44 -08:00
pcmcia pcmcia: remove now-defunct cs_error, pcmcia_error_{func,ret} 2009-11-09 08:30:06 +01:00
power PM: Runtime PM documentation update 2009-12-22 20:43:40 +01:00
powerpc Merge commit 'kumar/next' into merge 2009-12-21 09:30:42 +11:00
pps LinuxPPS: core support 2009-06-18 13:04:04 -07:00
prctl
RCU rcu: Add synchronize_srcu_expedited() to the documentation 2009-10-26 09:40:31 +01:00
s390 [S390] s390dbf: Add description for usage of "%s" in sprintf events 2009-09-11 10:29:53 +02:00
scheduler sched: Documentation/sched-rt-group: Fix style issues & bump version 2009-06-21 13:12:46 +02:00
scsi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-12-09 19:43:33 -08:00
serial tty: docs: serial/tty, add to ldisc methods 2009-12-11 15:18:05 -08:00
sh
sound ALSA: Fix a typo in Procfile.txt 2009-12-26 18:37:44 +01:00
sparc
spi tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
sysctl doc: Add documentation for bootloader_{type,version} 2009-12-11 14:28:56 -08:00
telephony
thermal thermal: add sanity check for the passive attribute 2009-11-05 18:18:10 -05:00
timers fix URL in hpet.txt 2009-11-09 09:40:54 +01:00
trace tracing/documentation: Cover new frame pointer semantics 2010-01-26 17:00:39 -05:00
uml
usb USB: power management documentation update 2009-12-23 11:34:09 -08:00
video4linux V4L/DVB (13651): sh_mobile_ceu_camera: document the scaling and cropping algorithm 2009-12-16 09:27:20 -02:00
vm HWPOISON: Use correct name for MADV_HWPOISON in documentation 2009-12-16 12:20:00 +01:00
w1 ds2482: Discard obsolete detect method 2009-10-04 22:53:41 +02:00
watchdog Documentation/: fix warnings from -Wmissing-prototypes in HOSTCFLAGS 2009-09-23 07:39:28 -07:00
wimax
x86 USB: ehci-dbgp,documentation: Documentation updates for ehci-dbgp 2009-09-23 06:46:39 -07:00
zh_CN
00-INDEX Bluetooth: Add documentation for Marvell Bluetooth driver 2009-08-22 14:25:32 -07:00
applying-patches.txt
atomic_ops.txt Documentation/atomic_ops.txt: fix sample code 2009-06-16 19:47:52 -07:00
bad_memory.txt
basic_profiling.txt
binfmt_misc.txt
braille-console.txt trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
bt8xxgpio.txt
btmrvl.txt Bluetooth: Add documentation for Marvell Bluetooth driver 2009-08-22 14:25:32 -07:00
BUG-HUNTING
cachetlb.txt
Changes netfilter: xtables: document minimal required version 2009-12-14 14:52:10 +01:00
CodingStyle trivial: fix typo milisecond/millisecond for documentation and source comments. 2009-06-12 18:01:46 +02:00
cpu-hotplug.txt cpumask: don't recommend set_cpus_allowed hack in Documentation/cpu-hotplug.txt 2009-12-17 11:43:29 +10:30
cpu-load.txt
cputopology.txt Documentation: ABI: /sys/devices/system/cpu/cpu#/ topology files 2009-10-30 14:59:52 -07:00
credentials.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt ieee1394: update URLs in debugging-via-ohci1394.txt 2009-10-03 09:28:11 +02:00
dell_rbu.txt trivial: Documentation/dell_rbu.txt: fix typos 2009-06-12 18:01:50 +02:00
devices.txt
DMA-API.txt trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
DMA-attributes.txt
DMA-ISA-LPC.txt
dmaengine.txt
dontdiff dontdiff: add generated 2009-12-12 13:08:13 +01:00
dynamic-debug-howto.txt
edac.txt fix typos/grammos in Documentation/edac.txt 2009-12-04 15:39:53 +01:00
eisa.txt
email-clients.txt
feature-removal-schedule.txt feature-removal-schedule: Add v4l1 drivers obsoleted by gspca sub drivers 2010-01-17 11:31:35 -02:00
flexible-arrays.txt Update flex_arrays.txt 2009-10-15 07:25:20 -06:00
futex-requeue-pi.txt
gcov.txt trivial: fix typo in CONFIG_DEBUG_FS in gcov doc 2009-09-21 15:14:56 +02:00
gpio.txt gpiolib: add support for changing value polarity in sysfs 2009-12-16 07:20:01 -08:00
highuid.txt
HOWTO
hw_random.txt
initrd.txt
intel_txt.txt x86, intel_txt: Intel TXT boot support 2009-07-21 11:49:06 -07:00
Intel-IOMMU.txt intel-iommu: Kill DMAR_BROKEN_GFX_WA option. 2009-09-19 09:37:23 -07:00
io_ordering.txt
io-mapping.txt
IO-mapping.txt Documentation: fix ioremap return type 2010-01-02 10:09:44 -08:00
iostats.txt
IPMI.txt
IRQ-affinity.txt
IRQ.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kernel-doc-nano-HOWTO.txt documentation: update kernel-doc-nano-HOWTO information 2010-01-11 09:34:07 -08:00
kernel-docs.txt
kernel-parameters.txt x86, mm: Allow highmem user page tables to be disabled at boot time 2010-02-25 10:28:19 +01:00
keys-request-key.txt
keys.txt KEYS: Add a keyctl to install a process's session keyring on its parent [try #6] 2009-09-02 21:29:22 +10:00
kmemcheck.txt kmemcheck: update documentation 2009-07-01 22:36:22 +02:00
kmemleak.txt kmemleak: add clear command support 2009-09-08 16:36:08 +01:00
kobject.txt trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
kprobes.txt debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. 2009-06-15 21:30:28 -07:00
kref.txt kref: double kref_put() in my_data_handler() 2009-09-18 09:48:52 -07:00
ldm.txt
leds-class.txt led: document sysfs interface 2009-08-28 15:21:12 -04:00
leds-lp3944.txt leds: LED driver for National Semiconductor LP3944 Funlight Chip 2009-06-23 20:21:38 +01:00
local_ops.txt trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
lockdep-design.txt lockdep: Fix typos in documentation 2009-08-07 12:03:46 +02:00
lockstat.txt lockstat: Add usage info to Documentation/lockstat.txt 2009-12-06 13:20:02 +01:00
logo.gif
logo.txt
magic-number.txt
Makefile
ManagementStyle
mca.txt
md.txt md: add 'recovery_start' per-device sysfs attribute 2009-12-14 12:58:57 +11:00
memory-barriers.txt
memory-hotplug.txt mm: add numa node symlink for memory section in sysfs 2009-12-15 08:53:17 -08:00
memory.txt Documentation/memory.txt: remove some very outdated recommendations 2009-09-22 07:17:26 -07:00
mono.txt
mutex-design.txt
nmi_watchdog.txt
nommu-mmap.txt nommu: fix malloc performance by adding uninitialized flag 2009-12-15 08:53:24 -08:00
numastat.txt mm: fix NUMA accounting in numastat.txt 2009-09-22 07:17:39 -07:00
oops-tracing.txt docs: Describe the 'C' taint flag in oops-tracing.txt 2009-11-09 09:40:56 +01:00
parport-lowlevel.txt
parport.txt
pi-futex.txt
pnp.txt
preempt-locking.txt
printk-formats.txt
prio_tree.txt
rbtree.txt trivial: rbtree.txt: fix rb_entry() parameters in sample code 2009-06-12 18:01:47 +02:00
rfkill.txt rfkill: export persistent attribute in sysfs 2009-06-19 11:50:18 -04:00
robust-futex-ABI.txt futex: documentation: fix inconsistent description of futex list_op_pending 2009-06-18 13:03:56 -07:00
robust-futexes.txt
rt-mutex-design.txt
rt-mutex.txt
rtc.txt rtc: add boot_timesource sysfs attribute 2009-09-23 07:39:46 -07:00
SAK.txt
SecurityBugs
SELinux.txt
serial-console.txt
sgi-ioc4.txt
sgi-visws.txt
slow-work.txt SLOW_WORK: Move slow_work's proc file to debugfs 2009-12-01 08:20:31 -08:00
SM501.txt trivial: Miscellaneous documentation typo fixes 2009-06-12 18:01:47 +02:00
Smack.txt
sparse.txt
spinlocks.txt Documentation: rw_lock lessons learned 2009-12-14 09:46:56 -08:00
stable_api_nonsense.txt
stable_kernel_rules.txt Doc/stable rules: add new cherry-pick logic 2009-12-23 11:23:43 -08:00
SubmitChecklist doc: SubmitChecklist, add ioctls, remove OSDL reference 2009-12-16 07:20:06 -08:00
SubmittingDrivers
SubmittingPatches docs: update patch size in SubmittingPatches 2009-10-01 16:11:12 -07:00
svga.txt
sysfs-rules.txt
sysrq.txt sysrq, kdump: make sysrq-c consistent 2009-07-29 19:10:36 -07:00
tomoyo.txt
unaligned-memory-access.txt
unicode.txt
unshare.txt
VGA-softcursor.txt
vgaarbiter.txt vgaarbiter: fix a typo in the vgaarbiter Documentation 2009-12-16 11:28:58 -08:00
video-output.txt
volatile-considered-harmful.txt
voyager.txt
zorro.txt