linux

mirror of https://github.com/FEX-Emu/linux.git synced 2025-01-14 13:39:10 +00:00

Author	SHA1	Message	Date
Jan Beulich	234bb549ee	x86, cleanups: Use clear_page/copy_page rather than memset/memcpy When operating on whole pages, use clear_page() and copy_page() in favor of memset() and memcpy(); after all that's what they are intended for. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4C7FB8CA0200007800013F51@vpn.id2.novell.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-22 15:36:49 -07:00
Steven Rostedt	95fccd465e	jump label: Remove duplicate structure for x86 The structure in the x86 jump label code uses the typedef jump_label_t, which is defined by the #ifdef arch type. The structure does not need to be duplicated there. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-22 17:37:43 -04:00
Jason Baron	d9f5ab7b1c	jump label: x86 support add x86 support for jump label. I'm keeping this patch separate so its clear to arch maintainers what was required for x86 support this new feature. Hopefully, it wouldn't be too painful for other archs. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <f838f49f40fbea0254036194be66dc48b598dcea.1284733808.git.jbaron@redhat.com> [ cleaned up some formatting ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-22 16:33:03 -04:00
Jason Baron	4c3ef6d793	jump label: Add jump_label_text_reserved() to reserve jump points Add a jump_label_text_reserved(void start, void end), so that other pieces of code that want to modify kernel text, can first verify that jump label has not reserved the instruction. Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <06236663a3a7b1c1f13576bb9eccb6d9c17b7bfe.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-22 16:30:46 -04:00
Jason Baron	bf5438fca2	jump label: Base patch for jump label base patch to implement 'jump labeling'. Based on a new 'asm goto' inline assembly gcc mechanism, we can now branch to labels from an 'asm goto' statment. This allows us to create a 'no-op' fastpath, which can subsequently be patched with a jump to the slowpath code. This is useful for code which might be rarely used, but which we'd like to be able to call, if needed. Tracepoints are the current usecase that these are being implemented for. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com> [ cleaned up some formating ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-22 16:29:41 -04:00
Ingo Molnar	90edf27fb8	Merge branch 'linus' into perf/core Conflicts: kernel/hw_breakpoint.c Merge reason: resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-22 18:45:01 +02:00
Linus Torvalds	87ac6fa26e	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hw breakpoints: Fix pid namespace bug x86: Fix instruction breakpoint encoding oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540) kprobes: Fix Kconfig dependency	2010-09-21 13:21:42 -07:00
Yinghai Lu	74b3c444a9	x86, setup: Fix earlyprintk=serial,0x3f8,115200 earlyprintk can take and I/O port, so we need to handle this case in the setup code too, otherwise 0x3f8 will be treated as a baud rate. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C7B05A6.4010801@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-21 10:18:33 -07:00
Yinghai Lu	83d9f65bda	x86, setup: Fix earlyprintk=serial,ttyS0,115200 Torsten reported that there is garbage output, after commit 8fee13a48e4879fba57725f6d9513df4bfa8e9f3 (x86, setup: enable early console output from the decompressor) It turns out we missed the offset for that case. Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4C7B0578.8090807@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-21 10:18:14 -07:00
Ingo Molnar	7ed569206e	Merge commit 'v2.6.36-rc5' into perf/core Merge reason: Pick up the latest fixes in -rc5. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-21 13:55:11 +02:00
Jiri Olsa	bb7ab785ad	oprofile: Add Support for Intel CPU Family 6 / Model 29 This patch adds CPU type detection for dunnington processor (Family 6 / Model 29) to be identified as core 2 family cpu type (wikipedia source). I tested oprofile on Intel(R) Xeon(R) CPU E7440 reporting itself as model 29, and it runs without an issue. Spec: http://www.intel.com/Assets/en_US/PDF/specupdate/320336.pdf Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-09-21 12:22:48 +02:00
Rusty Russell	9b6efcd2e2	lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY. We used to have a hypercall which reloaded the entire GDT, then we switched to one which loaded a single entry (to match the IDT code). Some comments were not updated, so fix them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reported by: Eviatar Khen <eviatarkhen@gmail.com>	2010-09-21 10:54:02 +09:30
H. Peter Anvin	c2b9ff24a0	x86, cpu: Re-run get_cpu_cap() after adjusting the CPUID level At least on Intel, adjusting the max CPUID level can expose new CPUID features, so we need to re-run get_cpu_cap() after changing the CPUID level. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-20 18:03:28 -07:00
H. Peter Anvin	ce5f68246b	x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache line When we're using MWAIT for play_dead, explicitly CLFLUSH the cache line before executing MONITOR. This is a potential workaround for the Xeon 7400 erratum AAI65 after having a spurious wakeup and returning around the loop. "Potential" here because it is not certain that that erratum could actually trigger; however, the CLFLUSH should be harmless. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Venkatesh Pallipadi <venki@google.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.kernel.org> Cc: Len Brown <lenb@kernel.org>	2010-09-20 15:51:59 -07:00
Jason Baron	fa6f2cc770	jump label: Make text_poke_early() globally visible Make text_poke_early available outside of alternative.c. The jump label patchset wants to make use of it in order to set up the optimal no-op sequences at run-time. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <04cfddf2ba77bcabfc3e524f1849d871d6a1cf9d.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-20 18:19:51 -04:00
Jason Baron	f49aa44856	jump label: Make dynamic no-op selection available outside of ftrace Move Steve's code for finding the best 5-byte no-op from ftrace.c to alternative.c. The idea is that other consumers (in this case jump label) want to make use of that code. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <96259ae74172dcac99c0020c249743c523a92e18.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-09-20 18:19:39 -04:00
Andreas Herrmann	23ac4ae827	x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB The file names are somehow misleading as the code is not specific to AMD K8 CPUs anymore. The files accomodate code for other AMD CPU northbridges as well. Same is true for the config option which is valid for AMD CPU northbridges in general and not specific to K8. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160343.GD4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-20 14:22:58 -07:00
H. Peter Anvin	a68e5c94f7	x86, hotplug: Move WBINVD back outside the play_dead loop On processors with hyperthreading, when only one thread is offlined the other thread can cause a spurious wakeup on the idled thread. We do not want to re-WBINVD when that happens. Ideally, we should simply skip WBINVD unless we're the last thread on a particular core to shut down, but there might be similar issues elsewhere in the system. Thus, revert to previous behavior of only WBINVD outside the loop. Partly as a result, remove the mb()'s around it: they are not necessary since wbinvd() is a serializing instruction, but they were intended to make sure the compiler didn't do any funny loop optimizations. Reported-by: Asit Mallick <asit.k.mallick@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.kernel.org> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.hl> LKML-Reference: <tip-ea53069231f9317062910d6e772cca4ce93de8c8@git.kernel.org>	2010-09-17 17:10:23 -07:00
H. Peter Anvin	ea53069231	x86, hotplug: Use mwait to offline a processor, fix the legacy case The code in native_play_dead() has a number of problems: 1. We should use MWAIT when available, to put ourselves into a deeper sleep state. 2. We use the existence of CLFLUSH to determine if WBINVD is safe, but that is totally bogus -- WBINVD is 486+, whereas CLFLUSH is a much later addition. 3. We should do WBINVD inside the loop, just in case of something like setting an A bit on page tables. Pointed out by Arjan van de Ven. This code is based in part of a previous patch by Venki Pallipadi, but unlike that patch this one keeps all the detection code local instead of pre-caching a bunch of information. We're shutting down the CPU; there is absolutely no hurry. This patch moves all the code to C and deletes the global wbinvd_halt() which is broken anyway. Originally-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.hl> LKML-Reference: <20090522232230.162239000@intel.com>	2010-09-17 15:39:11 -07:00
H. Peter Anvin	bc83cccc76	x86, mwait: Move mwait constants to a common header file We have MWAIT constants spread across three different .c files, for no good reason. Move them all into a common header file. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <tip-*@git.kernel.org>	2010-09-17 15:36:40 -07:00
Andreas Herrmann	900f9ac9f1	x86, k8-gart: Decouple handling of garts and northbridges So far we only provide num_k8_northbridges. This is required in different areas (e.g. L3 cache index disable, GART). But not all AMD CPUs provide a GART. Thus it is useful to split off the GART handling from the generic caching of AMD northbridge misc devices. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160254.GC4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-17 13:26:21 -07:00
Andreas Herrmann	3518dd14ca	x86, cacheinfo: Fix dependency of AMD L3 CID L3 cache index disable code uses PCI accesses to AMD northbridge functions. Currently the code is #ifdef CONFIG_CPU_SUP_AMD. But it should be #if (defined(CONFIG_CPU_SUP_AMD) && defined(CONFIG_PCI)) which in the end is a dependency to K8_NB. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100917160744.GF4958@loge.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-17 13:25:56 -07:00
Cliff Wickman	3ee48b6af4	mm, x86: Saving vmcore with non-lazy freeing of vmas During the reading of /proc/vmcore the kernel is doing ioremap()/iounmap() repeatedly. And the buildup of un-flushed vm_area_struct's is causing a great deal of overhead. (rb_next() is chewing up most of that time). This solution is to provide function set_iounmap_nonlazy(). It causes a subsequent call to iounmap() to immediately purge the vma area (with try_purge_vmap_area_lazy()). With this patch we have seen the time for writing a 250MB compressed dump drop from 71 seconds to 44 seconds. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: kexec@lists.infradead.org Cc: <stable@kernel.org> LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-17 09:11:56 +02:00
Linus Torvalds	a5b617368c	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: hpet: Work around hardware stupidity x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y x86, cpufeature: Suppress compiler warning with gcc 3.x x86, UV: Fix initialization of max_pnode	2010-09-16 19:38:08 -07:00
Frederic Weisbecker	89e45aac42	x86: Fix instruction breakpoint encoding Lengths and types of breakpoints are encoded in a half byte into CPU registers. However when we extract these values and store them, we add a high half byte part to them: 0x40 to the length and 0x80 to the type. When that gets reloaded to the CPU registers, the high part is masked. While making the instruction breakpoints available for perf, I zapped that high part on instruction breakpoint encoding and that broke the arch -> generic translation used by ptrace instruction breakpoints. Writing dr7 to set an inst breakpoint was then failing. There is no apparent reason for these high parts so we could get rid of them altogether. That's an invasive change though so let's do that later and for now fix the problem by restoring that inst breakpoint high part encoding in this sole patch. Reported-by: Kelvie Wong <kelvie@ieee.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Will Deacon <will.deacon@arm.com>	2010-09-17 03:24:13 +02:00
Patrick Simmons	c33f543d32	oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540) This patch adds CPU type detection for the Intel Celeron 540, which is part of the Core 2 family according to Wikipedia; the family and ID pair is absent from the Volume 3B table referenced in the source code comments. I have tested this patch on an Intel Celeron 540 machine reporting itself as Family 6 Model 22, and OProfile runs on the machine without issue. Spec: http://download.intel.com/design/mobile/SPECUPDT/317667.pdf Signed-off-by: Patrick Simmons <linuxrocks123@netscape.net> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-09-16 12:35:56 +02:00
Suresh Siddha	fa47f7e528	x86, x2apic: Simplify apic init in SMP and UP builds Move enable_IR_x2apic() inside the default_setup_apic_routing(), and for SMP platforms, move the default_setup_apic_routing() after smp_sanity_check(). This cleans up the code that tries to avoid multiple calls to default_setup_apic_routing() when smp_sanity_check() fails (which goes through the APIC_init_uniprocessor() path). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100827181049.173087246@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-15 17:37:10 -07:00
Suresh Siddha	62a92f4c69	x86, intr-remap: Remove IRTE setup duplicate code Remove IRTE setup duplicate code with prepare_irte(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100827181049.095067319@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-15 17:36:45 -07:00
Suresh Siddha	75e3cfbed6	x86, intr-remap: Set redirection hint in the IRTE Currently the redirection hint in the interrupt-remapping table entry is set to 0, which means the remapped interrupt is directed to the processors listed in the destination. So in logical flat mode in the presence of intr-remapping, this results in a single interrupt multi-casted to multiple cpu's as specified by the destination bit mask. But what we really want is to send that interrupt to one of the cpus based on the lowest priority delivery mode. Set the redirection hint in the IRTE to '1' to indicate that we want the remapped interrupt to be directed to only one of the processors listed in the destination. This fixes the issue of same interrupt getting delivered to multiple cpu's in the logical flat mode in the presence of interrupt-remapping. While there is no functional issue observed with this behavior, this will impact performance of such configurations (<=8 cpu's using logical flat mode in the presence of interrupt-remapping) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100827181049.013051492@sbsiddha-MOBL3.sc.intel.com> Cc: Weidong Han <weidong.han@intel.com> Cc: <stable@kernel.org> # [v2.6.32+] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-15 17:36:37 -07:00
Namhyung Kim	6abded71d7	kprobes: Remove __dummy_buf Remove __dummy_buf which is needed for kallsyms_lookup only. use kallsysm_lookup_size_offset instead. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> LKML-Reference: <1284512670-2369-5-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-15 10:44:02 +02:00
Namhyung Kim	6376b22975	kprobes: Make functions static Make following (internal) functions static to make sparse happier :-) * get_optimized_kprobe: only called from static functions * kretprobe_table_unlock: _lock function is static * kprobes_optinsn_template_holder: never called but holding asm code Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> LKML-Reference: <1284512670-2369-4-git-send-email-namhyung@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-15 10:44:01 +02:00
Ingo Molnar	3aabae7d9d	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	2010-09-15 10:27:31 +02:00
Roland McGrath	eefdca043e	x86-64, compat: Retruncate rax after ia32 syscall entry tracing In commit d4d6715, we reopened an old hole for a 64-bit ptracer touching a 32-bit tracee in system call entry. A %rax value set via ptrace at the entry tracing stop gets used whole as a 32-bit syscall number, while we only check the low 32 bits for validity. Fix it by truncating %rax back to 32 bits after syscall_trace_enter, in addition to testing the full 64 bits as has already been added. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-14 16:08:47 -07:00
H. Peter Anvin	36d001c70d	x86-64, compat: Test %rax for the syscall number, not %eax On 64 bits, we always, by necessity, jump through the system call table via %rax. For 32-bit system calls, in theory the system call number is stored in %eax, and the code was testing %eax for a valid system call number. At one point we loaded the stored value back from the stack to enforce zero-extension, but that was removed in checkin d4d67150165df8bf1cc05e532f6efca96f907cab. An actual 32-bit process will not be able to introduce a non-zero-extended number, but it can happen via ptrace. Instead of re-introducing the zero-extension, test what we are actually going to use, i.e. %rax. This only adds a handful of REX prefixes to the code. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org> Cc: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>	2010-09-14 16:08:46 -07:00
H. Peter Anvin	c41d68a513	compat: Make compat_alloc_user_space() incorporate the access_ok() compat_alloc_user_space() expects the caller to independently call access_ok() to verify the returned area. A missing call could introduce problems on some architectures. This patch incorporates the access_ok() check into compat_alloc_user_space() and also adds a sanity check on the length. The existing compat_alloc_user_space() implementations are renamed arch_compat_alloc_user_space() and are used as part of the implementation of the new global function. This patch assumes NULL will cause __get_user()/__put_user() to either fail or access userspace on all architectures. This should be followed by checking the return value of compat_access_user_space() for NULL in the callers, at which time the access_ok() in the callers can also be removed. Reported-by: Ben Hawkes <hawkes@sota.gen.nz> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <jejb@parisc-linux.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: <stable@kernel.org>	2010-09-14 16:08:45 -07:00
Thomas Gleixner	54ff7e595d	x86: hpet: Work around hardware stupidity This more or less reverts commits 08be979 (x86: Force HPET readback_cmp for all ATI chipsets) and 30a564be (x86, hpet: Restrict read back to affected ATI chipsets) to the status of commit 8da854c (x86, hpet: Erratum workaround for read after write of HPET comparator). The delta to commit 8da854c is mostly comments and the change from WARN_ONCE to printk_once as we know the call path of this function already. This needs really in depth explanation: First of all the HPET design is a complete failure. Having a counter compare register which generates an interrupt on matching values forces the software to do at least one superfluous readback of the counter register. While it is nice in theory to program "absolute" time events it is practically useless because the timer runs at some absurd frequency which can never be matched to real world units. So we are forced to calculate a relative delta and this forces a readout of the actual counter value, adding the delta and programming the compare register. When the delta is small enough we run into the danger that we program a compare value which is already in the past. Due to the compare for equal nature of HPET we need to read back the counter value after writing the compare rehgister (btw. this is necessary for absolute timeouts as well) to make sure that we did not miss the timer event. We try to work around that by setting the minimum delta to a value which is larger than the theoretical time which elapses between the counter readout and the compare register write, but that's only true in theory. A NMI or SMI which hits between the readout and the write can easily push us beyond that limit. This would result in waiting for the next HPET timer interrupt until the 32bit wraparound of the counter happens which takes about 306 seconds. So we designed the next event function to look like: match = read_cnt() + delta; write_compare_ref(match); return read_cnt() < match ? 0 : -ETIME; At some point we got into trouble with certain ATI chipsets. Even the above "safe" procedure failed. The reason was that the write to the compare register was delayed probably for performance reasons. The theory was that they wanted to avoid the synchronization of the write with the HPET clock, which is understandable. So the write does not hit the compare register directly instead it goes to some intermediate register which is copied to the real compare register in sync with the HPET clock. That opens another window for hitting the dreaded "wait for a wraparound" problem. To work around that "optimization" we added a read back of the compare register which either enforced the update of the just written value or just delayed the readout of the counter enough to avoid the issue. We unfortunately never got any affirmative info from ATI/AMD about this. One thing is sure, that we nuked the performance "optimization" that way completely and I'm pretty sure that the result is worse than before some HW folks came up with those. Just for paranoia reasons I added a check whether the read back compare register value was the same as the value we wrote right before. That paranoia check triggered a couple of years after it was added on an Intel ICH9 chipset. Venki added a workaround (commit 8da854c) which was reading the compare register twice when the first check failed. We considered this to be a penalty in general and restricted the readback (thus the wasted CPU cycles) to the known to be affected ATI chipsets. This turned out to be a utterly wrong decision. 2.6.35 testers experienced massive problems and finally one of them bisected it down to commit 30a564be which spured some further investigation. Finally we got confirmation that the write to the compare register can be delayed by up to two HPET clock cycles which explains the problems nicely. All we can do about this is to go back to Venki's initial workaround in a slightly modified version. Just for the record I need to say, that all of this could have been avoided if hardware designers and of course the HPET committee would have thought about the consequences for a split second. It's out of my comprehension why designing a working timer is so hard. There are two ways to achieve it: 1) Use a counter wrap around aware compare_reg <= counter_reg implementation instead of the easy compare_reg == counter_reg Downsides: - It needs more silicon. - It needs a readout of the counter to apply a relative timeout. This is necessary as the counter does not run in any useful (and adjustable) frequency and there is no guarantee that the counter which is used for timer events is the same which is used for reading the actual time (and therefor for calculating the delta) Upsides: - None 2) Use a simple down counter for relative timer events Downsides: - Absolute timeouts are not possible, which is not a problem at all in the context of an OS and the expected max. latencies/jitter (also see Downsides of #1) Upsides: - It needs less or equal silicon. - It works ALWAYS - It is way faster than a compare register based solution (One write versus one write plus at least one and up to four reads) I would not be so grumpy about all of this, if I would not have been ignored for many years when pointing out these flaws to various hardware folks. I really hate timers (at least those which seem to be designed by janitors). Though finally we got a reasonable explanation plus a solution and I want to thank all the folks involved in chasing it down and providing valuable input to this. Bisected-by: Nix <nix@esperi.org.uk> Reported-by: Artur Skawina <art.08.09@gmail.com> Reported-by: Damien Wyart <damien.wyart@free.fr> Reported-by: John Drescher <drescherjm@gmail.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: stable@kernel.org Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-09-15 00:55:13 +02:00
basile@opensource.dyc.edu	08c2b394b9	x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y The arch/x86/Makefile uses scripts/gcc-x86_$(BITS)-has-stack-protector.sh to check if cc1 supports -fstack-protector. When -fPIE is passed to cc1, these scripts fail causing stack protection to be disabled even when it is available. This fix is similar to commit c47efe5548abbf53c2f66e06dcb46183b11d6b22 Reported-by: Kai Dietrich <mail@cleeus.de> Signed-off-by: Magnus Granberg <zorry@gentoo.org> LKML-Reference: <20100913101319.748A1148E216@opensource.dyc.edu> Signed-off-by: Anthony G. Basile <basile@opensource.dyc.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-13 15:53:16 -07:00
Tetsuo Handa	2fd818642a	x86, cpufeature: Suppress compiler warning with gcc 3.x Gcc 3.x generates a warning arch/x86/include/asm/cpufeature.h: In function `__static_cpu_has': arch/x86/include/asm/cpufeature.h:326: warning: asm operand 1 probably doesn't match constraints on each file. But static_cpu_has() for gcc 3.x does not need __static_cpu_has(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> LKML-Reference: <201008300127.o7U1RC6Z044051@www262.sakura.ne.jp> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-13 14:48:41 -07:00
Stephane Eranian	b0b2072df3	perf_events: Fix BTS interrupt handling to avoid being dazed by NMI (v2) Fix a bug introduced with commit de725de and the change in the meaning of the return value of intel_pmu_handle_irq(). With the current code, when you are using the BTS, you get 'dazed by NMI' each time the BTS buffer fills up. BTS does interrupt on the PMU vector, thus NMI. You need to take this into account in the return value of the function. This version fixes initial patch which was missing changes to perf_event_intel_ds.c. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: perfmon2-devel@lists.sf.net Cc: eranian@gmail.com Cc: robert.richter@amd.com LKML-Reference: <4c8a1686.aae9d80a.5aa4.5e35@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-13 08:43:40 +02:00
Joe Perches	d0ed0c3266	x86: Remove pr_<level> uses of KERN_<level> Signed-off-by: Joe Perches <joe@perches.com> Cc: Jiri Kosina <trivial@kernel.org> LKML-Reference: <d40c60f4b036a26db8492848695bdafaa3b42791.1284267142.git.joe@perches.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-12 09:32:31 +02:00
Peter Zijlstra	5ee5e97ee9	x86, tsc: Fix a preemption leak in restore_sched_clock_state() A real life genuine preemption leak.. Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-09-10 18:17:45 -07:00
Venkatesh Pallipadi	351e5a703a	x86, mtrr: Support mtrr lookup for range spanning across MTRR range mtrr_type_lookup [start:end] looked up the resultant MTRR type for that range, based on fixed and all variable MTRR ranges. It did check for multiple MTRR var ranges overlapping [start:end] and returned the net type. However, if the [start:end] range spanned across any var MTRR range, mtrr_type_lookup would return an error return of 0xFE. This was based on typical usage of mtrr_type_lookup in PAT mapping, where region being mapped would not normally span across MTRR ranges and also trying to keep the code simple. Mark recently reported the problem with this limitation. When there are two continguous MTRR's of type "writeback" and if there is a memory mapping over a region starting in one MTRR range and ending in another MTRR range, such mapping will fallback to "uncached" due to the above limitation. Change below adds support for such lookups spanning multiple MTRR ranges. We now have a wrapper mtrr_type_lookup that dynamically splits such a region into smaller chunks that fit within one MTRR range and does a __mtrr_type_lookup on it and combine the results later. Reported-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1284159350-19841-3-git-send-email-venki@google.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-10 16:11:20 -07:00
Venkatesh Pallipadi	a7f07cfbaa	x86, mtrr: Refactor MTRR type overlap check code Move the MTRR type overlap check into a new function. No functional change in this patch. Just making it easier to add multiple region overlap check in the following patch. Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1284159350-19841-2-git-send-email-venki@google.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-10 16:11:10 -07:00
Jack Steiner	36ac4b987b	x86, UV: Fix initialization of max_pnode Fix calculation of "max_pnode" for systems where the the highest blade has neither cpus or memory. (And, yes, although rare this does occur). Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20100910150808.GA19802@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-09-10 17:15:49 +02:00
Linus Torvalds	be6200aac9	Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Perform hardware_enable in CPU_STARTING callback KVM: i8259: fix migration KVM: fix i8259 oops when no vcpus are online KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts	2010-09-10 08:02:45 -07:00
Brian Gerst	b2b57fe053	x86, fpu: Merge fpu_save_init() Make 64-bit use the 32-bit version of fpu_save_init(). Remove unused clear_fpu_state(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-13-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-09 14:17:36 -07:00
Brian Gerst	58a992b9cb	x86-32, fpu: Rewrite fpu_save_init() Rewrite fpu_save_init() to prepare for merging with 64-bit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-12-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-09 14:17:31 -07:00
Brian Gerst	eec73f813a	x86, fpu: Remove PSHUFB_XMM5_* macros The PSHUFB_XMM5_* macros are no longer used. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-11-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-09 14:17:25 -07:00
Brian Gerst	8eb91a577d	x86, fpu: Remove unnecessary ifdefs from i387 code. Remove ifdefs for code that the compiler can optimize away on 64-bit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-10-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-09 14:17:18 -07:00
Brian Gerst	a334fe43d8	x86-32, fpu: Remove math_emulate stub check_fpu() in bugs.c halts boot if no FPU is found and math emulation isn't enabled. Therefore this stub will never be used. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-9-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-09-09 14:17:11 -07:00

... 2 3 4 5 6 ...

11493 Commits