Commit Graph

2350 Commits

Author SHA1 Message Date
Ingo Molnar
f43fdad862 ftrace: fix kexec
disable the tracer while kexec pulls the rug from under the old
kernel.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:39:05 +02:00
Steven Rostedt
d61f82d066 ftrace: use dynamic patching for updating mcount calls
This patch replaces the indirect call to the mcount function
pointer with a direct call that will be patched by the
dynamic ftrace routines.

On boot up, the mcount function calls the ftace_stub function.
When the dynamic ftrace code is initialized, the ftrace_stub
is replaced with a call to the ftrace_record_ip, which records
the instruction pointers of the locations that call it.

Later, the ftraced daemon will call kstop_machine and patch all
the locations to nops.

When a ftrace is enabled, the original calls to mcount will now
be set top call ftrace_caller, which will do a direct call
to the registered ftrace function. This direct call is also patched
when the function that should be called is updated.

All patching is performed by a kstop_machine routine to prevent any
type of race conditions that is associated with modifying code
on the fly.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:33:47 +02:00
Steven Rostedt
3c1720f00b ftrace: move memory management out of arch code
This patch moves the memory management of the ftrace
records out of the arch code and into the generic code
making the arch code simpler.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:33:35 +02:00
Steven Rostedt
dfa60aba04 ftrace: use nops instead of jmp
This patch patches the call to mcount with nops instead
of a jmp over the mcount call.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:33:28 +02:00
Steven Rostedt
3d0833953e ftrace: dynamic enabling/disabling of function calls
This patch adds a feature to dynamically replace the ftrace code
with the jmps to allow a kernel with ftrace configured to run
as fast as it can without it configured.

The way this works, is on bootup (if ftrace is enabled), a ftrace
function is registered to record the instruction pointer of all
places that call the function.

Later, if there's still any code to patch, a kthread is awoken
(rate limited to at most once a second) that performs a stop_machine,
and replaces all the code that was called with a jmp over the call
to ftrace. It only replaces what was found the previous time. Typically
the system reaches equilibrium quickly after bootup and there's no code
patching needed at all.

e.g.

  call ftrace  /* 5 bytes */

is replaced with

  jmp 3f  /* jmp is 2 bytes and we jump 3 forward */
3:

When we want to enable ftrace for function tracing, the IP recording
is removed, and stop_machine is called again to replace all the locations
of that were recorded back to the call of ftrace.  When it is disabled,
we replace the code back to the jmp.

Allocation is done by the kthread. If the ftrace recording function is
called, and we don't have any record slots available, then we simply
skip that call. Once a second a new page (if needed) is allocated for
recording new ftrace function calls.  A large batch is allocated at
boot up to get most of the calls there.

Because we do this via stop_machine, we don't have to worry about another
CPU executing a ftrace call as we modify it. But we do need to worry
about NMI's so all functions that might be called via nmi must be
annotated with notrace_nmi. When this code is configured in, the NMI code
will not call notrace.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:33:09 +02:00
Steven Rostedt
6cd8a4bb2f ftrace: trace preempt off critical timings
Add preempt off timings. A lot of kernel core code is taken from the RT patch
latency trace that was written by Ingo Molnar.

This adds "preemptoff" and "preemptirqsoff" to /debugfs/tracing/available_tracers

Now instead of just tracing irqs off, preemption off can be selected
to be recorded.

When this is selected, it shares the same files as irqs off timings.
One can either trace preemption off, irqs off, or one or the other off.

By echoing "preemptoff" into /debugfs/tracing/current_tracer, recording
of preempt off only is performed. "irqsoff" will only record the time
irqs are disabled, but "preemptirqsoff" will take the total time irqs
or preemption are disabled. Runtime switching of these options is now
supported by simpling echoing in the appropriate trace name into
/debugfs/tracing/current_tracer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:32:54 +02:00
Steven Rostedt
81d68a96a3 ftrace: trace irq disabled critical timings
This patch adds latency tracing for critical timings
(how long interrupts are disabled for).

 "irqsoff" is added to /debugfs/tracing/available_tracers

Note:
  tracing_max_latency
    also holds the max latency for irqsoff (in usecs).
   (default to large number so one must start latency tracing)

  tracing_thresh
    threshold (in usecs) to always print out if irqs off
    is detected to be longer than stated here.
    If irq_thresh is non-zero, then max_irq_latency
    is ignored.

Here's an example of a trace with ftrace_enabled = 0

=======
preemption latency trace v1.1.5 on 2.6.24-rc7
Signed-off-by: Ingo Molnar <mingo@elte.hu>
--------------------------------------------------------------------
 latency: 100 us, #3/3, CPU#1 | (M:rt VP:0, KP:0, SP:0 HP:0 #P:2)
    -----------------
    | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
    -----------------
 => started at: _spin_lock_irqsave+0x2a/0xb7
 => ended at:   _spin_unlock_irqrestore+0x32/0x5f

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
 swapper-0     1d.s3    0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000])
 swapper-0     1d.s3  100us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000])
 swapper-0     1d.s3  100us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f)

vim:ft=help
=======

And this is a trace with ftrace_enabled == 1

=======
preemption latency trace v1.1.5 on 2.6.24-rc7
--------------------------------------------------------------------
 latency: 102 us, #12/12, CPU#1 | (M:rt VP:0, KP:0, SP:0 HP:0 #P:2)
    -----------------
    | task: swapper-0 (uid:0 nice:0 policy:0 rt_prio:0)
    -----------------
 => started at: _spin_lock_irqsave+0x2a/0xb7
 => ended at:   _spin_unlock_irqrestore+0x32/0x5f

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
 swapper-0     1dNs3    0us+: _spin_lock_irqsave+0x2a/0xb7 (e1000_update_stats+0x47/0x64c [e1000])
 swapper-0     1dNs3   46us : e1000_read_phy_reg+0x16/0x225 [e1000] (e1000_update_stats+0x5e2/0x64c [e1000])
 swapper-0     1dNs3   46us : e1000_swfw_sync_acquire+0x10/0x99 [e1000] (e1000_read_phy_reg+0x49/0x225 [e1000])
 swapper-0     1dNs3   46us : e1000_get_hw_eeprom_semaphore+0x12/0xa6 [e1000] (e1000_swfw_sync_acquire+0x36/0x99 [e1000])
 swapper-0     1dNs3   47us : __const_udelay+0x9/0x47 (e1000_read_phy_reg+0x116/0x225 [e1000])
 swapper-0     1dNs3   47us+: __delay+0x9/0x50 (__const_udelay+0x45/0x47)
 swapper-0     1dNs3   97us : preempt_schedule+0xc/0x84 (__delay+0x4e/0x50)
 swapper-0     1dNs3   98us : e1000_swfw_sync_release+0xc/0x55 [e1000] (e1000_read_phy_reg+0x211/0x225 [e1000])
 swapper-0     1dNs3   99us+: e1000_put_hw_eeprom_semaphore+0x9/0x35 [e1000] (e1000_swfw_sync_release+0x50/0x55 [e1000])
 swapper-0     1dNs3  101us : _spin_unlock_irqrestore+0xe/0x5f (e1000_update_stats+0x641/0x64c [e1000])
 swapper-0     1dNs3  102us : _spin_unlock_irqrestore+0x32/0x5f (e1000_update_stats+0x641/0x64c [e1000])
 swapper-0     1dNs3  102us : trace_hardirqs_on_caller+0x75/0x89 (_spin_unlock_irqrestore+0x32/0x5f)

vim:ft=help
=======

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:32:46 +02:00
Arnaldo Carvalho de Melo
16444a8a40 ftrace: add basic support for gcc profiler instrumentation
If CONFIG_FTRACE is selected and /proc/sys/kernel/ftrace_enabled is
set to a non-zero value the ftrace routine will be called everytime
we enter a kernel function that is not marked with the "notrace"
attribute.

The ftrace routine will then call a registered function if a function
happens to be registered.

[ This code has been highly hacked by Steven Rostedt and Ingo Molnar,
  so don't blame Arnaldo for all of this ;-) ]

Update:
  It is now possible to register more than one ftrace function.
  If only one ftrace function is registered, that will be the
  function that ftrace calls directly. If more than one function
  is registered, then ftrace will call a function that will loop
  through the functions to call.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:31:58 +02:00
Steven Rostedt
23adec554a x86: add notrace annotations to vsyscall.
Add the notrace annotations to the vsyscall functions - there we are
not in kernel context yet, so the tracer function cannot (and must not)
be called.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 20:31:39 +02:00
Linus Torvalds
737b0fbf44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: correct mailing list address
  PCI: Correct last two HP entries in the bfsort whitelist
2008-05-20 10:55:04 -07:00
Linus Torvalds
e23a5f6687 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] return to old errno choice in mkdir() et.al.
  [Patch] fs/binfmt_elf.c: fix wrong return values
  [PATCH] get rid of leak in compat_execve()
  [Patch] fs/binfmt_elf.c: fix a wrong free
  [PATCH] avoid multiplication overflows and signedness issues for max_fds
  [PATCH] dup_fd() part 4 - race fix
  [PATCH] dup_fd() - part 3
  [PATCH] dup_fd() part 2
  [PATCH] dup_fd() fixes, part 1
  [PATCH] take init_files to fs/file.c
2008-05-19 16:37:45 -07:00
Linus Torvalds
88d53766bd Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: LAPIC: ignore pending timers if LVTT is disabled
  KVM: Update MAINTAINERS for new mailing lists
  KVM: Fix kvm_vcpu_block() task state race
  KVM: ia64: Set KVM_IOAPIC_NUM_PINS to 48
  KVM: ia64: fix GVMM module including position-dependent objects
  KVM: ia64: Define new kvm_fpreg struture to replace ia64_fpreg
  KVM: PIT: take inject_pending into account when emulating hlt
  s390: KVM guest: fix compile error
  KVM: x86 emulator: fix writes to registers with modrm encodings
2008-05-19 13:53:21 -07:00
Tony Camuso
8d64c781f0 PCI: Correct last two HP entries in the bfsort whitelist
Replace Redundant Whitelist Entries with the Correct Ones

The ProLiant DL585 G2 and the DL585 G2 are entered reundantly in the
dmi_system_id table. What should have been there are the DL360 and DL380. This
patch simply replaces the redundant entries with the correct entries.

Signed-off-by: Tony Camuso <tony.camuso@hp.com>
Signed-off-by: Pat Schoeller <patrick.schoeller@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-19 12:21:36 -07:00
Marcelo Tosatti
54aaacee35 KVM: LAPIC: ignore pending timers if LVTT is disabled
Only use the APIC pending timers count to break out of HLT emulation if
the timer vector is enabled.

Certain configurations of Windows simply mask out the vector without
disabling the timer.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:39:39 +03:00
Marcelo Tosatti
eedaa4e2af KVM: PIT: take inject_pending into account when emulating hlt
Otherwise hlt emulation fails if PIT is not injecting IRQ's.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:34:15 +03:00
Avi Kivity
107d6d2efa KVM: x86 emulator: fix writes to registers with modrm encodings
A register destination encoded with a mod=3 encoding left dst.ptr NULL.
Normally we don't trap writes to registers, but in the case of smsw, we do.

Fix by pointing dst.ptr at the destination register.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:34:14 +03:00
Thomas Gleixner
e9623b3559 x86: disable mwait for AMD family 10H/11H CPUs
The previous revert of 0c07ee38c9 left
out the mwait disable condition for AMD family 10H/11H CPUs.

Andreas Herrman said:

It depends on the CPU. For AMD CPUs that support MWAIT this is wrong.
Family 0x10 and 0x11 CPUs will enter C1 on HLT. Powersavings then
depend on a clock divisor and current Pstate of the core.

If all cores of a processor are in halt state (C1) the processor can
enter the C1E (C1 enhanced) state. If mwait is used this will never
happen.

Thus HLT saves more power than MWAIT here.

It might be best to switch off the mwait flag for these AMD CPU
families like it was introduced with commit
f039b75471 (x86: Don't use MWAIT on AMD
Family 10)

Re-add the AMD families 10H/11H check and disable the mwait usage for
those.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-17 22:57:20 +02:00
Avi Kivity
31f4d870b0 x86: fix crash on cpu hotplug on pat-incapable machines
pat_disable() is __init, which means it goes away after booting is complete.
Unfortunately it is used by the hotplug code if the machine is not
pat-capable, causing a crash.

Fix by marking pat_disable() as __cpuinit.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-17 22:57:20 +02:00
Ingo Molnar
a738d897b7 x86: remove mwait capability C-state check
Vegard Nossum reports:

| powertop shows between 200-400 wakeups/second with the description
| "<kernel IPI>: Rescheduling interrupts" when all processors have load (e.g.
| I need to run two busy-loops on my 2-CPU system for this to show up).
|
| The bisect resulted in this commit:
|
| commit 0c07ee38c9
| Date:   Wed Jan 30 13:33:16 2008 +0100
|
|     x86: use the correct cpuid method to detect MWAIT support for C states

remove the functional effects of this patch and make mwait unconditional.

A future patch will turn off mwait on specific CPUs where that causes
power to be wasted.

Bisected-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-17 22:57:20 +02:00
Al Viro
f52111b154 [PATCH] take init_files to fs/file.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-16 17:22:20 -04:00
Linus Torvalds
4ef7e3e90f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: user_regset_view table fix for ia32 on 64-bit
  x86: arch/x86/mm/pat.c - fix warning
  x86: fix csum_partial() export
  x86: early_init_centaur(): use set_cpu_cap()
  x86: fix app crashes after SMP resume
  x86: wakeup.lds.S - section ordering fix
  x86: [VOYAGER] fix duplicate phys_cpu_present_map symbol
  x86/pci: fix broken ISA DMA
2008-05-13 12:33:56 -07:00
Roland McGrath
1f465f4e47 x86: user_regset_view table fix for ia32 on 64-bit
The user_regset_view table for the 32-bit regsets on the 64-bit build had
the wrong sizes for the FP regsets.  This bug had no user-visible effect
(just on kernel modules using the user_regset interfaces and the like).
But the fix is trivial and risk-free.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:40:20 +02:00
Pranith Kumar
afc8534380 x86: arch/x86/mm/pat.c - fix warning
fix this warning:

 arch/x86/mm/pat.c: In function `phys_mem_access_prot_allowed':
 arch/x86/mm/pat.c:558: warning: long long unsigned int format, long
 unsigned int arg (arg 6)
 arch/x86/mm/pat.c: In function `map_devmem':
 arch/x86/mm/pat.c:580: warning: long long unsigned int format, long
 unsigned int arg (arg 6)

Signed-off-by: D Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:39:30 +02:00
Ingo Molnar
89804c022f x86: fix csum_partial() export
Fix this symbol export problem:

    Building modules, stage 2.
    MODPOST 193 modules
    ERROR: "csum_partial" [fs/reiserfs/reiserfs.ko] undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

This is due to a known weakness of symbol exports: if a symbol's
only in-core user is an EXPORT_SYMBOL from a lib-y section, the
symbol is not linked in.

The solution is to move the export to x8664_ksyms_64.c - but the real
solution would be to fix kbuild.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:38:47 +02:00
Andrew Morton
8c45a4e4f2 x86: early_init_centaur(): use set_cpu_cap()
arch/x86/kernel/setup_64.c:954: warning: passing argument 2 of 'set_bit' from incompatible pointer type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:37:38 +02:00
Hugh Dickins
61165d7a03 x86: fix app crashes after SMP resume
After resume on a 2cpu laptop, kernel builds collapse with a sed hang,
sh or make segfault (often on 20295564), real-time signal to cc1 etc.

Several hurdles to jump, but a manually-assisted bisect led to -rc1's
d2bcbad5f3 x86: do not zap_low_mappings
in __smp_prepare_cpus.  Though the low mappings were removed at bootup,
they were left behind (with Global flags helping to keep them in TLB)
after resume or cpu online, causing the crashes seen.

Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up
on x86_32.  This used to be serialized by smp_commenced_mask: that's now
gone, but a low_mappings flag will do.  No need for native_smp_cpus_done
to repeat the zap: let mem_init zap BSP's low mappings just like on UP.

(In passing, fix error code from native_cpu_up: do_boot_cpu returns a
variety of diagnostic values, Dprintk what it says but convert to -EIO.
And save_pg_dir separately before zap_low_mappings: doesn't matter now,
but zapping twice in succession wiped out resume's swsusp_pg_dir.)

That worked well on the duo and one quad, but wouldn't boot 3rd or 4th
cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock.  The TLB flush
IPI now being sent reveals a long-standing bug: the booting cpu has its
APIC readied in smp_callin at the top of start_secondary, but isn't put
into the cpu_online_map until just before that unlock_ipi_call_lock.

So native_smp_call_function_mask to online cpus would send_IPI_allbutself,
including the cpu just coming up, though it has been excluded from the
count to wait for: by the time it handles the IPI, the call data on
native_smp_call_function_mask's stack may well have been overwritten.

So fall back to send_IPI_mask while cpu_online_map does not match
cpu_callout_map: perhaps there's a better APICological fix to be
made at the start_secondary end, but I wouldn't know that.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:36:12 +02:00
Venki Pallipadi
77db988564 x86/PCI: X86_PAT & mprotect
Some versions of X used the mprotect workaround to change caching type from UC
to WB, so that it can then use mtrr to program WC for that region [1].  Change
the mmap of pci space through /sys or /proc interfaces from UC to UC_MINUS.
With this change, X will not need to use mprotect workaround to get WC type
since the MTRR mapping type will be honored.

The bug in mprotect that clobbers PAT bits is fixed in a follow on patch. So,
this X workaround will stop working as well.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-13 09:51:54 -07:00
Takashi Iwai
4a367f3a9d x86/PCI: fix broken ISA DMA
Rene Herman reported:

> commit 8779f2fc3b
>
> "x86: don't try to allocate from DMA zone at first"
>
> breaks all of ISA DMA. Or all of ALSA ISA DMA at least. All
> ISA soundcards are silent following that commit -- no error
> messages, everything appears fine, just silence.

That patch is buggy. We had an implicit assumption that
dev = NULL for ISA devices that require 24bit DMA.

The recent work on x86 dma_alloc_coherent() breaks the ISA DMA buffer
allocation, which is represented by "dev = NULL" and requires 24bit
DMA implicitly.

Bisected-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-13 09:51:53 -07:00
Cyrill Gorcunov
8c6b0ef2ea x86: wakeup.lds.S - section ordering fix
To allow linker to catch sections overlapping we have to declare
them in appropriate order.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 21:27:51 +02:00
James Bottomley
f8955ebe3e x86: [VOYAGER] fix duplicate phys_cpu_present_map symbol
The phys_cpu_present_map is an expected symbol in the SMP harness.
Unfortunately, x86 recently moved this and a few others to
kernel/setup.c where it doesn't quite work because voyager has to
define its own.  Use CONFIG_X86_LOCAL_APIC to isolate these
definitions and fix up another area in setup.c where CONFIG_X86_SMP
should be used instead of CONFIG_SMP.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: toralf.foerster@gmx.de
Cc: Mike Travis <travis@sgi.com>
Cc: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 21:27:51 +02:00
Takashi Iwai
8965eb1938 x86/pci: fix broken ISA DMA
Rene Herman reported:

> commit 8779f2fc3b
>
> "x86: don't try to allocate from DMA zone at first"
>
> breaks all of ISA DMA. Or all of ALSA ISA DMA at least. All
> ISA soundcards are silent following that commit -- no error
> messages, everything appears fine, just silence.

That patch is buggy. We had an implicit assumption that
dev = NULL for ISA devices that require 24bit DMA.

The recent work on x86 dma_alloc_coherent() breaks the ISA DMA buffer
allocation, which is represented by "dev = NULL" and requires 24bit
DMA implicitly.

Bisected-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-12 21:27:50 +02:00
Linus Torvalds
3e1b83ab39 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: rdc: leds build/config fix
  x86: sysfs cpu?/topology is empty in 2.6.25 (32-bit Intel system)
  x86: revert commit 709f744 ("x86: bitops asm constraint fixes")
  x86: restrict keyboard io ports reservation to make ipmi driver work
  x86: fix fpu restore from sig return
  x86: remove spew print out about bus to node mapping
  x86: revert printk format warning change which is for linux-next
  x86: cleanup PAT cpu validation
  x86: geode: define geode_has_vsa2() even if CONFIG_MGEODE_LX is not set
  x86: GEODE: cache results from geode_has_vsa2() and uninline
  x86: revert geode config dependency
2008-05-10 21:10:48 -07:00
Ingo Molnar
82fd866701 x86: rdc: leds build/config fix
select NEW_LEDS for now until the Kconfig dependencies have been
fixed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Helge Wagner
9096bd7a66 x86: restrict keyboard io ports reservation to make ipmi driver work
On some of our (single board computer) boards (x86) we are using an
IPMI controller that uses I/O ports 0x62 and 0x66 for a KCS (keyboard
controller style) IPMI system interface.

Trying to load the openipmi driver fails, because the ports
(0x62/0x66) are reserved for keyboard. keyboard reserves the full
range 0x60-0x6F while it doesn't need to.

Reserve only ports 0x60 and 0x64 for the legacy PS/2 i8042 keyboad
controller instead of 0x60-0x6F to allow the openipmi driver to work.

[ tglx: added 64bit fixup ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-10 19:31:45 +02:00
Suresh Siddha
fd3c3ed5d1 x86: fix fpu restore from sig return
If the task never used fpu, initialize the fpu before restoring the FP
state from the signal handler context. This will allocate the fpu
state, if the task never needed it before.

Reported-and-bisected-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Frederik Deweerdt <deweerdt@free.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Yinghai Lu
0646153921 x86: remove spew print out about bus to node mapping
Jeff Garzik pointed out that this printout is not needed.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Thomas Gleixner
5ecddcebfb x86: revert printk format warning change which is for linux-next
commit 62179849b4
    x86: fix setup printk format warning

is for linux-next and not for .26

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:44 +02:00
Linus Torvalds
8d53910856 Revert "PCI: remove default PCI expansion ROM memory allocation"
This reverts commit 9f8daccaa0, which was
reported to break X startup (xf86-video-ati-6.8.0). See

	http://bugs.freedesktop.org/show_bug.cgi?id=15523

for details.

Reported-by: Laurence Withers <l@lwithers.me.uk>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08 19:02:55 -07:00
Linus Torvalds
f589274533 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] soc at91 minor bug fixes
  [ALSA] soc - at91-pcm - Fix line wrapping
  pcspkr: fix dependancies
2008-05-08 10:58:45 -07:00
Thomas Gleixner
8d4a430085 x86: cleanup PAT cpu validation
Move the scattered checks for PAT support to a single function. Its
moved to addon_cpuid_features.c as this file is shared between 32 and
64 bit.

Remove the manipulation of the PAT feature bit and just disable PAT in
the PAT layer, based on the PAT bit provided by the CPU and the
current CPU version/model white list.

Change the boot CPU check so it works on Voyager somewhere in the
future as well :) Also panic, when a secondary has PAT disabled but
the primary one has alrady switched to PAT. We have no way to undo
that.

The white list is kept for now to ensure that we can rely on known to
work CPU types and concentrate on the software induced problems
instead of fighthing CPU erratas and subtle wreckage caused by not yet
verified CPUs. Once the PAT code has stabilized enough, we can remove
the white list and open the can of worms.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08 15:43:51 +02:00
Andres Salomon
547acec7ec x86: GEODE: cache results from geode_has_vsa2() and uninline
This moves geode_has_vsa2 into a .c file, caches the result we get from
the VSA virtual registers, and causes the function to no longer be inline.

[akpm@linux-foundation.org: cleanup]

Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08 15:43:50 +02:00
Thomas Gleixner
ac44cc96fb x86: revert geode config dependency
commit e26a28d190
    x86: olpc build fix

was a fix to a patch that was withdrawn/delayed and then erroneously
commited to x86.git. Revert it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-08 15:43:50 +02:00
Stas Sergeev
e5e1d3cb20 pcspkr: fix dependancies
fix pcspkr dependancies: make the pcspkr platform
drivers to depend on a platform device, and
not the other way around.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
CC: Vojtech Pavlik <vojtech@suse.cz>
CC: Michael Opdenacker <michael-lists@free-electrons.com>
[fixed for 2.6.26-rc1 by tiwai]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-05-07 12:42:03 +02:00
Hugh Dickins
aeed5fce37 x86: fix PAE pmd_bad bootup warning
Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32.

That came from 9fc34113f6 x86: debug pmd_bad();
but we understand now that the typecasting was wrong for PAE in the previous
version: pagetable pages above 4GB looked bad and stopped Arjan from booting.

And revert that cded932b75 x86: fix pmd_bad
and pud_bad to support huge pages.  It was the wrong way round: we shouldn't
weaken every pmd_bad and pud_bad check to let huge pages slip through - in
part they check that we _don't_ have a huge page where it's not expected.

Put the x86 pmd_bad() and pud_bad() definitions back to what they have long
been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking
junk in the upper word is good; and x86_64 should follow x86_32's stricter
comparison, to stop thinking any subset of required bits is good); but that
should be a later patch.

Fix Hans' good observation that follow_page() will never find pmd_huge()
because that would have already failed the pmd_bad test: test pmd_huge in
between the pmd_none and pmd_bad tests.  Tighten x86's pmd_huge() check?
No, once it's a hugepage entry, it can get quite far from a good pmd: for
example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits.

However... though follow_page() contains this and another test for huge
pages, so it's nice to keep it working on them, where does it actually get
called on a huge page?  get_user_pages() checks is_vm_hugetlb_page(vma) to
to call alternative hugetlb processing, as does unmap_vmas() and others.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-06 13:08:58 -07:00
Linus Torvalds
bb896afe20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes:
  sched: default to n for GROUP_SCHED and FAIR_GROUP_SCHED
  sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
  sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
  sched: fix cpu clock
  sched: fair-group: fix a Div0 error of the fair group scheduler
  sched: fix missing locking in sched_domains code
  sched: make clock sync tunable by architecture code
  sched: fix debugging
  sched: fix sched_info_switch not being called according to documentation
  sched: fix hrtick_start_fair and CPU-Hotplug
  sched: fix SCHED_FAIR wake-idle logic error
  sched: fix RT task-wakeup logic
  sched: add statics, don't return void expressions
  sched: add debug checks to idle functions
  sched: remove old sched doc
  sched: make rt_sched_class, idle_sched_class static
  sched: optimize calc_delta_mine()
  sched: fix normalized sleeper
2008-05-05 17:31:14 -07:00
Ingo Molnar
a5574cf65b sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
add the HAVE_UNSTABLE_SCHED_CLOCK, for architectures to select.

the next change utilizes it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-05 23:56:18 +02:00
Linus Torvalds
108c196184 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86 PCI: call dmi_check_pciprobe()
  x86/pci: add pci=skip_isa_align command lines.
  x86/pci: remove flag in pci_cfg_space_size_ext
  x86: fix section mismatch in pci_scan_bus
2008-05-05 12:39:10 -07:00
Yinghai Lu
0df18ff366 x86 PCI: call dmi_check_pciprobe()
this change:

| commit 08f1c192c3
| Author: Muli Ben-Yehuda <muli@il.ibm.com>
| Date:   Sun Jul 22 00:23:39 2007 +0300
|
|    x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata
|
|    This patch introduces struct pci_sysdata to x86 and x86-64, and
|    converts the existing two users (NUMA, Calgary) to use it.
|
|    This lays the groundwork for having other users of sysdata, such as
|    the PCI domains work.
|
|    The Calgary bits are tested, the NUMA bits just look ok.

replaces pcibios_scan_root with pci_scan_bus_parented...

but in pcibios_scan_root we have a DMI check:

    dmi_check_system(pciprobe_dmi_table);

when when have several peer root buses this could be called multiple
times (which is bad), so move that call to pci_access_init().

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-05 09:24:00 -07:00
Yinghai Lu
13a6ddb08e x86/pci: add pci=skip_isa_align command lines.
so we don't align the io port start address for pci cards.

also move out dmi check out acpi.c, because it has nothing to do with acpi.
it could spare some calling when we have several peer root buses.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-05 09:22:08 -07:00
Linus Torvalds
45ea2103d8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes:
  x86: fix setup printk format warning
  x86: olpc build fix
  x86: video/fbdev.c: add MODULE_LICENSE
  x86: fix up bootparam.h for userspace inclusion
  x86: relocs ELF handling - use SELFMAG instead of numeric constant
  x86: vdso ELF handling - use SELFMAG instead of numeric constant
  x86: remove dell reboot dmi quirk board name match
  x86: es7000 build fix
  x86: make additional_cpus static
  x86: make start_secondary() static
  kbuild, suspend, x86: fix rebuild of wakeup.bin
  uml: fix gcc problem
  x86: undo visws/numaq build changes
2008-05-04 17:11:43 -07:00