The main benefit of using ACPI host bridge window information is that
we can do better resource allocation in systems with multiple host bridges,
e.g., http://bugzilla.kernel.org/show_bug.cgi?id=14183
Sometimes we need _CRS information even if we only have one host bridge,
e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/341681
Most of these systems are relatively new, so this patch turns on
"pci=use_crs" only on machines with a BIOS date of 2008 or newer.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Previously we used a table of size PCI_BUS_NUM_RESOURCES (16) for resources
forwarded to a bus by its upstream bridge. We've increased this size
several times when the table overflowed.
But there's no good limit on the number of resources because host bridges
and subtractive decode bridges can forward any number of ranges to their
secondary buses.
This patch reduces the table to only PCI_BRIDGE_RESOURCE_NUM (4) entries,
which corresponds to the number of windows a PCI-to-PCI (3) or CardBus (4)
bridge can positively decode. Any additional resources, e.g., PCI host
bridge windows or subtractively-decoded regions, are kept in a list.
I'd prefer a single list rather than this split table/list approach, but
that requires simultaneous changes to every architecture. This approach
only requires immediate changes where we set up (a) host bridges with more
than four windows and (b) subtractive-decode P2P bridges, and we can
incrementally change other architectures to use the list.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No functional change; this converts loops that iterate from 0 to
PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the
pci_bus_for_each_resource() iterator instead.
This doesn't change the way resources are stored; it merely removes
dependencies on the fact that they're in a table.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Now that we return the new resource start position, there is no
need to update "struct resource" inside the align function.
Therefore, mark the struct resource as const.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
As suggested by Linus, align functions should return the start
of a resource, not void. An update of "res->start" is no longer
necessary.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On VIVT ARM, when we have multiple shared mappings of the same file
in the same MM, we need to ensure that we have coherency across all
copies. We do this via make_coherent() by making the pages
uncacheable.
This used to work fine, until we allowed highmem with highpte - we
now have a page table which is mapped as required, and is not available
for modification via update_mmu_cache().
Ralf Beache suggested getting rid of the PTE value passed to
update_mmu_cache():
On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
to construct a pointer to the pte again. Passing a pte_t * is much
more elegant. Maybe we might even replace the pte argument with the
pte_t?
Ben Herrenschmidt would also like the pte pointer for PowerPC:
Passing the ptep in there is exactly what I want. I want that
-instead- of the PTE value, because I have issue on some ppc cases,
for I$/D$ coherency, where set_pte_at() may decide to mask out the
_PAGE_EXEC.
So, pass in the mapped page table pointer into update_mmu_cache(), and
remove the PTE value, updating all implementations and call sites to
suit.
Includes a fix from Stephen Rothwell:
sparc: fix fallout from update_mmu_cache API change
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Right now xen's use of the x86 and ia64 handle_irq is just bizarre and very
fragile as it is very non-obvious the function exists and is is used by
code out in drivers/.... Luckily using handle_irq is completely unnecessary,
and we can just use the generic irq apis instead.
This still leaves drivers/xen/events.c as a problematic user of the generic
irq apis it has "static struct irq_info irq_info[NR_IRQS]" but that can be
fixed some other time.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <4B7CAAD2.10803@kernel.org>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
We broke "acpi=ht" in 2.6.32 by disabling MADT parsing
for acpi=disabled. e5b8fc6ac158f65598f58dba2c0d52ba3b412f52
This also broke systems which invoked acpi=ht via DMI blacklist.
acpi=ht is a really ugly hack,
but restore it for those that still use it.
http://bugzilla.kernel.org/show_bug.cgi?id=14886
Signed-off-by: Len Brown <len.brown@intel.com>
On x86, before prefill_possible_map(), nr_cpu_ids will be NR_CPUS aka
CONFIG_NR_CPUS.
Add nr_cpus= to set nr_cpu_ids. so we can simulate cpus <=8 are installed on
normal config.
-v2: accordging to Christoph, acpi_numa_init should use nr_cpu_ids in stead of
NR_CPUS.
-v3: add doc in kernel-parameters.txt according to Andrew.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-34-git-send-email-yinghai@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Tony Luck <tony.luck@intel.com>
In its <asm/elf.h> ia64 defines SET_PERSONALITY in a way that unconditionally
sets the personality of the current process to PER_LINUX, losing any flag bits
from the upper 3 bytes of current->personality. This is wrong. Those bits are
intended to be inherited across exec (other code takes care of ensuring that
security sensitive bits like ADDR_NO_RANDOMIZE are not passed to unsuspecting
setuid/setgid applications).
Signed-off-by: Tony Luck <tony.luck@intel.com>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This has been broken since May 2008 when Al Viro killed altroot support.
Since nobody has complained, it would appear that there are no users of
this code (A plausible theory since the main OSVs that support ia64 prefer
to use the IA32-EL software emulation).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pass the clocksource as an argument to the clocksource resume callback.
Needed so we can point out which CMT channel the sh_cmt.c driver shall
resume.
Signed-off-by: Magnus Damm <damm@opensource.se>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Disable kprobe booster when CONFIG_PREEMPT=y at this time,
because it can't ensure that all kernel threads preempted on
kprobe's boosted slot run out from the slot even using
freeze_processes().
The booster on preemptive kernel will be resumed if
synchronize_tasks() or something like that is introduced.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100202214904.4694.24330.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just a small change to a couple of scripts to go from
#!/usr/bin/env python
to
#!/usr/bin/python
This shouldn't effect anyone, unless they don't install python there.
In preparation for python3, Fedora is doing a big push to change the scripts
to use the system python. This allows developers to put the python3 in
their path without fear of breaking existing scripts.
Now I am pretty sure anyone using python3 for testing purposes will probably
not run any of the scripts I changed, but Fedora has this automated tool
that checks for this stuff so I thought I would try to push it upstream.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
acpi_integer is now obsolete and removed from the ACPICA code base,
replaced by u64.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
What it is: vhost net is a character device that can be used to reduce
the number of system calls involved in virtio networking.
Existing virtio net code is used in the guest without modification.
There's similarity with vringfd, with some differences and reduced scope
- uses eventfd for signalling
- structures can be moved around in memory at any time (good for
migration, bug work-arounds in userspace)
- write logging is supported (good for migration)
- support memory table and not just an offset (needed for kvm)
common virtio related code has been put in a separate file vhost.c and
can be made into a separate module if/when more backends appear. I used
Rusty's lguest.c as the source for developing this part : this supplied
me with witty comments I wouldn't be able to write myself.
What it is not: vhost net is not a bus, and not a generic new system
call. No assumptions are made on how guest performs hypercalls.
Userspace hypervisors are supported as well as kvm.
How it works: Basically, we connect virtio frontend (configured by
userspace) to a backend. The backend could be a network device, or a tap
device. Backend is also configured by userspace, including vlan/mac
etc.
Status: This works for me, and I haven't see any crashes.
Compared to userspace, people reported improved latency (as I save up to
4 system calls per packet), as well as better bandwidth and CPU
utilization.
Features that I plan to look at in the future:
- mergeable buffers
- zero copy
- scalability tuning: figure out the best threading model to use
Note on RCU usage (this is also documented in vhost.h, near
private_pointer which is the value protected by this variant of RCU):
what is happening is that the rcu_dereference() is being used in a
workqueue item. The role of rcu_read_lock() is taken on by the start of
execution of the workqueue item, of rcu_read_unlock() by the end of
execution of the workqueue item, and of synchronize_rcu() by
flush_workqueue()/flush_work(). In the future we might need to apply
some gcc attribute or sparse annotation to the function passed to
INIT_WORK(). Paul's ack below is for this RCU usage.
(Includes fixes by Alan Cox <alan@linux.intel.com>,
David L Stevens <dlstevens@us.ibm.com>,
Chris Wright <chrisw@redhat.com>)
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__per_cpu_idtrs is statically allocated ... on CONFIG_NR_CPUS=4096
systems it hogs 16MB of memory. This is way too much for a quite
probably unused facility (only KVM uses dynamic TR registers).
Change to an array of pointers, and allocate entries as needed on
a per cpu basis. Change the name too as the __per_cpu_ prefix is
confusing (this isn't a classic <linux/percpu.h> type object).
Signed-off-by: Tony Luck <tony.luck@intel.com>
Make sure compiler won't do weird things with limits. E.g. fetching
them twice may return 2 different values after writable limits are
implemented.
I.e. either use rlimit helpers added in
3e10e716abf3c71bdb5d86b8f507f9e72236c9cd
or ACCESS_ONCE if not applicable.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
pcibus_to_node can return -1 if we cannot determine which node a pci bus
is on. If passed -1, cpumask_of_node will negatively index the lookup array
and pull in random data:
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpus
00000000,00000003,00000000,00000000
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpulist
64-65
Change cpumask_of_node to check for -1 and return cpu_all_mask in this
case:
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpus
ffffffff,ffffffff,ffffffff,ffffffff
# cat /sys/devices/pci0000:00/0000:00:01.0/local_cpulist
0-127
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Len Brown pointed out that allmodconfig is broken for
ia64 because of:
arch/ia64/kvm/vmm.c: In function 'vmm_spin_unlock':
arch/ia64/kvm/vmm.c:70: error: 'spinlock_t' has no member named 'raw_lock'
KVM has it's own spinlock routines. It should not depend on the base kernel
spinlock_t type (which changed when ia64 switched to ticket locks). Define
its own vmm_spinlock_t type.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The x86 and ia64 implementations of the function in $subject are
exactly the same.
Also, since the arch-specific implementations of setting _PDC have
been completely hollowed out, remove the empty shells.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The only thing arch-specific about calling _PDC is what bits get
set in the input obj_list buffer.
There's no need for several levels of indirection to twiddle those
bits. Additionally, since we're just messing around with a buffer,
we can simplify the interface; no need to pass around the entire
struct acpi_processor * just to get at the buffer.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Both x86 and ia64 initialize _PDC with mostly common bit settings.
Factor out the common settings and leave the arch-specific ones alone.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The x86 and ia64 implementations of arch_acpi_processor_init_pdc()
are almost exactly the same. The only difference is in what bits
they set in obj_list buffer.
Combine the boilerplate memory management code, and leave the
arch-specific bit twiddling in separate implementations.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
arch dependent helper function that tells us if we should attempt to
evaluate _PDC on this machine or not.
The x86 implementation assumes that the CPUs in the machine must be
homogeneous, and that you cannot mix CPUs of different vendors.
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits)
net: fix for utsrelease.h moving to generated
gen_init_cpio: fixed fwrite warning
kbuild: fix make clean after mismerge
kbuild: generate modules.builtin
genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}()
score: add asm/asm-offsets.h wrapper
unifdef: update to upstream revision 1.190
kbuild: specify absolute paths for cscope
kbuild: create include/generated in silentoldconfig
scripts/package: deb-pkg: use fakeroot if available
scripts/package: add KBUILD_PKG_ROOTCMD variable
scripts/package: tar-pkg: use tar --owner=root
Kbuild: clean up marker
net: add net_tstamp.h to headers_install
kbuild: move utsrelease.h to include/generated
kbuild: move autoconf.h to include/generated
drop explicit include of autoconf.h
kbuild: move compile.h to include/generated
kbuild: drop include/asm
kbuild: do not check for include/asm-$ARCH
...
Fixed non-conflicting clean merge of modpost.c as per comments from
Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h
that needed to be changed to generated/autoconf.h)
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
direct I/O fallback sync simplification
ocfs: stop using do_sync_mapping_range
cleanup blockdev_direct_IO locking
make generic_acl slightly more generic
sanitize xattr handler prototypes
libfs: move EXPORT_SYMBOL for d_alloc_name
vfs: force reval of target when following LAST_BIND symlinks (try #7)
ima: limit imbalance msg
Untangling ima mess, part 3: kill dead code in ima
Untangling ima mess, part 2: deal with counters
Untangling ima mess, part 1: alloc_file()
O_TRUNC open shouldn't fail after file truncation
ima: call ima_inode_free ima_inode_free
IMA: clean up the IMA counts updating code
ima: only insert at inode creation time
ima: valid return code from ima_inode_alloc
fs: move get_empty_filp() deffinition to internal.h
Sanitize exec_permission_lite()
Kill cached_lookup() and real_lookup()
Kill path_lookup_open()
...
Trivial conflicts in fs/direct-io.c
* git://git.infradead.org/iommu-2.6:
implement early_io{re,un}map for ia64
Revert "Intel IOMMU: Avoid memory allocation failures in dma map api calls"
intel-iommu: ignore page table validation in pass through mode
intel-iommu: Fix oops with intel_iommu=igfx_off
intel-iommu: Check for an RMRR which ends before it starts.
intel-iommu: Apply BIOS sanity checks for interrupt remapping too.
intel-iommu: Detect DMAR in hyperspace at probe time.
dmar: Fix build failure without NUMA, warn on bogus RHSA tables and don't abort
iommu: Allocate dma-remapping structures using numa locality info
intr_remap: Allocate intr-remapping table using numa locality info
dmar: Allocate queued invalidation structure using numa locality info
dmar: support for parsing Remapping Hardware Static Affinity structure
dma_mask is, when interpreted as address, the last valid byte, and hence
comparison msut also be done using the last valid of the buffer in
question.
Also fix the open-coded instances in lib/swiotlb.c.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently all architectures but microblaze unconditionally define
USE_ELF_CORE_DUMP. The microblaze omission seems like an error to me, so
let's kill this ifdef and make sure we are the same everywhere.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: <linux-arch@vger.kernel.org>
Cc: Michal Simek <michal.simek@petalogix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Needed for commit 2c992208 ("intel-iommu: Detect DMAR in hyperspace at
probe time.) to build on IA64.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
clockevents: Convert to raw_spinlock
clockevents: Make tick_device_lock static
debugobjects: Convert to raw_spinlocks
perf_event: Convert to raw_spinlock
hrtimers: Convert to raw_spinlocks
genirq: Convert irq_desc.lock to raw_spinlock
smp: Convert smplocks to raw_spinlocks
rtmutes: Convert rtmutex.lock to raw_spinlock
sched: Convert pi_lock to raw_spinlock
sched: Convert cpupri lock to raw_spinlock
sched: Convert rt_runtime_lock to raw_spinlock
sched: Convert rq->lock to raw_spinlock
plist: Make plist debugging raw_spinlock aware
bkl: Fixup core_lock fallout
locking: Cleanup the name space completely
locking: Further name space cleanups
alpha: Fix fallout from locking changes
locking: Implement new raw_spinlock
locking: Convert raw_rwlock functions to arch_rwlock
locking: Convert raw_rwlock to arch_rwlock
...
Move definition of NUMA_NO_NODE from ia64 and x86_64 arch specific headers
to generic header 'linux/numa.h' for use in generic code. NUMA_NO_NODE
replaces bare '-1' where it's used in this series to indicate "no node id
specified". Ultimately, it can be used to replace the -1 elsewhere where
it is used similarly.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SPIN_LOCK_UNLOCKED is deprecated. Use __SPIN_LOCK_UNLOCKED instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-ia64@vger.kernel.org
Signed-off-by: Tony Luck <tony.luck@intel.com>
It's possible that SBA IOMMU might fail to find I/O space under heavy
I/Os. SBA IOMMU panics on allocation failure but it shouldn't; drivers
can handle the failure. The majority of other IOMMU drivers don't panic
on allocation failure.
This patch fixes SBA IOMMU path to handle allocation failure properly.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This is a patch related to this discussion.
http://www.spinics.net/lists/linux-ia64/msg07605.html
When INIT is sent, ip/psr/pfs register is stored to the I-resources
(iip/ipsr/ifs registers), and they are copied in the min-state save
area(pmsa_{iip,ipsr,ifs}).
Therefore, in creating pt_regs at ia64_mca_modify_original_stack(),
cr_{iip,ipsr,ifs} should be derived from pmsa_{iip,ipsr,ifs}. But
current code copies pmsa_{xip,xpsr,xfs} to cr_{iip,ipsr,ifs}
when PSR.ic is 0.
finish_pt_regs(struct pt_regs *regs, const pal_min_state_area_t *ms,
unsigned long *nat)
{
(snip)
if (ia64_psr(regs)->ic) {
regs->cr_iip = ms->pmsa_iip;
regs->cr_ipsr = ms->pmsa_ipsr;
regs->cr_ifs = ms->pmsa_ifs;
} else {
regs->cr_iip = ms->pmsa_xip;
regs->cr_ipsr = ms->pmsa_xpsr;
regs->cr_ifs = ms->pmsa_xfs;
}
It's ok when PSR.ic is not 0. But when PSR.ic is 0, this could be
a problem when we investigate kernel as the value of regs->cr_iip does
not point to where INIT really interrupted.
At first I tried to change finish_pt_regs() so that it uses always
pmsa_{iip,ipsr,ifs} for cr_{iip,ipsr,ifs}, but Keith Owens pointed out
it could cause another problem if I change it.
>The only problem I can think of is an MCA/INIT
>arriving while code like SAVE_MIN or SAVE_REST is executing. Back
>tracing at that point using pmsa_iip is going to be a problem, you have
>no idea what state the registers or stack are in.
I confirmed he was right, so I decided to keep it as-is and to
save pmsa_{iip,ipsr,ifs} to ia64_sal_os_state for debugging.
An attached patch is just adding new members into ia64_sal_os_state to
save pmsa_{iip,ipsr,ifs}.
Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Previously, we tried to use IA64_DEF_FIRST_DEVICE_VECTOR (0x30) as the
IA64_IRQ_MOVE_VECTOR. However, we allocate other IRQs from the device
vector range, so there's no guarantee that IA64_DEF_FIRST_DEVICE_VECTOR
will still be available when we register IA64_IRQ_MOVE_VECTOR.
This patch statically allocates 0x30 for IA64_IRQ_MOVE_VECTOR and
removes it from the device vector range.
Without this patch, we crash on machines like the HP rx3600 that use
vector 48 (0x30) as the ACPI SCI interrupt:
kernel BUG at arch/ia64/kernel/irq_ia64.c:647!
swapper[0]: bugcheck! 0 [1]
Modules linked in:
Pid: 0, CPU 0, comm: swapper
psr : 00001010084a2018 ifs : 800000000000030e ip : [<a000000100012ed0>] Not tainted (2.6.32-rc8-00184-gd5d4ec8)
ip is at ia64_native_register_percpu_irq+0x110/0x1e0
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>