12659 Commits

Author SHA1 Message Date
Alexander Graf
616dff8602 KVM: PPC: Book3S PR: Handle Facility interrupt and FSCR
POWER8 introduced a new interrupt type called "Facility unavailable interrupt"
which contains its status message in a new register called FSCR.

Handle these exits and try to emulate instructions for unhandled facilities.
Follow-on patches enable KVM to expose specific facilities into the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:22 +02:00
Alexander Graf
a5948fa092 KVM: PPC: Book3S PR: Emulate TIR register
In parallel to the Processor ID Register (PIR) threaded POWER8 also adds a
Thread ID Register (TIR). Since PR KVM doesn't emulate more than one thread
per core, we can just always expose 0 here.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:22 +02:00
Alexander Graf
f8f6eb0d18 KVM: PPC: Book3S PR: Ignore PMU SPRs
When we expose a POWER8 CPU into the guest, it will start accessing PMU SPRs
that we don't emulate. Just ignore accesses to them.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:22 +02:00
Alexander Graf
f24bc1ed45 KVM: PPC: Book3S: Move little endian conflict to HV KVM
With the previous patches applied, we can now successfully use PR KVM on
little endian hosts which means we can now allow users to select it.

However, HV KVM still needs some work, so let's keep the kconfig conflict
on that one.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:21 +02:00
Alexander Graf
cd087eefe6 KVM: PPC: Book3S PR: Do dcbz32 patching with big endian instructions
When the host CPU we're running on doesn't support dcbz32 itself, but the
guest wants to have dcbz only clear 32 bytes of data, we loop through every
executable mapped page to search for dcbz instructions and patch them with
a special privileged instruction that we emulate as dcbz32.

The only guests that want to see dcbz act as 32byte are book3s_32 guests, so
we don't have to worry about little endian instruction ordering. So let's
just always search for big endian dcbz instructions, also when we're on a
little endian host.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:21 +02:00
Alexander Graf
5deb8e7ad8 KVM: PPC: Make shared struct aka magic page guest endian
The shared (magic) page is a data structure that contains often used
supervisor privileged SPRs accessible via memory to the user to reduce
the number of exits we have to take to read/write them.

When we actually share this structure with the guest we have to maintain
it in guest endianness, because some of the patch tricks only work with
native endian load/store operations.

Since we only share the structure with either host or guest in little
endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv.

For booke, the shared struct stays big endian. For book3s_64 hv we maintain
the struct in host native endian, since it never gets shared with the guest.

For book3s_64 pr we introduce a variable that tells us which endianness the
shared struct is in and route every access to it through helper inline
functions that evaluate this variable.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:21 +02:00
Alexander Graf
2743103f91 KVM: PPC: PR: Fill pvinfo hcall instructions in big endian
We expose a blob of hypercall instructions to user space that it gives to
the guest via device tree again. That blob should contain a stream of
instructions necessary to do a hypercall in big endian, as it just gets
passed into the guest and old guests use them straight away.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:20 +02:00
Alexander Graf
b59d9d26be KVM: PPC: Book3S PR: PAPR: Access RTAS in big endian
When the guest does an RTAS hypercall it keeps all RTAS variables inside a
big endian data structure.

To make sure we don't have to bother about endianness inside the actual RTAS
handlers, let's just convert the whole structure to host endian before we
call our RTAS handlers and back to big endian when we return to the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:20 +02:00
Alexander Graf
1692aa3faa KVM: PPC: Book3S PR: PAPR: Access HTAB in big endian
The HTAB on PPC is always in big endian. When we access it via hypercalls
on behalf of the guest and we're running on a little endian host, we need
to make sure we swap the bits accordingly.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:20 +02:00
Alexander Graf
94810ba4ed KVM: PPC: Book3S PR: Default to big endian guest
The default MSR when user space does not define anything should be identical
on little and big endian hosts, so remove MSR_LE from it.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:20 +02:00
Alexander Graf
14a7d41dad KVM: PPC: Book3S_64 PR: Access shadow slb in big endian
The "shadow SLB" in the PACA is shared with the hypervisor, so it has to
be big endian. We access the shadow SLB during world switch, so let's make
sure we access it in big endian even when we're on a little endian host.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:19 +02:00
Alexander Graf
4e509af9f8 KVM: PPC: Book3S_64 PR: Access HTAB in big endian
The HTAB is always big endian. We access the guest's HTAB using
copy_from/to_user, but don't yet take care of the fact that we might
be running on an LE host.

Wrap all accesses to the guest HTAB with big endian accessors.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:19 +02:00
Alexander Graf
860540bc50 KVM: PPC: Book3S_32: PR: Access HTAB in big endian
The HTAB is always big endian. We access the guest's HTAB using
copy_from/to_user, but don't yet take care of the fact that we might
be running on an LE host.

Wrap all accesses to the guest HTAB with big endian accessors.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:19 +02:00
Alexander Graf
740f834eb2 KVM: PPC: Book3S: PR: Fix C/R bit setting
Commit 9308ab8e2d made C/R HTAB updates go byte-wise into the target HTAB.
However, it didn't update the guest's copy of the HTAB, but instead the
host local copy of it.

Write to the guest's HTAB instead.

Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Paul Mackerras <paulus@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
2014-05-30 14:26:18 +02:00
Aneesh Kumar K.V
7562c4fded KVM: PPC: BOOK3S: PR: Fix WARN_ON with debug options on
With debug option "sleep inside atomic section checking" enabled we get
the below WARN_ON during a PR KVM boot. This is because upstream now
have PREEMPT_COUNT enabled even if we have preempt disabled. Fix the
warning by adding preempt_disable/enable around floating point and altivec
enable.

WARNING: at arch/powerpc/kernel/process.c:156
Modules linked in: kvm_pr kvm
CPU: 1 PID: 3990 Comm: qemu-system-ppc Tainted: G        W     3.15.0-rc1+ #4
task: c0000000eb85b3a0 ti: c0000000ec59c000 task.ti: c0000000ec59c000
NIP: c000000000015c84 LR: d000000003334644 CTR: c000000000015c00
REGS: c0000000ec59f140 TRAP: 0700   Tainted: G        W      (3.15.0-rc1+)
MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 42000024  XER: 20000000
CFAR: c000000000015c24 SOFTE: 1
GPR00: d000000003334644 c0000000ec59f3c0 c000000000e2fa40 c0000000e2f80000
GPR04: 0000000000000800 0000000000002000 0000000000000001 8000000000000000
GPR08: 0000000000000001 0000000000000001 0000000000002000 c000000000015c00
GPR12: d00000000333da18 c00000000fb80900 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 00003fffce4e0fa1
GPR20: 0000000000000010 0000000000000001 0000000000000002 00000000100b9a38
GPR24: 0000000000000002 0000000000000000 0000000000000000 0000000000000013
GPR28: 0000000000000000 c0000000eb85b3a0 0000000000002000 c0000000e2f80000
NIP [c000000000015c84] .enable_kernel_fp+0x84/0x90
LR [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr]
Call Trace:
[c0000000ec59f3c0] [0000000000000010] 0x10 (unreliable)
[c0000000ec59f430] [d000000003334644] .kvmppc_handle_ext+0x134/0x190 [kvm_pr]
[c0000000ec59f4c0] [d00000000324b380] .kvmppc_set_msr+0x30/0x50 [kvm]
[c0000000ec59f530] [d000000003337cac] .kvmppc_core_emulate_op_pr+0x16c/0x5e0 [kvm_pr]
[c0000000ec59f5f0] [d00000000324a944] .kvmppc_emulate_instruction+0x284/0xa80 [kvm]
[c0000000ec59f6c0] [d000000003336888] .kvmppc_handle_exit_pr+0x488/0xb70 [kvm_pr]
[c0000000ec59f790] [d000000003338d34] kvm_start_lightweight+0xcc/0xdc [kvm_pr]
[c0000000ec59f960] [d000000003336288] .kvmppc_vcpu_run_pr+0xc8/0x190 [kvm_pr]
[c0000000ec59f9f0] [d00000000324c880] .kvmppc_vcpu_run+0x30/0x50 [kvm]
[c0000000ec59fa60] [d000000003249e74] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0 [kvm]
[c0000000ec59faf0] [d000000003244948] .kvm_vcpu_ioctl+0x478/0x760 [kvm]
[c0000000ec59fcb0] [c000000000224e34] .do_vfs_ioctl+0x4d4/0x790
[c0000000ec59fd90] [c000000000225148] .SyS_ioctl+0x58/0xb0
[c0000000ec59fe30] [c00000000000a1e4] syscall_exit+0x0/0x98

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:18 +02:00
Aneesh Kumar K.V
e5ee5422f8 KVM: PPC: BOOK3S: PR: Enable Little Endian PR guest
This patch make sure we inherit the LE bit correctly in different case
so that we can run Little Endian distro in PR mode

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:18 +02:00
Alexander Graf
8f20a3ab27 KVM: PPC: E500: Add dcbtls emulation
The dcbtls instruction is able to lock data inside the L1 cache.

We don't want to give the guest actual access to hardware cache locks,
as that could influence other VMs on the same system. But we can tell
the guest that its locking attempt failed.

By implementing the instruction we at least don't give the guest a
program exception which it definitely does not expect.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:17 +02:00
Alexander Graf
07fec1c2e7 KVM: PPC: E500: Ignore L1CSR1_ICFI,ICLFR
The L1 instruction cache control register contains bits that indicate
that we're still handling a request. Mask those out when we set the SPR
so that a read doesn't assume we're still doing something.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30 14:26:17 +02:00
Bjorn Helgaas
fdaf36bd36 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Fix return value from pci_user_{read,write}_config_*()
  PCI: Turn pcibios_penalize_isa_irq() into a weak function
  PCI: Test for std config alias when testing extended config space
2014-05-28 16:21:25 -06:00
Bjorn Helgaas
d1a2523d2a Merge branches 'pci/hotplug', 'pci/pci_is_bridge' and 'pci/virtualization' into next
* pci/hotplug:
  PCI: cpqphp: Fix possible null pointer dereference
  NVMe: Implement PCIe reset notification callback
  PCI: Notify driver before and after device reset

* pci/pci_is_bridge:
  pcmcia: Use pci_is_bridge() to simplify code
  PCI: pciehp: Use pci_is_bridge() to simplify code
  PCI: acpiphp: Use pci_is_bridge() to simplify code
  PCI: cpcihp: Use pci_is_bridge() to simplify code
  PCI: shpchp: Use pci_is_bridge() to simplify code
  PCI: rpaphp: Use pci_is_bridge() to simplify code
  sparc/PCI: Use pci_is_bridge() to simplify code
  powerpc/PCI: Use pci_is_bridge() to simplify code
  ia64/PCI: Use pci_is_bridge() to simplify code
  x86/PCI: Use pci_is_bridge() to simplify code
  PCI: Use pci_is_bridge() to simplify code
  PCI: Add new pci_is_bridge() interface
  PCI: Rename pci_is_bridge() to pci_has_subordinate()

* pci/virtualization:
  PCI: Introduce new device binding path using pci_dev.driver_override

Conflicts:
	drivers/pci/pci-sysfs.c
2014-05-28 16:21:07 -06:00
Greg Kroah-Hartman
7ca22cfa0f USB: remove CONFIG_USB_DEBUG from defconfig files
Now that CONFIG_USB_DEBUG is gone, remove it from a number of defconfig
files that were enabling it.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-28 09:40:45 -07:00
Linus Torvalds
4efdedca93 Small fixes for x86, slightly larger fixes for PPC, and a forgotten s390 patch.
The PPC fixes are important because they fix breakage that is new in 3.15.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTdivEAAoJEBvWZb6bTYbyw3YQAIILnflhHNtklj1mfPnnibQf
 c3BLCkJ0gtK6A0FO2aAHgSja0kpgbEEnSphE/A/cb0vkLon3n5O0pQoSKjGUUbBO
 Mo0ndjzBYNmCP4MGxhkrg49VdqD40NaR0BjJAZudb4vUOw892WLFIJMIVmIqs9eG
 8V/y6S7mPLmrooAKHZxXql9y30UC77T1VZ3r4pXwYgKtUT51BQfTyWiSfjQBa8yI
 oGOSb8uqEC7YiOYPJYUNIMsyVqW4E6Qqs46rqtP4XZmSxzWXDzzgP4nQHHyJJCdZ
 aBYkeG+sJZG7ZwleJLejAncjWUY9Oq9GkMYNj0cTAoP/zA6jBGAll96KGKRbes9z
 bZUtCNL3ifLcgbIGeAxgjmYOq0XLGahHbqm9QISYW2XdRkBI+8EJs5FCP4YEHzZn
 FSm3zcCQ+wtbqjBbZZcqqLa6A/CGzjyO26qz+BCxrZ0BQkQX/2am3UykQ0JWam3H
 vX5ZM2ewJhs6SjFisPcswd20AN+SHjPyzPvErBLDfrqnAVbwj2ehgqyN2slVsqrj
 UyGzeKCfJgA0TiEH/4K6j6hvQWynUU+/2JglIfGE6AXmWddazCzl/qx4LvuGKFoB
 b8JSQ7YaHSsq/tHc8WhHkvcP0FSDZEiHcJN2iY1pwLKTSQp9JN3aPNruPKiO8dsW
 N+LoHL5fFcDi6Uu6wS7w
 =E2fU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Small fixes for x86, slightly larger fixes for PPC, and a forgotten
  s390 patch.  The PPC fixes are important because they fix breakage
  that is new in 3.15"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: announce irqfd capability
  KVM: x86: disable master clock if TSC is reset during suspend
  KVM: vmx: disable APIC virtualization in nested guests
  KVM guest: Make pv trampoline code executable
  KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
  KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
  KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
2014-05-28 08:08:03 -07:00
Sam bobroff
1739ea9e13 powerpc: Fix regression of per-CPU DSCR setting
Since commit "efcac65 powerpc: Per process DSCR + some fixes (try#4)"
it is no longer possible to set the DSCR on a per-CPU basis.

The old behaviour was to minipulate the DSCR SPR directly but this is no
longer sufficient: the value is quickly overwritten by context switching.

This patch stores the per-CPU DSCR value in a kernel variable rather than
directly in the SPR and it is used whenever a process has not set the DSCR
itself. The sysfs interface (/sys/devices/system/cpu/cpuN/dscr) is unchanged.

Writes to the old global default (/sys/devices/system/cpu/dscr_default)
now set all of the per-CPU values and reads return the last written value.

The new per-CPU default is added to the paca_struct and is used everywhere
outside of sysfs.c instead of the old global default.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:40 +10:00
Sam bobroff
39a360ef72 powerpc: Split __SYSFS_SPRSETUP macro
Split the __SYSFS_SPRSETUP macro into two parts so that registers requiring
custom read and write functions can use common code for their show and store
functions.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:39 +10:00
Rickard Strandqvist
b717d98543 arch: powerpc/fadump: Cleaning up inconsistent NULL checks
Cleaning up inconsistent NULL checks.
There is otherwise a risk of a possible null pointer dereference.

Was largely found by using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:39 +10:00
Preeti U Murthy
4750afa2c5 powerpc: Fix comment around arch specific definition of RECLAIM_DISTANCE
Commit 32e45ff43eaf5c17f changed the default value of
RECLAIM_DISTANCE to 30. However the comment around arch
specifc definition of RECLAIM_DISTANCE is not updated to
reflect the same. Correct the value mentioned in the comment.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: KOSAKI Motohiro <Kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:38 +10:00
Michael Ellerman
e2186023f2 powerpc/powernv: Add support for POWER8 split core on powernv
Upcoming POWER8 chips support a concept called split core. This is where the
core can be split into subcores that although not full cores, are able to
appear as full cores to a guest.

The splitting & unsplitting procedure is mildly complicated, and explained at
length in the comments within the patch.

One notable detail is that when splitting or unsplitting we need to pull
offline cpus out of their offline state to do work as part of the procedure.

The interface for changing the split mode is via a sysfs file, eg:

 $ echo 2 > /sys/devices/system/cpu/subcores_per_core

Currently supported values are '1', '2' and '4'. And indicate respectively that
the core should be unsplit, split in half, and split in quarters. These modes
correspond to threads_per_subcore of 8, 4 and 2.

We do not allow changing the split mode while KVM VMs are active. This is to
prevent the value changing while userspace is configuring the VM, and also to
prevent the mode being changed in such a way that existing guests are unable to
be run.

CPU hotplug fixes by Srivatsa.  max_cpus fixes by Mahesh.  cpuset fixes by
benh.  Fix for irq race by paulus.  The rest by mikey and mpe.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:37 +10:00
Michael Ellerman
3102f7843c powerpc/kvm/book3s_hv: Use threads_per_subcore in KVM
To support split core on POWER8 we need to modify various parts of the
KVM code to use threads_per_subcore instead of threads_per_core. On
systems that do not support split core threads_per_subcore ==
threads_per_core and these changes are a nop.

We use threads_per_subcore as the value reported by KVM_CAP_PPC_SMT.
This communicates to userspace that guests can only be created with
a value of threads_per_core that is less than or equal to the current
threads_per_subcore. This ensures that guests can only be created with a
thread configuration that we are able to run given the current split
core mode.

Although threads_per_subcore can change during the life of the system,
the commit that enables that will ensure that threads_per_subcore does
not change during the life of a KVM VM.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Alexander Graf <agraf@suse.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:37 +10:00
Michael Ellerman
6f5e40a300 powerpc: Check cpu_thread_in_subcore() in __cpu_up()
To support split core we need to change the check in __cpu_up() that
determines if a cpu is allowed to come online.

Currently we refuse to online cpus which are not the primary thread
within their core.

On POWER8 with split core support this check needs to instead refuse to
online cpus which are not the primary thread within their *sub* core.

On POWER7 and other systems that do not support split core,
threads_per_subcore == threads_per_core and so the check is equivalent.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:36 +10:00
Michael Ellerman
5853aef1ac powerpc: Add threads_per_subcore
On POWER8 we have a new concept of a subcore. This is what happens when
you take a regular core and split it. A subcore is a grouping of two or
four SMT threads, as well as a handfull of SPRs which allows the subcore
to appear as if it were a core from the point of view of a guest.

Unlike threads_per_core which is fixed at boot, threads_per_subcore can
change while the system is running. Most code will not want to use
threads_per_subcore.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:35 +10:00
Michael Ellerman
8d6f7c5aa3 powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()
To support split core we need to be able to force all secondaries into
nap, so the core can detect they are idle and do an unsplit.

Currently power7_nap() will return without napping if there is an irq
pending. We want to ignore the pending irq and nap anyway, we will deal
with the interrupt later.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:35 +10:00
Michael Ellerman
441c19c8a2 powerpc/kvm/book3s_hv: Rework the secondary inhibit code
As part of the support for split core on POWER8, we want to be able to
block splitting of the core while KVM VMs are active.

The logic to do that would be exactly the same as the code we currently
have for inhibiting onlining of secondaries.

Instead of adding an identical mechanism to block split core, rework the
secondary inhibit code to be a "HV KVM is active" check. We can then use
that in both the cpu hotplug code and the upcoming split core code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Alexander Graf <agraf@suse.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:34 +10:00
Nishanth Aravamudan
64bb80d87f powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES
Based off fd1197f1 for ia64, enable CONFIG_HAVE_MEMORYLESS_NODES if
NUMA. Initialize the local memory node in start_secondary.

With this commit and the preceding to enable
CONFIG_USER_PERCPU_NUMA_NODE_ID, which is a prerequisite, in a PowerKVM
guest with the following topology:

numactl --hardware
available: 3 nodes (0-2)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
95 96 97 98 99
node 0 size: 1998 MB
node 0 free: 521 MB
node 1 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
187 188 189 190 191 192 193 194 195 196 197 198 199
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus:
node 2 size: 2039 MB
node 2 free: 1739 MB
node distances:
node   0   1   2
  0:  10  40  40
  1:  40  10  40
  2:  40  40  10

the unreclaimable slab is reduced by close to 130M:

Before:
        Slab:             418176 kB
        SReclaimable:      26624 kB
        SUnreclaim:       391552 kB

After:
        Slab:             298944 kB
        SReclaimable:      31744 kB
        SUnreclaim:       267200 kB

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:34 +10:00
Nishanth Aravamudan
8c27226119 powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID
Based off 3bccd996 for ia64, convert powerpc to use the generic per-CPU
topology tracking, specifically:

    initialize per cpu numa_node entry in start_secondary
    remove the powerpc cpu_to_node()
    define CONFIG_USE_PERCPU_NUMA_NODE_ID if NUMA

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:35:33 +10:00
Benjamin Herrenschmidt
86969cf733 Merge branch 'merge' into next
Merge the binutils and kexec fixes.
2014-05-28 13:30:12 +10:00
Srivatsa S. Bhat
011e4b02f1 powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:

[    0.089866] POWER8 performance monitor hardware support registered
[    0.089985] power8-pmu: PMAO restore workaround active.
[    5.095419] Processor 1 is stuck.
[   10.097933] Processor 2 is stuck.
[   15.100480] Processor 3 is stuck.
[   20.102982] Processor 4 is stuck.
[   25.105489] Processor 5 is stuck.
[   30.108005] Processor 6 is stuck.
[   35.110518] Processor 7 is stuck.
[   40.113369] Processor 9 is stuck.
[   45.115879] Processor 10 is stuck.
[   50.118389] Processor 11 is stuck.
[   55.120904] Processor 12 is stuck.
[   60.123425] Processor 13 is stuck.
[   65.125970] Processor 14 is stuck.
[   70.128495] Processor 15 is stuck.
[   75.131316] Processor 17 is stuck.

Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:

[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.

Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().

It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().

Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.

So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.

Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.

Fixes: c97102ba963 (kexec: migrate to reboot cpu)
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:24:26 +10:00
Guenter Roeck
7998eb3dc7 powerpc: Fix 64 bit builds with binutils 2.24
With binutils 2.24, various 64 bit builds fail with relocation errors
such as

arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
	(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
	against symbol `interrupt_base_book3e' defined in .text section
	in arch/powerpc/kernel/built-in.o
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
	(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
	against symbol `interrupt_end_book3e' defined in .text section
	in arch/powerpc/kernel/built-in.o

The assembler maintainer says:

 I changed the ABI, something that had to be done but unfortunately
 happens to break the booke kernel code.  When building up a 64-bit
 value with lis, ori, shl, oris, ori or similar sequences, you now
 should use @high and @higha in place of @h and @ha.  @h and @ha
 (and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
 now report overflow if the value is out of 32-bit signed range.
 ie. @h and @ha assume you're building a 32-bit value. This is needed
 to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
 and @toc@ha expressions, and for consistency I did the same for all
 other @h and @ha relocs.

Replacing @h with @high in one strategic location fixes the relocation
errors. This has to be done conditionally since the assembler either
supports @h or @high but not both.

Cc: <stable@vger.kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-28 13:24:05 +10:00
Benjamin Herrenschmidt
b9d800959e Merge remote-tracking branch 'scott/next' into next
<<
Highlights include a few new boards, a device tree binding for CCF
(including backwards-compatible device tree updates to distinguish
incompatible versions), and some fixes.
>>
2014-05-28 10:02:58 +10:00
Naoki MATSUMOTO
3a0d89d3f8 USB: delete CONFIG_USB_DEVICEFS from defconfig
It no longer occurs in Kconfig.
USB: remove CONFIG_USB_DEVICEFS(fb28d58b) leaked remove defconfig.

Signed-off-by: Naoki MATSUMOTO <nekomatu+linux@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-27 16:07:13 -07:00
Hanjun Guo
a43ae58c84 PCI: Turn pcibios_penalize_isa_irq() into a weak function
pcibios_penalize_isa_irq() is only implemented by x86 now, and legacy ISA
is not used by some architectures.  Make pcibios_penalize_isa_irq() a
__weak function to simplify the code.  This removes the need for new
platforms to add stub implementations of pcibios_penalize_isa_irq().

[bhelgaas: changelog, comments]
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2014-05-27 16:23:58 -06:00
Yijing Wang
c888770eb2 powerpc/PCI: Use pci_is_bridge() to simplify code
Use pci_is_bridge() to simplify code.  No functional change.

Requires: 326c1cdae741 PCI: Rename pci_is_bridge() to pci_has_subordinate()
Requires: 1c86438c9423 PCI: Add new pci_is_bridge() interface
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-05-27 14:57:36 -06:00
David S. Miller
54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
Grant Likely
ba52464a62 of: Stop naming platform_device using dcr address
There is now a way to ensure all platform devices get a unique name when
populated from the device tree, and the DCR_NATIVE code path is broken
anyway. PowerPC Cell (PS3) is the only platform that actually uses this
path.  Most likely nobody will notice if it is killed. Remove the code
and associated ugly #ifdef.

The user-visible impact of this patch is that any DCR device on Cell
will get a new name in the /sys/devices hierarchy.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-23 08:28:02 +09:00
Scott Wood
e83eb028bb powerpc/fsl: Add fsl,portid-mapping to corenet1-cf chips
Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Diana Craciun <diana.craciun@freescale.com>
2014-05-22 18:10:42 -05:00
Alexander Graf
8cb59788b3 PPC: ePAPR: Fix hypercall on LE guest
We get an array of instructions from the hypervisor via device tree that
we write into a buffer that gets executed whenever we want to make an
ePAPR compliant hypercall.

However, the hypervisor passes us these instructions in BE order which
we have to manually convert to LE when we want to run them in LE mode.

With this fixup in place, I can successfully run LE kernels with KVM
PV enabled on PR KVM.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:33 -05:00
harninder rai
1be62c6cce powerpc/mpc85xx: Add BSC9132 QDS Support
- BSC9132 is an integrated device that targets Femto base station market.
  It combines Power Architecture e500v2 and DSP StarCore SC3850 technologies
  with MAPLE-B2F baseband acceleration processing elements

- BSC9132QDS Overview
     2Gbyte DDR3 (on board DDR)
     32Mbyte 16bit NOR flash
     128Mbyte 2K page size NAND Flash
     256 Kbit M24256 I2C EEPROM
     128 Mbit SPI Flash memory
     SD slot
     eTSEC1: Connected to SGMII PHY
     eTSEC2: Connected to SGMII PHY
     DUART interface: supports one UARTs up to 115200 bps for console display

Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
Signed-off-by: Ruchika Gupta <ruchika.gupta@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:32 -05:00
Lijun Pan
fd7e5b7a87 powerpc/mpc85xx: Remove P1023 RDS support
P1023RDS is no longer supported/manufactured by Freescale while P1023RDB is.

Signed-off-by: Lijun Pan <Lijun.Pan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:31 -05:00
Scott Wood
aa80581da1 powerpc/mpic: Don't init the fsl error int until after mpic init
Besides other potential problems, if MPIC_NO_RESET is  not set,
the error interrupt will be masked after it is requested.

Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:30 -05:00
Prabhakar Kushwaha
0c0fc4d3a9 powerpc/fsl-booke: Add initial T104x_QDS board support
Add support for T104x board in board file t104x_qds.c, It is common for
 both T1040 and T1042 as they share same QDS board.

 T1040QDS board Overview
 -----------------------
 - SERDES Connections, 8 lanes supporting:
      — PCI Express: supporting Gen 1 and Gen 2;
      — SGMII
      — QSGMII
      — SATA 2.0
      — Aurora debug with dedicated connectors (T1040 only)
 - DDR Controller
     - Supports rates of up to 1600 MHz data-rate
     - Supports one DDR3LP UDIMM/RDIMMs, of single-, dual- or quad-rank types.
 -IFC/Local Bus
     - NAND flash: 8-bit, async, up to 2GB.
     - NOR: 8-bit or 16-bit, non-multiplexed, up to 512MB
     - GASIC: Simple (minimal) target within Qixis FPGA
     - PromJET rapid memory download support
 - Ethernet
     - Two on-board RGMII 10/100/1G ethernet ports.
     - PHY #0 remains powered up during deep-sleep (T1040 only)
 - QIXIS System Logic FPGA
 - Clocks
     - System and DDR clock (SYSCLK, “DDRCLK”)
     - SERDES clocks
 - Power Supplies
 - Video
     - DIU supports video at up to 1280x1024x32bpp
 - USB
     - Supports two USB 2.0 ports with integrated PHYs
     — Two type A ports with 5V@1.5A per port.
     — Second port can be converted to OTG mini-AB
 - SDHC
     - SDHC port connects directly to an adapter card slot, featuring:
     - Supporting SD slots for: SD, SDHC (1x, 4x, 8x) and/or MMC
     — Supporting eMMC memory devices
 - SPI
    -  On-board support of 3 different devices and sizes
 - Other IO
    - Two Serial ports
    - ProfiBus port
    - Four I2C ports

Add T104xQDS support in Kconfig and Makefile. Also create device tree.
Following features are currently not implmented.
  - SerDes: Aurora
  - IFC: GASIC, Promjet
  - QIXIS
  - Ethernet
  - DIU
  - power supplies management
  - ProfiBus

Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com>
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:29 -05:00
Prabhakar Kushwaha
fb734eeebf powerpc/mpc85xx:Add initial device tree support of T104x
The QorIQ T1040/T1042 processor support four integrated 64-bit e5500 PA
processor cores with high-performance data path acceleration architecture
and network peripheral interfaces required for networking & telecommunications.

T1042 personality is a reduced personality of T1040 without Integrated 8-port
Gigabit Ethernet switch.

The T1040/T1042 SoC includes the following function and features:

 - Four e5500 cores, each with a private 256 KB L2 cache
 - 256 KB shared L3 CoreNet platform cache (CPC)
 - Interconnect CoreNet platform
 - 32-/64-bit DDR3L/DDR4 SDRAM memory controller with ECC and interleaving
   support
 - Data Path Acceleration Architecture (DPAA) incorporating acceleration
 for the following functions:
    -  Packet parsing, classification, and distribution
    -  Queue management for scheduling, packet sequencing, and congestion
    	management
    -  Cryptography Acceleration (SEC 5.0)
    - RegEx Pattern Matching Acceleration (PME 2.2)
    - IEEE Std 1588 support
    - Hardware buffer management for buffer allocation and deallocation
 - Ethernet interfaces
    - Integrated 8-port Gigabit Ethernet switch (T1040 only)
    - Four 1 Gbps Ethernet controllers
 - Two RGMII interfaces or one RGMII and one MII interfaces
 - High speed peripheral interfaces
   - Four PCI Express 2.0 controllers running at up to 5 GHz
   - Two SATA controllers supporting 1.5 and 3.0 Gb/s operation
   - Upto two QSGMII interface
   - Upto six SGMII interface supporting 1000 Mbps
   - One SGMII interface supporting upto 2500 Mbps
 - Additional peripheral interfaces
   - Two USB 2.0 controllers with integrated PHY
   - SD/eSDHC/eMMC
   -  eSPI controller
   - Four I2C controllers
   - Four UARTs
   - Four GPIO controllers
   - Integrated flash controller (IFC)
   - Change this to  LCD/ HDMI interface (DIU) with 12 bit dual data rate
   - TDM interface
 - Multicore programmable interrupt controller (PIC)
 - Two 8-channel DMA engines
 - Single source clocking implementation
 - Deep Sleep power implementaion (wakeup from GPIO/Timer/Ethernet/USB)

Signed-off-by: Poonam Aggrwal <poonam.aggrwal@freescale.com>
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-05-22 18:08:29 -05:00