Commit Graph

19074 Commits

Author SHA1 Message Date
Bandan Das
19677e32fe KVM: nVMX: rearrange get_vmx_mem_address
Our common function for vmptr checks (in 2/4) needs to fetch
the memory address

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-06 18:59:57 +02:00
Ulrich Obergfell
1171903d89 KVM: x86: improve the usability of the 'kvm_pio' tracepoint
This patch moves the 'kvm_pio' tracepoint to emulator_pio_in_emulated()
and emulator_pio_out_emulated(), and it adds an argument (a pointer to
the 'pio_data'). A single 8-bit or 16-bit or 32-bit data item is fetched
from 'pio_data' (depending on 'size'), and the value is included in the
trace record ('val'). If 'count' is greater than one, this is indicated
by the string "(...)" in the trace output.

Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-05 22:42:05 +02:00
Marcelo Tosatti
e4c9a5a175 KVM: x86: expose invariant tsc cpuid bit (v2)
Invariant TSC is a property of TSC, no additional
support code necessary.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-29 15:22:43 +02:00
Xiao Guangrong
198c74f43f KVM: MMU: flush tlb out of mmu lock when write-protect the sptes
Now we can flush all the TLBs out of the mmu lock without TLB corruption when
write-proect the sptes, it is because:
- we have marked large sptes readonly instead of dropping them that means we
  just change the spte from writable to readonly so that we only need to care
  the case of changing spte from present to present (changing the spte from
  present to nonpresent will flush all the TLBs immediately), in other words,
  the only case we need to care is mmu_spte_update()

- in mmu_spte_update(), we haved checked
  SPTE_HOST_WRITEABLE | PTE_MMU_WRITEABLE instead of PT_WRITABLE_MASK, that
  means it does not depend on PT_WRITABLE_MASK anymore

Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:49:52 -03:00
Xiao Guangrong
7f31c9595e KVM: MMU: flush tlb if the spte can be locklessly modified
Relax the tlb flush condition since we will write-protect the spte out of mmu
lock. Note lockless write-protection only marks the writable spte to readonly
and the spte can be writable only if both SPTE_HOST_WRITEABLE and
SPTE_MMU_WRITEABLE are set (that are tested by spte_is_locklessly_modifiable)

This patch is used to avoid this kind of race:

      VCPU 0                         VCPU 1
lockless wirte protection:
      set spte.w = 0
                                 lock mmu-lock

                                 write protection the spte to sync shadow page,
                                 see spte.w = 0, then without flush tlb

				 unlock mmu-lock

                                 !!! At this point, the shadow page can still be
                                     writable due to the corrupt tlb entry
     Flush all TLB

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:49:51 -03:00
Xiao Guangrong
c126d94f2c KVM: MMU: lazily drop large spte
Currently, kvm zaps the large spte if write-protected is needed, the later
read can fault on that spte. Actually, we can make the large spte readonly
instead of making them un-present, the page fault caused by read access can
be avoided

The idea is from Avi:
| As I mentioned before, write-protecting a large spte is a good idea,
| since it moves some work from protect-time to fault-time, so it reduces
| jitter.  This removes the need for the return value.

This version has fixed the issue reported in 6b73a9606, the reason of that
issue is that fast_page_fault() directly sets the readonly large spte to
writable but only dirty the first page into the dirty-bitmap that means
other pages are missed. Fixed it by only the normal sptes (on the
PT_PAGE_TABLE_LEVEL level) can be fast fixed

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:49:50 -03:00
Xiao Guangrong
92a476cbfc KVM: MMU: properly check last spte in fast_page_fault()
Using sp->role.level instead of @level since @level is not got from the
page table hierarchy

There is no issue in current code since the fast page fault currently only
fixes the fault caused by dirty-log that is always on the last level
(level = 1)

This patch makes the code more readable and avoids potential issue in the
further development

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:49:49 -03:00
Xiao Guangrong
a086f6a1eb Revert "KVM: Simplify kvm->tlbs_dirty handling"
This reverts commit 5befdc385d.

Since we will allow flush tlb out of mmu-lock in the later
patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:49:48 -03:00
Nadav Amit
42bf549f3c KVM: x86: Processor mode may be determined incorrectly
If EFER.LMA is off, cs.l does not determine execution mode.
Currently, the emulation engine assumes differently.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:47:00 -03:00
Nadav Amit
e6e39f0438 KVM: x86: IN instruction emulation should ignore REP-prefix
The IN instruction is not be affected by REP-prefix as INS is.  Therefore, the
emulation should ignore the REP prefix as well.  The current emulator
implementation tries to perform writeback when IN instruction with REP-prefix
is emulated. This causes it to perform wrong memory write or spurious #GP
exception to be injected to the guest.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:46:59 -03:00
Nadav Amit
346874c950 KVM: x86: Fix CR3 reserved bits
According to Intel specifications, PAE and non-PAE does not have any reserved
bits.  In long-mode, regardless to PCIDE, only the high bits (above the
physical address) are reserved.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:46:57 -03:00
Nadav Amit
671bd9934a KVM: x86: Fix wrong/stuck PMU when guest does not use PMI
If a guest enables a performance counter but does not enable PMI, the
hypervisor currently does not reprogram the performance counter once it
overflows.  As a result the host performance counter is kept with the original
sampling period which was configured according to the value of the guest's
counter when the counter was enabled.

Such behaviour can cause very bad consequences. The most distrubing one can
cause the guest not to make any progress at all, and keep exiting due to host
PMI before any guest instructions is exeucted. This situation occurs when the
performance counter holds a very high value when the guest enables the
performance counter. As a result the host's sampling period is configured to be
very short. The host then never reconfigures the sampling period and get stuck
at entry->PMI->exit loop. We encountered such a scenario in our experiments.

The solution is to reprogram the counter even if the guest does not use PMI.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-23 17:46:52 -03:00
Bandan Das
e0ba1a6ffc KVM: nVMX: Advertise support for interrupt acknowledgement
Some Type 1 hypervisors such as XEN won't enable VMX without it present

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-22 18:41:34 -03:00
Bandan Das
77b0f5d67f KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
This feature emulates the "Acknowledge interrupt on exit" behavior.
We can safely emulate it for L1 to run L2 even if L0 itself has it
disabled (to run L1).

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-22 18:41:33 -03:00
Bandan Das
4b85507860 KVM: nVMX: Don't advertise single context invalidation for invept
For single context invalidation, we fall through to global
invalidation in handle_invept() except for one case - when
the operand supplied by L1 is different from what we have in
vmcs12. However, typically hypervisors will only call invept
for the currently loaded eptp, so the condition will
never be true.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-22 18:41:28 -03:00
Huw Davies
fd2a445a94 KVM: VMX: Advance rip to after an ICEBP instruction
When entering an exception after an ICEBP, the saved instruction
pointer should point to after the instruction.

This fixes the bug here: https://bugs.launchpad.net/qemu/+bug/1119686

Signed-off-by: Huw Davies <huw@codeweavers.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-22 18:37:43 -03:00
Nadav Amit
5c7411e293 KVM: x86: Fix CR3 and LDT sel should not be saved in TSS
According to Intel specifications, only general purpose registers and segment
selectors should be saved in the old TSS during 32-bit task-switch.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-21 17:33:49 -03:00
Michael S. Tsirkin
68c3b4d167 KVM: VMX: speed up wildcard MMIO EVENTFD
With KVM, MMIO is much slower than PIO, due to the need to
do page walk and emulation. But with EPT, it does not have to be: we
know the address from the VMCS so if the address is unique, we can look
up the eventfd directly, bypassing emulation.

Unfortunately, this only works if userspace does not need to match on
access length and data.  The implementation adds a separate FAST_MMIO
bus internally. This serves two purposes:
    - minimize overhead for old userspace that does not use eventfd with lengtth = 0
    - minimize disruption in other code (since we don't know the length,
      devices on the MMIO bus only get a valid address in write, this
      way we don't need to touch all devices to teach them to handle
      an invalid length)

At the moment, this optimization only has effect for EPT on x86.

It will be possible to speed up MMIO for NPT and MMU using the same
idea in the future.

With this patch applied, on VMX MMIO EVENTFD is essentially as fast as PIO.
I was unable to detect any measureable slowdown to non-eventfd MMIO.

Making MMIO faster is important for the upcoming virtio 1.0 which
includes an MMIO signalling capability.

The idea was suggested by Peter Anvin.  Lots of thanks to Gleb for
pre-review and suggestions.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-17 14:01:43 -03:00
Michael S. Tsirkin
f848a5a8dc KVM: support any-length wildcard ioeventfd
It is sometimes benefitial to ignore IO size, and only match on address.
In hindsight this would have been a better default than matching length
when KVM_IOEVENTFD_FLAG_DATAMATCH is not set, In particular, this kind
of access can be optimized on VMX: there no need to do page lookups.
This can currently be done with many ioeventfds but in a suboptimal way.

However we can't change kernel/userspace ABI without risk of breaking
some applications.
Use len = 0 to mean "ignore length for matching" in a more optimal way.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-17 14:01:42 -03:00
Nadav Amit
cd9ae5fe47 KVM: x86: Fix page-tables reserved bits
KVM does not handle the reserved bits of x86 page tables correctly:
In PAE, bits 5:8 are reserved in the PDPTE.
In IA-32e, bit 8 is not reserved.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-16 18:59:23 -03:00
Linus Torvalds
55101e2d6c Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Marcelo Tosatti:
 - Fix for guest triggerable BUG_ON (CVE-2014-0155)
 - CR4.SMAP support
 - Spurious WARN_ON() fix

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: remove WARN_ON from get_kernel_ns()
  KVM: Rename variable smep to cr4_smep
  KVM: expose SMAP feature to guest
  KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
  KVM: Add SMAP support when setting CR4
  KVM: Remove SMAP bit from CR4_RESERVED_BITS
  KVM: ioapic: try to recover if pending_eoi goes out of range
  KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)
2014-04-14 16:21:28 -07:00
Marcelo Tosatti
b351c39cc9 KVM: x86: remove WARN_ON from get_kernel_ns()
Function and callers can be preempted.

https://bugzilla.kernel.org/show_bug.cgi?id=73721

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-14 17:50:43 -03:00
Feng Wu
66386ade2a KVM: Rename variable smep to cr4_smep
Rename variable smep to cr4_smep, which can better reflect the
meaning of the variable.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:40 -03:00
Feng Wu
de935ae15b KVM: expose SMAP feature to guest
This patch exposes SMAP feature to guest

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:37 -03:00
Feng Wu
e1e746b3c5 KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
SMAP is disabled if CPU is in non-paging mode in hardware.
However KVM always uses paging mode to emulate guest non-paging
mode with TDP. To emulate this behavior, SMAP needs to be
manually disabled when guest switches to non-paging mode.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:35 -03:00
Feng Wu
97ec8c067d KVM: Add SMAP support when setting CR4
This patch adds SMAP handling logic when setting CR4 for guests

Thanks a lot to Paolo Bonzini for his suggestion to use the branchless
way to detect SMAP violation.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:34 -03:00
Feng Wu
56d6efc2de KVM: Remove SMAP bit from CR4_RESERVED_BITS
This patch removes SMAP bit from CR4_RESERVED_BITS.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14 17:50:33 -03:00
Linus Torvalds
09c9b61d5d LLVMLinux Patches for v3.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iEYEABECAAYFAlNFtEkACgkQuseO5dulBZVcPwCdFWY81hKqQaKHaSPFh9m+n1lt
 yY0An2VZZGrFSkj162POFy1P2sPpGw5p
 =LRFZ
 -----END PGP SIGNATURE-----

Merge tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel

Pull llvm patches from Behan Webster:
 "These are some initial updates to support compiling the kernel with
  clang.

  These patches have been through the proper reviews to the best of my
  ability, and have been soaking in linux-next for a few weeks.  These
  patches by themselves still do not completely allow clang to be used
  with the kernel code, but lay the foundation for other patches which
  are still under review.

  Several other of the LLVMLinux patches have been already added via
  maintainer trees"

* tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel:
  x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id"
  x86 kbuild: LLVMLinux: More cc-options added for clang
  x86, acpi: LLVMLinux: Remove nested functions from Thinkpad ACPI
  LLVMLinux: Add support for clang to compiler.h and new compiler-clang.h
  LLVMLinux: Remove warning about returning an uninitialized variable
  kbuild: LLVMLinux: Fix LINUX_COMPILER definition script for compilation with clang
  Documentation: LLVMLinux: Update Documentation/dontdiff
  kbuild: LLVMLinux: Adapt warnings for compilation with clang
  kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang
2014-04-12 17:00:40 -07:00
Linus Torvalds
0b747172dc Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
  AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
  audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
  audit: do not cast audit_rule_data pointers pointlesly
  AUDIT: Allow login in non-init namespaces
  audit: define audit_is_compat in kernel internal header
  kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
  sched: declare pid_alive as inline
  audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
  syscall_get_arch: remove useless function arguments
  audit: remove stray newline from audit_log_execve_info() audit_panic() call
  audit: remove stray newlines from audit_log_lost messages
  audit: include subject in login records
  audit: remove superfluous new- prefix in AUDIT_LOGIN messages
  audit: allow user processes to log from another PID namespace
  audit: anchor all pid references in the initial pid namespace
  audit: convert PPIDs to the inital PID namespace.
  pid: get pid_t ppid of task in init_pid_ns
  audit: rename the misleading audit_get_context() to audit_take_context()
  audit: Add generic compat syscall support
  audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
  ...
2014-04-12 12:38:53 -07:00
Linus Torvalds
eeb91e4f9d More ACPI and power management fixes and updates for 3.15-rc1
- Fix for a recently introduced CPU hotplug regression in ARM KVM
    from Ming Lei.
 
  - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
    cpufreq drivers introduced during the 3.14 cycle (-stable material)
    from Chen Gang and Viresh Kumar.
 
  - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
    from Gautham R Shenoy and Srivatsa S Bhat.
 
  - Exynos cpufreq driver fix preventing it from being included into
    multiplatform builds that aren't supported by it from Sachin Kamat.
 
  - cpufreq cleanups related to the usage of the driver_data field in
    struct cpufreq_frequency_table from Viresh Kumar.
 
  - cpufreq ppc driver cleanup from Sachin Kamat.
 
  - Intel BayTrail support for intel_idle and ACPI idle from Len Brown.
 
  - Intel CPU model 54 (Atom N2000 series) support for intel_idle from
    Jan Kiszka.
 
  - intel_idle fix for Intel Ivy Town residency targets from Len Brown.
 
  - turbostat updates (Intel Broadwell support and output cleanups)
    from Len Brown.
 
  - New cpuidle sysfs attribute for exporting C-states' target residency
    information to user space from Daniel Lezcano.
 
  - New kernel command line argument to prevent power domains enabled
    by the bootloader from being turned off even if they are not in use
    (for diagnostics purposes) from Tushar Behera.
 
  - Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven.
 
  - New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler
    Paul.
 
  - Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi,
    Zhihui Zhang, Hanjun Guo.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTRxLAAAoJEILEb/54YlRxnCwP/16UwO/eVE8SIi0TqQboikFC
 k8u7F3zgDYG+xPSzXlCR+J7thTxGueQlrb+aM18PYuMVgaw2rpy7U7SIqEk8s6oR
 uFnzZCWKA5ZebbZn+NlodnQaJmbgJxwsVJDuuechUka/e67CaIc54JULi2ynZ0lz
 Kg/nU3NJhu4S81cT5SOTkJ9xE63oxHcCwKbNqEmxn7x7ddFzGK/DThG67NMEnW1F
 vHbBTSyI6vmXXg1f9aobUtuo3PfJkkx5jD+nR1H2e6wmB64tW7JPVKV3mi6LJfYM
 ui/8/gNb3PUMHMX1QbL9EFbPxl9miQx2NJ7dgFKa1HZ/WPyiXpJjz7uGr9O3Fau3
 cFVREdaW8p2TAYWOEgH8luohhdK0j8UEpR/sEm0TrTjsK8wqczVf/hz6RraVJZiN
 ck6eVHjY6m3/bFQauZQ/r+DNeeNcdr+iLejgjbh/MXuF3j0kx+1dkKkzCEU2TgEZ
 9etF0uzjlgyXySyxNKBeSW13+ssVA6kF5/BHns7LHoxTfGu7Y4oVaWUi+j74i66O
 bc+2ileNal71mS4v9gomnj6Ffj8oH8KXFA7k0sEsAdwLZNgThB5bTppmY/U7Y5Ce
 hTS81tcGe2vOVQzF9iFOF7LNKKussAVAtrgkkrA8lJLeOTfQbIo4+fMhORxf3X/p
 3O7R/jc4cT+IXK8a2xRt
 =hGKg
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI and power management fixes and updates from Rafael Wysocki:
 "This is PM and ACPI material that has emerged over the last two weeks
  and one fix for a CPU hotplug regression introduced by the recent CPU
  hotplug notifiers registration series.

  Included are intel_idle and turbostat updates from Len Brown (these
  have been in linux-next for quite some time), a new cpufreq driver for
  powernv (that might spend some more time in linux-next, but BenH was
  asking me so nicely to push it for 3.15 that I couldn't resist), some
  cpufreq fixes and cleanups (including fixes for some silly breakage in
  a couple of cpufreq drivers introduced during the 3.14 cycle),
  assorted ACPI cleanups, wakeup framework documentation fixes, a new
  sysfs attribute for cpuidle and a new command line argument for power
  domains diagnostics.

  Specifics:

   - Fix for a recently introduced CPU hotplug regression in ARM KVM
     from Ming Lei.

   - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
     cpufreq drivers introduced during the 3.14 cycle (-stable material)
     from Chen Gang and Viresh Kumar.

   - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
     from Gautham R Shenoy and Srivatsa S Bhat.

   - Exynos cpufreq driver fix preventing it from being included into
     multiplatform builds that aren't supported by it from Sachin Kamat.

   - cpufreq cleanups related to the usage of the driver_data field in
     struct cpufreq_frequency_table from Viresh Kumar.

   - cpufreq ppc driver cleanup from Sachin Kamat.

   - Intel BayTrail support for intel_idle and ACPI idle from Len Brown.

   - Intel CPU model 54 (Atom N2000 series) support for intel_idle from
     Jan Kiszka.

   - intel_idle fix for Intel Ivy Town residency targets from Len Brown.

   - turbostat updates (Intel Broadwell support and output cleanups)
     from Len Brown.

   - New cpuidle sysfs attribute for exporting C-states' target
     residency information to user space from Daniel Lezcano.

   - New kernel command line argument to prevent power domains enabled
     by the bootloader from being turned off even if they are not in use
     (for diagnostics purposes) from Tushar Behera.

   - Fixes for wakeup sysfs attributes documentation from Geert
     Uytterhoeven.

   - New ACPI video blacklist entry for ThinkPad Helix from Stephen
     Chandler Paul.

   - Assorted ACPI cleanups and a Kconfig help update from Jonghwan
     Choi, Zhihui Zhang, Hanjun Guo"

* tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
  ACPI: Update the ACPI spec information in Kconfig
  arm, kvm: fix double lock on cpu_add_remove_lock
  cpuidle: sysfs: Export target residency information
  cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
  cpufreq: create another field .flags in cpufreq_frequency_table
  cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
  cpufreq: don't print value of .driver_data from core
  cpufreq: ia64: don't set .driver_data to index
  cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
  cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
  cpufreq: powernv: cpufreq driver for powernv platform
  cpufreq: at32ap: don't declare local variable as static
  cpufreq: loongson2_cpufreq: don't declare local variable as static
  cpufreq: unicore32: fix typo issue for 'clk'
  cpufreq: exynos: Disable on multiplatform build
  PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes
  PM / domains: Add pd_ignore_unused to keep power domains enabled
  ACPI / dock: Drop dock_device_ids[] table
  ACPI / video: Favor native backlight interface for ThinkPad Helix
  ACPI / thermal: Fix wrong variable usage in debug statement
  ...
2014-04-11 13:20:04 -07:00
Linus Torvalds
40e9963e62 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pullx86 core platform updates from Peter Anvin:
 "This is the x86/platform branch with the objectionable IOSF patches
  removed.

  What is left is proper memory handling for Intel GPUs, and a change to
  the Calgary IOMMU code which will be required to make kexec work
  sanely on those platforms after some upcoming kexec changes"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, calgary: Use 8M TCE table size by default
  x86/gpu: Print the Intel graphics stolen memory range
  x86/gpu: Add Intel graphics stolen memory quirk for gen2 platforms
  x86/gpu: Add vfunc for Intel graphics stolen memory base address
2014-04-11 12:04:15 -07:00
Linus Torvalds
8eab6cd031 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "This is a collection of minor fixes for x86, plus the IRET information
  leak fix (forbid the use of 16-bit segments in 64-bit mode)"

NOTE! We may have to relax the "forbid the use of 16-bit segments in
64-bit mode" part, since there may be people who still run and depend on
16-bit Windows binaries under Wine.

But I'm taking this in the current unconditional form for now to see who
(if anybody) screams bloody murder.  Maybe nobody cares.  And maybe
we'll have to update it with some kind of runtime enablement (like our
vm.mmap_min_addr tunable that people who run dosemu/qemu/wine already
need to tweak).

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
  efi: Pass correct file handle to efi_file_{read,close}
  x86/efi: Correct EFI boot stub use of code32_start
  x86/efi: Fix boot failure with EFI stub
  x86/platform/hyperv: Handle VMBUS driver being a module
  x86/apic: Reinstate error IRQ Pentium erratum 3AP workaround
  x86, CMCI: Add proper detection of end of CMCI storms
2014-04-11 11:58:33 -07:00
H. Peter Anvin
b3b42ac2cb x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer.  We have
a software workaround for that ("espfix") for the 32-bit kernel, but
it relies on a nonzero stack segment base which is not available in
32-bit mode.

Since 16-bit support is somewhat crippled anyway on a 64-bit kernel
(no V86 mode), and most (if not quite all) 64-bit processors support
virtualization for the users who really need it, simply reject
attempts at creating a 16-bit segment when running on top of a 64-bit
kernel.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-kicdm89kzw9lldryb1br9od0@git.kernel.org
Cc: <stable@vger.kernel.org>
2014-04-11 10:10:09 -07:00
Ingo Molnar
3151b942ba * Fix EFI boot regression introduced during the merge window where the
firmware was reading random values from the stack because we were
    passing a pointer to the wrong object type.
 
  * Kernel corruption has been reported when booting with the EFI boot
    stub which was tracked down to setting a bogus value for
    bp->hdr.code32_start, resulting in corruption during relocation.
 
  * Olivier Martin reported that the wrong file handles were being passed
    to efi_file_(read|close), which works for x86 by luck due to the way
    that the FAT driver is implemented, but doesn't work on ARM.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTR5QIAAoJEC84WcCNIz1VMn0P/01GF8A2frSK+NuCJCkmZoAa
 fcOvcHmQajNwG3WAVsVWlS/i2QsYwK1jAgameEusn+FFrnWIwaZ9qb1TjEMbJylu
 4odaRc1YYiLOJi9UD2jRB644374jJwgwteKGs0Vt99g4pa8HsgSbXTR6oF8PUDWr
 1HZUV9tq8O1eAzpQdMADEgWYieylnldfvHk+ArPTJyR5fTNx8xCYALlCthc6Tv+A
 cpi0rQj/YzNh+vqZF1YYZ8xqktvV1di2Hvmy3UVt05y1kwkaTquNY9478ZRF5UHm
 oUk3nAYyA9M/1gxVnvUfyLgUtrWtyF02N+iDTxLoz05KxeK5wVdKaIPZfSAUrglt
 hOvnL+5EOss6w9gG19zpPD4FVHCd696W+iCIBoqooWJqX8AqOVRr81GTYb3q3YDr
 EIH0wLipuV4XI4sdN8JMH9fIbfkRdAvaGUR2lPSYFq2Cm7nn2hs820UdKFYeH0wT
 fdgtGpWAdXhEq/SUW4KRZMCXLDz4XuNF3d/JREcC28CyiRgdjKFD/PMbZEShpisF
 fYE16+IiAq8UMgfgUDqlrSP2UMqkyZ2kp5itvJBrLbTD6rWzEcpK+CMXqykWTOwV
 ONzPAfZEbUmFuU3JhKOTFO5uf7dM9EG5BDKduWR6Wjl8VIVTQlD8R1OB5o1lbZPN
 ecFWo1eIQGZjeoMm36EM
 =rovT
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

"* Fix EFI boot regression introduced during the merge window where the
   firmware was reading random values from the stack because we were
   passing a pointer to the wrong object type.

 * Kernel corruption has been reported when booting with the EFI boot
   stub which was tracked down to setting a bogus value for
   bp->hdr.code32_start, resulting in corruption during relocation.

 * Olivier Martin reported that the wrong file handles were being passed
   to efi_file_(read|close), which works for x86 by luck due to the way
   that the FAT driver is implemented, but doesn't work on ARM."

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-11 10:25:28 +02:00
WANG Chao
0534af01cc x86, calgary: Use 8M TCE table size by default
New kexec-tools wants to pass kdump kernel needed memmap via E820
directly, instead of memmap=exactmap. This makes saved_max_pfn not
be passed down to 2nd kernel. To keep 1st kernel and 2nd kernel using
the same TCE table size, Muli suggest to hard code the size to max (8M).

We can't get rid of saved_max_pfn this time, for backward compatibility
with old first kernel and new second kernel. However new first kernel
and old second kernel can not work unfortunately.

v2->v1:
- retain saved_max_pfn so new 2nd kernel can work with old 1st kernel
  from Vivek

Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Muli Ben-Yehuda <mulix@mulix.org>
Acked-by: Jon Mason <jdmason@kudzu.us>
Link: http://lkml.kernel.org/r/1394463120-26999-1-git-send-email-chaowang@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-04-10 19:51:32 -07:00
Matt Fleming
47514c996f efi: Pass correct file handle to efi_file_{read,close}
We're currently passing the file handle for the root file system to
efi_file_read() and efi_file_close(), instead of the file handle for the
file we wish to read/close.

While this has worked up until now, it seems that it has only been by
pure luck. Olivier explains,

 "The issue is the UEFI Fat driver might return the same function for
  'fh->read()' and 'h->read()'. While in our case it does not work with
  a different implementation of EFI_SIMPLE_FILE_SYSTEM_PROTOCOL. In our
  case, we return a different pointer when reading a directory and
  reading a file."

Fixing this actually clears up the two functions because we can drop one
of the arguments, and instead only pass a file 'handle' argument.

Reported-by: Olivier Martin <olivier.martin@arm.com>
Reviewed-by: Olivier Martin <olivier.martin@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-04-10 21:20:03 +01:00
Matt Fleming
7e8213c1f3 x86/efi: Correct EFI boot stub use of code32_start
code32_start should point at the start of the protected mode code, and
*not* at the beginning of the bzImage. This is much easier to do in
assembly so document that callers of make_boot_params() need to fill out
code32_start.

The fallout from this bug is that we would end up relocating the image
but copying the image at some offset, resulting in what appeared to be
memory corruption.

Reported-by: Thomas Bächler <thomas@archlinux.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-04-10 21:19:52 +01:00
Matt Fleming
396f1a08db x86/efi: Fix boot failure with EFI stub
commit 54b52d8726 ("x86/efi: Build our own EFI services pointer
table") introduced a regression because the 64-bit file_size()
implementation passed a pointer to a 32-bit data object, instead of a
pointer to a 64-bit object.

Because the firmware treats the object as 64-bits regardless it was
reading random values from the stack for the upper 32-bits.

This resulted in people being unable to boot their machines, after
seeing the following error messages,

    Failed to get file info size
    Failed to alloc highmem for files

Reported-by: Dzmitry Sledneu <dzmitry.sledneu@gmail.com>
Reported-by: Koen Kooi <koen@dominion.thruhere.net>
Tested-by: Koen Kooi <koen@dominion.thruhere.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-04-10 21:19:47 +01:00
Jan-Simon Möller
fd2d0a19ab x86 kbuild: LLVMLinux: More cc-options added for clang
Protect more options for x86 with cc-option so that we don't get errors when
using clang instead of gcc.  Add more or different options when using clang as
well. Also need to enforce that SSE is off for clang and the stack is 8-byte
aligned.

Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de>
Signed-off-by: Behan Webster <behanw@converseincode.com>
Signed-off-by: Mark Charlebois <charlebm@gmail.com>
2014-04-09 13:44:35 -07:00
David Rientjes
d0057ca4c1 arch/x86/mm/kmemcheck/kmemcheck.c: use kstrtoint() instead of sscanf()
Kmemcheck should use the preferred interface for parsing command line
arguments, kstrto*(), rather than sscanf() itself.  Use it
appropriately.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-08 16:48:52 -07:00
Rafael J. Wysocki
8c73c4d831 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: sysfs: Export target residency information
  intel_idle: fine-tune IVT residency targets
  tools/power turbostat: Run on Broadwell
  tools/power turbostat: simplify output, add Avg_MHz
  intel_idle: Add CPU model 54 (Atom N2000 series)
  intel_idle: support Bay Trail
  intel_idle: allow sparse sub-state numbering, for Bay Trail
  ACPI idle: permit sparse C-state sub-state numbers
2014-04-08 13:27:40 +02:00
Rafael J. Wysocki
73df623add Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-cpuidle
Pull intel_idle and turbostat material for v3.15-rc1 from Len Brown.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  intel_idle: fine-tune IVT residency targets
  tools/power turbostat: Run on Broadwell
  tools/power turbostat: simplify output, add Avg_MHz
  intel_idle: Add CPU model 54 (Atom N2000 series)
  intel_idle: support Bay Trail
  intel_idle: allow sparse sub-state numbering, for Bay Trail
  ACPI idle: permit sparse C-state sub-state numbers
2014-04-08 13:25:25 +02:00
Linus Torvalds
26c12d9334 Merge branch 'akpm' (incoming from Andrew)
Merge second patch-bomb from Andrew Morton:
 - the rest of MM
 - zram updates
 - zswap updates
 - exit
 - procfs
 - exec
 - wait
 - crash dump
 - lib/idr
 - rapidio
 - adfs, affs, bfs, ufs
 - cris
 - Kconfig things
 - initramfs
 - small amount of IPC material
 - percpu enhancements
 - early ioremap support
 - various other misc things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (156 commits)
  MAINTAINERS: update Intel C600 SAS driver maintainers
  fs/ufs: remove unused ufs_super_block_third pointer
  fs/ufs: remove unused ufs_super_block_second pointer
  fs/ufs: remove unused ufs_super_block_first pointer
  fs/ufs/super.c: add __init to init_inodecache()
  doc/kernel-parameters.txt: add early_ioremap_debug
  arm64: add early_ioremap support
  arm64: initialize pgprot info earlier in boot
  x86: use generic early_ioremap
  mm: create generic early_ioremap() support
  x86/mm: sparse warning fix for early_memremap
  lglock: map to spinlock when !CONFIG_SMP
  percpu: add preemption checks to __this_cpu ops
  vmstat: use raw_cpu_ops to avoid false positives on preemption checks
  slub: use raw_cpu_inc for incrementing statistics
  net: replace __this_cpu_inc in route.c with raw_cpu_inc
  modules: use raw_cpu_write for initialization of per cpu refcount.
  mm: use raw_cpu ops for determining current NUMA node
  percpu: add raw_cpu_ops
  slub: fix leak of 'name' in sysfs_slab_add
  ...
2014-04-07 16:38:06 -07:00
Mark Salter
5b7c73e009 x86: use generic early_ioremap
Move x86 over to the generic early ioremap implementation.

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:15 -07:00
Dave Young
6b550f6f20 x86/mm: sparse warning fix for early_memremap
This patch series takes the common bits from the x86 early ioremap
implementation and creates a generic implementation which may be used by
other architectures.  The early ioremap interfaces are intended for
situations where boot code needs to make temporary virtual mappings
before the normal ioremap interfaces are available.  Typically, this
means before paging_init() has run.

This patch (of 6):

There's a lot of sparse warnings for code like below: void *a =
early_memremap(phys_addr, size);

early_memremap intend to map kernel memory with ioremap facility, the
return pointer should be a kernel ram pointer instead of iomem one.

For making the function clearer and supressing sparse warnings this patch
do below two things:
1. cast to (__force void *) for the return value of early_memremap
2. add early_memunmap function and pass (__force void __iomem *) to iounmap

From Boris:
  "Ingo told me yesterday, it makes sense too.  I'd guess we can try it.
   FWIW, all callers of early_memremap use the memory they get remapped
   as normal memory so we should be safe"

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:14 -07:00
Christoph Lameter
b3ca1c10d7 percpu: add raw_cpu_ops
The kernel has never been audited to ensure that this_cpu operations are
consistently used throughout the kernel.  The code generated in many
places can be improved through the use of this_cpu operations (which
uses a segment register for relocation of per cpu offsets instead of
performing address calculations).

The patch set also addresses various consistency issues in general with
the per cpu macros.

A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
   because checks are skipped. This is typically shown through a raw_
   prefix. So this patch set changes the places where __this_cpu_ptr()
   is used to raw_cpu_ptr().

B. There has been the long term wish by some that __this_cpu operations
   would check for preemption. However, there are cases where preemption
   checks need to be skipped. This patch set adds raw_cpu operations that
   do not check for preemption and then adds preemption checks to the
   __this_cpu operations.

C. The use of __get_cpu_var is always a reference to a percpu variable
   that can also be handled via a this_cpu operation. This patch set
   replaces all uses of __get_cpu_var with this_cpu operations.

D. We can then use this_cpu RMW operations in various places replacing
   sequences of instructions by a single one.

E. The use of this_cpu operations throughout will allow other arches than
   x86 to implement optimized references and RMV operations to work with
   per cpu local data.

F. The use of this_cpu operations opens up the possibility to
   further optimize code that relies on synchronization through
   per cpu data.

The patch set works in a couple of stages:

I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
    Also converts the existing __this_cpu_xx_# primitive in the x86
    code to raw_cpu_xx_#.

II. Patch 2-4 use the raw_cpu operations in places that would give
     us false positives once they are enabled.

III. Patch 5 adds preemption checks to __this_cpu operations to allow
    checking if preemption is properly disabled when these functions
    are used.

IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
   with this_cpu_ptr. They do not depend on any changes to the percpu
   code. No preemption tests are skipped if they are applied.

V. Patches 21-46 are conversion patches that use this_cpu operations
   in various kernel subsystems/drivers or arch code.

VI.  Patches 47/48 (not included in this series) remove no longer used
    functions (__this_cpu_ptr and __get_cpu_var).  These should only be
    applied after all the conversion patches have made it and after we
    have done additional passes through the kernel to ensure that none of
    the uses of these functions remain.

This patch (of 46):

The patches following this one will add preemption checks to __this_cpu
ops so we need to have an alternative way to use this_cpu operations
without preemption checks.

raw_cpu_ops will be the basis for all other ops since these will be the
operations that do not implement any checks.

Primitive operations are renamed by this patch from __this_cpu_xxx to
raw_cpu_xxxx.

Also change the uses of the x86 percpu primitives in preempt.h.
These depend directly on asm/percpu.h (header #include nesting issue).

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Alex Shi <alex.shi@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bryan Wu <cooloney@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: David Daney <david.daney@cavium.com>
Cc: David Miller <davem@davemloft.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Robert Richter <rric@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:13 -07:00
Josh Triplett
b06dd879f5 x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG
This ensures that BUG() always has a definition that causes a trap (via
an undefined instruction), and that the compiler still recognizes the
code following BUG() as unreachable, avoiding warnings that would
otherwise appear (such as on non-void functions that don't return a
value after BUG()).

In addition to saving a few bytes over the generic infinite-loop
implementation, this implementation traps rather than looping, which
potentially allows for better error-recovery behavior (such as by
rebooting).

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:10 -07:00
Linus Torvalds
467a9e1633 CPU hotplug notifiers registration fixes for 3.15-rc1
The purpose of this single series of commits from Srivatsa S Bhat (with
 a small piece from Gautham R Shenoy) touching multiple subsystems that use
 CPU hotplug notifiers is to provide a way to register them that will not
 lead to deadlocks with CPU online/offline operations as described in the
 changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
 of callback registration functions).
 
 The first three commits in the series introduce the API and document it
 and the rest simply goes through the users of CPU hotplug notifiers and
 converts them to using the new method.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
 4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
 KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
 BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
 rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
 j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
 ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
 lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
 IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
 cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
 StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
 sWl//rg76nb13dFjvF+q
 =SW7Q
 -----END PGP SIGNATURE-----

Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
 "The purpose of this single series of commits from Srivatsa S Bhat
  (with a small piece from Gautham R Shenoy) touching multiple
  subsystems that use CPU hotplug notifiers is to provide a way to
  register them that will not lead to deadlocks with CPU online/offline
  operations as described in the changelog of commit 93ae4f978c ("CPU
  hotplug: Provide lockless versions of callback registration
  functions").

  The first three commits in the series introduce the API and document
  it and the rest simply goes through the users of CPU hotplug notifiers
  and converts them to using the new method"

* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  net/iucv/iucv.c: Fix CPU hotplug callback registration
  net/core/flow.c: Fix CPU hotplug callback registration
  mm, zswap: Fix CPU hotplug callback registration
  mm, vmstat: Fix CPU hotplug callback registration
  profile: Fix CPU hotplug callback registration
  trace, ring-buffer: Fix CPU hotplug callback registration
  xen, balloon: Fix CPU hotplug callback registration
  hwmon, via-cputemp: Fix CPU hotplug callback registration
  hwmon, coretemp: Fix CPU hotplug callback registration
  thermal, x86-pkg-temp: Fix CPU hotplug callback registration
  octeon, watchdog: Fix CPU hotplug callback registration
  oprofile, nmi-timer: Fix CPU hotplug callback registration
  intel-idle: Fix CPU hotplug callback registration
  clocksource, dummy-timer: Fix CPU hotplug callback registration
  drivers/base/topology.c: Fix CPU hotplug callback registration
  acpi-cpufreq: Fix CPU hotplug callback registration
  zsmalloc: Fix CPU hotplug callback registration
  scsi, fcoe: Fix CPU hotplug callback registration
  scsi, bnx2fc: Fix CPU hotplug callback registration
  scsi, bnx2i: Fix CPU hotplug callback registration
  ...
2014-04-07 14:55:46 -07:00
Linus Torvalds
d1d9cfc330 A number of cleanups plus support for the RDSEED instruction, which
will be showing up in Intel Broadwell CPU's.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTPaDyAAoJENNvdpvBGATwIVkP/Ao77ATdIBKr7kJJ+7q/jroo
 hfg2EOBzYwFSTZ4EUCwTwCNOqcAnr8xBYRWDDTxoiT0ORQN+4QkIXlukTBZbeCUg
 5ZCvRYhNcrieCBX69X/0aQ7yAkMR5/KSjccww3Tr6bLEItEBS7/gkeGv0H/E1DOg
 /1C3vbDDRPchM2qGM/TS/2Wd3WU5jWJk4cURJUlweBWHGXItOk6yHIY/+h3t4Bj1
 eJfxwG21BLJ1t159uvMj5Sw9Ketwnzgc4hsU6Ai/o37+Q12LYJ6xRCHsAatr6aA8
 dsObrfnD15T3gH/elVzB0tpBkor9pESkJAvUxWdMNMNGVU01ONuBz6rbSUqLdKu4
 GXA+9iHq4Ti43f4M886PweSweFr1+lFIiQrdAQaWZdrooLCMJCgjMNM+7wnlu8ho
 Eit65rq4k6y5Fb5GTB8Am/TCb3HNYhXnrMERA0meKx+/K8yaMygzHtPoT+DHdn8z
 GWDwNiMYrtTeiJ5nduljO3M2lYJDIWLkCD9SwupXe5HEj3qSHS/aiT5ng6YP+oOK
 mmUiWWgpVv3unO6pJw49pbzuAWXnbHVWMV2VDILVxn7mZZ0NAOL3OGAmK7US+Z/J
 qww9s89A2oba8vTbdrW7JuiP9ESlujpfrLQQgnW6WInMHY2dfo5hSo84ULLeLMXQ
 SqT1yywl4U888jqBYioB
 =RtMT
 -----END PGP SIGNATURE-----

Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random

Pull /dev/random changes from Ted Ts'o:
 "A number of cleanups plus support for the RDSEED instruction, which
  will be showing up in Intel Broadwell CPU's"

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  random: Add arch_has_random[_seed]()
  random: If we have arch_get_random_seed*(), try it before blocking
  random: Use arch_get_random_seed*() at init time and once a second
  x86, random: Enable the RDSEED instruction
  random: use the architectural HWRNG for the SHA's IV in extract_buf()
  random: clarify bits/bytes in wakeup thresholds
  random: entropy_bytes is actually bits
  random: simplify accounting code
  random: tighten bound on random_read_wakeup_thresh
  random: forget lock in lockless accounting
  random: simplify accounting logic
  random: fix comment on "account"
  random: simplify loop in random_read
  random: fix description of get_random_bytes
  random: fix comment on proc_do_uuid
  random: fix typos / spelling errors in comments
2014-04-04 21:25:28 -07:00
Linus Torvalds
7df934526c Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull renameat2 system call from Miklos Szeredi:
 "This adds a new syscall, renameat2(), which is the same as renameat()
  but with a flags argument.

  The purpose of extending rename is to add cross-rename, a symmetric
  variant of rename, which exchanges the two files.  This allows
  interesting things, which were not possible before, for example
  atomically replacing a directory tree with a symlink, etc...  This
  also allows overlayfs and friends to operate on whiteouts atomically.

  Andy Lutomirski also suggested a "noreplace" flag, which disables the
  overwriting behavior of rename.

  These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only
  implemented for ext4 as an example and for testing"

* 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ext4: add cross rename support
  ext4: rename: split out helper functions
  ext4: rename: move EMLINK check up
  ext4: rename: create ext4_renament structure for local vars
  vfs: add cross-rename
  vfs: lock_two_nondirectories: allow directory args
  security: add flags to rename hooks
  vfs: add RENAME_NOREPLACE flag
  vfs: add renameat2 syscall
  vfs: rename: use common code for dir and non-dir
  vfs: rename: move d_move() up
  vfs: add d_is_dir()
2014-04-04 14:03:05 -07:00