Commit cbf8277ef4 ("iommu/arm-smmu: Treat IOMMU_DOMAIN_DMA as bypass
for now") ignores requests to attach a device to the default domain
since, without IOMMU-basked DMA ops available everywhere, the default
domain will just lead to unexpected transaction faults being reported.
Unfortunately, the way this was implemented on SMMUv2 causes a
regression with VFIO PCI device passthrough under KVM on AMD Seattle.
On this system, the host controller device is associated with both a
pci_dev *and* a platform_device, and can therefore end up with duplicate
SMR entries, resulting in a stream-match conflict at runtime.
This patch amends the original fix so that attaching to IOMMU_DOMAIN_DMA
is rejected even before configuring the SMRs. This restores the old
behaviour for now, but we'll need to look at handing host controllers
specially when we come to supporting the default domain fully.
Reported-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Eric Auger <eric.auger@linaro.org>
Tested-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation')
introduced per-CPU IOVA caches to massively improve scalability. Use them.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
IOVA allocation has two problems that impede high-throughput I/O.
First, it can do a linear search over the allocated IOVA ranges.
Second, the rbtree spinlock that serializes IOVA allocations becomes
contended.
Address these problems by creating an API for caching allocated IOVA
ranges, so that the IOVA allocator isn't accessed frequently. This
patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs
without taking the rbtree spinlock. The per-CPU caches are backed by
a global cache, to avoid invoking the (linear-time) IOVA allocator
without needing to make the per-CPU cache size excessive. This design
is based on magazines, as described in "Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources" (currently
available at https://www.usenix.org/legacy/event/usenix01/bonwick.html)
Adding caching on top of the existing rbtree allocator maintains the
property that IOVAs are densely packed in the IO virtual address space,
which is important for keeping IOMMU page table usage low.
To keep the cache size reasonable, we bound the IOVA space a CPU can
cache by 32 MiB (we cache a bounded number of IOVA ranges, and only
ranges of size <= 128 KiB). The shared global cache is bounded at
4 MiB of IOVA space.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of
pointers to "struct iova". This avoids using the iova struct from the IOVA
red-black tree and the resulting explicit find_iova() on unmap.
This patch will allow us to cache IOVAs in the next patch, in order to
avoid rbtree operations for the majority of map/unmap operations.
Note: In eliminating the find_iova() operation, we have also eliminated
the sanity check previously done in the unmap flow. Arguably, this was
overhead that is better avoided in production code, but it could be
brought back as a debug option for driver development.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded
the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb()
for domains with no dev iotlb devices.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[gvdl@google.com: fixed locking issues]
Signed-off-by: Godfrey van der Linden <gvdl@google.com>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Current unmap implementation unmaps the entire area covered by the IOVA
range, which is a power-of-2 aligned region. The corresponding map,
however, only maps those pages originally mapped by the user. This
discrepancy can lead to unmapping of already unmapped entries, which is
unneeded work.
With this patch, only mapped pages are unmapped. This is also a baseline
for a map/unmap implementation based on IOVAs and not iova structures,
which will allow caching.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi()
dma addresses. (x86_64 mm and dma have the same size for pages
at the moment, but this usage improves consistency.)
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The IOMMU's IOTLB invalidation is a costly process. When iommu mode
is not set to "strict", it is done asynchronously. Current code
amortizes the cost of invalidating IOTLB entries by batching all the
invalidations in the system and performing a single global invalidation
instead. The code queues pending invalidations in a global queue that
is accessed under the global "async_umap_flush_lock" spinlock, which
can result is significant spinlock contention.
This patch splits this deferred queue into multiple per-cpu deferred
queues, and thus gets rid of the "async_umap_flush_lock" and its
contention. To keep existing deferred invalidation behavior, it still
invalidates the pending invalidations of all CPUs whenever a CPU
reaches its watermark or a timeout occurs.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Currently, deferred flushes' info is striped between several lists in
the flush tables. Instead, move all information about a specific flush
to a single entry in this table.
This patch does not introduce any functional change.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Remove the usage of of_parse_phandle_with_args() and replace
it by the phandle-iterator implementation so that we can
parse out all of the potentially present 128 stream-ids.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
"devid" needs to be signed for the error handling to work.
Fixes: b097d11a0f ('iommu/amd: Manage iommu_group for ACPI HID devices')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Solve IOMMU support issues with PCIe non-transparent bridges that use
Requester ID look-up tables (RID-LUT), e.g., the PEX8733.
The NTB connects devices in two independent PCI domains. Devices separated
by the NTB are not able to discover each other. A PCI packet being
forwared from one domain to another has to have its RID modified so it
appears on correct bus and completions are forwarded back to the original
domain through the NTB. The RID is translated using a preprogrammed table
(LUT) and the PCI packet propagates upstream away from the NTB. If the
destination system has IOMMU enabled, the packet will be discarded because
the new RID is unknown to the IOMMU. Adding a DMA alias for the new RID
allows IOMMU to properly recognize the packet.
Each device behind the NTB has a unique RID assigned in the RID-LUT. The
current DMA alias implementation supports only a single alias, so it's not
possible to support mutiple devices behind the NTB when IOMMU is enabled.
Enable all possible aliases on a given bus (256) that are stored in a
bitset. Alias devfn is directly translated to a bit number. The bitset is
not allocated for devices that have no need for DMA aliases.
More details can be found in the following article:
http://www.plxtech.com/files/pdf/technical/expresslane/RTC_Enabling%20MulitHostSystemDesigns.pdf
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Commit 61289cb ('iommu/amd: Remove old alias handling code')
removed the old alias handling code from the AMD IOMMU
driver because this is now handled by the IOMMU core code.
But this also removed the handling of PCI aliases, which is
not handled by the core code. This caused issues with PCI
devices that have hidden PCIe-to-PCI bridges that rewrite
the request-id.
Fix this bug by re-introducing some of the removed functions
from commit 61289cbaf6 and add a alias field
'struct iommu_dev_data'. This field carrys the return value
of the get_alias() function and uses that instead of the
amd_iommu_alias_table[] array in the code.
Fixes: 61289cbaf6 ('iommu/amd: Remove old alias handling code')
Cc: stable@vger.kernel.org # v4.4+
Tested-by: Tomasz Golinski <tomaszg@math.uwb.edu.pl>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Teach the short-descriptor format to create Device mappings when asked.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Teach the LPAE format to create Device mappings when asked.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
My static checker complains that "dma_alias" is uninitialized unless we
are dealing with a pci device. This is true but harmless. Anyway, we
can flip the condition around to silence the warning.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since commit cd6438c5f8 ("iommu/rockchip: Reconstruct to support multi
slaves") rk_iommu_is_stall_active() always returns false because the
bitwise AND operates on the boolean flag promoted to an integer and a
value that is either zero or BIT(2).
Explicitly convert the right-hand value to a boolean so that both sides
are guaranteed to be either zero or one.
rk_iommu_is_paging_enabled() does not suffer from the same problem since
RK_MMU_STATUS_PAGING_ENABLED is BIT(0), but let's apply the same change
for consistency and to make it clear that it's correct without needing
to lookup the value.
Fixes: cd6438c5f8 ("iommu/rockchip: Reconstruct to support multi slaves")
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The Freescale PAMU can be enabled on both 32 and 64-bit
Power chips. Commit 477ab7a19c restricted PAMU to PPC32.
PPC covers both.
Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IOMMU drivers that do not support default domains, but make
use of the the group->domain pointer can get that pointer
overwritten with NULL on device add/remove.
Make sure this can't happen by only overwriting the domain
pointer when it is NULL.
Cc: stable@vger.kernel.org # v4.4+
Fixes: 1228236de5 ('iommu: Move default domain allocation to iommu_group_get_for_dev()')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD Uart DMA belongs to ACPI HID type device, and its driver
is basing on AMBA Bus, need also IOMMU support.
This patch is just to set the AMD iommu callbacks for amba bus.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch creates a new function for finding or creating an IOMMU
group for acpihid(ACPI Hardware ID) device.
The acpihid devices with the same devid will be put into same group and
there will have the same domain id and share the same page table.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Current IOMMU driver make assumption that the downstream devices are PCI.
With the newly added ACPI-HID IVHD device entry support, this is no
longer true. This patch is to add dev type check and to distinguish the
pci and acpihid device code path.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch is to make the call-sites of get_device_id aware of its
return value.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces a new kernel parameter, ivrs_acpihid.
This is used to override existing ACPI-HID IVHD device entry,
or add an entry in case it is missing in the IVHD.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch introduces acpihid_map, which is used to store
the new IVHD device entry extracted from BIOS IVRS table.
It also provides a utility function add_acpi_hid_device(),
to add this types of devices to the map.
Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IVRS in more recent AMD system usually contains multiple
IVHD block types (e.g. 0x10, 0x11, and 0x40) for each IOMMU.
The newer IVHD types provide more information (e.g. new features
specified in the IOMMU spec), while maintain compatibility with
the older IVHD type.
Having multiple IVHD type allows older IOMMU drivers to still function
(e.g. using the older IVHD type 0x10) while the newer IOMMU driver can use
the newer IVHD types (e.g. 0x11 and 0x40). Therefore, the IOMMU driver
should only make use of the newest IVHD type that it can support.
This patch adds new logic to determine the highest level of IVHD type
it can support, and use it throughout the to initialize the driver.
This requires adding another pass to the IVRS parsing to determine
appropriate IVHD type (see function get_highest_supported_ivhd_type())
before parsing the contents.
[Vincent: fix the build error of IVHD_DEV_ACPI_HID flag not found]
Signed-off-by: Wan Zongshun <vincent.wan@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch modifies the existing struct ivhd_header,
which currently only support IVHD type 0x10, to add
new fields from IVHD type 11h and 40h.
It also modifies the pointer calculation to allow
support for IVHD type 11h and 40h
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The IVHD header type 11h and 40h introduce the PCSup bit in
the EFR Register Image bit fileds. This should be used to
determine the IOMMU performance support instead of relying
on the PNCounters and PNBanks.
Note also that the PNCouters and PNBanks bits in the IOMMU
attributes field of IVHD headers type 11h are incorrectly
programmed on some systems.
So, we should not rely on it to determine the performance
counter/banks size. Instead, these values should be read
from the MMIO Offset 0030h IOMMU Extended Feature Register.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch fixes one existing alignment checkpatch check
warning of the type "Alignment should match open parenthesis"
in the OMAP IOMMU debug source file.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The OMAP IOMMU page table needs to be aligned on a 16K boundary,
and the current code uses a BUG_ON on the alignment sanity check
in the .domain_alloc() ops implementation. Replace this with a
less severe WARN_ON and bail out gracefully.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The iopgtable_store_entry_core() function uses a BUG() statement
for an unsupported page size entry programming. Replace this with
a less severe WARN_ON() and perform a graceful bailout on error.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The function iopgtable_clear_entry_all() is used for clearing all
the page table entries. These entries are neither created nor
initialized during the OMAP IOMMU driver probe, and are managed
only when a client device attaches to the IOMMU. So, there is no
need to invoke this function on a driver remove.
Removing this fixes a NULL pointer dereference crash if the IOMMU
device is unbound from the driver with no client device attached
to the IOMMU device.
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
dma_pte_free_pagetable no longer depends on last level ptes
being clear, it clears them itself. Fix up the comment to
match.
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If we do, devres prints a "invalid resource" string in the error
loglevel.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove new line in error logs, avoid duplicate and explicit pr_fmt.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57 ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fault rates can easily overwhelm the console and make the system
unresponsive. Ratelimit to allow an opportunity for maintenance.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57 ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In MT8173, Normally the first 1GB PA is for the HW SRAM and Regs,
so the PA will be 33bits if the dram size is 4GB. We have a
"DRAM 4GB mode" toggle bit for this. If it's enabled, from CPU's
point of view, the dram PA will be from 0x1_00000000~0x1_ffffffff.
In short descriptor, the pagetable descriptor is always 32bit.
Mediatek extend bit9 in the lvl1 and lvl2 pgtable descriptor
as the 4GB mode.
In the 4GB mode, the bit9 must be set, then M4U help add 0x1_00000000
based on the PA in pagetable. Thus the M4U output address to EMI is
always 33bits(the input address is still 32bits).
We add a special quirk for this MTK-4GB mode. And in the standard
spec, Bit9 in the lvl1 is "IMPLEMENTATION DEFINED", while it's AP[2]
in the lvl2, therefore if this quirk is enabled, NO_PERMS is also
expected.
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With the change to stashing just the IOVA-page-aligned remainder of the
CPU-page offset rather than the whole thing, the failure path in
__invalidate_sg() also needs tweaking to account for that in the case of
differing page sizes where the two offsets may not be equivalent.
Similarly in __finalise_sg(), lest the architecture-specific wrappers
later get the wrong address for cache maintenance on sync or unmap.
Fixes: 164afb1d85 ("iommu/dma: Use correct offset in map_sg")
Reported-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Cc: stable@ver.kernel.org # v4.4+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This time with:
* Updates for the Exynos IOMMU driver to make use of default
domains and to add support for the SYSMMU v5
* New Mediatek IOMMU driver
* Support for the ARMv7 short descriptor format in the
io-pgtable code
* Default domain support for the ARM SMMU
* Couple of other small fixes all over the place
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJW7/7rAAoJECvwRC2XARrjKvgP/2sgR6lzIGksKpZRNNNoyJEp
PbFt3zxBvIPYow6rQtfMqU82FAi6psq+EVKq+M0EOeJrjFGawwWpN9H/e0ZCs5Z/
/s6DIljRFKrbty59eFsHn57Pd+302Pt0GkwnSgdgBJD7FimozyyeMJnAOs5gPjYT
jF2ajV9FYa5rIRrMsSD2KjLKgBb3xVsgUlW72NU2WwldnOB6fSsfg4ll01kbzTon
IQENT5ywk9zZFouLyrX6EvcvowHslO/sZhGe3Py9qOOHpu9roW7EE7rEGYdabn47
PGpw8O5NOeSrQNzlmhXje5tuKxkh33DV55s7vVcaOy66kWbYExJGoz1/V7Vju4n1
pok82L3N8eauMs3xqNOiQMV8UsWIXOzdMMaGypM18pCVKMaAUiz9vO9rLSmR4Z20
IYFiX0yBXhc1AXMnrRlq/xR2WjBX2L2s0VguvYoSssdmJUZ9aKYxsurF8Ylqpm+1
wymOj+gjM056DqAXcYBVg4ZPOEezRjnUe2qD8lZ4et3xOVUL3LXRi8FmacztEB97
chUSB5mur/XRy6bOVI2l1uRQaqdfErgbCey0fa9N/SWKSHKWtAfR6CYYVpoR6m0L
H/xL7yCn6jUEoadKxZyTKnX8GIN6wNcZdI+58OOMz3sjlmWs69wgdPt8Xx2RNHpm
7caf/9sTdpUeh+V2fySD
=uiAk
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- updates for the Exynos IOMMU driver to make use of default domains
and to add support for the SYSMMU v5
- new Mediatek IOMMU driver
- support for the ARMv7 short descriptor format in the io-pgtable code
- default domain support for the ARM SMMU
- couple of other small fixes all over the place
* tag 'iommu-updates-v4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
iommu/ipmmu-vmsa: Add r8a7795 DT binding
iommu/mediatek: Check for NULL instead of IS_ERR()
iommu/io-pgtable-armv7s: Fix kmem_cache_alloc() flags
iommu/mediatek: Fix handling of of_count_phandle_with_args result
iommu/dma: Fix NEED_SG_DMA_LENGTH dependency
iommu/mediatek: Mark PM functions as __maybe_unused
iommu/mediatek: Select ARM_DMA_USE_IOMMU
iommu/exynos: Use proper readl/writel register interface
iommu/exynos: Pointers are nto physical addresses
dts: mt8173: Add iommu/smi nodes for mt8173
iommu/mediatek: Add mt8173 IOMMU driver
memory: mediatek: Add SMI driver
dt-bindings: mediatek: Add smi dts binding
dt-bindings: iommu: Add binding for mediatek IOMMU
iommu/ipmmu-vmsa: Use ARCH_RENESAS
iommu/exynos: Support multiple attach_device calls
iommu/exynos: Add Maintainers entry for Exynos SYSMMU driver
iommu/exynos: Add support for v5 SYSMMU
iommu/exynos: Update device tree documentation
iommu/exynos: Add support for SYSMMU controller with bogus version reg
...
Pull x86 protection key support from Ingo Molnar:
"This tree adds support for a new memory protection hardware feature
that is available in upcoming Intel CPUs: 'protection keys' (pkeys).
There's a background article at LWN.net:
https://lwn.net/Articles/643797/
The gist is that protection keys allow the encoding of
user-controllable permission masks in the pte. So instead of having a
fixed protection mask in the pte (which needs a system call to change
and works on a per page basis), the user can map a (handful of)
protection mask variants and can change the masks runtime relatively
cheaply, without having to change every single page in the affected
virtual memory range.
This allows the dynamic switching of the protection bits of large
amounts of virtual memory, via user-space instructions. It also
allows more precise control of MMU permission bits: for example the
executable bit is separate from the read bit (see more about that
below).
This tree adds the MM infrastructure and low level x86 glue needed for
that, plus it adds a high level API to make use of protection keys -
if a user-space application calls:
mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);
(note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
this special case, and will set a special protection key on this
memory range. It also sets the appropriate bits in the Protection
Keys User Rights (PKRU) register so that the memory becomes unreadable
and unwritable.
So using protection keys the kernel is able to implement 'true'
PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
PROT_READ as well. Unreadable executable mappings have security
advantages: they cannot be read via information leaks to figure out
ASLR details, nor can they be scanned for ROP gadgets - and they
cannot be used by exploits for data purposes either.
We know about no user-space code that relies on pure PROT_EXEC
mappings today, but binary loaders could start making use of this new
feature to map binaries and libraries in a more secure fashion.
There is other pending pkeys work that offers more high level system
call APIs to manage protection keys - but those are not part of this
pull request.
Right now there's a Kconfig that controls this feature
(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
(like most x86 CPU feature enablement code that has no runtime
overhead), but it's not user-configurable at the moment. If there's
any serious problem with this then we can make it configurable and/or
flip the default"
* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
mm/core, x86/mm/pkeys: Add execute-only protection keys support
x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
x86/mm/pkeys: Allow kernel to modify user pkey rights register
x86/fpu: Allow setting of XSAVE state
x86/mm: Factor out LDT init from context init
mm/core, x86/mm/pkeys: Add arch_validate_pkey()
mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
x86/mm/pkeys: Add Kconfig prompt to existing config option
x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
x86/mm/pkeys: Dump PKRU with other kernel registers
mm/core, x86/mm/pkeys: Differentiate instruction fetches
x86/mm/pkeys: Optimize fault handling in access_error()
mm/core: Do not enforce PKEY permissions on remote mm access
um, pkeys: Add UML arch_*_access_permitted() methods
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
x86/mm/gup: Simplify get_user_pages() PTE bit handling
...
Whilst the default SLUB allocator happily just merges the original
allocation flags from kmem_cache_create() with those passed through
kmem_cache_alloc(), there is a code path in the SLAB allocator which
will aggressively BUG_ON() if the cache was created with SLAB_CACHE_DMA
but GFP_DMA is not specified for an allocation:
kernel BUG at mm/slab.c:2536!
Internal error: Oops - BUG: 0 [#1] SMP ARM
Modules linked in:[ 1.299311] Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted
4.5.0-rc6-koelsch-05892-ge7e45ad53ab6795e #2270
Hardware name: Generic R8A7791 (Flattened Device Tree)
task: ef422040 ti: ef442000 task.ti: ef442000
PC is at cache_alloc_refill+0x2a0/0x530
LR is at _raw_spin_unlock+0x8/0xc
...
[<c02c6928>] (cache_alloc_refill) from [<c02c6630>] (kmem_cache_alloc+0x7c/0xd4)
[<c02c6630>] (kmem_cache_alloc) from [<c04444bc>]
(__arm_v7s_alloc_table+0x5c/0x278)
[<c04444bc>] (__arm_v7s_alloc_table) from [<c0444e1c>]
(__arm_v7s_map.constprop.6+0x68/0x25c)
[<c0444e1c>] (__arm_v7s_map.constprop.6) from [<c0445044>]
(arm_v7s_map+0x34/0xa4)
[<c0445044>] (arm_v7s_map) from [<c0c18ee4>] (arm_v7s_do_selftests+0x140/0x418)
[<c0c18ee4>] (arm_v7s_do_selftests) from [<c0201760>]
(do_one_initcall+0x100/0x1b4)
[<c0201760>] (do_one_initcall) from [<c0c00d4c>]
(kernel_init_freeable+0x120/0x1e8)
[<c0c00d4c>] (kernel_init_freeable) from [<c067a364>] (kernel_init+0x8/0xec)
[<c067a364>] (kernel_init) from [<c0206b68>] (ret_from_fork+0x14/0x2c)
Code: 1a000003 e7f001f2 e3130001 0a000000 (e7f001f2)
---[ end trace 190f6f6b84352efd ]---
Keep the peace by adding GFP_DMA when allocating a table.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The function can return negative value so it should be assigned to signed
variable. The patch changes also type of related i variable to make code
more compact and coherent.
The problem has been detected using patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In the PCI hotplug path of the Intel IOMMU driver, replace
the usage of the BUS_NOTIFY_DEL_DEVICE notifier, which is
executed before the driver is unbound from the device, with
BUS_NOTIFY_REMOVED_DEVICE, which runs after that.
This fixes a kernel BUG being triggered in the VT-d code
when the device driver tries to unmap DMA buffers and the
VT-d driver already destroyed all mappings.
Reported-by: Stefani Seibold <stefani@seibold.net>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Detach the device that is about to be removed from its
domain (if it has one) to clear any related state like DTE
entry and device's ATS state.
Reported-by: Kelly Zytaruk <Kelly.Zytaruk@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IOMMU_DMA does indeed depend on scatterlists having a DMA length, but
the NEED_SG_DMA_LENGTH symbol should be selected, not depended upon.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When CONFIG_PM is unset, we get a harmless warning for this driver:
drivers/iommu/mtk_iommu.c:665:12: error: 'mtk_iommu_suspend' defined but not used [-Werror=unused-function]
drivers/iommu/mtk_iommu.c:680:12: error: 'mtk_iommu_resume' defined but not used [-Werror=unused-function]
Marking the functions as __maybe_unused gits rid of the two functions
and lets the compiler silently drop the object code, while still
doing syntax checking on them for build-time verification.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0df4fabe20 ("iommu/mediatek: Add mt8173 IOMMU driver")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The newly added Mediatek IOMMU driver uses the IOMMU_DMA infrastructure,
but unlike other such drivers, it does not select 'ARM_DMA_USE_IOMMU',
which is a prerequisite, leading to a link error:
warning: (MTK_IOMMU) selects IOMMU_DMA which has unmet direct dependencies (IOMMU_SUPPORT && NEED_SG_DMA_LENGTH)
drivers/iommu/built-in.o: In function `iommu_put_dma_cookie':
mtk_iommu.c:(.text+0x11fe): undefined reference to `put_iova_domain'
drivers/iommu/built-in.o: In function `iommu_dma_init_domain':
mtk_iommu.c:(.text+0x1316): undefined reference to `init_iova_domain'
drivers/iommu/built-in.o: In function `__iommu_dma_unmap':
mtk_iommu.c:(.text+0x1380): undefined reference to `find_iova'
This adds the same select that the other drivers have. On a related
note, I wonder if we should just always select ARM_DMA_USE_IOMMU
whenever any IOMMU driver is enabled. Are there any cases where
we would enable an IOMMU but not use it?
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 0df4fabe20 ("iommu/mediatek: Add mt8173 IOMMU driver")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Drivers should use generic readl/writel calls to access HW registers, so
replace all __raw_readl/writel with generic version.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The exynos iommu driver changed an incorrect cast from pointer
to 'unsigned int' to an equally incorrect cast to a 'phys_addr_t',
which results in an obvious compile-time error when phys_addr_t
is wider than pointers are:
drivers/iommu/exynos-iommu.c: In function 'alloc_lv2entry':
drivers/iommu/exynos-iommu.c:918:32: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
The code does not actually want the physical address (which would
involve using virt_to_phys()), but just checks the alignment,
so we can change it to use a cast to uintptr_t instead.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 740a01eee9 ("iommu/exynos: Add support for v5 SYSMMU")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The AMD Family 15h Models 30h-3Fh (Kaveri) BIOS and Kernel Developer's
Guide omitted part of the BIOS IOMMU L2 register setup specification.
Without this setup the IOMMU L2 does not fully respect write permissions
when handling an ATS translation request.
The IOMMU L2 will set PTE dirty bit when handling an ATS translation with
write permission request, even when PTE RW bit is clear. This may occur by
direct translation (which would cause a PPR) or by prefetch request from
the ATC.
This is observed in practice when the IOMMU L2 modifies a PTE which maps a
pagecache page. The ext4 filesystem driver BUGs when asked to writeback
these (non-modified) pages.
Enable ATS write permission check in the Kaveri IOMMU L2 if BIOS has not.
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Cc: <stable@vger.kernel.org> # v3.19+
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.
Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.
Cc: stable@vger.kernel.org
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make use of ARCH_RENESAS in place of ARCH_SHMOBILE.
This is part of an ongoing process to migrate from ARCH_SHMOBILE to
ARCH_RENESAS the motivation for which being that RENESAS seems to be a more
appropriate name than SHMOBILE for the majority of Renesas ARM based SoCs.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
IOMMU core calls attach_device callback without detaching device from
the previous domain. This patch adds support for such unballanced calls.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds support for v5 of SYSMMU controller, found in Samsung
Exynos 5433 SoCs. The main difference of v5 is support for 36-bit physical
address space and some changes in register layout and core clocks hanging.
This patch also adds support for ARM64 architecture, which is used by
Exynos 5433 SoCs.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
SYSMMU on some SoCs reports bogus values in VERSION register. Force
hardware version to 1.0 for such controllers. This patch also moves reading
version register to driver's probe() function.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch simplifies the code for handling of flpdcache invalidation.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch rewrites sysmmu_init_config function to make it easier to read
and understand.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch provides a new implementation for page fault handing code. The
new implementation is ready for future extensions. No functional changes
have been made.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch changes some internal functions to have access to the state of
sysmmu device instead of having only it's registers. This will make the
code ready for future extensions.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
All clock API function can be called on NULL clock, so simplify code avoid
checking of master clock presence.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch replaces custom ARM-specific code for performing CPU cache flush
operations with generic code based on DMA-mapping. Domain managing code
is independent of particular SYSMMU device, so the first registered SYSMMU
device is used for DMA-mapping calls. This simplification works fine
because all SYSMMU controllers are in the same address space (where
DMA address equals physical address) and the DMA-mapping calls are done
mainly to flush CPU cache to make changes visible to SYSMMU controllers.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch adds support for DMA domain type. Such domain have DMA cookie
prepared and can be used by generic DMA-IOMMU glue layer.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This patch replaces custom code in add_device implementation with
iommu_group_get_for_dev() call and provides the needed callback.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Since iommu_map() code added pgsize value to the paddr, trace_map()
used wrong paddr. So, this patch adds "orig_paddr" value in the
iommu_map() to use for the trace_map().
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We try to enforce protection keys in software the same way that we
do in hardware. (See long example below).
But, we only want to do this when accessing our *own* process's
memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then
tried to PTRACE_POKE a target process which just happened to have
some mprotect_pkey(pkey=6) memory, we do *not* want to deny the
debugger access to that memory. PKRU is fundamentally a
thread-local structure and we do not want to enforce it on access
to _another_ thread's data.
This gets especially tricky when we have workqueues or other
delayed-work mechanisms that might run in a random process's context.
We can check that we only enforce pkeys when operating on our *own* mm,
but delayed work gets performed when a random user context is active.
We might end up with a situation where a delayed-work gup fails when
running randomly under its "own" task but succeeds when running under
another process. We want to avoid that.
To avoid that, we use the new GUP flag: FOLL_REMOTE and add a
fault flag: FAULT_FLAG_REMOTE. They indicate that we are
walking an mm which is not guranteed to be the same as
current->mm and should not be subject to protection key
enforcement.
Thanks to Jerome Glisse for pointing out this scenario.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Dominik Vogt <vogt@linux.vnet.ibm.com>
Cc: Eric B Munson <emunson@akamai.com>
Cc: Geliang Tang <geliangtang@163.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Low <jason.low2@hp.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Until all upstream devices have their DMA ops swizzled to point at the
SMMU, we need to treat the IOMMU_DOMAIN_DMA domain as bypass to avoid
putting devices into an empty address space when detaching from VFIO.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMU attach_dev implementations returns -EEXIST if the device
being attached is already attached to a domain. This doesn't play nicely
with the default domain, resulting in splats such as:
WARNING: at drivers/iommu/iommu.c:1257
Modules linked in:
CPU: 3 PID: 1939 Comm: virtio-net-tx Tainted: G S 4.5.0-rc4+ #1
Hardware name: FVP Base (DT)
task: ffffffc87a9d0000 ti: ffffffc07a278000 task.ti: ffffffc07a278000
PC is at __iommu_detach_group+0x68/0xe8
LR is at __iommu_detach_group+0x48/0xe8
This patch fixes the problem by forcefully detaching the device from
its old domain, if present, when attaching to a new one. The unused
->detach_dev callback is also removed the iommu_ops structures.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Borrow the disable_bypass parameter from the SMMUv3 driver as a handy
debugging/security feature so that unmatched stream IDs (i.e. devices
not attached to an IOMMU domain) may be configured to fault.
Rather than introduce unsightly inconsistency, or repeat the existing
unnecessary use of module_param_named(), fix that as well in passing.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We are saving pointer to iommu DT node in of_iommu_set_ops()
hence we should increment DT node ref count.
Reviewed-by: Ray Jui <rjui@broadcom.com>
Reviewed-by: Scott Branden <sbranden@broadcom.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With DMA mapping ops provided by the iommu-dma code, only a minimal
contribution from the IOMMU driver is needed to create a suitable
DMA-API domain for them to use. Implement this for the ARM SMMUs.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The IOMMU API has no concept of privilege so assumes all devices and
mappings are equal, and indeed most non-CPU master devices on an AMBA
interconnect make little use of the attribute bits on the bus thus by
default perform unprivileged data accesses.
Some devices, however, believe themselves more equal than others, such
as programmable DMA controllers whose 'master' thread issues bus
transactions marked as privileged instruction fetches, while the data
accesses of its channel threads (under the control of Linux, at least)
are marked as unprivileged. This poses a problem for implementing the
DMA API on an IOMMU conforming to ARM VMSAv8, under which a page that is
unprivileged-writeable is also implicitly privileged-execute-never.
Given that, there is no one set of attributes with which iommu_map() can
implement, say, dma_alloc_coherent() that will allow every possible type
of access without something running into unexecepted permission faults.
Fortunately the SMMU architecture provides a means to mitigate such
issues by overriding the incoming attributes of a transaction; make use
of that to strip the privileged/unprivileged status off incoming
transactions, leaving just the instruction/data dichotomy which the
IOMMU API does at least understand; Four states good, two states better.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
As the number of io-pgtable implementations grows beyond 1, it's time
to rationalise the quirks mechanism before things have a chance to
start getting really ugly and out-of-hand.
To that end:
- Indicate exactly which quirks each format can/does support.
- Fail creating a table if a caller wants unsupported quirks.
- Properly document where each quirk applies and why.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In certain unmapping situations it is quite possible to end up issuing
back-to-back TLB synchronisations, which at best is a waste of time and
effort, and at worst causes some hardware to get rather confused. Whilst
the pagetable implementations, or the IOMMU drivers, or both, could keep
track of things to avoid this happening, it seems to make the most sense
to prevent code duplication and add some simple state tracking in the
common interface between the two.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add some simple wrappers to avoid having the guts of the TLB operations
spilled all over the page table implementations, and to provide a point
to implement extra common functionality.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add a nearly-complete ARMv7 short descriptor implementation, omitting
only a few legacy and CPU-centric aspects which shouldn't be necessary
for IOMMU API use anyway.
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Tested-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Minor register size and interrupt acknowledgement fixes which only showed
up in testing on newer hardware, but mostly a fix to the MM refcount
handling to prevent a recursive refcount issue when mmap() is used on
the file descriptor associated with a bound PASID.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlbC/gAACgkQdwG7hYl686OY8QCfUPH+IB0zou9/MH3JNMz1ujot
I6wAoK0R4KiOFXvjNeNPy+XroZ9xKqv/
=RM+0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu
Pull IOMMU SVM fixes from David Woodhouse:
"Minor register size and interrupt acknowledgement fixes which only
showed up in testing on newer hardware, but mostly a fix to the MM
refcount handling to prevent a recursive refcount issue when mmap() is
used on the file descriptor associated with a bound PASID"
* tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu:
iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
According to the VT-d specification we need to clear the PPR bit in
the Page Request Status register when handling page requests, or the
hardware won't generate any more interrupts.
This wasn't actually necessary on SKL/KBL (which may well be the
subject of a hardware erratum, although it's harmless enough). But
other implementations do appear to get it right, and we only ever get
one interrupt unless we clear the PPR bit.
Reported-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
In below commit alias DTE is set when its peripheral is
setting DTE. However there's a code bug here to wrongly
set the alias DTE, correct it in this patch.
commit e25bfb56ea
Author: Joerg Roedel <jroedel@suse.de>
Date: Tue Oct 20 17:33:38 2015 +0200
iommu/amd: Set alias DTE in do_attach/do_detach
Signed-off-by: Baoquan He <bhe@redhat.com>
Tested-by: Mark Hounschell <markh@compro.net>
Cc: stable@vger.kernel.org # v4.4
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There are some IPs, such as video encoder/decoder, contains 2 slave iommus,
one for reading and the other for writing. They share the same irq and
clock with master.
This patch reconstructs to support this case by making them share the same
Page Directory, Page Tables and even the register operations.
That means every instruction to the reading MMU registers would be
duplicated to the writing MMU and vice versa.
Signed-off-by: ZhengShunQian <zhengsq@rock-chips.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Trying to build a kernel for ARC with both options CONFIG_COMPILE_TEST
and CONFIG_IOMMU_IO_PGTABLE_LPAE enabled (e.g. as a result of "make
allyesconfig") results in the following build failure:
| CC drivers/iommu/io-pgtable-arm.o
| linux/drivers/iommu/io-pgtable-arm.c: In
| function ‘__arm_lpae_alloc_pages’:
| linux/drivers/iommu/io-pgtable-arm.c:221:3:
| error: implicit declaration of function ‘dma_map_single’
| [-Werror=implicit-function-declaration]
| dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
| ^
| linux/drivers/iommu/io-pgtable-arm.c:221:42:
| error: ‘DMA_TO_DEVICE’ undeclared (first use in this function)
| dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
| ^
Since IOMMU_IO_PGTABLE_LPAE depends on DMA API, io-pgtable-arm.c should
include linux/dma-mapping.h. This fixes the reported failure.
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The updates include:
* Small code cleanups in the AMD IOMMUv2 driver
* Scalability improvements for the DMA-API implementation of the
AMD IOMMU driver. This is just a starting point, but already
showed some good improvements in my tests.
* Removal of the unused Renesas IPMMU/IPMMUI driver
* Updates for ARM-SMMU include:
* Some fixes to get the driver working nicely on
Broadcom hardware
* A change to the io-pgtable API to indicate the unit in
which to flush (all callers converted, with Ack from
Laurent)
* Use of devm_* for allocating/freeing the SMMUv3
buffers
* Some other small fixes and improvements for other drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWnkkIAAoJECvwRC2XARrjpD8QAMCYfoqiq35QLQYn7Jh/LA5E
ZotdNv6hONwCahQiNSSsaoP2f8IBpyvo7Nrgz1Fj3SEQYzAiBn6mWgXFu7WdQarD
kw1SLwUIUweF/qjgpOvGD29F7mC4XIRYfPOFbLEPvkBwx6Vm4NSJkclfMZJeNCFm
ghWxGdva+7HFyJX+gMS1flihfUzN31U5hKWRxQqHXcLbHuVOdEnL1by5ozbpcJNI
vkpbkCcWaD1uKju918akFYJultcwMGb7Wm6HwKB2EjG2aOoe2Siw61MrJ1DUreOh
J0fJubltaZwkMxFUTuNwrP9E+FH6arPtJBmvpMMz8ZQeLyQQQnBcHKDZFAgHu23Z
/wOkjoA5uG5iy2XiPWbUFJQKp4q+Dlkp8LqT1RAKvp8kVbrrsSGUXQzBIf+DE5F7
U0ghAWB70g6fREys/cvs0q7huX42Cuf3M82JKP9rksLj9ArWoT4TtkI2nvbNyKE8
KhX57xj4OSROriZV8+XmaU/W7bK6BVXr7B0aVOCvf5y7GsIYhf6zSH+0cP/TmLuQ
ZLtOr2UHFzvjZq7LHgRfEs1CYn+PhKw6kUM3rxjm/QZxiBSft7ABhhxJZKlMyE/f
jTnPS5DH2XT+UKtt8D0nfS558h0kxqwXzhICQHC30lpJLIoWj9ulZcmOXdzlY1xM
R5+4TTJ4l1tovPtQ9nUW
=bm5E
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"The updates include:
- Small code cleanups in the AMD IOMMUv2 driver
- Scalability improvements for the DMA-API implementation of the AMD
IOMMU driver. This is just a starting point, but already showed
some good improvements in my tests.
- Removal of the unused Renesas IPMMU/IPMMUI driver
- Updates for ARM-SMMU include:
* Some fixes to get the driver working nicely on Broadcom hardware
* A change to the io-pgtable API to indicate the unit in which to
flush (all callers converted, with Ack from Laurent)
* Use of devm_* for allocating/freeing the SMMUv3 buffers
- Some other small fixes and improvements for other drivers"
* tag 'iommu-updates-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (46 commits)
iommu/vt-d: Fix up error handling in alloc_iommu
iommu/vt-d: Check the return value of iommu_device_create()
iommu/amd: Remove an unneeded condition
iommu/amd: Preallocate dma_ops apertures based on dma_mask
iommu/amd: Use trylock to aquire bitmap_lock
iommu/amd: Make dma_ops_domain->next_index percpu
iommu/amd: Relax locking in dma_ops path
iommu/amd: Initialize new aperture range before making it visible
iommu/amd: Build io page-tables with cmpxchg64
iommu/amd: Allocate new aperture ranges in dma_ops_alloc_addresses
iommu/amd: Optimize dma_ops_free_addresses
iommu/amd: Remove need_flush from struct dma_ops_domain
iommu/amd: Iterate over all aperture ranges in dma_ops_area_alloc
iommu/amd: Flush iommu tlb in dma_ops_free_addresses
iommu/amd: Rename dma_ops_domain->next_address to next_index
iommu/amd: Remove 'start' parameter from dma_ops_area_alloc
iommu/amd: Flush iommu tlb in dma_ops_aperture_alloc()
iommu/amd: Retry address allocation within one aperture
iommu/amd: Move aperture_range.offset to another cache-line
iommu/amd: Add dma_ops_aperture_alloc() function
...
This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Holding mm_users works OK for graphics, which was the first user of SVM
with VT-d. However, it works less well for other devices, where we actually
do a mmap() from the file descriptor to which the SVM PASID state is tied.
In this case on process exit we end up with a recursive reference count:
- The MM remains alive until the file is closed and the driver's release()
call ends up unbinding the PASID.
- The VMA corresponding to the mmap() remains intact until the MM is
destroyed.
- Thus the file isn't closed, even when exit_files() runs, because the
VMA is still holding a reference to it. And the MM remains alive…
To address this issue, we *stop* holding mm_users while the PASID is bound.
We already hold mm_count by virtue of the MMU notifier, and that can be
made to be sufficient.
It means that for a period during process exit, the fun part of mmput()
has happened and exit_mmap() has been called so the MM is basically
defunct. But the PGD still exists and the PASID is still bound to it.
During this period, we have to be very careful — exit_mmap() doesn't use
mm->mmap_sem because it doesn't expect anyone else to be touching the MM
(quite reasonably, since mm_users is zero). So we also need to fix the
fault handler to just report failure if mm_users is already zero, and to
temporarily bump mm_users while handling any faults.
Additionally, exit_mmap() calls mmu_notifier_release() *before* it tears
down the page tables, which is too early for us to flush the IOTLB for
this PASID. And __mmu_notifier_release() removes every notifier from the
list, so when exit_mmap() finally *does* tear down the mappings and
clear the page tables, we don't get notified. So we work around this by
clearing the PASID table entry in our MMU notifier release() callback.
That way, the hardware *can't* get any pages back from the page tables
before they get cleared.
Hardware designers have confirmed that the resulting 'PASID not present'
faults should be handled just as gracefully as 'page not present' faults,
the important criterion being that they don't perturb the operation for
any *other* PASID in the system.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@vger.kernel.org
Only check for error when iommu->iommu_dev has been assigned
and only assign drhd->iommu when the function can't fail
anymore.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This adds the proper check to alloc_iommu to make sure that
the call to iommu_device_create has completed successfully
and if not return the error code to the caller after freeing
up resources allocated previously.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When mapping a non-page-aligned scatterlist entry, we copy the original
offset to the output DMA address before aligning it to hand off to
iommu_map_sg(), then later adding the IOVA page address portion to get
the final mapped address. However, when the IOVA page size is smaller
than the CPU page size, it is the offset within the IOVA page we want,
not that within the CPU page, which can easily be larger than an IOVA
page and thus result in an incorrect final address.
Fix the bug by taking only the IOVA-aligned part of the offset as the
basis of the DMA address, not the whole thing.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
get_device_id() returns an unsigned short device id. It never fails and
it never returns a negative so we can remove this condition.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Preallocate between 4 and 8 apertures when a device gets it
dma_mask. With more apertures we reduce the lock contention
of the domain lock significantly.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Make this pointer percpu so that we start searching for new
addresses in the range we last stopped and which is has a
higher probability of being still in the cache.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This allows to build up the page-tables without holding any
locks. As a consequence it removes the need to pre-populate
dma_ops page-tables.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Don't flush the iommu tlb when we free something behind the
current next_bit pointer. Update the next_bit pointer
instead and let the flush happen on the next wraparound in
the allocation path.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The flushing of iommu tlbs is now done on a per-range basis.
So there is no need anymore for domain-wide flush tracking.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It points to the next aperture index to allocate from. We
don't need the full address anymore because this is now
tracked in struct aperture_range.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Moving it before the pte_pages array puts in into the same
cache-line as the spin-lock and the bitmap array pointer.
This should safe a cache-miss.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There have been present PTEs which in theory could have made
it to the IOMMU TLB. Flush the addresses out on the error
path to make sure no stale entries remain.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If CONFIG_PHYS_ADDR_T_64BIT=n:
drivers/iommu/ipmmu-vmsa.c: In function 'ipmmu_domain_init_context':
drivers/iommu/ipmmu-vmsa.c:434:2: warning: right shift count >= width of type
ipmmu_ctx_write(domain, IMTTUBR0, ttbr >> 32);
^
As io_pgtable_cfg.arm_lpae_s1_cfg.ttbr[] is an array of u64s, assigning
it to a phys_addr_t may truncates it. Make ttbr u64 to fix this.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Doug reports that the equivalent page allocator on 32-bit ARM exhibits
particularly pathalogical behaviour under memory pressure when
fragmentation is high, where allocating a 4MB buffer takes tens of
seconds and the number of calls to alloc_pages() is over 9000![1]
We can drastically improve that situation without losing the other
benefits of high-order allocations when they would succeed, by assuming
memory pressure is relatively constant over the course of an allocation,
and not retrying allocations at orders we know to have failed before.
This way, the best-case behaviour remains unchanged, and in the worst
case we should see at most a dozen or so (MAX_ORDER - 1) failed attempts
before falling back to single pages for the remainder of the buffer.
[1]:http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/394660.html
Reported-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
dma-iommu.c was naughtily relying on an implicit transitive #include of
linux/vmalloc.h, which is apparently not present on some architectures.
Add that, plus a couple more headers for other functions which are used
similarly.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Those are stupid and code should use static_cpu_has_safe() or
boot_cpu_has() instead. Kill the least used and unused ones.
The remaining ones need more careful inspection before a conversion can
happen. On the TODO.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
Cc: David Sterba <dsterba@suse.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Two similar fixes for the Intel and AMD IOMMU drivers to add
proper access checks before calling handle_mm_fault.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWdCp7AAoJECvwRC2XARrjjAIP/0ihW2zF4R622RgY1C1Cm62j
0eb/R4UqjI3PG0KsURgDHcIm9JP5Z//dgKTOtNX9KOkHlXLcO9MMSD5chVBd4HKG
+Mgx7RM+Mr7f6ElRUa6s1GY1tcJlGf43fW5cMQ44BJIqVXlE47go4U09D86DVgXy
KgyBxQldeOrkXZvAG82WLjGgkdGALQjbDlI8ktmfYWXAvIRWNGJqWY16BwAYOWfb
9d3+1JPekSSBWHC6H+qbkDb8ueO69/Ux0HL5z2Q0zchqGjBb1gnfwLcz865KZpOB
qUwsKFSXTl+jPCrAaLYJnVqAnH4qqKaF6WKAJSIHObTSVqXKHpFHrQrlGVzOvYNn
s3216KIMsxG2nnvSgXCOFGqM/810MH2MSo8YcF5A3celrka3j2Gj08mxInrZXN7D
3p51HSwq8ePo4i5jppT5ldOBSjNV9N3wKWcjDb4OL+OfkJc/u2VbSHNQtpvTclsV
V6VSfWLDC8BCmUveMH2TrawQWkKOz0LqgqfQPX+VvSCIM7tgkrgVsTJrijPtGOs1
zid/A/cfqMdBezSVALrZfB4OVBaM2UL2LJmmLJgApYV+N55Oxmx+nxnMr0aT5KlY
crjcnVaypkq3rG1Wjpt+nTTwtllB0yXNEywQcu2edeswmaQCqsEgQRsDqi6S2/+S
c8l9JKoTrB4+vToYjXyW
=qrAB
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two similar fixes for the Intel and AMD IOMMU drivers to add proper
access checks before calling handle_mm_fault"
* tag 'iommu-fixes-v4.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Do access checks before calling handle_mm_fault()
iommu/amd: Do proper access checking before calling handle_mm_fault()
When tearing down page tables, we return early for the final level
since we know that we won't have any table pointers to follow.
Unfortunately, this also means that we forget to free the final level,
so we end up leaking memory.
Fix the issue by always freeing the current level, but just don't bother
to iterate over the ptes if we're at the final level.
Cc: <stable@vger.kernel.org>
Reported-by: Zhang Bo <zhangbo_a@xiaomi.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
It is ILLEGAL to set STE.S1STALLD to 1 if stage 1 is enabled and
either the stall or terminate models are not supported.
This patch fixes the STALLD check and ensures that we don't set STALLD
in the STE when it is not supported.
Signed-off-by: Prem Mallappa <pmallapp@broadcom.com>
[will: consistently use IDR0_STALL_MODEL_* prefix]
Signed-off-by: Will Deacon <will.deacon@arm.com>
When acknowledging global errors, the GERRORN register should be written
with the original GERROR value so that active errors are toggled.
This patch fixed the driver to write the original GERROR value to
GERRORN, instead of an active error mask.
Signed-off-by: Prem Mallappa <pmallapp@broadcom.com>
[will: reworked use of active bits and fixed commit log]
Signed-off-by: Will Deacon <will.deacon@arm.com>
There is no need to keep a useful accessor for a public structure hidden
away in a private implementation. Move it out alongside the structure
definition so that other implementations may reuse it.
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When invalidating an IOVA range potentially spanning multiple pages,
such as when removing an entire intermediate-level table, we currently
only issue an invalidation for the first IOVA of that range. Since the
architecture specifies that address-based TLB maintenance operations
target a single entry, an SMMU could feasibly retain live entries for
subsequent pages within that unmapped range, which is not good.
Make sure we hit every possible entry by iterating over the whole range
at the granularity provided by the pagetable implementation.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: added missing semicolons...]
Signed-off-by: Will Deacon <will.deacon@arm.com>
IOMMU hardware with range-based TLB maintenance commands can work
happily with the iova and size arguments passed via the tlb_add_flush
callback, but for IOMMUs which require separate commands per entry in
the range, it is not straightforward to infer the necessary granularity
when it comes to issuing the actual commands.
Add an additional argument indicating the granularity for the benefit
of drivers needing to know, and update the ARM LPAE code appropriately
(for non-leaf invalidations we currently just assume the worst-case
page granularity rather than walking the table to check).
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the case of corrupted page tables, or when an invalid size is given,
__arm_lpae_unmap() may recurse beyond the maximum number of levels.
Unfortunately the detection of this error condition only happens *after*
calculating a nonsense offset from something which might not be a valid
table pointer and dereferencing that to see if it is a valid PTE.
Make things a little more robust by checking the level is valid before
doing anything which depends on it being so.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Whilst the architecture only defines a few of the possible CERROR values,
we should handle unknown values gracefully rather than go out of bounds
trying to print out an error description.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The basic flow for add a device:
arm_smmu_add_device
|->iommu_group_get_for_dev
|->iommu_group_get
return group; (1)
|->ops->device_group : Init/increase reference count to/by 1.
|->iommu_group_add_device : Increase reference count by 1.
return group (2)
|->return 0;
Since we are adding one device, the flow is (2) and the group reference
count will be increased by 2. So, we need to add iommu_group_put at the
end of arm_smmu_add_device to decrease the count by 1.
Also take the failure path into consideration when fail to add a device.
Signed-off-by: Peng Fan <van.freenix@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When we initialise a bypass STE, we memset the structure to zero and
set the Valid and Config fields to indicate that the stream should
bypass the SMMU. Unfortunately, this results in an SHCFG field of 0
which means that the shareability of any incoming transactions is
overridden with non-shareable, leading to potential coherence problems
down the line.
This patch fixes the issue by initialising bypass STEs to use the
incoming shareability attributes. When translation is in effect at
either stage 1 or stage 2, the shareability is determined by the
page tables.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The free_io_pgtable_ops() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMUv3 driver uses dma_{alloc,free}_coherent to manage its
queues and configuration data structures.
This patch converts the driver to the managed (dmam_*) API, so that
resources are freed automatically on device teardown. This greatly
simplifies the failure paths and allows us to remove a bunch of
handcrafted freeing code.
Signed-off-by: Will Deacon <will.deacon@arm.com>
PRIQ_0_OF has been removed from the SMMUv3 architecture, so remove its
corresponding (and unused) #define from the driver.
Signed-off-by: Will Deacon <will.deacon@arm.com>
commit db0fa0cb01 "scatterlist: use sg_phys()" did replacements of
the form:
phys_addr_t phys = page_to_phys(sg_page(s));
phys_addr_t phys = sg_phys(s) & PAGE_MASK;
However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.
Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb01 ("scatterlist: use sg_phys()")
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Vitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As of commit 44d88c754e ("ARM: shmobile: Remove legacy SoC code
for R-Mobile A1"), the Renesas IPMMU/IPMMUI driver is no longer used.
In theory it could still be used on SH-Mobile AG5 and R-Mobile A1 SoCs,
but that requires adding DT support to the driver, which is not
planned.
Remove the driver, it can be resurrected from git history when needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
These new helpers simplify implementing multi-driver modules and
properly handle failure to register one driver by unregistering all
previously registered drivers.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This mmu_notifier_ops structure is never modified, so declare it as
const, like the other mmu_notifier_ops structures.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Get rid of the three error paths that look the same and move
error handling to a single place.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Instead of just checking for a write access, calculate the
flags that are passed to handle_mm_fault() more precisly and
use the pre-defined macros.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Not doing so is a bug and might trigger a BUG_ON in
handle_mm_fault(). So add the proper permission checks
before calling into mm code.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The handle_mm_fault function expects the caller to do the
access checks. Not doing so and calling the function with
wrong permissions is a bug (catched by a BUG_ON).
So fix this bug by adding proper access checking to the io
page-fault code in the AMD IOMMUv2 driver.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-By: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fix these warnings:
CHECK drivers/iommu/s390-iommu.c
drivers/iommu/s390-iommu.c:52:21: warning: symbol 's390_domain_alloc' was not declared. Should it be static?
drivers/iommu/s390-iommu.c:76:6: warning: symbol 's390_domain_free' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>