Tegra124 was accidentally left out when the number of TLB lines was
parameterized in commit 11cec15bf3fb ("iommu/tegra-smmu: Parameterize
number of TLB lines"). Fortunately this doesn't cause any noticeable
regressions upstream, presumably because there aren't any use-cases
that exercise enough pressure on the SMMU. But it is a regression
nonetheless, so let's fix it.
Fixes: 11cec15bf3fb ("iommu/tegra-smmu: Parameterize number of TLB lines")
Signed-off-by: Vince Hsu <vince.h@nvidia.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
[treding@nvidia.com: extract from unrelated patch]
Signed-off-by: Thierry Reding <treding@nvidia.com>
This time the IOMMU updates are mostly cleanups or fixes. No big new
features or drivers this time. In particular the changes include:
* Bigger cleanup of the Domain<->IOMMU data structures and the
code that manages them in the Intel VT-d driver. This makes
the code easier to understand and maintain, and also easier to
keep the data structures in sync. It is also a preparation
step to make use of default domains from the IOMMU core in the
Intel VT-d driver.
* Fixes for a couple of DMA-API misuses in ARM IOMMU drivers,
namely in the ARM and Tegra SMMU drivers.
* Fix for a potential buffer overflow in the OMAP iommu driver's
debug code
* A couple of smaller fixes and cleanups in various drivers
* One small new feature: Report domain-id usage in the Intel
VT-d driver to easier detect bugs where these are leaked.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJV7sCEAAoJECvwRC2XARrjz3YP/Au4IIfqykfPvmI0cmPhVnAV
Q72tltwkbK2u2iP+pHheveaMngJtAshsZrnhBon4KJRIt/KTLZQvsFplHDaRhPfY
yw3LIxhC5kLG/S6irY9Ozb0+uTMdQ3BU2uS23pyoFVfCz+RngBrAwDBcTKqZDCDG
8dNd+T21XlzxuyeGr58h9upz2VFtq6feoGFhLU5PNxTlf4JWZe77D7NlbSvx6Nwy
7Ai8dVRgpV9ciUP7w8FXrCUvbMZQDIoTMiWGNSlogVMgA0dllGES91UZYhWf3pil
abuX6DeFul/cOhEOnH2xa+j5zz2O/upe9stU4wAFw6IhPiAELTHc2NKlWAhwb0SY
bpDRf7dgLnUfqpmZLpWjTwN4jllc0qS2MIHj+eUu0uhdFi4Z0BuH2wSCdbR7xkqk
u5u0Jq7hDNKs5FmQTSsWSiAdjakMsRjIN7jMrBbOeZnBSmUnLx74KGPLTb63ncR3
WIOi4Iyu+LSXBIvZDiLu3lIIh7Atzd+y7IDnb8KXdyqfy+h53OZZOJNbP/qTWHgT
ZUdm/qrqjIQpTQfleOEadC7vY/y3fR5sBtOQHUamfntni3oYCc4AMRlNdf3eV9lb
Tyss6F699mU7d/vennTaIToBgVwaXdLYtmvGWjnoT/kqOMclyDf3cIUtZGtp2rJR
ddmzDA3vBUC5pGj8Hd8R
=yoGE
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates for from Joerg Roedel:
"This time the IOMMU updates are mostly cleanups or fixes. No big new
features or drivers this time. In particular the changes include:
- Bigger cleanup of the Domain<->IOMMU data structures and the code
that manages them in the Intel VT-d driver. This makes the code
easier to understand and maintain, and also easier to keep the data
structures in sync. It is also a preparation step to make use of
default domains from the IOMMU core in the Intel VT-d driver.
- Fixes for a couple of DMA-API misuses in ARM IOMMU drivers, namely
in the ARM and Tegra SMMU drivers.
- Fix for a potential buffer overflow in the OMAP iommu driver's
debug code
- A couple of smaller fixes and cleanups in various drivers
- One small new feature: Report domain-id usage in the Intel VT-d
driver to easier detect bugs where these are leaked"
* tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (83 commits)
iommu/vt-d: Really use upper context table when necessary
x86/vt-d: Fix documentation of DRHD
iommu/fsl: Really fix init section(s) content
iommu/io-pgtable-arm: Unmap and free table when overwriting with block
iommu/io-pgtable-arm: Move init-fn declarations to io-pgtable.h
iommu/msm: Use BUG_ON instead of if () BUG()
iommu/vt-d: Access iomem correctly
iommu/vt-d: Make two functions static
iommu/vt-d: Use BUG_ON instead of if () BUG()
iommu/vt-d: Return false instead of 0 in irq_remapping_cap()
iommu/amd: Use BUG_ON instead of if () BUG()
iommu/amd: Make a symbol static
iommu/amd: Simplify allocation in irq_remapping_alloc()
iommu/tegra-smmu: Parameterize number of TLB lines
iommu/tegra-smmu: Factor out tegra_smmu_set_pde()
iommu/tegra-smmu: Extract tegra_smmu_pte_get_use()
iommu/tegra-smmu: Use __GFP_ZERO to allocate zeroed pages
iommu/tegra-smmu: Remove PageReserved manipulation
iommu/tegra-smmu: Convert to use DMA API
iommu/tegra-smmu: smmu_flush_ptc() wants device addresses
...
The number of TLB lines was increased from 16 on Tegra30 to 32 on
Tegra114 and later. Parameterize the value so that the initial default
can be set accordingly.
On Tegra30, initializing the value to 32 would effectively disable the
TLB and hence cause massive latencies for memory accesses translated
through the SMMU. This is especially noticeable for isochronuous clients
such as display, whose FIFOs would continuously underrun.
Fixes: 891846516317 ("memory: Add NVIDIA Tegra memory controller support")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Recent versions of the Tegra MC hardware extend the size of the client
ID bitfield in the MC_ERR_STATUS register by one bit. While one could
simply extend the bitfield for older hardware, that would allow data
from reserved bits into the driver code, which is generally a bad idea
on principle. So this patch instead passes in the client ID mask from
from the per-SoC MC data.
There's no MC support for T210 (yet), but when that support winds up
in the kernel, the appropriate soc->client_id_mask value for that chip
will be 0xff.
Based on an original patch by David Ung <davidu@nvidia.com>.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: David Ung <davidu@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Drivers should not be using __cpuc_* functions nor outer_cache_flush()
directly. This change partly cleans up tegra-smmu.c.
The only difference between cache handling of the tegra variants is
Denver, which omits the call to outer_cache_flush(). This is due to
Denver being an ARM64 CPU, and the ARM64 architecture does not provide
this function. (This, in itself, is a good reason why these should not
be used.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[treding@nvidia.com: fix build failure on 64-bit ARM]
Signed-off-by: Thierry Reding <treding@nvidia.com>
In order to ease testing, expose the list of supported EMC frequencies
via debugfs.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This file in debugfs can be used to get or set the EMC frequency.
Reading the file will return the currently set frequency in Hz, while
writing the file sets the specified frequency rounded to the next
highest frequency supported by the board.
Will be very useful when tuning memory scaling.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
[treding@nvidia.com: add "emc" debugfs directory]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Implements functionality needed to change the rate of the memory bus
clock.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The EMC driver needs to know the number of external memory devices and
also needs to update the EMEM configuration based on the new rate of the
memory bus.
To know how to update the EMEM config, looks up the values of the burst
regs in the DT, for a given timing.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
As this interrupt is just for development purposes, as the TRM says, and
the sheer amount of interrupts fired can seriously disrupt userspace
when testing the lower frequencies supported by the EMC.
From the TRM:
"There is one performance warning type interrupt: ARBITRATION_EMEM. It
fires when the MC detects that a request has been pending in the Row
Sorter long enough to hit the DEADLOCK_PREVENTION_SLACK_THRESHOLD. In
addition to true performance problems, this interrupt may fire in
situations such as clock-change where the EMC backpressures pending
traffic for long periods of time. This interrupt helps developers
identify and debug performance issues and configuration issues."
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The memory controller on Tegra132 is very similar to the one found on
Tegra124. But the Denver CPUs don't have an outer cache, so dcache
maintenance is done slightly differently.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Subsequent patches will add debugfs files that print the status of the
SWGROUPs. Add a new names field and complement the SoC tables with the
names of the individual SWGROUPs.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The memory controller on NVIDIA Tegra exposes various knobs that can be
used to tune the behaviour of the clients attached to it.
Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.
This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller. It is supported on Tegra30, Tegra114 and Tegra124
currently. Tegra20 has a GART instead.
The Tegra SMMU operates on memory clients and SWGROUPs. A memory client
is a unidirectional, special-purpose DMA master. A SWGROUP represents a
set of memory clients that form a logical functional unit corresponding
to a single device. Typically a device has two clients: one client for
read transactions and one client for write transactions, but there are
also devices that have only read clients, but many of them (such as the
display controllers).
Because there is no 1:1 relationship between memory clients and devices
the driver keeps a table of memory clients and the SWGROUPs that they
belong to per SoC. Note that this is an exception and due to the fact
that the SMMU is tightly integrated with the rest of the Tegra SoC. The
use of these tables is discouraged in drivers for generic IOMMU devices
such as the ARM SMMU because the same IOMMU could be used in any number
of SoCs and keeping such tables for each SoC would not scale.
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>