Up until now we only supported bridged interfaces. Packets ingressing
through the switch ports were either classified to FIDs (in the case of
the VLAN-aware bridge) or vFIDs (in the case of VLAN-unaware bridges).
The packets were then forwarded according to the FDB. Routing was done
entirely in slowpath, by splitting the vFID range in two and using the
lower 0.5K vFIDs as dummy bridges that simply flooded all incoming
traffic to the CPU.
Instead, allow packets to be routed in the device by creating router
interfaces (RIFs) that will direct them to the router block.
Specifically, the RIFs introduced here are Sub-port RIFs used for VLAN
devices and port netdevs. Packets ingressing from the {Port / LAG ID, VID}
with which the RIF was programmed with will be assigned to a special
kind of FIDs called rFIDs and from there directed to the router.
Create a RIF whenever the first IPv4 address was programmed on a VLAN /
LAG / port netdev. Destroy it upon removal of the last IPv4 address.
Receive these notifications by registering for the 'inetaddr'
notification chain. A non-zero (10) priority is used for the
notification block, so that RIFs will be created before routes are
offloaded via FIB code.
Note that another trigger for RIF destruction are CHANGEUPPER
notifications causing the underlying FID's reference count to go down to
zero. This can happen, for example, when a VLAN netdev with an IP address
is put under bridge. While this configuration doesn't make sense it does
cause the device and the kernel to get out of sync when the netdev is
unbridged. We intend to address this in the future, hopefully in current
cycle.
Finally, Remove the lower 0.5K vFIDs, as they are deprecated by the RIFs,
which will trap packets according to their DIP.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are just about to introduce router interfaces (RIFs), but before that
we need to be able update the device with the correct RIF attributes
whenever they change for the netdev the RIF is backing. Two such
attributes are MTU and MAC.
The MAC is used both to set the source MAC of packets egressing from the
RIF and also to program an FDB rule that will direct packets to the
router block.
Use the existing netdevice notification block and respond to CHANGEADDR
and CHANGEMTU accordingly. Store both attributes in the RIF struct
in case we need to revert to old attributes following a failed update.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions that iterate over lower devices and find port device.
As a dependency add netdev_for_each_all_lower_dev and
netdev_for_each_all_lower_dev_rcu macro with
netdev_all_lower_get_next and netdev_all_lower_get_next_rcu shelpers.
Also, add functions to return mlxsw struct according to lower device
found and mlxsw_port struct with a reference to lower device.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement ipv4 FIB entries addition and removal. Initially, we support
local and broadcast routes using "ip2me" trap action.
Also, unicast routes without nexthop are supported using "local" action.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serves for adding, updating and removing fib entries.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual router is a construct used inside HW. In this implementation
we map kernel tables to virtual routers one to one. Introduce management
logic to create virtual routers when needed and destroy in case they are
no longer in use. According to that, call into LPM tree management.
Each virtual router is always bound to one LPM tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce basic LPM tree management allowing to share the trees in
between tables if the used prefixes in the tables are the same.
Build the tree structure according to the used prefixes. Although it is
not optimal for many use cases, this initial implementation does only
simple linear left-tree. More advanced structures will be introduced
later on, possibly including mechanisms to change trees on the fly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used to bind virtual router and protocol to an
allocated LPM tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serves to build LPM tree structure.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register serves for allocation and deallocation of LPM search tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shadow FIB is needed in order to hold additional information for FIB
entries and keep track of used prefixes. That is needed for the LPM tree
construction to be introduced later on in this set.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Rothwell reports a build warnings(powerpc ppc64_defconfig)
drivers/net/tun.c: In function 'tun_do_read.part.5':
/home/sfr/next/next/drivers/net/tun.c:1491:6: warning: 'err' may be
used uninitialized in this function [-Wmaybe-uninitialized]
int err;
This is because tun_ring_recv() may return an uninitialized err, fix this.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Raghu Vatsavayi says:
====================
liquidio updates and bug fixes
Following V2 patchset contains updates and bug fixes for
liquidio driver. This patchset also has changes that you
suggested in vxlan code. Please apply the patches in following
order as some of the patches depend on earlier patches in the
series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes response header to be able to communicate
with new firmware interface.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes redundant file includes and conditions.
Provides some meaningful comments and code alignment.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes redudant droq num validation.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the MTU between 68 bytes and 16000 bytes.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the issue of proper freeing of queue
memory resources during free device. It also has fix for
correct pcie error reporting.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the dependency of number of iq/oq's on
number of cpus.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates the delay constant for softcommands in
accrodance with new requirements.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch tries to protect against bh preemption with
sc_buf_pool. It also modifies the syncronization primitives
during input queue processing.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch has minor replacements of ACCESS_ONCE macros with
WRITE_ONCE and replacement of BUG_ON with polite version WARN_ON.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for Vxaln offloads in liquidio driver.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If skb_clear_hash() was invoked due to mangling of relevant headers and
BPF program needs skb->hash later on, we can add a helper to trigger hash
recalculation via bpf_get_hash_recalc().
The helper will return the newly retrieved hash directly, but later access
can also be done via skb context again through skb->hash directly (inline)
without needing to call the helper once more.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds samples for pktgen to use with new mode to inject pkts into
the qdisc layer. This also doubles as nice test cases to test any
patches against qdisc layer.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add another xmit_mode to pktgen to allow testing xmit functionality
of qdiscs. The new mode "queue_xmit" injects packets at
__dev_queue_xmit() so that qdisc is called.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Silent a few smatch warnings about indentation.
This include the removal of a 'return' statement in 'resource_tracker.c'.
This 'return' will still be performed when breaking out of the
corresponding 'switch' block.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extremely useful for setting packet type to host so i dont
have to modify the dst mac address using pedit (which requires
that i know the mac address)
Example usage:
tc filter add dev eth0 parent ffff: protocol ip pref 9 u32 \
match ip src 5.5.5.5/32 \
flowid 1:5 action skbedit ptype host
This will tag all packets incoming from 5.5.5.5 with type
PACKET_HOST
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 8067302973 ("net-next: mediatek: add support for IRQ grouping")
failed to properly update the irq handling inside mtk_poll_controller()
causing compile errors if netconsole was enabled. Fix this by updating
the code to use the new separated irq handler function for RX.
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
mlxsw: Lay the groundwork for the introduction of router interfaces
This is first patchset on a way to introduce ipv4 routing offload support
in mlxsw driver. Does preparations before router interfaces will
be introduced in mlxsw.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ip2me:
To instruct HW to send trapped ip2me traffic to kernel, we have to add
this trap. Selection ip2me traffic is introduced later on in this set.
ARPs:
We are going to stop flooding to CPU port when netdev isn't bridged and
only get packets destined to the netdev's IP address and certain control
packets.
Add traps for ARP request (broadcast) and response (unicast) in order to
get these to the CPU and resolve neighbours.
host miss:
If a packet is routed through a directly connected route and its
destination IP is not in the device's neighbour table, then we need to
trap it to CPU. This will cause the host to resolve the MAC of the
neighbour, which will be eventually programmed to the device's table.
router ingress:
In order to trap packets in router part.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing packet traps we should use action 'discard' instead of
'forward', as some trap IDs we'll add cannot be configured with the
later. However, result is the same, as packets are not trapped to the
CPU.
In the future we will be able to reverse the operation properly by
detaching the trap group from the CPU.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the Router Interface Table Register (RITR), which allows us to
create and configure router interfaces (RIFs).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Incoming packets are directed to the router when they match an FDB
entry with action forward to IP router.
Add this action, which was mistakenly named "TRAP".
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When enabling the router in the device we will represent L3 netdevs
using router interfaces (RIFs). These will be specified whenever
programming routes or neighbours on the netdev.
Introduce the basic RIF infrastructure which allows one to lookup a RIF
by its netdev. Later patches in the series will extend this, but the
basic routines are needed now in order to direct traffic to CPU.
Pointers to the RIF structs are stored in an array indexed by the RIF's
number. This will allow us to efficiently update the kernel's neighbour
table when regularly dumping the device's table.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a skeleton router file and do basic HW initialization of router.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>