smp_mb() inside bnx2_tx_avail() is used twice in the normal
bnx2_start_xmit() path (see illustration below). The full memory
barrier is only necessary during race conditions with tx completion.
We can speed up the tx path by replacing smp_mb() in bnx2_tx_avail()
with a compiler barrier. The compiler barrier is to force the
compiler to fetch the tx_prod and tx_cons from memory.
In the race condition between bnx2_start_xmit() and bnx2_tx_int(),
we have the following situation:
bnx2_start_xmit() bnx2_tx_int()
if (!bnx2_tx_avail())
BUG();
...
if (!bnx2_tx_avail())
netif_tx_stop_queue(); update_tx_index();
smp_mb(); smp_mb();
if (bnx2_tx_avail()) if (netif_tx_queue_stopped() &&
netif_tx_wake_queue(); bnx2_tx_avail())
With smp_mb() removed from bnx2_tx_avail(), we need to add smp_mb() to
bnx2_start_xmit() as shown above to properly order netif_tx_stop_queue()
and bnx2_tx_avail() to check the ring index. If it is not strictly
ordered, the tx queue can be stopped forever.
This improves performance by about 5% with 2 ports running bi-directional
64-byte packets.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on original patch by Breno Leitão <leitao@linux.vnet.ibm.com>.
Allocate the actual number of vectors and make use of fewer vectors
if pci_enable_msix() returns > 0. We must allocate one additional
vector for the cnic driver.
Cc: Breno Leitão <leitao@linux.vnet.ibm.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We were using the wrong tx multicast counter instead of the rx multicast
counter.
Reported-by: Peter Snellman <peter.snellman@cinnober.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use DMA API as PCI equivalents will be deprecated. This change also allow
to allocate with GFP_KERNEL in some places.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now core network is able to handle 64 bit netdevice stats on 32 bit
arches, we can provide them for bnx2, since hardware maintains some 64
bit counters.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These config register values will be useful when the memory registers
are returning 0xffffffff which has been reported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add skb->rxhash support for TCP packets only because the bnx2 RSS hash
does not hash UDP ports.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor change to use MSI-X even if there is only one CPU. This allows
the CNIC driver to always have a dedicated MSI-X vector to handle
iSCSI events, instead of sharing the MSI vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes dma_get_ops() prefetch optimization in bnx2.
bnx2 uses dma_get_ops() to see if dma_sync_single_for_cpu() is
noop. bnx2 does prefetch if it's noop.
But dma_get_ops() isn't available on all the architectures (only the
architectures that uses dma_map_ops struct have it). Using
dma_get_ops() in drivers leads to compilation breakage on many
architectures.
This patch removes dma_get_ops() and changes bnx2 to do prefetch on
all the architectures. This adds useless prefetch on non-coherent
architectures but this is harmless. It is also unlikely to cause the
performance drop.
[ Remove now unused local variable 'pdev' -DaveM ]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/bnx2.c: In function 'bnx2_disable_forced_2g5':
drivers/net/bnx2.c:1489: warning: 'bmcr' may be used uninitialized in this function
We fix it by checking return values from all bnx2_read_phy() and proceeding
to do read-modify-write only if the read operation is successful.
The related bnx2_enable_forced_2g5() is also fixed the same way.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The regression is caused by:
commit 4327ba435a56ada13eedf3eb332e583c7a0586a9
bnx2: Fix netpoll crash.
If ->open() and ->close() are called multiple times, the same napi structs
will be added to dev->napi_list multiple times, corrupting the dev->napi_list.
This causes free_netdev() to hang during rmmod.
We fix this by calling netif_napi_del() during ->close().
Also, bnx2_init_napi() must not be in the __devinit section since it is
called by ->open().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on original patch from Stanislaw Gruszka <sgruszka@redhat.com>.
Using netif_carrier_off() is better than updating all the ->trans_start
on all the tx queues.
netif_carrier_off() needs to be called after bnx2_disable_int_sync()
to guarantee no race conditions with the serdes timers that can
modify the carrier state.
If the chip or phy is reset, carrier will turn back on when we get the
link interrupt. If there is no reset, we need to turn carrier back on
in bnx2_netif_start(). Again, the phy_lock prevents race conditions with
the serdes timers.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New firmware fixes a performance regression on small packets.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dump the correct MCP registers and add EMAC_RX_STATUS register during
NETDEV_WATCHDOG for debugging.
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add prefetches of the skb and the next rx descriptor to speed up rx path.
Use prefetchw() for the skb [suggested by Eric Dumazet].
The rx descriptor is in skb->data which is mapped for streaming mode DMA.
Eric Dumazet pointed out that we should not prefetch the data before
dma_sync. So we prefetch only if dma_sync is no_op on the system.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And turn on NETIF_F_GRO by default [requested by DaveM].
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bonding driver calls ndo_vlan_rx_register() while holding bond->lock.
The bnx2 driver calls bnx2_netif_stop() to stop the rx handling while
changing the vlgrp. The call also stops the cnic driver which sleeps
while the bond->lock is held and cause the warning.
This code path only needs to stop the NAPI rx handling while we are
changing the vlgrp. Since no reset is going to occur, there is no need
to stop cnic in this case. By adding a parameter to bnx2_netif_stop()
to skip stopping cnic, we can avoid the warning.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It has been reported that under certain heavy traffic conditions in MSI-X
mode, the driver can lose an MSI-X vector causing all packets in the
associated rx/tx ring pair to be dropped. The problem is caused by
the chip dropping the write to unmask the MSI-X vector by the kernel
(when migrating the IRQ for example).
This can be prevented by increasing the GRC timeout value for these
register read and write operations.
Thanks to Dell for helping us debug this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Converts the list and the core manipulating with it to be the same as uc_list.
+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netpoll needs to call the proper handler depending on the IRQ mode
and the vector.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bnx2 driver calls netif_napi_add() for all the NAPI structs during
->probe() time but not all of them will be used if we're not in MSI-X
mode. This creates a problem for netpoll since it will poll all the
NAPI structs in the dev_list whether or not they are scheduled, resulting
in a crash when we access structure fields not initialized for that vector.
We fix it by moving the netif_napi_add() call to ->open() after the number
of IRQ vectors has been determined.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the VPD searching code is abstracted away, the outer loop used
to detect the read-only large resource data type section is useless.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the pci_vpd_find_info_keyword() helper function to
find information field keywords within read-only and read-write large
resource data type sections.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a preprocessor constant to describe the PCI VPD
information field header size and an inline function to extract the
size of the information field itself.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the pci_vpd_find_tag() helper function to find VPD
resource data types in a buffer.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces more VPD preprocessor definitions to identify some
small and large resource data type item names. The patch then continues
to correct how the tg3 and bnx2 drivers search for the "read-only data"
large resource data type.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a preprocessor constant to describe the PCI VPD large
resource data type tag size and an inline function to extract the large
resource section size from the large resource data type tag.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove #define PFX
Use pr_<level>
Use netdev_<level>
Use netif_<level>
Remove periods from formats
Coalesce long formats
Coalesce some printks
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Increase FTQ depth to 256 to ehnabce performance.
- Fix RV2P context corruption on 5709 when flow control is enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the problem of dropping the carry when adding 2 32-bit values.
Switch to use array indexing for better readability.
Reported by and fix provided by Patrick Rabau.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary code that works around older versions of ethtool
that can pass down invalid advertisement speed values. This old
code prevents the user from specifying multiple advertisement values.
The new code uses simple masking to mask out invalid advertisment bits.
Reported-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current water marks are too high and can cause unnecessary flow
control frames.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New status blocks are allocated during MTU change so we need to
update this information for the cnic driver.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checking the flag is more correct than checking bp->irq_nvecs. By
accident it is not a problem because we always have more than 1
vectors when using MSIX mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces dev->mc_count in all drivers (hopefully I didn't miss
anything). Used spatch and did small tweaks and conding style changes when
it was suitable.
Jirka
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces three macros to work with uc list from net drivers.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MTU changes, ring size changes, etc cause the chip to be reset and the
statisctics flushed. To keep track of the accumulated statistics, we
add code to save the whole statistics block before reset. We also
modify the macros and statistics functions to return the sum of the
saved and current counters.
Based on original patch by Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refine the statistics macros by passing in just the name of the
counter field. This makes it a lot easier and cleaner to add
counters saved before the last reset in the next patch.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MSI-X table size needs to be properly set before pci_enable_msix()
is called. But on certain machines, the writes are delayed and the
MSI-X table size is incorrectly read. By reading the
BNX2_PCI_MSIX_CONTROL register, the writes are flushed and now
ensure that the MSI-X table is set correctly before MSI-X
is enable on the device.
This patch was originally diagnosed and authored by
Kalyan Ram Chintalapati <kalyanc@vmware.com>.
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Kalyan Ram Chintalapati <kalyanc@vmware.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error was introduced while merging:
commit 4529819c45161e4a119134f56ef504e69420bc98
bnx2: reset_task is crashing the kernel. Fixing it.
Signed-off-by: Michael Chan <mchan@broadcom.com>k
Signed-off-by: David S. Miller <davem@davemloft.net>