aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-10-25IB/mlx5: Add tunneling offloads supportMaor Gottlieb4-6/+61
The device can support receive Stateless Offloads for the inner packet's fields only when the packet is processed by TIR which is enabled to support tunneling. Otherwise, the device treats the packet as an ordinary non-tunneling packet and receive offloads can be done only for the outer packet's field. In order to enable receive Stateless Offloading support for incoming tunneling traffic the TIR should be created with tunneled_offload_en. Tunneling offloads is supported only be raw ethernet QP. This patch includes: * New QP creation flag for tunneling offloads. * Reports device capabilities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Update tunnel offloads bitsMaor Gottlieb1-1/+3
This patch updates the mlx5_ifc with the following: - Fix tunnel_stateless_gre typo. - max_geneve_opt_len - Maximum geneve options length. - tunnel_stateless_geneve_rx - If set, receive Stateless Offloads for Geneve tunneled (inner) packets are supported. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Support padded 128B CQE featureGuy Levi5-7/+42
In some benchmarks and some CPU architectures, writing the CQE on a full cache line size improves performance by saving memory access operations (read-modify-write) relative to partial cache line change. This patch lets the user to configure the device to pad the CQE up to 128B in case its content is less than 128B. Currently the driver supports only padding for a CQE size of 128B. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Support 128B CQE compression featureGuy Levi3-5/+16
In commit 1cbe6fc86ccf ("IB/mlx5: Add support for CQE compressing") the concept of CQE compression was introduced and added a support for 64B CQE size. This change update the code to support 128B CQE size as well. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Add 128B CQE compression and padding HW bitsGuy Levi1-1/+3
Adding new bits in mlx5_ifc_cmd_hca_cap to get the hardware capabilities for: - compression_128: Support 128B CQE compression - cqe_128_always: Support 128B CQE padding Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Allow creation of a multi-packet RQNoa Osherovich4-9/+56
Allow creation of a multi-packet receive queue. In order to create a multi-packet RQ, the following fields in the mlx5_ib_rwq should be set: - log_num_strides: Log of number of strides per WQE - single_stride_log_num_of_bytes: Log of a single stride size - two_byte_shift_en: When enabled, hardware pads 2 bytes of zeros before writing the message to memory (e.g. for the IP alignment). Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/mlx5: Expose multi-packet RQ capabilitiesNoa Osherovich3-0/+35
This patch reports the device's striding RQ capabilities to the user-space: - min/max_single_stride_log_num_of_bytes: Log of min/max number of bytes in a single stride. - min/max_single_wqe_log_num_of_strides: Log of min/max number of strides in a single WQE. - supported_qpts: A bit mask to know which QP types support multi- packet RQ, for now only Raw Packet QPs. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25RDMA/hns: Add modify CQ support for hip08oulijun5-0/+51
It is needed to call modify cq API for modifying cq context fields for controlling event generation moderations. This patch mainly adds it. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25RDMA/hns: Update the PD&CQE&MTT specification in hip08Wei Hu(Xavier)1-4/+4
This patch updates the PD specification to 16M for hip08. And it updates the numbers of mtt and cqe segments for the buddy. As the CQE supports hop num 1 addressing, the CQE specification is 64k. This patch updates to set the CQE specification to 64k. Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25RDMA/hns: Update the IRRL table chunk size in hip08Wei Hu(Xavier)6-16/+21
As the increase of the IRRL specification in hip08, the IRRL table chunk size needs to be updated. This patch updates the IRRL table chunk size to 256k for hip08. Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25RDMA/hns: Support WQE/CQE/PBL page size configurable feature in hip08Wei Hu(Xavier)5-57/+142
This patch updates to support WQE, CQE and PBL page size configurable feature, which includes base address page size and buffer page size. Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/ipoib: Change number of TX wqe to 64Erez Shitrit1-1/+1
NAPI budget is 64 packets, while maximum polling size for the send CQ is 16. Let's bring them in sync, so the NAPI budget will be reused completely. Cc: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/ipoib: Use NAPI in UD/TX flowsErez Shitrit5-79/+136
Instead of explicit call to poll_cq of the tx ring, use the NAPI mechanism to handle the completions of each packet that has been sent to the HW. The next major changes were taken: * The driver init completion function in the creation of the send CQ, that function triggers the napi scheduling. * The driver uses CQ for RX for both modes UD and CM, and CQ for TX for CM and UD. Cc: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25IB/ipoib: Get rid of the tx_outstanding variable in all modesErez Shitrit3-11/+10
The first step toward using NAPI in the UD/TX flow is to separate between two flows, the NAPI and the xmit, meaning no use of shared variables between both flows. This patch takes out the tx_outstanding variable that was used in both flows and instead the driver uses the 2 cyclic ring variables: tx_head and tx_tail, tx_head used in the xmit flow and tx_tail in the NAPI flow. Cc: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25RDMA: Remove Sean's and Hal's emails from MAINTAINER fileLeon Romanovsky1-2/+0
RDMA subsystem has one active maintainer who can apply patches - Doug Ledford, but the RDMA entries in MAINTAINER file are not stating it. The following patch removes Sean's and Hal's emails from the maintainers list. It will allow clean get_maintaner.pl output which is needed for people outside of our community and various semi-automatic tools. Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Move cqp_cmd_head init to CQP initializationBob Sharp1-1/+1
Control QP (CQP) command backlog list is initialized at device initialization time. It is not reinitialized in the reset flow. Move the initialization to CQP creation time so the list can be initialized correctly for reset as well. Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls") Signed-off-by: Bob Sharp <Robert.O.Sharp@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Remove UDA QP from QoS list if creation failsIvan Barrera3-3/+5
If User-space Direct Access (UDA) QP creation fails, the QP entry is not removed from QoS list. Fix this by removing QP from QoS list if create QP fails. Fixes: 0fc2dc58896f ("i40iw: Add Quality of Service support") Signed-off-by: Ivan Barrera <ivan.d.barrera@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Clear CQP Head/Tail during initializationChristopher Bednarz1-0/+3
Clear Control Queue Pair (CQP) Head/Tail on CQP initialization as during an adapter reset, these values are not reinitialized. Tail is cleared by writing 0 to CQP's tail register. Head is cleared by writing 0 to CQP's doorbell register. Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls") Signed-off-by: Christopher Bednarz <christopher.n.bednarz@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Refactor queue depth calculationShiraz Saleem4-43/+67
Queue depth calculations use a mix of work requests and actual number of bytes. Consolidate all calculations using minimum WQE size to avoid confusion. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Reinitialize IEQ on MTU changeShiraz Saleem5-4/+26
On a netdev MTU change event, the iWARP Exception Queue (IEQ) buffers may not be sized properly to handle the new MTU. Reinitialize the IEQ with new MTU size on MTU change event. Also, add define for the max ethernet frame size field in IEQ QP context instead of the snd_mss define which is for iWARP QPs' MSS field. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Move ceq_valid to i40iw_sc_dev structureShiraz Saleem4-16/+15
Completion Event Queues are created and destroyed on a per device basis as opposed to per User-space Direct Access resource. Move ceq_valid to the correct place in i40iw_sc_dev from i40iw_puda_rsrc. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Account for IPv6 header when setting MSSShiraz Saleem7-13/+15
The IPv6 header size is not subtracted from MTU when MSS is set for QPs. Save MTU opposed to MSS in the vsi struct during initialization and calculate the MSS based on IPv4 vs IPv6 connection. Fixes: f27b4746f378 ("i40iw: add connection management code") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Remove unused structuresMustafa Ismail1-18/+0
Some structures for post SQ operation are not used. Remove the following: i40iw_post_send_w_inv i40iw_post_inline_send_w_inv send_w_sol send_w_inv send_w_sol_inv inline_send_w_sol inline_send_w_inv inline_send_w_sol_inv Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Move exception_lan_queue to VSI structureMustafa Ismail3-8/+7
Consolidate exception_lan_queue under VSI structure where it belongs. Remove it from device and QP structures. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Remove unused static_rsrc from i40iw_create_qp_infoMustafa Ismail3-7/+0
The field static_rsrc in i40iw_create_qp_info is unused and not needed. Remove it. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Ignore AE source field in AEQE for some AEsMustafa Ismail2-0/+50
The AE source field in Asynchronous Event Queue Entry (AEQE) is not set by the hardware for some AEs, but the code assumes it does. This results in incorrect processing of some AEs. Fix this by setting ae_src to I40IW_AE_SOURCE_RSVD for those AEs. Fixes: 86dbcd0f12e9 ("i40iw: add file to handle cqp calls") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18i40iw: Cleanup AE processingMustafa Ismail3-14/+3
Remove unimplemented Asynchronous Events (AE) and correct names. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge branch 'timer_setup' into for-nextDoug Ledford43-192/+158
Conflicts: drivers/infiniband/hw/cxgb4/cm.c drivers/infiniband/hw/qib/qib_driver.c drivers/infiniband/hw/qib/qib_mad.c There were minor fixups needed in these files. Just minor context diffs due to patches from independent sources touching the same basic area. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge branch 'for-next-early' into for-nextDoug Ledford168-2878/+10295
The early for-next branch was based on v4.14-rc2, while the shared pull request I got from Mellanox used a v4.14-rc4 base. I'm making the branch that was the shared Mellanox pull request the new for-next branch and merging the early for-next branch into it. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/mlx5: Use ARRAY_SIZEJérémy Lefaure1-3/+3
Using the ARRAY_SIZE macro improves the readability of the code. Found with Coccinelle with the following semantic patch: @r depends on (org || report)@ type T; T[] E; position p; @@ ( (sizeof(E)@p /sizeof(*E)) | (sizeof(E)@p /sizeof(E[...])) | (sizeof(E)@p /sizeof(T)) ) Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/core: Fix calculation of maximum RoCE MTUParav Pandit2-11/+15
The original code only took into consideration the largest header possible after the IB_BTH_BYTES. This was incorrect, as the largest possible header size is the largest possible combination of headers we might run into. The new code accounts for all possible headers in the largest possible combination and subtracts that from the MTU to make sure that all packets will fit on the wire. Link: https://www.spinics.net/lists/linux-rdma/msg54558.html Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/core: Fix use workqueue without WQ_MEM_RECLAIMParav Pandit1-1/+1
The IB/core provides address resolution service and invokes callback handler when address resolve request completes of requester in worker thread context. Such caller might allocate or free memory in callback handler depending on the completion status to make further progress or to terminate a connection. Most ULPs resolve route which involves allocating route entry and path record elements in callback event handler. It has been noticed that WQ_MEM_RECLAIM flag should not be used for workers that tend to allocate memory in this [1] thread discussion. In order to mitigate this situation, WQ_MEM_RECLAIM flag was dropped for other such WQs in this [2] patch. Similar problem might arise with address resolution path, though its not yet noticed. The ib_addr workqueue is not memory reclaim path due to its nature of invoking callback that might allocate memory or don't free any memory under memory pressure. [1] https://www.spinics.net/lists/linux-rdma/msg53239.html [2] https://www.spinics.net/lists/linux-rdma/msg53416.html Fixes: f54816261c2b ("IB/addr: Remove deprecated create_singlethread_workqueue") Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/core: Fix unable to change lifespan entry for hw_countersParav Pandit1-1/+15
This patch fixes the case where 'lifespan' entry of the hw_counters is not writable. Currently write callback is not exposed for for the hw_counters sysfs operation. Due to this, modifying lifespan value results into permission denied error in below example. echo 10 > /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan -bash: /sys/class/infiniband/mlx5_0/ports/1/hw_counters/lifespan: Permission denied This patch adds the hook to modify any attribute which implements store() operation. Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB: Let ib_core resolve destination mac addressParav Pandit7-50/+9
Since IB/core resolves the destination mac address for user and kernel consumers, avoid resolving in multiple provider drivers. Only ib_core resolves DMAC now, therefore resolve_eth_dmac is removed as exported symbol. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/core: Introduce and use rdma_create_user_ahParav Pandit4-12/+55
Introduce rdma_create_user_ah API which allows passing udata to provider driver and additionally which resolves DMAC for RoCE. ib_resolve_eth_dmac() resolves destination mac address for unicast, multicast, link local ipv4 mapped ipv6 and ipv6 destination gid entry. This allows all RoCE provider drivers to avoid duplicating such code. Such change brings consistency where IB core always resolves dmac and pass it to RoCE provider drivers for user and kernel consumers, with this ah_attr->roce.dmac is always an input field for provider drivers. This uniformity avoids exporting ib_resolve_eth_dmac symbol to providers or other modules. Therefore its removed as exported symbol at later in the patch series. Now uverbs and umad both makes use of rdma_create_user_ah API which fixes the issue where umad has invalid DMAC for address. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18RDMA/cxgb4: Convert timers to use timer_setup()Kees Cook3-10/+5
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Also removes an unused timer and drops a redundant initialization. Cc: Steve Wise <swise@chelsio.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18RDMA/i40iw: Convert timers to use timer_setup() (part 2)Kees Cook2-8/+6
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This includes the remaining timers missed in the earlier i40iw patch. Cc: Faisal Latif <faisal.latif@intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18RDMA/cxgb3: Convert timers to use timer_setup()Kees Cook3-9/+5
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Also removes an unused timer. Cc: Steve Wise <swise@chelsio.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/hfi1: Convert timers to use timer_setup()Kees Cook8-32/+26
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Switches test of .data field to .function, since .data will be going away. Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/rdmavt: Convert timers to use timer_setup()Kees Cook1-6/+4
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. setup_timer() was already being called before the open-coded init_timer() and .data assignment. These are removed as well. Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge tag 'mlx5-updates-2017-10-11' of ↵Doug Ledford24-796/+1437
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into mlnx-shared mlx5-updates-2017-10-11: IPoIB Muli Pkey support This series provides the support for IPoIB Multi Pkey. InfiniBand Pkeys are the equivalent of Ethernet vlans. Currently IPoIB device driver supports only default Pkey and IPoIB Pkey child interfaces are not supported with IPoIB offloads mode, this series will add the support for that by allowing creating mlx5 multiple IPoIB netdevices with a non-default Pkey. mlx5 IPoIB Pkey child interface is smaller version of mlx5i IPoIB interfaces and shares most of its resources with the parent IPoIB interface, namely RX steering and ring queue resources. The only mlx5 resources a child Pkey interface will be creating are the TX rings, since they should be assigned to a specific Pkey. mlx5i Pkey netdev is implemented via new mlx5e netdev profile implemented in mlx5/core/ipoib/ipoib_vlan.c. The series starts with a refactoring of mlx5e PTP and mlx5 clock implementation to move the code to be part of mlx5 core rather than mlx5e netdevice, in order to make mlx5 clock and PTP registration part of the core to be shared with mlx5e master Ethernet netdev/IPoIB parent netdev and mlx5_ib in the near future. Add the support for attaching multiple underlay QPs for the different Pkeys in mlx5 core RX steering. Add Pkey index to rdma_netdev to add the ability to set PKEY index to lower IPoIB offload netdev. Use hash-table to map between DQPN (Destination QP number) to child netdev for the IPoIB parent netdev to forward RX packets to the corresponding child Pkey netdev, since the RX rings are shared. The reset of the series adds the ipoib child Pkey: mlx5e netdev profile, netdev nods implementation and minimal set of ethtool callbacks. Thanks, Saeed. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srp: Make CM timeout dependent on subnet timeoutBart Van Assche1-2/+22
For small networks it is safe to reduce the subnet timeout from its default value (18 for opensm) to 16. Make the SRP CM timeout dependent on the subnet timeout such that decreasing the subnet timeout also causes SRP failover and failback to occur faster. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srp: Cache global rkeyBart Van Assche2-14/+16
This is a micro-optimization for the hot path. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srp: Remove second argument of srp_destroy_qp()Bart Van Assche1-6/+6
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srp: Avoid that a cable pull can trigger a kernel crashBart Van Assche1-6/+19
This patch fixes the following kernel crash: general protection fault: 0000 [#1] PREEMPT SMP Workqueue: ib_mad2 timeout_sends [ib_core] Call Trace: ib_sa_path_rec_callback+0x1c4/0x1d0 [ib_core] send_handler+0xb2/0xd0 [ib_core] timeout_sends+0x14d/0x220 [ib_core] process_one_work+0x200/0x630 worker_thread+0x4e/0x3b0 kthread+0x113/0x150 Fixes: commit aef9ec39c47f ("IB: Add SCSI RDMA Protocol (SRP) initiator") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srpt: Change default behavior from using SRQ to using RCBart Van Assche2-35/+123
Although in the RC mode more resources are needed that mode has three advantages over SRQ: - It works with all RDMA adapters, even those that do not support SRQ. - Posting WRs and polling WCs does not trigger lock contention because only one thread at a time accesses a WR or WC queue in non-SRQ mode. - The end-to-end flow control mechanism is used. >From the IB spec: C9-150.2.1: For QPs that are not associated with an SRQ, each HCA receive queue shall generate end-to-end flow control credits. If a QP is associated with an SRQ, the HCA receive queue shall not generate end-to-end flow control credits. Add new configfs attributes that allow to configure which mode to use (/sys/kernel/config/target/srpt/$GUID/$GUID/attrib/use_srq). Note: only the attribute for port 1 is relevant on multi-port adapters. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srpt: Cache global L_KeyBart Van Assche2-3/+6
This patch is a micro-optimization for the hot path. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srpt: Limit the send and receive queue sizes to what the HCA supportsBart Van Assche1-4/+5
Additionally, correct the comment about ch->rq_size. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srpt: Do not accept invalid initiator port namesBart Van Assche1-5/+4
Make srpt_parse_i_port_id() return a negative value if hex2bin() fails. Fixes: commit a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18RDMA/uverbs: Make the code in ib_uverbs_cmd_verbs() less confusingBart Van Assche1-10/+3
This patch reduces the number of #ifdefs and also avoids that smatch reports the following: drivers/infiniband/core/uverbs_ioctl.c:276: ib_uverbs_cmd_verbs() warn: if statement not indented drivers/infiniband/core/uverbs_ioctl.c:280: ib_uverbs_cmd_verbs() warn: possible memory leak of 'ctx' drivers/infiniband/core/uverbs_ioctl.c:315: ib_uverbs_cmd_verbs() warn: if statement not indented References: commit fac9658cabb9 ("IB/core: Add new ioctl interface") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Matan Barak <matanb@mellanox.com> Cc: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>

Privacy Policy