-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idpf-linux: block changing ring params while af_xdp is active #25
base: idpf-libie-new
Are you sure you want to change the base?
idpf-linux: block changing ring params while af_xdp is active #25
Commits on Jul 16, 2024
-
netdevice: convert private flags > BIT(31) to bitfields
Make dev->priv_flags `u32` back and define bits higher than 31 as bitfield booleans as per Jakub's suggestion. This simplifies code which accesses these bits with no optimization loss (testb both before/after), allows to not extend &netdev_priv_flags each time, but also scales better as bits > 63 in the future would only add a new u64 to the structure with no complications, comparing to that extending ::priv_flags would require converting it to a bitmap. Note that I picked `unsigned long :1` to not lose any potential optimizations comparing to `bool :1` etc. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for d06d98d - Browse repository at this point
Copy the full SHA d06d98dView commit details -
netdev_features: remove unused __UNUSED_NETIF_F_1
NETIF_F_NO_CSUM was removed in 3.2-rc2 by commit 34324dc ("net: remove NETIF_F_NO_CSUM feature bit") and became __UNUSED_NETIF_F_1. It's not used anywhere in the code. Remove this bit waste. It wasn't needed to rename the flag instead of removing it as netdev features are not uAPI/ABI. Ethtool passes their names and values separately with no fixed positions and the userspace Ethtool code doesn't have any hardcoded feature names/bits, so that new Ethtool will work on older kernels and vice versa. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1c76f46 - Browse repository at this point
Copy the full SHA 1c76f46View commit details -
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for ad1f18e - Browse repository at this point
Copy the full SHA ad1f18eView commit details -
netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
"Interface can't change network namespaces" is rather an attribute, not a feature, and it can't be changed via Ethtool. Make it a "cold" private flag instead of a netdev_feature and free one more bit. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1e66f4b - Browse repository at this point
Copy the full SHA 1e66f4bView commit details -
netdev_features: convert NETIF_F_FCOE_MTU to dev->fcoe_mtu
Ability to handle maximum FCoE frames of 2158 bytes can never be changed and thus more of an attribute, not a toggleable feature. Move it from netdev_features_t to "cold" priv flags (bitfield bool) and free yet another feature bit. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 66bab81 - Browse repository at this point
Copy the full SHA 66bab81View commit details -
net: netdev_features: remove NETIF_F_ALL_FCOE
NETIF_F_ALL_FCOE is used only in vlan_dev.c, 2 times. Now that it's only 2 bits, open-code it and remove the definition from netdev_features.h. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 44c8d76 - Browse repository at this point
Copy the full SHA 44c8d76View commit details
Commits on Jul 17, 2024
-
idpf: fix memory leaks and crashes while performing a soft reset
The second tagged commit introduced a UAF, as it removed restoring q_vector->vport pointers after reinitializating the structures. This is due to that all queue allocation functions are performed here with the new temporary vport structure and those functions rewrite the backpointers to the vport. Then, this new struct is freed and the pointers start leading to nowhere. But generally speaking, the current logic is very fragile. It claims to be more reliable when the system is low on memory, but in fact, it consumes two times more memory as at the moment of running this function, there are two vports allocated with their queues and vectors. Moreover, it claims to prevent the driver from running into "bad state", but in fact, any error during the rebuild leaves the old vport in the partially allocated state. Finally, if the interface is down when the function is called, it always allocates a new queue set, but when the user decides to enable the interface later on, vport_open() allocates them once again, IOW there's a clear memory leak here. There's now oneliner way to fix this all. Instead, rewrite the function from scratch without playing with two vports and memcpy()s. Just perform everything on the current structure and do a minimum set of stuff needed to rebuild the vport. Don't allocate the queues at all, as vport_open(), no matter if it will be called here or during the next ifup, will do that for us. Fixes: 02cbfba ("idpf: add ethtool callbacks") Fixes: e4891e4 ("idpf: split &idpf_queue into 4 strictly-typed queue structures") Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1b85df9 - Browse repository at this point
Copy the full SHA 1b85df9View commit details -
idpf: fix memleak in vport interrupt configuration
The initialization of vport interrupt consists of two functions: 1) idpf_vport_intr_init() where a generic configuration is done 2) idpf_vport_intr_req_irq() where the irq for each q_vector is requested. The first function used to create a base name for each interrupt using "kasprintf()" call. Unfortunately, although that call allocated memory for a text buffer, that memory was never released. Fix this by removing creating the interrupt base name in 1). Instead, always create a full interrupt name in the function 2), because there is no need to create a base name separately, considering that the function 2) is never called out of idpf_vport_intr_init() context. Fixes: d4d5587 ("idpf: initialize interrupts and enable vport") Cc: stable@vger.kernel.org # 6.7 Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for e2cb9d7 - Browse repository at this point
Copy the full SHA e2cb9d7View commit details -
idpf: fix UAFs when destroying the queues
The second tagged commit started sometimes (very rarely, but possible) throwing WARNs from net/core/page_pool.c:page_pool_disable_direct_recycling(). Turned out idpf frees interrupt vectors with embedded NAPIs *before* freeing the queues making page_pools' NAPI pointers lead to freed memory before these pools are destroyed by libeth. It's not clear whether there are other accesses to the freed vectors when destroying the queues, but anyway, we usually free queue/interrupt vectors only when the queues are destroyed and the NAPIs are guaranteed to not be referenced anywhere. Invert the allocation and freeing logic making queue/interrupt vectors be allocated first and freed last. Vectors don't require queues to be present, so this is safe. Additionally, this change allows to remove that useless queue->q_vector pointer cleanup, as vectors are still valid when freeing the queues (+ both are freed within one function, so it's not clear why nullify the pointers at all). Fixes: 1c325aa ("idpf: configure resources for TX queues") Fixes: 90912f9 ("idpf: convert header split mode to libeth + napi_build_skb()") Reported-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 5f5e2bd - Browse repository at this point
Copy the full SHA 5f5e2bdView commit details -
unroll: add generic loop unroll helpers
There are cases when we need to explicitly unroll loops. For example, cache operations, filling DMA descriptors on very high speeds etc. Make MIPS' unroll header a generic one to have "unroll always" macro, which would work on any compiler and system, and add compiler-specific attribute macros. Example usage: #define UNROLL_BATCH 8 unrolled_count(UNROLL_BATCH) for (u32 i = 0; i < UNROLL_BATCH; i++) op(var, i); Not that sometimes the compilers won't unroll loops if they think that would have worse optimization and perf than with a loop, and that unroll attributes are available only starting GCC 8. In this case, you can still use unrolled_call(UNROLL_BATCH, op), which works in the range of [1...32] iterations. For better unrolling/parallelization, don't have any variables that interfere between iterations except for the iterator itself. Co-developed-by: Jose E. Marchesi <jose.marchesi@oracle.com> # pragmas Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Co-developed-by: Paul Burton <paulburton@kernel.org> # unrolled_call() Signed-off-by: Paul Burton <paulburton@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for f0145fb - Browse repository at this point
Copy the full SHA f0145fbView commit details -
libeth: add common queue stats
Define common structures, inline helpers and Ethtool helpers to collect, update and export the statistics (RQ, SQ, XDPSQ). Use u64_stats_t right from the start, as well as the corresponding helpers to ensure tear-free operations. For the NAPI parts of both Rx and Tx, also define small onstack containers to update them in polling loops and then sync the actual containers once a loop ends. In order to implement fully generic Netlink per-queue stats callbacks, &libeth_netdev_priv is introduced and is required to be embedded at the start of the driver's netdev_priv structure. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 86249af - Browse repository at this point
Copy the full SHA 86249afView commit details -
libie: add Tx buffer completion helpers
Software-side Tx buffers for storing DMA, frame size, skb pointers etc. are pretty much generic and every driver defines them the same way. The same can be said for software Tx completions -- same napi_consume_skb()s and all that... Add a couple simple wrappers for doing that to stop repeating the old tale at least within the Intel code. Drivers are free to use 'priv' member at the end of the structure. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 7c49fd3 - Browse repository at this point
Copy the full SHA 7c49fd3View commit details -
idpf: convert to libie Tx buffer completion
&idpf_tx_buffer is almost identical to the previous generations, as well as the way it's handled. Moreover, relying on dma_unmap_addr() and !!buf->skb instead of explicit defining of buffer's type was never good. Use the newly added libie helpers to do it properly and reduce the copy-paste around the Tx code. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for c385b1f - Browse repository at this point
Copy the full SHA c385b1fView commit details -
netdevice: add netdev_tx_reset_subqueue() shorthand
Add a shorthand similar to other net*_subqueue() helpers for resetting the queue by its index w/o obtaining &netdev_tx_queue beforehand manually. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 9194f24 - Browse repository at this point
Copy the full SHA 9194f24View commit details -
idpf: refactor Tx completion routines
This patch adds a mechanism to guard against stashing partial packets into the hash table. This makes the driver more robust, leads to more efficient decision making when cleaning. Doon't stash partial packets. This can happen when an RE completion is received in flow scheduling mode, or when an out of order RS completion is received. The first buffer with the skb is stashed, but some or all of its frags are not because the stack is out of reserve buffers. This leaves the ring in a weird state since the frags are still on the ring. Use the field to track the number of fragments/ tx_bufs representing the packet. The clean routines check to make sure there are enough reserve buffers on the stack before stashing any part of the packet. If there are not, next_to_clean is left pointing to the first buffer of the packet that failed to be stashed. This leaves the whole packet on the ring, and the next time around, cleaning will start from this packet. An RS completion is still expected for this packet in either case. So instead of being cleaned from the hash table, it will be cleaned from the ring directly. This should all still be fine since the DESC_UNUSED and BUFS_UNUSED will reflect the state of the ring. If we ever fall below the thresholds, the TXQ will still be stopped, giving the completion queue time to catch up. This may lead to stopping the queue more frequently, but it guarantees the TX ring will always be in a good state. Also, always use the idpf_tx_splitq_clean function to clean descriptors, i.e. use it from clean_buf_ring as well. This way we avoid duplicating the logic and make sure we're using the same reserve buffers guard rail. This does require a switch from the s16 next_to_clean overflow descriptor ring wrap calculation to u16 and the normal ring size check. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for d0bd302 - Browse repository at this point
Copy the full SHA d0bd302View commit details -
idpf: fix netdev Tx queue stop/wake
netif_txq_maybe_stop() returns -1, 0, or 1, while idpf_tx_maybe_stop_common() says it returns 0 or -EBUSY. As a result, there sometimes are Tx queue timeout warnings despite that the queue is empty or there is at least enough space to restart it. Make idpf_tx_maybe_stop_common() inline and returning true or false, handling the return of netif_txq_maybe_stop() properly. Use a correct goto in idpf_tx_maybe_stop_splitq() to avoid stopping the queue or incrementing the stops counter twice. Fixes: 6818c4d ("idpf: add splitq start_xmit") Fixes: a5ab9ee ("idpf: add singleq start_xmit and napi poll") Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for cc9628b - Browse repository at this point
Copy the full SHA cc9628bView commit details -
Tell hardware to write back completed descriptors even when interrupts are disabled. Otherwise, descriptors might not be written back until the hardware can flush a full cacheline of descriptors. This can cause unnecessary delays when traffic is light (or even trigger Tx queue timeout). The example scenario to reproduce the Tx timeout if the fix is not applied: - configure at least 2 Tx queues to be assigned to the same q_vector, - generate a huge Tx traffic on the first Tx queue - try to send a few packets using the second Tx queue. In such a case Tx timeout will appear on the second Tx queue because no completion descriptors are written back for that queue while interrupts are disabled due to NAPI polling. The patch is necessary to start work on the AF_XDP implementation for the idpf driver, because there may be a case where a regular LAN Tx queue and an XDP queue share the same NAPI. Fixes: c2d548c ("idpf: add TX splitq napi poll support") Fixes: a5ab9ee ("idpf: add singleq start_xmit and napi poll") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 6cae557 - Browse repository at this point
Copy the full SHA 6cae557View commit details -
idpf: switch do libeth generic statistics
Fully reimplement idpf's per-queue stats using the libeth infra. Embed &libeth_netdev_priv to the beginning of &idpf_netdev_priv(), call the necessary init/deinit helpers and the corresponding Ethtool helpers. Update hotpath counters such as hsplit and tso/gso using the onstack containers instead of direct accesses to queue->stats. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 56c1d9b - Browse repository at this point
Copy the full SHA 56c1d9bView commit details -
bpf, xdp: constify some bpf_prog * function arguments
In lots of places, bpf_prog pointer is used only for tracing or other stuff that doesn't modify the structure itself. Same for net_device. Address at least some of them and add `const` attributes there. The object code didn't change, but that may prevent unwanted data modifications and also allow more helpers to have const arguments. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1db8f94 - Browse repository at this point
Copy the full SHA 1db8f94View commit details -
xdp, xsk: constify read-only arguments of some static inline helpers
Lots of read-only helpers for &xdp_buff and &xdp_frame, such as getting the frame length, skb_shared_info etc., don't have their arguments marked with `const` for no reason. Add the missing annotations to leave less place for mistakes and more for optimization. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for afcb93d - Browse repository at this point
Copy the full SHA afcb93dView commit details -
xdp: allow attaching already registered memory model to xdp_rxq_info
One may need to register memory model separately from xdp_rxq_info. One simple example may be XDP test run code, but in general, it might be useful when memory model registering is managed by one layer and then XDP RxQ info by a different one. Allow such scenarios by adding a simple helper which "attaches" an already registered memory model to the desired xdp_rxq_info. As this is mostly needed for Page Pool, add a special function to do that for a &page_pool pointer. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for da6f0f1 - Browse repository at this point
Copy the full SHA da6f0f1View commit details -
net: Register system page pool as an XDP memory model
To make the system page pool usable as a source for allocating XDP frames, we need to register it with xdp_reg_mem_model(), so that page return works correctly. This is done in preparation for using the system page pool for the XDP live frame mode in BPF_TEST_RUN; for the same reason, make the per-cpu variable non-static so we can access it from the test_run code as well. Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for 1859320 - Browse repository at this point
Copy the full SHA 1859320View commit details -
page_pool: make page_pool_put_page_bulk() actually handle array of pages
Currently, page_pool_put_page_bulk() indeed takes an array of pointers to the data, not pages, despite the name. As one side effect, when you're freeing frags from &skb_shared_info, xdp_return_frame_bulk() converts page pointers to virtual addresses and then page_pool_put_page_bulk() converts them back. Make page_pool_put_page_bulk() actually handle array of pages. Pass frags directly and use virt_to_page() when freeing xdpf->data, so that the PP core will then get the compound head and take care of the rest. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 2cd561a - Browse repository at this point
Copy the full SHA 2cd561aView commit details -
page_pool: allow mixing PPs within one bulk
The main reason for this change was to allow mixing pages from different &page_pools within one &xdp_buff/&xdp_frame. Why not? Adjust xdp_return_frame_bulk() and page_pool_put_page_bulk(), so that they won't be tied to a particular pool. Let the latter splice the bulk when it encounters a page whichs PP is different and flush it recursively. This greatly optimizes xdp_return_frame_bulk(): no more hashtable lookups. Also make xdp_flush_frame_bulk() inline, as it's just one if + function call + one u32 read, not worth extending the call ladder. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 23c4f5c - Browse repository at this point
Copy the full SHA 23c4f5cView commit details -
xdp: get rid of xdp_frame::mem.id
Initially, xdp_frame::mem.id was used to search for the corresponding &page_pool to return the page correctly. However, after that struct page now contains a direct pointer to its PP, further keeping of this field makes no sense. xdp_return_frame_bulk() still uses it to do a lookup, but this is rather a leftover. Remove xdp_frame::mem and replace it with ::mem_type, as only memory type still matters and we need to know it to be able to free the frame correctly. As a cute side effect, we can now make every scalar field in &xdp_frame of 4 byte width, speeding up accesses to them. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 92b506a - Browse repository at this point
Copy the full SHA 92b506aView commit details -
xdp: add generic xdp_buff_add_frag()
The code piece which would attach a frag to &xdp_buff is almost identical across the drivers supporting XDP multi-buffer on Rx. Make it a generic elegant onelner. Also, I see lots of drivers calculating frags_truesize as `xdp->frame_sz * nr_frags`. I can't say this is fully correct, since frags might be backed by chunks of different sizes, especially with stuff like the header split. Even page_pool_alloc() can give you two different truesizes on two subsequent requests to allocate the same buffer size. Add a field to &skb_shared_info (unionized as there's no free slot currently on x6_64) to track the "true" truesize. It can be used later when updating an skb. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for d5f4287 - Browse repository at this point
Copy the full SHA d5f4287View commit details -
xdp: add generic xdp_build_skb_from_buff()
The code which builds an skb from an &xdp_buff keeps multiplying itself around the drivers with almost no changes. Let's try to stop that by adding a generic function. There's __xdp_build_skb_from_frame() already, so just convert it to take &xdp_buff instead, while making the original one a wrapper. The original one always took an already allocated skb, allow both variants here -- if no skb passed, which is expected when calling from a driver, pick one via napi_build_skb(). Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 3f75758 - Browse repository at this point
Copy the full SHA 3f75758View commit details -
xsk: allow attaching XSk pool via xdp_rxq_info_reg_mem_model()
When you register an XSk pool as XDP Rxq info memory model, you then need to manually attach it after the registration. Let the user combine both actions into one by just passing a pointer to the pool directly to xdp_rxq_info_reg_mem_model(), which will take care of calling xsk_pool_set_rxq_info(). This looks similar to how a &page_pool gets registered and reduce repeating driver code. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1b659d2 - Browse repository at this point
Copy the full SHA 1b659d2View commit details -
xsk: make xsk_buff_add_frag really add a frag via __xdp_buff_add_frag()
Currently, xsk_buff_add_frag() only adds a frag to the pool linked list, not doing anythig with the &xdp_buff. The drivers do that manually and the logic is the same. Make it really add an skb frag, just like xdp_buff_add_frag() does that, and freeing frags on error if needed. This allows to remove repeating code from i40e and ice and not add the same code again and again. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for bddf7b1 - Browse repository at this point
Copy the full SHA bddf7b1View commit details -
xsk: add generic XSk &xdp_buff -> skb conversion
Same as with converting &xdp_buff to skb on Rx, the code which allocates a new skb and copies the XSk frame there is identical across the drivers, so make it generic. This includes copying all the frags if they are present in the original buff. System percpu Page Pools help here a lot: when available, allocate pages from there instead of the MM layer. This greatly improves XDP_PASS performance on XSk: instead of page_alloc() + page_free(), the net core recycles the same pages, so the only overhead left is memcpy()s. Note that the passed buff gets freed if the conversion is done w/o any error, assuming you don't need this buffer after you convert it to an skb. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 5f5c62c - Browse repository at this point
Copy the full SHA 5f5c62cView commit details -
xsk: add helper to get &xdp_desc's DMA and meta pointer in one go
Currently, when you send an XSk frame without metadata, you need to do the following: * call external xsk_buff_raw_get_dma(); * call inline xsk_buff_get_metadata(), which calls external xsk_buff_raw_get_data() and then do some inline checks. This effectively means that the following piece: addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; is done twice per frame, plus you have 2 external calls per frame, plus this: meta = pool->addrs + addr - pool->tx_metadata_len; if (unlikely(!xsk_buff_valid_tx_metadata(meta))) is always inlined, even if there's no meta or it's invalid. Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that in one go. It returns a small structure with 2 fields: DMA address, filled unconditionally, and metadata pointer, valid only if it's present. The address correction is performed only once and you also have only 1 external call per XSk frame, which does all the calculations and checks outside of your hotpath. You only need to check `if (ctx.meta)` for the metadata presence. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 3908174 - Browse repository at this point
Copy the full SHA 3908174View commit details -
skbuff: allow 2-4-argument skb_frag_dma_map()
skb_frag_dma_map(dev, frag, 0, skb_frag_size(frag), DMA_TO_DEVICE) is repeated across dozens of drivers and really wants a shorthand. Add a macro which will count args and handle all possible number from 2 to 5. Semantics: skb_frag_dma_map(dev, frag) -> __skb_frag_dma_map(dev, frag, 0, skb_frag_size(frag), DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset) -> __skb_frag_dma_map(dev, frag, offset, skb_frag_size(frag) - offset, DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset, size) -> __skb_frag_dma_map(dev, frag, offset, size, DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset, size, dir) -> __skb_frag_dma_map(dev, frag, offset, size, dir) No object code size changes for the existing callers. Users passing less arguments also won't have bigger size comparing to the full equivalent call. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 2b9703f - Browse repository at this point
Copy the full SHA 2b9703fView commit details -
jump_label: export static_key_slow_{inc,dec}_cpuslocked()
Sometimes, there's a need to modify a lot of static keys or modify the same key multiple times in a loop. In that case, it seems more optimal to lock cpu_read_lock once and then call _cpuslocked() variants. The enable/disable functions are already exported, the refcounted counterparts however are not. Fix that to allow modules to save some cycles. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for ae28117 - Browse repository at this point
Copy the full SHA ae28117View commit details -
libeth: support native XDP and register memory model
Expand libeth's Page Pool functionality by adding native XDP support. This means picking the appropriate headroom and DMA direction. Also, register all the created &page_pools as XDP memory models. A driver then can call xdp_rxq_info_attach_page_pool() when registering its RxQ info. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 35d3947 - Browse repository at this point
Copy the full SHA 35d3947View commit details -
libeth: add a couple of XDP helpers (libeth_xdp)
"Couple" is a bit humbly... Add the following functionality to libeth: * XDP shared queues managing * XDP_TX bulk sending infra * .ndo_xdp_xmit() infra * adding buffers to &xdp_buff * running XDP prog and managing its verdict * completing XDP Tx buffers * ^ repeat everything for XSk Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # lots of stuff Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 6f307ba - Browse repository at this point
Copy the full SHA 6f307baView commit details -
idpf: make complq cleaning dependent on scheduling mode
Extend completion queue cleaning function to support queue-based scheduling mode needed for XDP queues. Add 4-byte descriptor for queue-based scheduling mode and perform some refactoring to extract the common code for both scheduling modes. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for a6d1416 - Browse repository at this point
Copy the full SHA a6d1416View commit details -
idpf: remove SW marker handling from NAPI
SW marker descriptors on completion queues are used only when a queue is about to be destroyed. It's far from hotpath and handling it in the hotpath NAPI poll makes no sense. Instead, run a simple poller after a virtchnl message for destroying the queue is sent and wait for the replies. If replies for all of the queues are received, this means the synchronization is done correctly and we can go forth with stopping the link. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 1ecd090 - Browse repository at this point
Copy the full SHA 1ecd090View commit details -
idpf: prepare structures to support xdp
Extend basic structures of the driver (e.g. 'idpf_vport', 'idpf_*_queue', 'idpf_vport_user_config_data') by adding members necessary to support XDP. Add extra XDP Tx queues needed to support XDP_TX and XDP_REDIRECT actions without interfering a regular Tx traffic. Also add functions dedicated to support XDP initialization for Rx and Tx queues and call those functions from the existing algorithms of queues configuration. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 707e479 - Browse repository at this point
Copy the full SHA 707e479View commit details -
idpf: implement XDP_SETUP_PROG in ndo_bpf for splitq
Implement loading the XDP program using ndo_bpf callback for splitq and XDP_SETUP_PROG parameter. Add functions for stopping, reconfiguring and restarting all queues when needed. Also, implement the XDP hot swap mechanism when the existing XDP program is replaced by another one (without a necessity of reconfiguring anything). Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 8a6f7e0 - Browse repository at this point
Copy the full SHA 8a6f7e0View commit details -
idpf: use generic functions to build xdp_buff and skb
In preparation of XDP support, move from having skb as the main frame container during the Rx polling to &xdp_buff. This allows to use generic and libie helpers for building an XDP buffer and changes the logics: now we try to allocate an skb only when we processed all the descriptors related to the frame. Store &libeth_xdp_stash instead of the skb pointer on the Rx queue. It's only 8 bytes wider and there's a place to fit it in. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 7aa5b1a - Browse repository at this point
Copy the full SHA 7aa5b1aView commit details -
idpf: add support for XDP on Rx
Use libeth XDP infra to support running XDP program on Rx polling. This includes all of the possible verdicts/actions. XDP Tx queues are cleaned only in "lazy" mode when there are less than 1/4 free descriptors left on the ring. libeth helper macros to define driver-specific XDP functions make sure the compiler could uninline them when needed. Use __LIBETH_WORD_ACCESS to parse descriptors more efficiently when applicable. It really gives some good boosts and code size reduction on x86_64. Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 242d36e - Browse repository at this point
Copy the full SHA 242d36eView commit details -
idpf: add support for .ndo_xdp_xmit()
Use libeth XDP infra to implement .ndo_xdp_xmit() in idpf. The Tx callbacks are reused from XDP_TX code. XDP redirect target feature is set/cleared depending on the XDP prog presence, as for now we still don't allocate XDP Tx queues when there's no program. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 7e0b8ef - Browse repository at this point
Copy the full SHA 7e0b8efView commit details -
Add &xdp_metadata_ops with a callback to get RSS hash hint from the descriptor. Declare the splitq 32-byte descriptor as 4 u64s to parse them more efficiently when possible. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 74ed675 - Browse repository at this point
Copy the full SHA 74ed675View commit details -
idpf: add vc functions to manage selected queues
Implement VC functions dedicated to enabling, disabling and configuring randomly selected queues. Also, refactor the existing implementation to make the code more modular. Introduce new generic functions for sending VC messages consisting of chunks, in order to isolate the sending algorithm and its implementation for specific VC messages. Finally, rewrite the function for mapping queues to q_vectors using the new modular approach to avoid copying the code that implements the VC message sending algorithm. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for dd01c33 - Browse repository at this point
Copy the full SHA dd01c33View commit details -
idpf: add XSk pool initialization
Add functionality to setup an XSk buffer pool, including ability to stop, reconfig and start only selected queues, not the whole device. Pool DMA mapping is managed by libeth. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 62a153f - Browse repository at this point
Copy the full SHA 62a153fView commit details -
idpf: implement Tx path for AF_XDP
Implement Tx handling for AF_XDP feature in zero-copy mode using the libeth (libeth_xdp) XSk infra. When the NAPI poll is called, XSk Tx queues are polled first, before regular Tx and Rx. They're generally faster to serve and have higher priority comparing to regular traffic. Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for af817d9 - Browse repository at this point
Copy the full SHA af817d9View commit details -
idpf: implement Rx path for AF_XDP
Implement Rx packet processing specific to AF_XDP ZC using the libeth XSk infra. Initialize queue registers before allocating buffers to avoid redundant ifs when updating the queue tail. Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 3ce0d15 - Browse repository at this point
Copy the full SHA 3ce0d15View commit details -
idpf: enable XSk features and ndo_xsk_wakeup
Now that AF_XDP functionality is fully implemented, advertise XSk XDP feature and add .ndo_xsk_wakeup() callback to be able to use it with this driver. Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Configuration menu - View commit details
-
Copy full SHA for de4b645 - Browse repository at this point
Copy the full SHA de4b645View commit details
Commits on Jul 18, 2024
-
idpf-linux: block changing ring params while af_xdp is active
Changing ring parameters, especially ring size, should not be modified while AF_XDP socket is assigned to any Rx ring. Implement a function for checking all Rx queues for AF_XDP socket assign and block changing queue parameters if at least one Rx queue has AF_XDP socket. Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Configuration menu - View commit details
-
Copy full SHA for 04fdca7 - Browse repository at this point
Copy the full SHA 04fdca7View commit details