-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multichanne, CAN-FD, and STM32G4 #176
base: master
Are you sure you want to change the base?
Conversation
… received USB frame This is a preparation patch, this label will be used several times in the next patches.
…BD_GS_CAN_SendToHost() Since USBD_GS_CAN_SendFrame() isn't used anymore outside of usbd_gs_can.c, mark it as static.
The USBD_GS_CAN_SendToHost() function is used to send to struct gs_host_frame_object to the host. Until this patch, after the sending process has been started, the outgoing frame object is already added to the list of free objects and the variable USBD_GS_CAN_HandleTypeDef::TxState is used to track if the transfer to the host is in progress. Instead, hold the outgoing object in USBD_GS_CAN_HandleTypeDef::to_host_buf and move it to the free list after the transfer is finished in USBD_GS_CAN_DataIn(). Use this to track if a transfer is ongoing.
…eceive() with IRQs enabled
…te functions No functional change intended.
No functional change.
…N_CfgDesc is copied to RAM
…e feature for CAN-FD frames
...having so many arguments doesn't scale.
Add code to support the M_CAN core found on the newer STM32 devices. Co-developed-by: Ryan Edwards <ryan.edwards@gmail.com> Co-developed-by: Jonas Martin <j.martin@pengutronix.de> Co-developed-by: Venelin Efremov <ghent360@iqury.us> Co-developed-by: Phil Greenland <phil@beamconnectivity.com> Co-developed-by: Marc Kleine-Budde <mkl@pengutronix.de>
Also, for those working on a G4 solution I've been working on a board because I needed a solution for 3x CANFD channels + 1 LIN channel. BOM cost is about $10 from JLCPCB. Have some coming in next week for testing so I'll throw this code onto the boards. As always if anyone wants one let me know :) https://github.com/ryedwards/budgetcan_g4_hw I solved the problem of FDCAN1RX sharing BOOT0 with some magical IC power switching wizardry. |
So the HAL has a callback called "HAL_FDCAN_TxBufferCompleteCallback(hfdcan, TransmittedBuffers);" to control the echo frames. You can set up an IRQ flag for it. Would need some crafty design around it to track the buffers. |
This would be the way to do it. I'd suggest that unless it turned out to be related to the bus lockup / starvation issue mentioned above that it should be left for a later PR. This one's already pretty big. The current behaviour while not ideal matches the mainline. @fenugrec Are you still working your way through the PR in the background / do we have a route to completing it? I've got limited time on this myself atm. I believe marc has updated his PR to reflect some compatibility issues with older versions of GCC. I'd look to rebase my changes on top of his again. If I can reproduce / understand the unexpected dependency between the channels I'l look to address that. Going forward implementing the short-echo optimisation discussed above would be a simple addition to increase the TX throughput. Especially for CAN FD. Likewise echoing back to confirm the transmission after the hardware reports its completion, rather than when it's scheduled for transmission should be a relatively small but useful addition. |
ACK |
Drop me a note if I should add support for it to the kernel side. |
Agreed. Let's leave that aside for later.
very slowly - sometimes I don't have enough hardware to test, and I'm averse to merging stuff that I haven't personally tested. The current roadblock was the handful of commits reworking the TxState flag.
Indeed I saw that, haven't looked if it fixes things current problems on master or only in the later multichan commits ? After that, I was hoping to merge the error frame optimizations earlier, hoping to remove throughput bottlenecks on F0.
Please, everyone, do not add this on to any of the monster PRs already pending. |
It's a problem with some compiler versions, it's in the m_can driver, so mainline is not affected. |
Found some time to look at the inter-channel issue. Attempted to reproduce the issue in the way I likely triggered it myself before. This involved configuring and bringing up both channels. Sending frames on a channel which isn't connected to a bus. Then attempting to transmit and receive on the other interface which is connected to a bus. Attempted the same thing, with both channels connected to a bus but one channel configured for the wrong speed. In both cases the correctly configured channel was working, passing a test with canfdtest and able to communicate with my customer's hardware. @ryedwards if you're able to reproduce the issue your customer reported on this branch, let me know what I might try. One thing I did notice which may be the potential cause of any starvation. In the case of transmitting on a bus with no nodes, the local controller continuously retries as per its default configuration. In doing so it reports acknowledge errors, which are in turn reported to the host. Having sent a single frame on the mis-configured bus, watching the USB transfers in Wireshark, leads to 16,000 transfers/s on my G0 based dual channel adapter. Running the same firmware on a single channel adapter based on an F0 results in 15,000 transfers/s. This matches the master branch, which when tested on the F0 exhibits the same behaviour. It's correct in that the errors its reporting are genuine but it might be better to try to suppress them after the first notification? |
Doing a bit of CAN stuff today. I can confirm that disabling the SOF interrupt on current master has no measurable effect (~16200 f/s). Also tried the |
I’ve got on order 2 of the “SH-C31A USB to CAN Adapter with FD” from DSD Tech. Their BOM lists using the stm32g431C8 which I’m guessing is close to the stm32g4CB (flash size difference). It also has the TJA1051T transceiver and for my needs to run bitrate 83.3k. It’s been a while since I’ve delved into embedded devices and Linux drivers but I can read code and understand Linux kernel code. I’d like to help the best I can to progress but admittedly may need some pointers along the way. once I get hardware and tool chain configured, I’ll git what’s up and report what I find. |
Hey @curreyr, I have the same adapter and it came with a candlelight compatible firmware. It doesn't support CAN-FD, though. But you can configure a bitrate of 83.3 kbit/s without problems. With Linux kernel v6.3 or you can configure the bitrate like this: ip link set can0 up type can bitrate 83333 Older kernels need some more help, they don't calculate a proper ip link set can0 up type can bitrate 83333 sjw 8 In both cases please check the bitrate configuration with: ip --details -s link show can0
167: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0
can state ERROR-ACTIVE restart-ms 1000
bitrate 83333 sample-point 0.875
tq 88 prop-seg 59 phase-seg1 59 phase-seg2 17 sjw 8 brp 15
^^^^^^^^^^^^^^^^^^^
gs_usb: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp_inc 1
clock 170000000
re-started bus-errors arbit-lost error-warn error-pass bus-off
0 0 0 0 0 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 parentbus usb parentdev 2-2:1.0
RX: bytes packets errors dropped missed mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
0 0 0 0 0 0 edit: use 83.3 kbit/s not 833.3 kbit/s |
I think I parse. My plan is to get the pair on a bench working with ‘hello’ at 83.3k before I attach to th w906 Mercedes. Bench is cheap, Van not so much |
When I tap into the Mercedes’ just to listen, any specific settings? As in not ack a door unlock vs ack and the other modules also do. I’m a can newb |
I don't have any experience with CAN and cars. There are probably CAN car hacker forums out there that can help you. |
Yea, they exist. Usually full of bad. Y’all know the hardware and I’ll explore my domain. I just want to sniff this, then perhaps venture to sends. |
https://github.com/marckleinebudde I appreciate you when I get hardware and computers configured, let me know how I can help in development |
A lot of what you can do depends on the vehicle make & architecture. Most cars now only expose the OBD2 compliant diagnostic messages via the DLC connector. For CAN that is usually pins 6/14. All of the other buses are buried behind a gateway. Unfortunately there isn't much to sniff if you are on the DLC side of the gateway. You'd need to make valid diagnostic requests to see anything. Suggest reading up on UDS and the types of things you can do with it. |
This is getting way off-topic for this PR, you're welcome to continue discussion in a separate PR please ! |
Sorry for getting off topic, but I do have taps to the full bus (not simply odbdii). Actually 3 separate can bus … it’s a German thing I guess. anyhoo, I do want to contribute and can setup a bench to debug, test, QA code as best I can. The car isn’t fdcan but I can bench one for testing if it helps. |
I just want to say thanks for everyone's work on this! This PR builds for me and produces firmware for both the MKS CANable 2.0 and MKS CANable 2.0 PRO. It works on all three CAN busses i've tried it on (250 kbps and 500 kbps CAN, and 1Mbps/4Mbps CAN-FD). I have not run any speed tests, just basic functional tests, and only sniffing packets from the CAN bus, not outputting packets to the bus. Would love to see this progress towards mainline, what can i do to help? |
Speed and high bus load tests are important since that is when edge cases and subtle problems appear.
Not much atm, I review and merge a few commits now and then. There is a lot of common content in here and #139 , I was hoping for some clarification on certain things from @marckleinebudde but haven't heard back (and I haven't followed up or anything since then, so it's not marc's fault - he's been waiting on me forever) |
I'll try to set up some speed tests and report back here, but i ran into a glitch. I have a CANable2 running b4dd771, and it's mostly working well. I have two problems, which seem like they might be related. This is all on Ubuntu 23.10, with dfu-util 0.11. Failure to reattach after flashingStart with the CANable2 running candleLight_fw multichannel, use dfu-util to detach it:
Not sure what to make of the Writing the flash seems to crash the CANable2:
At this point the device does not show up in lsusb, and the green LED ("WORK") and red LED ("PWR") are lit. If I unplug the device from USB and reconnect it the green LED ("WORK") and blue LED ("SATA") flash and the device shows up as a working CAN adapter:
So it all seems to work, except for the resetting into the newly flashed firmware. Using the "make flash" helper produces different log output, but otherwise fails in a similar way:
Failure to boot after a short power cycleFor this i have the firmware flashed on the device and its working well. Unplug USB and reconnect it quickly, < 2 seconds of downtime, and it comes up in bootloader mode (0483:df11). Unplug USB and leave it disconnected for > 3 seconds and it boots the candleLight firmware. This behavior is totally repeatable. Any thoughts on what's going on here? I'd really like to be able to boot into new firmware right after flashing, and i'd like the device to boot even after a very brief power loss. |
Week 23 will be our annual technology week, I have working on the candlelight firmware as my project. |
@fenugrec Let me know when you're done merging Marc's commits and I'll look to rebase this one on top. I'm not likely to find any time to test anything over the next few months unfortunately. My test setup with different MCUs has long been broken up and spread to the far corners of the office. Happy to help with any questions / feedback though. My G0 based adapters running this branch have been my daily drivers for some time now. |
Do we have a timeline when we think this will be merged? I think I've found a way to handle echo_id responses only AFTER a message has been transmitted using the HAL. I'm am capturing this here since the new PR will need to be created only after this PR is merged. So, I started messing with the HAL_FDCAN_GetTxEvent() function. At first it didn't appear to be working but I uncovered that it wasn't enabled as we needed to set TxHeader.TxEventFifoControl = FDCAN_STORE_TX_EVENTS. Once I changed this value in the can_send() function I was getting a response while polling the GetTxEvent function. The cool part: We can use the MessageMarker member to store the echo_id. The other cool part: This also allows for the HAL timestamp feature to be implemented where the echo frame can contain the timestamp of when the message was actually TX'd on the bus. |
No. Been extremely busy with work these past 2 months, will be unusually busy for the rest of summer too. I have about 20 minutes right now, am merging a few easy pickings from 139. |
@@ -166,7 +177,15 @@ USBD_StatusTypeDef USBD_LL_Init(USBD_HandleTypeDef *pdev) | |||
HAL_PCDEx_PMAConfig((PCD_HandleTypeDef*)pdev->pData, 0x00, PCD_SNG_BUF, 0x18); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when working on another project I noticed that for the PMA memory on the G0/G4 that the "base address" is different than the F0 USB (not sure why) but the offset for the buffers needs to be offset from 64 instead of 24.
This means that the PMAConfig is 0x40, 0x80, 0x00C00100, 0x01400180
Maybe someone can find why the offset is different. Or maybe it's been wrong the whole time since nobody can seem to explain the PMA memory space :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It impacted when I tried to use endpoints beyond 0x81 and 0x02. Not an issue for this project but unsure what other issues it may cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PMA offsets, on F0, are a function of selected buffer sizes, and the encoding of the relevant registers is absolutely not intuitive, and easy to get wrong.
I haven't read the G0 refman but that part about PMA buffers should be examined closely. I mean, in theory HAL should make it so we don't need to do that, but yeah, theory.
To replace their proprietary, slcan based, firmware, I have added the support for the WeActStudio USB2CANFDV1 based on this branch and it is working nicely. I have pushed the changes to my repo for now but I would be happy to make a PR here once this PR is merged. |
Picked up where @marckleinebudde left off.
Resolved the performance issue, which appeared exactly as @fenugrec had described in his comment.
Added support for the STM32G4, specifically the Makerbase MKS Canable2 (which presumably is based on the canable.io Canable 2.0, although I couldn't find a schematic for the canable.io to compare). A bit of scope creep granted, my goals were support for this board and improved performance for my G0 board.
Followed @fenugrec 's tests using:
Along with the following for canfd as canbusload doesn't support fd frames at present:
Have tested with an F0 based board (DSD_TECH_SH_C30A_fw), G0 based board (my own dual channel adapter) and a G4 based board (CANable2_MKS_fw). Couldn't find anything with an F4 on to test with unfortunately. Although there's only a dev board listed, so may not be too much of an issue?
For reference details of my G0 based adapter can be found on my blog, and board files on a separate branch.
Tested TX and RX performance for each unit, with the same PC/USB controller (and port). Communicating with a Peak PCAN-FD USB adapter, using Linux 6.1.0 on Debian.
Performance is as follows (in frames per second):
The poor / variable performance in CAN FD mode appears to be partly down to how the gs_usb driver in the Linux kernel behaves.
The MCU (on Linux) during RX sends an appropriately sized frame (either a classic or FD depending on the payload).
As such CAN2.0 RX performance is maintained regardless of operating mode.
The linux driver, in FD mode, appears to always sends FD sized frames, even for CAN2.0 frames, see: https://elixir.bootlin.com/linux/v6.1/source/drivers/net/can/usb/gs_usb.c#L778
The performance issue was found to be the error frame generation. Although it isn't fully clear how its changed the performance so significantly when compared to the master branch. I stopped short of reading through assembly.
Given the main loop coordinates both USB and CAN, I added a simple loop iteration counter. Measuring the number of iterations of the main loop performed per second. I did this on the F0 which is supported by both branches.
multichannel main loop iterations per second
with error checking 121156
without error checking 229495
master main loop iterations per second
with error checking 137564
without error checking 242311
It appears that retrieving a frame from the free pool, partially preparing before usually discarding an error frame on each iteration of the loop is quite time consuming.
As the error parsing / frame generation code looked fairly tidy and tricky to split, I've opted to only run it when there's a change in the controller error status register, for which it might want to generate an error frame.
For the G0 and G4 families I've also added bus-off recovery. Unlike the F0's bx_can module, the m_can module will not perform automatic bus-off recovery. Requiring software to detect its fallen back into initialisation mode in response to bus-off and request it advance back into normal mode. Passed the highly advanced screwdriver between CAN L/H after that.
Found one slightly issue which I haven't been able to resolve with the G4 yet. It has problems with the double buffered USB endpoint used for CAN transmission. It performs as expected in CAN2.0 mode but confuses the kernel in CAN FD mode, eventually causing the gs_usb driver to freeze and stop transmitting (due to its 10 echo IDs being exhaused). Switching it to single buffered mode solves the issue, without a notable performance decrease. I've left double buffering enabled for the other targets which don't appear to suffer from whatever's happening in the stack/hw.