Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Thermal Runaway - But Marlin does not catch it #20749

Closed
zenturacp opened this issue Jan 11, 2021 · 42 comments
Closed

[BUG] Thermal Runaway - But Marlin does not catch it #20749

zenturacp opened this issue Jan 11, 2021 · 42 comments

Comments

@zenturacp
Copy link

zenturacp commented Jan 11, 2021

Bug Description

Suddenly while printing the temperature reporting on both bed and hotend stops updating, it just keeps on the last recorded value and if the value is < target it will keep the hotend/bed on if the last value is > target it will shutdown the heater on hotend / bed. This results in the hotend / bed is either turned off or permanently on

Thermal Runaway does not detect the issue because it keeps reporting the same value over and over again.

After reset it cant start because it says hotend is above threshhold which it really was, target was 210 and when i finally got it on it said 260/120, target / status pre reboot was 210/60 - and just kept beeing that even if i set it to cooldown it does not shutdown the heat.

I'm running this version now
https://github.com/MarlinFirmware/Marlin/tree/cf1f8aff7781c221d76c671e94a88d6d851b2d4d

Im not aware of any recent changes to the printer, the firmware that was on the board was from mid december, and i just updated it yesterday to the version i referenced.

It will print and its not every time it happens. It could be a hardware issue, but I really dont know why it just works after reboot again.

It have happend 3 times now on the same model but sliced with different parameters.

Model: https://www.thingiverse.com/thing:2482299

CatsandwichBowl_V2.zip
Sliced GCode

Configuration Files

Configuration.zip

Steps to Reproduce

I have a modle i slice it with default settings in Cura for Ender 3 Pro
Upload GCode to Octoprint
Print model - after X hours of print I see some deformation on the model.
When i stop the print the actual temperature does not drop - and reporting to console is the same again and again
Recv: T:209.84 /0.00 B:59.84 /0.00 @:0 B@:0
Recv: T:209.84 /0.00 B:59.84 /0.00 @:0 B@:0
Recv: T:209.84 /0.00 B:59.84 /0.00 @:0 B@:0
Recv: T:209.84 /0.00 B:59.84 /0.00 @:0 B@:0

After reboot / reset the temperature is actually
Recv: T:237.50 /0.00 B:112.57 /0.00 @:0 B@:0

Expected behavior:

That the temperature keeps updating, and is the correct (Actual values) that is visible to marlin

Actual behavior:

After some hours printing the printer have "static" readings from both bed and nozzle, it seems like the loop is not updating the actual values - but its for both bed and hotend at once.

Additional Information

Motherboard is SKR 1.4 Turbo
Display BTT 3.5 V2
Printer Ender 3 V2
Hotend Microswiss All metal hotend

IMG_3846
What i see on the model when it happens

Pre
Status before Reset

Post
Status after reset (Several times - because its just running an alarm stating that the hotend is above threshold

ConsoleLog.zip
Here is the terminal output, where you can see that the motherboard sends temp updates but its just the same values over and over again.

@zenturacp zenturacp changed the title [BUG] Thermal Runaway [BUG] Thermal Runaway - But Marlin does not catch it Jan 11, 2021
@loetefix
Copy link

Exactly the same problem here with
Board: SKR1.4Turbo
TFT: TFT35E3V3
Hotend:Hotend Microswiss All metal hotend
Printer : Tronx xy2-Pro

Yesterday the current temperature freeze: Bed - 81 ° and Hotend 234 °
The Bed was Cold and the Hotend was Heating too much. Bed temperature was set to 80 and Hotend to 240.
The Printer was still Printing( The Printer was not freezed ) only the temperature measurement was freezed.

@ellensp
Copy link
Contributor

ellensp commented Jan 22, 2021

what version firmware is on the TFT35E3V3 ?

@loetefix
Copy link

loetefix commented Jan 22, 2021 via email

@loetefix
Copy link

Hi,
It is V3.0.26 APR 11 2020

@loetefix
Copy link

loetefix commented Jan 23, 2021

Today i updated the tft to latest Firmware and configure some changes in marlin asdescribed in the config.ini of the TFT:

General options:
M115_GEOMETRY_REPORT (in Configuration_adv.h)
M114_DETAIL (in Configuration_adv.h)
REPORT_FAN_CHANGE (in Configuration_adv.h)
EMERGENCY_PARSER (in Configuration_adv.h)
SERIAL_FLOAT_PRECISION 4 (in Configuration_adv.h)
HOST_ACTION_COMMANDS (in Configuration_adv.h)

Options to support printing from onboard SD:
SDSUPPORT (in Configuration.h)
AUTO_REPORT_TEMPERATURES (in Configuration_adv.h)
AUTO_REPORT_SD_STATUS (in Configuration_adv.h)
LONG_FILENAME_HOST_SUPPORT (in Configuration_adv.h)
SDCARD_CONNECTION ONBOARD (in Configuration_adv.h)

some configurations i did not before the update.

It seems that it works pretty well til now.

2 Prints without any problems.

@zenturacp
Copy link
Author

what version firmware is on the TFT35E3V3 ?

I run the original from April -

But I really never use the display or only use it in marlin mode - checked the OctoPrint / Terminal and the temperature updates was constant - but it still updated on the console though..

@MakerMeik
Copy link

I have exactly the same problem here. SKR 1.4 Turbo with Marlin 2.0.7.2.
Unfortunately, it happens to me only at longer intervals that the temperature indicator on the display freezes and the 3D printer heats up the Nozzle and the bed more and more to the point of smoke.
Has anyone already found the reason for this? Has the TFT firmware fixed the problem?
I run the printer almost exclusively via Octoprint, which is an Octopi on a Raspi connected to the board via USB. My guess is that there may be a problem in this regard.

@thinkyhead
Copy link
Member

I have exactly the same problem here. SKR 1.4 Turbo with Marlin 2.0.7.2.

Please test the bugfix-2.0.x branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.

@MakerMeik
Copy link

OK, I have tried the bugfix branch. There still seems to be a fundamental problem. Even though I didn't run into the overheating problem now, after two or three more test runs I now had the case that the temperature was displayed with the desired 210°C after the usual warm-up phase, but the nozzle actually only had about 60°C after a while. This was only displayed correctly after a reset of the mainboard.

However, I believe that the temperature was reached initially for a short time, because the PLA started to flow. That the Nozzle temperature is obviously too low, I noticed only by the fact that the filament was suddenly no longer flowable. I now occasionally check the temperature with my multimeter thermistor when I have doubts.

Since it sometimes affects the bed and sometimes the nozzle, I would exclude hardware problems at the temperature sensors.

However, to be able to recreate the problem reasonably reliably, I sent the job via Octoprint again this time. I am not sure if the problem persists if I restrict myself exclusively to the internal SD card and cut the USB connection to the Raspi.

@zenturacp
Copy link
Author

zenturacp commented Mar 13, 2021 via email

@MakerMeik
Copy link

I've had this problem on several models now. I switched to the SKR board only a few weeks ago and am therefore still running various optimization tests. The error occurred both with models that I had sliced myself in Cura (xyzCube) and with a model that I downloaded from a website (Link). I would therefore also exclude that it is due to any special slicer settings.

My suspicion goes as said in the direction of Octoprint or the serial interface. But I have no deeper understanding of how the Marlin code works, so this is amateurish guessing ;-)

Image00001
Image00002

@zenturacp
Copy link
Author

zenturacp commented Mar 14, 2021 via email

@zeleps
Copy link
Contributor

zeleps commented Mar 30, 2021

It happened to me twice as well, in two random occasions, while using octoprint.

The second time, I was preheating the printer, I had the temperature graph open, and the temperature rise appeared to slow down abnormally as it apporached the target temp (200°C), slouching around 185°C, then suddenly it jumped to 260°C+. The printer halted, and after reconnection it cooled off gradually, so the reading was probably accurate.

I suspect it has something to do with M155 temperature reporting (a buffering issue maybe?), but haven't got the time yet to try and reproduce it or debug it. I'll get back to it when I get the change.

@MakerMeik
Copy link

In the meantime, I removed the USB cable from my Raspi and have since done quite a few prints via SD card. Some of them took over ten hours. The problem has not occurred since then. I will connect the Raspi again in the near future to see if the problem could be related to this. I can imagine that there is a connection with #21010.

@MakerMeik
Copy link

Screenshot 2021-04-17 153040
OK, I am now 99.9% sure: There is a relationship with Octoprint or the serial communication. I have now printed at least 10 models exclusively from the SD card. I had Octoprint respectively the USB cable to the Raspberry Pi completely disconnected. During this time there were no temperature problems at all.

Today I needed a terminal to send some GCODES. In this context I reconnected the Raspi and set the Nozzle to the 210°C via Octoprint. This also worked as expected for the first few minutes, until it came back to the heating problem, where the display continued to show my set 210°C. I recognize the smell instantly by now and was able to quickly reset the board. Because once the filament is fried in the nozzle, a major cleaning or replacement action is usually required in most cases. I have ruined a Nozzle in this way in the meantime.

After the reset, the actual temperature of just under 300°C was displayed again as before, which then slowly normalized.

So this time I didn't even have a print running, but just set the temperature and played around with the following GCODES:

M92
G91
G1 E100 F50
M92 E100
M500

However, I suspect that the GCODES did not play a role in this, because initially everything ran as expected. Only after two or three runs the described problem occurred.

@zenturacp
Copy link
Author

I can confirm I also used OctoPrint.. But it's only on certain prints.. Have had long prints working after this..

But exactly same here 210 on display and 300+ on nozzle

@MakerMeik
Copy link

Today I had the problem for the first time even without Octopi was connected via USB. In the meantime surely 20 to 30 prints have run. I would stick to the idea that the USB serial connection accelerates the problem, but it seems to occur with SD card-only printing as well.

@flat-jack
Copy link

flat-jack commented Nov 21, 2021

Have the same problem. Also using a bigtreetech tft. Never had problems with the temperature before. Even on 10 hour prints. Printer is is running Marlin 2.0.9.2. But suddenly they occur in random order. Also realized that my reset button isn't workin correctly anymore. Have anyone tried to change the tft and see if this fixed the problem? I am not using octoprint. Always print from sd card.

@pillopaolo
Copy link

pillopaolo commented Dec 27, 2021

I had exactly the same problem too! With BigTreeTech SKR 1.3 + TFT35 E3 V3, while priting from SD-card via the TFT.
Temperature was frozen just below setpoint, with heater always ON, printer runing normally.
Material (PP) started to bubble and make noise --> temperature was in fact > 350 °C, as indicated by Marlin after restart.

No matter who is "interfering" with Marlin (Octoprint, BTT TFT, etc):

  1. Marlin should not stop updating the temperature!
  2. To cope with temperature freeze (whether caused by Marlin or by the hardware), Marlin should have a kind of freeze check, i.e. if T is not moving (say +-0.1) for a while (say 30 secs), then KILL!

I never had problems with Marlin-2.0.7.2 downloaded Oct-2020.
Problems started with Marlin-2.0.9.2 dowloaded Oct-2021
By the way, watch dog was properly set "#define USE_WATCHDOG"

This is a serious FIRE HAZARD, I encourage the most skilled developers to spend some more time on this.
Thanks

@robbycandra
Copy link
Contributor

robbycandra commented Dec 27, 2021

@pillopaolo, have you updated your firmware to the latest bugfix? What is your firmware version?

@pillopaolo
Copy link

I did not update because the problem is very difficult to replicate, it happened only once to me.
I prefer to wait and see somebody acknowledging the issue, troubleshooting it and finding a solution and documenting/commenting the code accordingly. To be 100% sure the issues is really solved.
We are talking about a FIRE HAZARD here! I cannot proceed with trial'n'error.

I have some knowledge of coding, happy to contribute with troubleshooting if you tell me what lines of code are suspected.

BEEPER_PIN: most likely not configured. BUT a beeper does not solve the problem...

@pillopaolo
Copy link

BEEPER_PIN is NOT defined in my case. It is defined when "HAS_WIRED_LCD is defined", which in turn is defined when "IS_ULTRA_LCD is defined", which is not my case.
I defined REPRAP_DISCOUNT_FULL_GRAPHIC_SMART_CONTROLLER, needed when BTT TFT35 in Marlin mode.

@robbycandra
Copy link
Contributor

robbycandra commented Dec 27, 2021

@pillopaolo The FIRE HAZARD problem can be caused by Hardware too.

When the printer board turn-off the heater, it cut the GND line The 0V lines.
Now, let's look at the wires from our heater. Usually, near the heat block, there is a part that is slightly exposed, without insulation.
If the GND wire in the heater is stuck to the Heatblock. Then the heater will always be on. Because the board disconnects the heater from GND.
In some printers, I add a MOSFET to change the cut-off to +12/24V.

@robbycandra
Copy link
Contributor

Because maybe in some printer, the printer body is connected to GND line. including the nozzle heat block.
I think this is a serious problem for 3d printers. But until now, no one talk about it.

@pillopaolo
Copy link

@robbycandra: I will check later the HW as you suggested

However I see now that a more recent version of the code has been corrected as follows:
inline void loud_kill(FSTR_P const lcd_msg, const heater_id_t heater_id) {
marlin_state = MF_KILLED;
thermalManager.disable_all_heaters();

While in the older version I have, the thermalManager.disable_all_heaters() command was WRONGLY put under "if USE_BEEPER" as follows:
inline void loud_kill(PGM_P const lcd_msg, const heater_id_t heater_id) {
marlin_state = MF_KILLED;
#if USE_BEEPER
thermalManager.disable_all_heaters();

Therefore the heaters were NOT properly disabled if the beeper was not defined.

Correct?

@pillopaolo
Copy link

@pillopaolo The FIRE HAZARD problem can be caused by Hardware too.

When the printer board turn-off the heater, it cut the GND line The 0V lines. Now, let's look at the wires from our heater. Usually, near the heat block, there is a part that is slightly exposed, without insulation. If the GND wire in the heater is stuck to the Heatblock. Then the heater will always be on. Because the board disconnects the heater from GND. In some printers, I add a MOSFET to change the cut-off to +12/24V.

If I understand well, the above only explains why heaters stay ON, it does not explain why Marlin stops reading the temperature and keeps displaying the old one... Or I miss something?

@robbycandra
Copy link
Contributor

Well.... Actually if BEEPER is not used. Disable_all_heaters is called at kill(). I think the Beeper is not the case.

Disable_all_heaters moved to the top because now marlin have park nozzle when printer goes to thermal halted, this only to ensure the disable_all_heater called first.

@robbycandra
Copy link
Contributor

@pillopaolo , yea... I still don't understand about it.

@zeleps
Copy link
Contributor

zeleps commented Dec 27, 2021

@pillopaolo what type of temp sensor do you use?

@robbycandra
Copy link
Contributor

I forgot to say that the thermistor is connected to GND, If the heater GND is connected to thermistor GND, then it will stay heating.

@pillopaolo
Copy link

When the incident occurred, all (except the heater) was working well. This make me think that kill() was not called (assume Kill() stops motors too).
GND is not the issues; 1) I checked + 2) the issues was gone as soon as I reset the exturder 3) It does not explain why T reading was frozen

Could it be any issues with the T reading code, maybe simply not called or terminated/exited prematurely?
Maybe something wrong with the related interrupt (if any).

What about implementing a T freeze check (in a different interrupt / part of the code), as suggested above?

@pillopaolo
Copy link

I forgot to say that the thermistor is connected to GND, If the heater GND is connected to thermistor GND, then it will stay heating.

Standard 100K ohm NTC 3950, since years in 50+ printer/extruders.

Still does not explain why Temperature reading is frozen

@robbycandra
Copy link
Contributor

robbycandra commented Dec 27, 2021

When it comes to stopping or freezing printers, I think my guess is more towards gcode reading. But this is only based on experience, don't have any proof. But I never experience any printer heating up. Just stop or freeze.

@zeleps
Copy link
Contributor

zeleps commented Dec 27, 2021

@pillopaolo I wanted to take a look at temperature.cpp, knowing the sensor type eliminates some possible code paths.

I had something similar happening to me a few months ago, but it hasn't occurred since then (although I exclusively use octoprint to print stuff). Can you reproduce the issue? If yes, it would be interesting if you could enable some debug logging and try it again.

@pillopaolo
Copy link

pillopaolo commented Dec 27, 2021

@pillopaolo I wanted to take a look at temperature.cpp, knowing the sensor type eliminates some possible code paths.

I had something similar happening to me a few months ago, but it hasn't occurred since then (although I exclusively use octoprint to print stuff). Can you reproduce the issue? If yes, it would be interesting if you could enable some debug logging and try it again.

Thermistor type = 1 = Standard 100K ohm NTC 3950, since years in 50+ printer/extruders.

Unfortunately I cannot reproduce it. It only happened once (and was pretty bad!).
Then I read other people had the same issue. Temperature freeze (while other things are functioning) is something very peculiar that could help in troubleshooting.

@zeleps
Copy link
Contributor

zeleps commented Dec 27, 2021

Do you remember if bed temp was updating properly when the problem occured?

@pillopaolo
Copy link

Do you remember if bed temp was updating properly when the problem occured?

Bed was cold, so pretty constant. I did not pay attention.

@thisiskeithb
Copy link
Member

#23373 has been merged.

@github-actions github-actions bot removed the Needs: More Data We need more data in order to proceed label Jan 2, 2022
@descipher
Copy link
Contributor

Does the runaway occur when bang bang is used instead of PID?

@zeleps
Copy link
Contributor

zeleps commented Feb 21, 2022

No, @zenturacp's case (as well as mine) are PID setups (both for hotend and bed).

@descipher
Copy link
Contributor

No, @zenturacp's case (as well as mine) are PID setups (both for hotend and bed).

Just to verify, we have no reported incidents when using Bang Bang? We only see it when using PID so far.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests