[WIP] Zero slowdown linear advance #27259

dbuezas · 2024-07-10T22:12:14Z

Superseded by ------> #27352 <--------

Description

Linear advance currently limits acceleration by k * xy_acceleraton * mm_of_extruder_per_mm_of_motion < e_jerk. This is to instantly achieve the required nozzle pressure when an extruding movement begins/ends acceleration/deceleration.

This PR removes that limitation by allowing nozzle pressure to build up incrementally as demonstrated by this gsheets simulation:

If the e-acceleration is set too small there will be some under-extrusion at the beginning and end of the acceleration phase, and over-extrusion at the beginning and end of deceleration phase of the trapezoidal acceleration profile.

I'm getting ~~excellent~~ mediocre but workinig results (with 5000mm/s2 movement acceleration and 3000mm/s2 extruder acceleration). ~~Maybe even better than the current implementation, which I hypothesise is because the extruder acceleration slope makes extrusion behave more linearly~~. (too soon, there's a working better approach in the works)

I don't have a bowden machine to test this, but those will see the most drastic print time improvements.
Accelerations are a lot faster than with classic LA, so you may need to reduce them a bit.

Another good sign is that now the time estimation in Orca Slicer is accurate, regardless of k factor.

Computing slowdown of current approach

In my printer (direct drive extruder with a fairly long filament path of 14mm)

line width = 0.4mm
layer height = 0.2mm
e_jerk = 5mm/s
k = 0.1

Results in

mm_of_extruder_per_mm_of_motion = (0.4mm * 0.2mm * 1mm) / (pi * (1.75 / 2)^2) = 0.0332

Which applied to the equation above gives

0.1 * xy_acceleraton * 0.0332 < 5
xy_acceleraton < 5/0.0332/0.1
xy_acceleraton < 1506mm/s^2

Find your own max acceleration (for the same layer height and line width) with your_max_xy_acceleraton = e_jerk/0.0332/k

It doesn't matter if the movement axes max acceleration is configured to 10k, linear advance will limit all extruding movements to your_max_xy_acceleraton (1500mm/s^2 in my case).

Todos:

Inline code documentation
A configuration_adv flag to switch between classic and zero slowdown LA
Reduce computations done in the block_phase_isr (e.g precompute more)
Reduce ram usage (e.g remove added value from the block)
A lot of testing

What I wish for

Testing in 8 bit boards
Testing in bowden extruders
Confirmation that I'm not accidentally exceeding the max_e_acceleration
Before and after pics :)

Left away for follow up PRs

S_CURVE_ACCELERATION matching LA: Make la_rate adapt to the acceleration of each step event
Target the final la_rate to match the acceleration of the next block
Smoothing out LA beyond the acceleration/deceleration phases and across contiguous blocks

Requirements

Benefits

Faster printing, particularly in machines using large k values.

Configurations

#define LIN_ADVANCE

Related Issues

ellensp · 2024-07-11T07:02:47Z

Fails to build on test config [Test BIGTREE_BTT002] BigTreeTech BTT002 with Prusa MK3S and related options..

Marlin/src/module/stepper.cpp: In static member function 'static uint32_t Stepper::block_phase_isr()':
Marlin/src/module/stepper.cpp:2891:31: error: 'acc_step_rate' was not declared in this scope
2891 | uint32_t sum_rate = acc_step_rate + current_block->la_step_rate;
| ^~~~~~~~~~~~~

currently acc_step_rate is only defined as so...

#if DISABLED(S_CURVE_ACCELERATION)
  static uint32_t acc_step_rate; // needed for deceleration start point
#endif

This just builds standard configs here https://github.com/MarlinFirmware/Configurations/tree/bugfix-2.1.x/config/examples/Prusa/MK3S-BigTreeTech-BTT002

dbuezas · 2024-07-11T07:26:39Z

@ellensp It's not compatible with S_CURVE_ACCELERATION yet.

dbuezas · 2024-07-11T15:42:30Z

To test this, I suggest e-acceleration is set as high as possible (but before it starts rattling of course), I had it way too low and with this algorithm it is a limiting factor on how high pressure advance can react to changes in acceleration

dbuezas · 2024-07-11T19:43:13Z

Marlin/src/module/planner.cpp

@@ -874,6 +874,10 @@ void Planner::calculate_trapezoid_for_block(block_t * const block, const_float_t
      const float comp = extruder_advance_K[E_INDEX_N(block->extruder)] * block->steps.e / block->step_event_count;
      block->max_adv_steps = cruise_rate * comp;
      block->final_adv_steps = final_rate * comp;
+      #if ENABLED(LA_ZERO_SLOWDOWN)
+        block->la_step_rate = 0;


Could block->la_step_rate be a static variable of the stepper class or the planner to save ram?
I think it could since blocks run one at a time.

The variable keeps track of the linear advance term of current extruder velocity

In principle the LA or acceleration parameters could change between blocks. So I think la_step_rate has to be stored for each block.

dbuezas · 2024-07-11T19:44:50Z

Marlin/src/module/planner.cpp

@@ -874,6 +874,10 @@ void Planner::calculate_trapezoid_for_block(block_t * const block, const_float_t
      const float comp = extruder_advance_K[E_INDEX_N(block->extruder)] * block->steps.e / block->step_event_count;
      block->max_adv_steps = cruise_rate * comp;
      block->final_adv_steps = final_rate * comp;
+      #if ENABLED(LA_ZERO_SLOWDOWN)
+        block->la_step_rate = 0;
+        block->max_e_acc = (uint32_t)(float(planner.max_acceleration_steps_per_s2[E_AXIS + E_INDEX_N(extruder)]) * float(block->step_event_count) / float(block->steps[E_AXIS]) * (float(1UL << 24) / (STEPPER_TIMER_RATE)));


I wonder if block->max_e_acc could also be a static variable computed once when the block is dequeued to save more ram. It is a very expensive operation though so it shouldn't go in the isr

Advice on where else to put it?

I don't think it needs to be calculated in Planner::calculate_trapezoid_for_block() which can get called multiple times for any given block. All the variables it depends on are set in stone before or during Planner::_populate_block(). I think the logical place for it is in the #if ENABLED(LIN_ADVANCE) block at around line 2377.

It can't be static because max acceleration could change between blocks.

I think scaling by float(block->step_event_count) / float(block->steps[E_AXIS]) is incorrect. block->max_e_acc is going to be multiplied by an interval measured in stepper timer ticks and the result should be delta speed for the E axis. So it just needs to be steps / s / tick, which you would get without the scaling.

I intended to scale it to movement steps (because "the Bresenham algorithm will convert this step rate into extruder steps").
I inferred it from these snippets that are already in the code:

#define LIMIT_ACCEL_FLOAT(AXIS,INDX) do{ \ if (block->steps[AXIS] && max_acceleration_steps_per_s2[AXIS+INDX] < accel) { \ const float max_possible = float(max_acceleration_steps_per_s2[AXIS+INDX]) * float(block->step_event_count) / float(block->steps[AXIS]); \ NOMORE(accel, max_possible); \ } \ }while(0)

and

block->acceleration_rate = (uint32_t)(accel * (float(1UL << 24) / (STEPPER_TIMER_RATE)));

Which is then used like this in Stepper::block_phase_isr

acc_step_rate = STEP_MULTIPLY(acceleration_time, current_block->acceleration_rate) + current_block->initial_rate;

Am I seeing it wrong? I did it like that so I can multiply by interval, which I believe determines the time until the next call to block_phase_isr

Oh, you're right.

dbuezas · 2024-07-11T19:48:01Z

Marlin/src/module/stepper.cpp

+            #if ENABLED(LA_ZERO_SLOWDOWN)
+              const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
+              const uint32_t max_e_accel =  STEP_MULTIPLY(interval, current_block->max_e_acc); 
+              // TODO: respect max e accel by also considering the acceleration from the step_rate delta between two timer steps


The TODO is about using max_e_accel - step_rate_acceleration on ramp up, and max_e_accel + step_rate_acceleration on ramp down, since step_rate_acceleration is also seen by the extruder

Something like

const uint32_t max_delta_v = STEP_MULTIPLY(interval, current_block->max_e_acc); const uint32_t acc_delta_v = STEP_MULTIPLY(interval, current_block->acceleration_rate); const uint32_t max_delta_v_ramp_up = max_delta_v - acc_delta_v; const uint32_t max_delta_v_ramp_down = max_delta_v + acc_delta_v; const bool is_ramping_up = step_events_left_in_phase * max_delta_v_ramp_down >= current_block->current_la_step_rate; if (is_ramping_up){ current_block->current_la_step_rate += _MIN(max_delta_v_ramp_up, current_block->la_advance_rate - current_block->current_la_step_rate); } else { current_block->current_la_step_rate -= _MIN(max_delta_v_ramp_down, current_block->current_la_step_rate); }

This could have been covered in planner.cpp I think when initially calculating current_block->max_e_acc.

dbuezas · 2024-07-11T19:49:47Z

Marlin/src/module/stepper.cpp

-            const uint32_t la_step_rate = la_advance_steps < current_block->max_adv_steps ? current_block->la_advance_rate : 0;
+            #if ENABLED(LA_ZERO_SLOWDOWN)
+              const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
+              const uint32_t max_e_accel =  STEP_MULTIPLY(interval, current_block->max_e_acc); 


is the interval constant across the execution of the whole block? if so, this could be computed only once at the beginning of each block.

Here is where I have big plans for S_CURVE_ACCELERATION in a follow up PR: I'll keep track of the current acceleration of each step event and compute la_rate from it. That will make linear advance exactly match the s curve and finally be perfect.

@tombrazier I tried this, and the resulting quality is fantastic.
It's rather simple:

I removed all the LA code for acceleration/deacceleration/cruise/new_block from stepper.cpp

stored current_step_rate in a variable so I can access it at the end regardless of the current phase

added this at the end of block_phase_isr

if (la_active && current_block) { // K = mm of filament compression needed per 1mm/s extrusion speed [mm/mm/s]. // K = steps of filament compression needed per 1steps/time extrusion speed [steps/steps/time]. float e_step_rate = float(curr_step_rate) * xy_to_e_steps; float dt = float(interval)/float(STEPPER_TIMER_RATE); float dt_inv = 1.0f / dt; float target_la_step_count = k * e_step_rate; float distance_to_target = target_la_step_count - current_la_step_count; float one_shot_v = distance_to_target * dt_inv; float a = ABS(one_shot_v - current_la_step_rate) * dt_inv; a = a > a_max ? a_max : a; float stopping_distance = current_la_step_rate*current_la_step_rate / (2 * a_max); if (ABS(distance_to_target) <= stopping_distance){ // If we're within stopping distance, decelerate a *= current_la_step_rate<0 ? 1 : -1; } else{ // Otherwise, accelerate towards the target a *= distance_to_target<0 ? -1 : 1; } current_la_step_rate += a * dt; current_la_step_count += current_la_step_rate * dt; const int32_t la_step_rate = current_la_step_rate * e_to_xy_steps; set_la_interval((int32_t)curr_step_rate + la_step_rate); }

The idea is simple:

keep track of the current nozzle pressure

calculate target nozzle pressure

accelerate/decelerate extruder to achieve target nozzle pressure

aim at reaching target nozzle pressure with zero velocity (to avoid overshooting)

This will always lag a bit, but the resulting quality at full speed is already impressive (better than the code in this PR)

What I did is obviously extremely inefficient (i have a 550MHz mcu) but the results prove that there's a lot of quality to win from full speed LA. The quality with and without S_CURVE was pretty much the same

I can see that working very nicely for a fast 32 bit processor, especially if it has floating point instructions.

I'll make prints and pics tomorrow, I think I'm getting better results than the current (slower) LA, even though this method takes 30% less time to print.
I hope I can figure out how to make this in the planner so this can be efficient. One thing I believe to be making a difference too is that it corrects for jerk, so it reduces blobs a bit. It will need to look at contiguous blocks.

Thanks for the review and the kind words btw

dbuezas · 2024-07-11T19:56:31Z

Marlin/src/module/stepper.cpp

+              const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
+              const uint32_t max_e_accel =  STEP_MULTIPLY(interval, current_block->max_e_acc); 
+              // TODO: respect max e accel by also considering the acceleration from the step_rate delta between two timer steps
+              const bool is_ramping_up = step_events_left_until_ramp_down * max_e_accel >= current_block->la_step_rate;


equivalent to

uint32_t step_events_it_would_take_to_slow_down = current_block->la_step_rate / max_e_accel; const bool is_ramping_up = step_events_left_until_ramp_down >= step_events_it_would_take_to_slow_down;

but without the more expensive division operation

dbuezas · 2024-07-11T19:58:35Z

Marlin/src/module/stepper.cpp

@@ -2460,7 +2460,20 @@ hal_timer_t Stepper::block_phase_isr() {

        #if ENABLED(LIN_ADVANCE)
          if (la_active) {
-            const uint32_t la_step_rate = la_advance_steps < current_block->max_adv_steps ? current_block->la_advance_rate : 0;


Why are these checks done? as long as we are in the acceleration phase, la_step_rate should always be current_block->la_advance_rate. What am I missing

Is la_advance_rate chosen in a way which ensures that max_adv_steps can't be surpassed under any circumstances during the accel phase? Or is there a possibility for rounding errors or similar problems that would allow la_advance_steps to slightly overshoot max_adv_steps? This check looks like it was originally meant to catch errors like that. Don't know if it's still useful with this PR in place.

Possibly. I assume it makes sure the exact added pressure is removed, although I believe this to have no effect in print quality (if anything it would result in a tiny bit of underextrusion since the extra pressure will likely have escaped the nozzle already somewhere else).

The idea of this PR is allowing for a less mathematically pedantic LA in favour of print speed, so if I understood correctly this check should indeed be irrelevant for it.

I agree, small deviations on a single accel are unlikely to have any visible effect. But you still need to make sure that la_advance_steps returns to (precisely) zero in the following decel phase, otherwise the error will accummulate over multiple blocks.

I think it shouldn't accumulate since la_step_rate is reset to zero on every cruise phase and at the beginning of each block. If the extruder ran a couple of steps too many, then the increased pressure will bleed out as over extrusion.
as a sanity check I printed some benchies and I don't see any cumulative effects.

do you have any bowden machine left? I'm really curious about how much faster this can run without starting to degrade quality. If acceleration is pushed beyond the extruder's best effort, this PR would start looking like k=0. And of course even if the extrusion were perfect, the increased acceleration can worsen other issues like ringing

I've still got an Ender 3 Pro with Bowden setup, but that starts layer shifting above 6000 mm/s² on the Y axis (bed) as far as I remember my last tests. But I run #define DEFAULT_EJERK 15.0 (originally even 20.0) with #define ADVANCE_K 0.45 and didn't have the impression that LA was the limiting factor for accel. According to your calculations in the first post, it should be, though.

According to my calculations, LA should limit acceleration to 1004mm/s2.
Have you tried this PR? What I expect to happen is that the printer goes so fast that the extruder won't be able to fully keep up and it will result in less LA corrections applied. But reducing printing acc to say 3000 may be a good compromise. Also try increasing extruder acceleration as much as possible.
I'm super curious about any results you may get :)

When I improved linear advance back in #24533 there was a lot of discussion about not causing cumulative errors in E steps. This check was part of preventing that. (IIRC the reasoning was this: Before my changes E steps were locked in sync with XYZ steps which gave this guarantee. After my changes there was the potential of rounding errors.)

I agree that this could be relaxed for PR which sacrifices exact step numbers for speed and lets the minor differences bleed out into the print pretty much invisibly.

The last slowdown can be done in cruise mode and stay within max_e_acc, or in the next step event

dbuezas · 2024-07-13T17:26:39Z

The underextrusion on the top right of the cube and the typical bulged corners are gone, all without slowing down the print.
All took the same to print (33 minutes).

Needless to say, a cube printed with the current LA implementation can look slightly better since it will be printed at lower average speed.

dbuezas · 2024-07-18T20:01:01Z

@ellensp forgot to mention it, but it is now compatible with s curve acceleration. It is just not optimized to perfectly match nozzle pressure along the curve (current LA is not either)

tombrazier · 2024-07-22T08:03:36Z

Well done. I have wanted to implement something like this for ages but have not had the time.

tombrazier · 2024-07-22T08:26:06Z

Marlin/src/module/planner.h

@@ -260,6 +260,10 @@ typedef struct PlannerBlock {
    uint8_t  la_scaling;                    // Scale ISR frequency down and step frequency up by 2 ^ la_scaling
    uint16_t max_adv_steps,                 // Max advance steps to get cruising speed pressure
             final_adv_steps;               // Advance steps for exit speed pressure
+    #if ENABLED(LA_ZERO_SLOWDOWN)
+      uint32_t current_la_step_rate;        // Current (gradually changing) linear advance rate


I think current_la_step_rate does not need to be in PlannerBlock as it only applies to the currently executing block in the stepper module. Move it to stepper.h to save a bunch of RAM and also remove a bunch of indirection overhead.

tombrazier · 2024-07-22T10:13:26Z

In addition to my previous post, I have added some comments to the existing conversations about things I think could be improved and one thing I think is incorrect. They are here, here and here.

On the whole I really like this PR. I understand it sacrifices step accuracy for speed and that overall it will result in under-extrusion on accel and decel ramps. So that's a tradeoff. Whenever I have thought about implementing this I have also considered limiting XYZ speed to extruder speed / k factor. But I never fancied implementing that level of complexity.

In addition to the under-extrusion already noted, there is an over-extrusion at the end of accel ramps because the interval value used to work out when to start decelerating LA steps is larger than the remainder of the interval values that will be encountered. The opposite happens at the end of deceleration but instead of under-extrusion it will result in a burst of decel, accel, decel, accel, etc. segments. Getting the "when to start decelerating" calculation right is the other thing that always stopped me implementing this feature. The approximation in this PR is pretty simple and if it works well then all I can say is well played.

dbuezas · 2024-07-23T17:12:14Z

I made some extra tests. Same gcode, same k=0.15

Left is this PR, took 37 minutes
Middle is the pressure follower experiment, took 37 minutes BEST AND FASTEST
Right is normal LA, took 48 minutes

There's two take aways:

Extrusion with the original approach of this PR is not consistent. Fast but less quality, needs more work.
The computationally expensive pressure follower (middle pic) gives perfect quality (even slightly better than slow current LA!) and it cuts 23% of printing time. Corners are sharper and top surface is more uniform. I need to precompute the curve in the planner to make it efficient.

The fact that the the "pressure follower" is better than slow LA must come from the fact it also takes jerk into account, allows PA changes during cruise phase, and doesn't assume all intervals are equal (an issue @tombrazier mentioned above).

dbuezas · 2024-08-17T19:31:39Z

Superseded by #27352

dbuezas marked this pull request as draft July 10, 2024 22:14

thisiskeithb added Needs: Work More work is needed F: Linear Advance C: Motion labels Jul 11, 2024

ellensp added Needs: Testing Testing is needed for this change F: Linear Advance C: Motion and removed F: Linear Advance C: Motion labels Jul 11, 2024

dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from ceaadad to 5c6c3cc Compare July 11, 2024 17:10

dbuezas commented Jul 11, 2024

View reviewed changes

dbuezas marked this pull request as ready for review July 11, 2024 20:13

dbuezas added 8 commits July 12, 2024 23:11

First working version of zero slowdown linear advance

09e5dd1

undo respect e acc which was wrong.

9c9fd30

Only SERIAL_ECHOLNPGM in debug mode

44f5a51

comment

1f7ac23

Remove unnecessary extra slow down step.

d2d16c7

The last slowdown can be done in cruise mode and stay within max_e_acc, or in the next step event

Add LA_ZERO_SLOWDOWN option and fix incompatibility with S_CURVE

d0a944d

Mark as experimental

a26c6f5

Better var names (1/2)

91302ac

dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from 90c57df to ca96ab0 Compare July 12, 2024 21:11

Better var names (2/2)

8c23760

dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from ca96ab0 to 8c23760 Compare July 12, 2024 21:30

thisiskeithb marked this pull request as draft July 19, 2024 17:35

thisiskeithb changed the title ~~Zero slowdown linear advance~~ [WIP] Zero slowdown linear advance Jul 19, 2024

tombrazier reviewed Jul 22, 2024

View reviewed changes

thinkyhead force-pushed the bugfix-2.1.x branch from c792921 to 37fb26b Compare August 12, 2024 01:01

dbuezas mentioned this pull request Aug 13, 2024

New zero slowdown linear advance algorithm #27352

Draft

dbuezas closed this Aug 17, 2024

thisiskeithb added the S: Superseded label Aug 19, 2024

[WIP] Zero slowdown linear advance #27259

[WIP] Zero slowdown linear advance #27259

Conversation

dbuezas commented Jul 10, 2024 • edited Loading

Superseded by ------> #27352 <--------

Description

Computing slowdown of current approach

Todos:

What I wish for

Left away for follow up PRs

Requirements

Benefits

Configurations

Related Issues

ellensp commented Jul 11, 2024 • edited Loading

dbuezas commented Jul 11, 2024

dbuezas commented Jul 11, 2024 • edited Loading

dbuezas Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbuezas Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbuezas Jul 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbuezas Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbuezas Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbuezas commented Jul 13, 2024

dbuezas commented Jul 18, 2024

tombrazier commented Jul 22, 2024

Choose a reason for hiding this comment

tombrazier commented Jul 22, 2024

dbuezas commented Jul 23, 2024

dbuezas commented Aug 17, 2024

dbuezas commented Jul 10, 2024 •

edited

Loading

ellensp commented Jul 11, 2024 •

edited

Loading

dbuezas commented Jul 11, 2024 •

edited

Loading

dbuezas Jul 11, 2024 •

edited

Loading

dbuezas Jul 11, 2024 •

edited

Loading

dbuezas Jul 13, 2024 •

edited

Loading

dbuezas Jul 11, 2024 •

edited

Loading

dbuezas Jul 11, 2024 •

edited

Loading