Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Zero slowdown linear advance #27259

Conversation

dbuezas
Copy link
Contributor

@dbuezas dbuezas commented Jul 10, 2024

Superseded by ------> #27352 <--------

Description

Linear advance currently limits acceleration by k * xy_acceleraton * mm_of_extruder_per_mm_of_motion < e_jerk. This is to instantly achieve the required nozzle pressure when an extruding movement begins/ends acceleration/deceleration.

This PR removes that limitation by allowing nozzle pressure to build up incrementally as demonstrated by this gsheets simulation:

image

If the e-acceleration is set too small there will be some under-extrusion at the beginning and end of the acceleration phase, and over-extrusion at the beginning and end of deceleration phase of the trapezoidal acceleration profile.

I'm getting excellent mediocre but workinig results (with 5000mm/s2 movement acceleration and 3000mm/s2 extruder acceleration). Maybe even better than the current implementation, which I hypothesise is because the extruder acceleration slope makes extrusion behave more linearly. (too soon, there's a working better approach in the works)

I don't have a bowden machine to test this, but those will see the most drastic print time improvements.
Accelerations are a lot faster than with classic LA, so you may need to reduce them a bit.

Another good sign is that now the time estimation in Orca Slicer is accurate, regardless of k factor.

Computing slowdown of current approach

In my printer (direct drive extruder with a fairly long filament path of 14mm)

  • line width = 0.4mm
  • layer height = 0.2mm
  • e_jerk = 5mm/s
  • k = 0.1

Results in

  • mm_of_extruder_per_mm_of_motion = (0.4mm * 0.2mm * 1mm) / (pi * (1.75 / 2)^2) = 0.0332

Which applied to the equation above gives

  • 0.1 * xy_acceleraton * 0.0332 < 5
  • xy_acceleraton < 5/0.0332/0.1
  • xy_acceleraton < 1506mm/s^2

Find your own max acceleration (for the same layer height and line width) with your_max_xy_acceleraton = e_jerk/0.0332/k

It doesn't matter if the movement axes max acceleration is configured to 10k, linear advance will limit all extruding movements to your_max_xy_acceleraton (1500mm/s^2 in my case).

Todos:

  • Inline code documentation
  • A configuration_adv flag to switch between classic and zero slowdown LA
  • Reduce computations done in the block_phase_isr (e.g precompute more)
  • Reduce ram usage (e.g remove added value from the block)
  • A lot of testing

What I wish for

  • Testing in 8 bit boards
  • Testing in bowden extruders
  • Confirmation that I'm not accidentally exceeding the max_e_acceleration
  • Before and after pics :)

Left away for follow up PRs

  • S_CURVE_ACCELERATION matching LA: Make la_rate adapt to the acceleration of each step event
  • Target the final la_rate to match the acceleration of the next block
  • Smoothing out LA beyond the acceleration/deceleration phases and across contiguous blocks

Requirements

Benefits

Faster printing, particularly in machines using large k values.

Configurations

#define LIN_ADVANCE

Related Issues

@dbuezas dbuezas marked this pull request as draft July 10, 2024 22:14
@ellensp
Copy link
Contributor

ellensp commented Jul 11, 2024

Fails to build on test config [Test BIGTREE_BTT002] BigTreeTech BTT002 with Prusa MK3S and related options..

Marlin/src/module/stepper.cpp: In static member function 'static uint32_t Stepper::block_phase_isr()':
Marlin/src/module/stepper.cpp:2891:31: error: 'acc_step_rate' was not declared in this scope
2891 | uint32_t sum_rate = acc_step_rate + current_block->la_step_rate;
| ^~~~~~~~~~~~~

currently acc_step_rate is only defined as so...

#if DISABLED(S_CURVE_ACCELERATION)
  static uint32_t acc_step_rate; // needed for deceleration start point
#endif

This just builds standard configs here https://github.com/MarlinFirmware/Configurations/tree/bugfix-2.1.x/config/examples/Prusa/MK3S-BigTreeTech-BTT002

@dbuezas
Copy link
Contributor Author

dbuezas commented Jul 11, 2024

@ellensp It's not compatible with S_CURVE_ACCELERATION yet.

@dbuezas
Copy link
Contributor Author

dbuezas commented Jul 11, 2024

To test this, I suggest e-acceleration is set as high as possible (but before it starts rattling of course), I had it way too low and with this algorithm it is a limiting factor on how high pressure advance can react to changes in acceleration

@dbuezas dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from ceaadad to 5c6c3cc Compare July 11, 2024 17:10
@@ -874,6 +874,10 @@ void Planner::calculate_trapezoid_for_block(block_t * const block, const_float_t
const float comp = extruder_advance_K[E_INDEX_N(block->extruder)] * block->steps.e / block->step_event_count;
block->max_adv_steps = cruise_rate * comp;
block->final_adv_steps = final_rate * comp;
#if ENABLED(LA_ZERO_SLOWDOWN)
block->la_step_rate = 0;
Copy link
Contributor Author

@dbuezas dbuezas Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could block->la_step_rate be a static variable of the stepper class or the planner to save ram?
I think it could since blocks run one at a time.

The variable keeps track of the linear advance term of current extruder velocity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle the LA or acceleration parameters could change between blocks. So I think la_step_rate has to be stored for each block.

@@ -874,6 +874,10 @@ void Planner::calculate_trapezoid_for_block(block_t * const block, const_float_t
const float comp = extruder_advance_K[E_INDEX_N(block->extruder)] * block->steps.e / block->step_event_count;
block->max_adv_steps = cruise_rate * comp;
block->final_adv_steps = final_rate * comp;
#if ENABLED(LA_ZERO_SLOWDOWN)
block->la_step_rate = 0;
block->max_e_acc = (uint32_t)(float(planner.max_acceleration_steps_per_s2[E_AXIS + E_INDEX_N(extruder)]) * float(block->step_event_count) / float(block->steps[E_AXIS]) * (float(1UL << 24) / (STEPPER_TIMER_RATE)));
Copy link
Contributor Author

@dbuezas dbuezas Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if block->max_e_acc could also be a static variable computed once when the block is dequeued to save more ram. It is a very expensive operation though so it shouldn't go in the isr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Advice on where else to put it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it needs to be calculated in Planner::calculate_trapezoid_for_block() which can get called multiple times for any given block. All the variables it depends on are set in stone before or during Planner::_populate_block(). I think the logical place for it is in the #if ENABLED(LIN_ADVANCE) block at around line 2377.

It can't be static because max acceleration could change between blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think scaling by float(block->step_event_count) / float(block->steps[E_AXIS]) is incorrect. block->max_e_acc is going to be multiplied by an interval measured in stepper timer ticks and the result should be delta speed for the E axis. So it just needs to be steps / s / tick, which you would get without the scaling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended to scale it to movement steps (because "the Bresenham algorithm will convert this step rate into extruder steps").
I inferred it from these snippets that are already in the code:

#define LIMIT_ACCEL_FLOAT(AXIS,INDX) do{ \
      if (block->steps[AXIS] && max_acceleration_steps_per_s2[AXIS+INDX] < accel) { \
        const float max_possible = float(max_acceleration_steps_per_s2[AXIS+INDX]) * float(block->step_event_count) / float(block->steps[AXIS]); \
        NOMORE(accel, max_possible); \
      } \
    }while(0)

and

block->acceleration_rate = (uint32_t)(accel * (float(1UL << 24) / (STEPPER_TIMER_RATE)));

Which is then used like this in Stepper::block_phase_isr

acc_step_rate = STEP_MULTIPLY(acceleration_time, current_block->acceleration_rate) + current_block->initial_rate;

Am I seeing it wrong? I did it like that so I can multiply by interval, which I believe determines the time until the next call to block_phase_isr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right.

#if ENABLED(LA_ZERO_SLOWDOWN)
const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
const uint32_t max_e_accel = STEP_MULTIPLY(interval, current_block->max_e_acc);
// TODO: respect max e accel by also considering the acceleration from the step_rate delta between two timer steps
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO is about using max_e_accel - step_rate_acceleration on ramp up, and max_e_accel + step_rate_acceleration on ramp down, since step_rate_acceleration is also seen by the extruder

Copy link
Contributor Author

@dbuezas dbuezas Jul 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like

const uint32_t max_delta_v =  STEP_MULTIPLY(interval, current_block->max_e_acc); 
const uint32_t acc_delta_v = STEP_MULTIPLY(interval, current_block->acceleration_rate);
const uint32_t max_delta_v_ramp_up = max_delta_v - acc_delta_v;
const uint32_t max_delta_v_ramp_down = max_delta_v + acc_delta_v;
const bool is_ramping_up = step_events_left_in_phase * max_delta_v_ramp_down >= current_block->current_la_step_rate;

if (is_ramping_up){
  current_block->current_la_step_rate += _MIN(max_delta_v_ramp_up, current_block->la_advance_rate - current_block->current_la_step_rate);
} else {
  current_block->current_la_step_rate -= _MIN(max_delta_v_ramp_down, current_block->current_la_step_rate);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have been covered in planner.cpp I think when initially calculating current_block->max_e_acc.

const uint32_t la_step_rate = la_advance_steps < current_block->max_adv_steps ? current_block->la_advance_rate : 0;
#if ENABLED(LA_ZERO_SLOWDOWN)
const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
const uint32_t max_e_accel = STEP_MULTIPLY(interval, current_block->max_e_acc);
Copy link
Contributor Author

@dbuezas dbuezas Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the interval constant across the execution of the whole block? if so, this could be computed only once at the beginning of each block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where I have big plans for S_CURVE_ACCELERATION in a follow up PR: I'll keep track of the current acceleration of each step event and compute la_rate from it. That will make linear advance exactly match the s curve and finally be perfect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tombrazier I tried this, and the resulting quality is fantastic.
It's rather simple:

  • I removed all the LA code for acceleration/deacceleration/cruise/new_block from stepper.cpp
  • stored current_step_rate in a variable so I can access it at the end regardless of the current phase
  • added this at the end of block_phase_isr
if (la_active && current_block) {
      // K = mm of filament compression needed per 1mm/s extrusion speed [mm/mm/s].
      // K = steps of filament compression needed per 1steps/time extrusion speed [steps/steps/time].
      
      float e_step_rate = float(curr_step_rate) * xy_to_e_steps;
      float dt = float(interval)/float(STEPPER_TIMER_RATE);
      float dt_inv = 1.0f / dt; 

      float target_la_step_count = k * e_step_rate;
      float distance_to_target = target_la_step_count - current_la_step_count;

      float one_shot_v = distance_to_target * dt_inv;
      float a = ABS(one_shot_v - current_la_step_rate) * dt_inv;
      a = a > a_max ? a_max : a;
      
      float stopping_distance = current_la_step_rate*current_la_step_rate / (2 * a_max);
      
      if (ABS(distance_to_target) <= stopping_distance){
          // If we're within stopping distance, decelerate
          a *= current_la_step_rate<0 ? 1 : -1;
      }
      else{
          // Otherwise, accelerate towards the target
          a *= distance_to_target<0 ? -1 : 1;
      }

      current_la_step_rate += a * dt;
      current_la_step_count += current_la_step_rate * dt;

      const int32_t la_step_rate = current_la_step_rate * e_to_xy_steps;
      set_la_interval((int32_t)curr_step_rate + la_step_rate);
    } 

The idea is simple:

  • keep track of the current nozzle pressure
  • calculate target nozzle pressure
  • accelerate/decelerate extruder to achieve target nozzle pressure
  • aim at reaching target nozzle pressure with zero velocity (to avoid overshooting)

This will always lag a bit, but the resulting quality at full speed is already impressive (better than the code in this PR)

What I did is obviously extremely inefficient (i have a 550MHz mcu) but the results prove that there's a lot of quality to win from full speed LA. The quality with and without S_CURVE was pretty much the same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that working very nicely for a fast 32 bit processor, especially if it has floating point instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make prints and pics tomorrow, I think I'm getting better results than the current (slower) LA, even though this method takes 30% less time to print.
I hope I can figure out how to make this in the planner so this can be efficient. One thing I believe to be making a difference too is that it corrects for jerk, so it reduces blobs a bit. It will need to look at contiguous blocks.

Thanks for the review and the kind words btw

const uint32_t step_events_left_until_ramp_down = accelerate_before - step_events_completed;
const uint32_t max_e_accel = STEP_MULTIPLY(interval, current_block->max_e_acc);
// TODO: respect max e accel by also considering the acceleration from the step_rate delta between two timer steps
const bool is_ramping_up = step_events_left_until_ramp_down * max_e_accel >= current_block->la_step_rate;
Copy link
Contributor Author

@dbuezas dbuezas Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalent to

uint32_t step_events_it_would_take_to_slow_down = current_block->la_step_rate / max_e_accel;
const bool is_ramping_up = step_events_left_until_ramp_down >= step_events_it_would_take_to_slow_down;

but without the more expensive division operation

@@ -2460,7 +2460,20 @@ hal_timer_t Stepper::block_phase_isr() {

#if ENABLED(LIN_ADVANCE)
if (la_active) {
const uint32_t la_step_rate = la_advance_steps < current_block->max_adv_steps ? current_block->la_advance_rate : 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these checks done? as long as we are in the acceleration phase, la_step_rate should always be current_block->la_advance_rate. What am I missing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is la_advance_rate chosen in a way which ensures that max_adv_steps can't be surpassed under any circumstances during the accel phase? Or is there a possibility for rounding errors or similar problems that would allow la_advance_steps to slightly overshoot max_adv_steps? This check looks like it was originally meant to catch errors like that. Don't know if it's still useful with this PR in place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. I assume it makes sure the exact added pressure is removed, although I believe this to have no effect in print quality (if anything it would result in a tiny bit of underextrusion since the extra pressure will likely have escaped the nozzle already somewhere else).

The idea of this PR is allowing for a less mathematically pedantic LA in favour of print speed, so if I understood correctly this check should indeed be irrelevant for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, small deviations on a single accel are unlikely to have any visible effect. But you still need to make sure that la_advance_steps returns to (precisely) zero in the following decel phase, otherwise the error will accummulate over multiple blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it shouldn't accumulate since la_step_rate is reset to zero on every cruise phase and at the beginning of each block. If the extruder ran a couple of steps too many, then the increased pressure will bleed out as over extrusion.
as a sanity check I printed some benchies and I don't see any cumulative effects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have any bowden machine left? I'm really curious about how much faster this can run without starting to degrade quality. If acceleration is pushed beyond the extruder's best effort, this PR would start looking like k=0. And of course even if the extrusion were perfect, the increased acceleration can worsen other issues like ringing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've still got an Ender 3 Pro with Bowden setup, but that starts layer shifting above 6000 mm/s² on the Y axis (bed) as far as I remember my last tests. But I run #define DEFAULT_EJERK 15.0 (originally even 20.0) with #define ADVANCE_K 0.45 and didn't have the impression that LA was the limiting factor for accel. According to your calculations in the first post, it should be, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my calculations, LA should limit acceleration to 1004mm/s2.
Have you tried this PR? What I expect to happen is that the printer goes so fast that the extruder won't be able to fully keep up and it will result in less LA corrections applied. But reducing printing acc to say 3000 may be a good compromise. Also try increasing extruder acceleration as much as possible.
I'm super curious about any results you may get :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I improved linear advance back in #24533 there was a lot of discussion about not causing cumulative errors in E steps. This check was part of preventing that. (IIRC the reasoning was this: Before my changes E steps were locked in sync with XYZ steps which gave this guarantee. After my changes there was the potential of rounding errors.)

I agree that this could be relaxed for PR which sacrifices exact step numbers for speed and lets the minor differences bleed out into the print pretty much invisibly.

@dbuezas dbuezas marked this pull request as ready for review July 11, 2024 20:13
@dbuezas dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from 90c57df to ca96ab0 Compare July 12, 2024 21:11
@dbuezas dbuezas force-pushed the dbuezas/zero-slowdown-linear-advance branch from ca96ab0 to 8c23760 Compare July 12, 2024 21:30
@dbuezas
Copy link
Contributor Author

dbuezas commented Jul 13, 2024

The underextrusion on the top right of the cube and the typical bulged corners are gone, all without slowing down the print.
All took the same to print (33 minutes).

image

Needless to say, a cube printed with the current LA implementation can look slightly better since it will be printed at lower average speed.

@dbuezas
Copy link
Contributor Author

dbuezas commented Jul 18, 2024

@ellensp forgot to mention it, but it is now compatible with s curve acceleration. It is just not optimized to perfectly match nozzle pressure along the curve (current LA is not either)

@thisiskeithb thisiskeithb marked this pull request as draft July 19, 2024 17:35
@thisiskeithb thisiskeithb changed the title Zero slowdown linear advance [WIP] Zero slowdown linear advance Jul 19, 2024
@tombrazier
Copy link
Contributor

Well done. I have wanted to implement something like this for ages but have not had the time.

@@ -260,6 +260,10 @@ typedef struct PlannerBlock {
uint8_t la_scaling; // Scale ISR frequency down and step frequency up by 2 ^ la_scaling
uint16_t max_adv_steps, // Max advance steps to get cruising speed pressure
final_adv_steps; // Advance steps for exit speed pressure
#if ENABLED(LA_ZERO_SLOWDOWN)
uint32_t current_la_step_rate; // Current (gradually changing) linear advance rate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think current_la_step_rate does not need to be in PlannerBlock as it only applies to the currently executing block in the stepper module. Move it to stepper.h to save a bunch of RAM and also remove a bunch of indirection overhead.

@tombrazier
Copy link
Contributor

In addition to my previous post, I have added some comments to the existing conversations about things I think could be improved and one thing I think is incorrect. They are here, here and here.

On the whole I really like this PR. I understand it sacrifices step accuracy for speed and that overall it will result in under-extrusion on accel and decel ramps. So that's a tradeoff. Whenever I have thought about implementing this I have also considered limiting XYZ speed to extruder speed / k factor. But I never fancied implementing that level of complexity.

In addition to the under-extrusion already noted, there is an over-extrusion at the end of accel ramps because the interval value used to work out when to start decelerating LA steps is larger than the remainder of the interval values that will be encountered. The opposite happens at the end of deceleration but instead of under-extrusion it will result in a burst of decel, accel, decel, accel, etc. segments. Getting the "when to start decelerating" calculation right is the other thing that always stopped me implementing this feature. The approximation in this PR is pretty simple and if it works well then all I can say is well played.

@dbuezas
Copy link
Contributor Author

dbuezas commented Jul 23, 2024

I made some extra tests. Same gcode, same k=0.15

There's two take aways:

  • Extrusion with the original approach of this PR is not consistent. Fast but less quality, needs more work.
  • The computationally expensive pressure follower (middle pic) gives perfect quality (even slightly better than slow current LA!) and it cuts 23% of printing time. Corners are sharper and top surface is more uniform. I need to precompute the curve in the planner to make it efficient.

The fact that the the "pressure follower" is better than slow LA must come from the fact it also takes jerk into account, allows PA changes during cruise phase, and doesn't assume all intervals are equal (an issue @tombrazier mentioned above).

@dbuezas
Copy link
Contributor Author

dbuezas commented Aug 17, 2024

Superseded by #27352

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Motion F: Linear Advance Needs: Testing Testing is needed for this change Needs: Work More work is needed S: Superseded
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants