Option to smooth ETA and *_per_second #394

aawsome · 2022-03-12T21:39:03Z

First, thanks a lot for this wonderful crate. It is awesome to use!

I have a minor issue, I have a workload with highly changing operation rates (in bytes per seconds) which is kind of desired behavior as sometimes caching is used and sometimes not. Now, displaying ETA and bytes_per_second works but is of course very oscillating.
Is it possible to add an option which allows to smooth these value, e.g. that a longer time period is used for the calculation?

djc · 2022-03-12T22:01:42Z

They are already smoothed to some extent, but not enough for your workload, I guess? On current main, you can add your own formatters, this would allow you to estimate based on total elapsed time / current position. Would that work for you?

I've been thinking about including formatters for that in the default set, likely called linear_eta (or maybe uniform_eta).

aawsome · 2022-03-13T07:47:13Z

Thanks a lot for your fast answer! This is a very good idea, I'm now using

            .with_key("my_eta", |s| 
                 match (s.pos(), s.len()){
                    (0, _) => "-".to_string(),
                    (pos,len) => format!("{:#}", HumanDuration(Duration::from_secs(s.elapsed().as_secs() * (len-pos)/pos))),
                })

I'd appreciate to have that in the default set of formatters. If you like, I can try to prepare a PR for this.

There is still some "flickering" for the rate *_per_seconds in my use case, but this can be very much tolerated as it correctly reflects the situation.

djc · 2022-03-13T12:11:57Z

Can't you use a similar solution to make your own take on _per_seconds?

aawsome · 2022-03-17T22:26:00Z

Can't you use a similar solution to make your own take on _per_seconds?

Yes, of course that would work. But as this has more a "local" character (gives the current rate) I decided to simply use the default formatter.

For the ETA, this is more a of a global character and therefore I used the solution above.
What would you consider a good name? "linear_eta"? Or "global_eta"

If you like, I can submit a PR to add two default formatters "_eta" and "_eta_precise".

1Dragoon · 2022-03-22T20:18:38Z

I was thinking the same thing just now, but also applicable to other measurements rather than just ETA. For example, bytes_per_sec keeps bouncing all over the place - it doesn't seem like it is a 1 second interval because I end up watching it bounce rapidly from the KiB scale to the GiB scale.

FWIW this is for network transfers using the win32 copy api directly rather than the std copy, and I've added a callback to have it report its progress directly to indicatif during the transfer. I'm also doing multiple file transfers in parallel. It would totally make sense for indicatif to see a lot of bumpy progress even though it's relatively stable.

djc · 2022-03-23T16:20:11Z

I'm pretty convinced by now that we should have eta and *_per_seconds things that basically just extrapolate elapsed time over the current position. What I'm still wondering about is whether that should be the default behavior, replacing the more complicated, fragile Estimator, and moving the Estimator to different names. Or even remove the Estimator completely, replacing it with building blocks that allow people to build their own more accurate versions? I guess that can already be done in a custom formatter, given access to pos() and elapsed()?

@chris-laplante any thoughts on the matter?

1Dragoon · 2022-03-23T17:19:42Z

I experimented a bit with storing the current pos at approximate intervals and averaging. It smooths it out a little bit, but I think (for *_per_sec at least) it will probably need a stateful formatter that doesn't immediately jump downward from one SI unit to another, but does so only after staying at the higher one for a minimum amount of time. Might also want to have fixed decimal points and reserve white space to the left every time more digits are needed, that way we don't bounce around the text should the numbers decrease to a lower number of digits.

chris-laplante · 2022-03-23T18:19:00Z

I'm pretty convinced by now that we should have eta and *_per_seconds things that basically just extrapolate elapsed time over the current position. What I'm still wondering about is whether that should be the default behavior, replacing the more complicated, fragile Estimator, and moving the Estimator to different names. Or even remove the Estimator completely, replacing it with building blocks that allow people to build their own more accurate versions? I guess that can already be done in a custom formatter, given access to pos() and elapsed()?

@chris-laplante any thoughts on the matter?

I haven't looked at this deeply yet (and I'm actually taking off from work today because of a cold, so I'm only just half here), but it may be worth looking at what some other popular progress libraries do? For example, in Python: https://github.com/tqdm/tqdm. Maybe not applicable directly but could give us some ideas.

1Dragoon · 2022-03-24T05:38:52Z

I submitted PR #418 as a proof of concept. Not really intending that this be merged as is, but it does provide a tool for smoothing out *_per_sec and ETA

1Dragoon · 2022-03-24T21:57:34Z

Has anybody thought of turning estimator into a trait? Thinking maybe when the progressbar is constructed, a struct implementing that trait can be passed as a parameter. If it's not specified then just use the existing method, which uses the existing struct. Also possibly have the formatters ask the estimator for any state information that it might want them to have. Say for example, have the estimator indicate whether a *_per_sec indicator should currently be in Kilo SI unit, Mega SI unit, etc. If it returns say a None for a state, then just do what they already do.

It may even be worth feature gating a few more complex implementations of estimator for convenience. They wouldn't have to alter the compute cost of any code that doesn't need them.

1Dragoon · 2022-03-25T10:50:16Z

Made another stab at it, see PR #420.

1Dragoon · 2022-03-26T04:33:21Z

Pushed more changes. For some reason the ETA doesn't work the way it's supposed to when I use my alternate estimator when it's running on the janky transfer example I put in there. I haven't debugged it to see why that flow works differently, but it's based on the same calculation, the only difference is the normal eta method passes it through Duration instead. However, it does work correctly if it's a non-janky progression.

1Dragoon · 2022-03-28T20:44:49Z

I think what I ultimately came up with in PR #420 after lots of refactoring (and I'm still learning Rust, in addition to being a very inexperienced programmer in general) is basically the same thing as what is already there, but tweaked slightly. The end result can work for any use case that I can imagine, including being able to mimic the behavior of the existing estimator (just specify 15 samples and zero duration.) IMO it should just replace the existing one, provide some reasonable defaults for sample size and interval period and just allow the user the option of overriding the defaults. Another advantage of my approach is it might yield better performance for most users, depending on the settings chosen.

If any maintainers agree with that then I'll go ahead and modify the code in the pull request accordingly.

djc · 2022-03-29T12:18:15Z

So I now think we should probably do something like this:

trait ProgressTracker: std::fmt::Display + Default { // name  to be bikeshedded
    fn tick(&mut self, state: &ProgressState);
}

This would replace the fn(_: &ProgressState) -> String (taking care of the concerns in #391) and the Estimator in #420. We'd create a Default::default() on when cloning the ProgressStyle and leverage the Display impl when formatting in ProgressStyle::format_state().

1Dragoon · 2022-03-30T00:59:49Z

I had actually drafted some code towards that end yesterday, and I couldn't see any way to do it without mutably borrowing ProgressState. Here's the gist of it:

// User code
    pb.update(|state| {
        state.est = Box::new(MyEstimator::new(3, Duration::from_millis(500)));
        state.ext = Some(Box::new(MyExtension::new()));
    });

// state.rs
pub trait Extension {
    fn tick(&mut self, state: &mut ProgressState, now: Instant);
    fn reset(&mut self, now: Instant);
    fn data(&self) -> Vec<ReturnPath>; // Considering replacing this with MPSC
}

... snip ...
// imple BarState
    pub(crate) fn tick(&mut self, now: Instant) {
        if self.ticker.is_none() || self.state.tick == 0 {
            self.state.tick = self.state.tick.saturating_add(1);
        }

        let pos = self.state.pos.pos.load(Ordering::Relaxed);
        self.state.est.record(pos, now);
        let _ = self.draw(false, now);

        if let Some(mut ext) = self.state.ext.take() {
            ext.tick(&mut self.state, now); // Can't immutably borrow without breaking the double mutable borrow rule
            self.state.ext = Some(ext);
        }
    }

It compiles and runs fine it seems. Somebody more knowledgeable than me would have to take a crack at it to get it immutably. Though the only real downside I see of having it mutable is that back-end changes could break user code, even in minor releases.

djc · 2022-03-30T09:25:12Z

I was actually thinking these would live as values in the ProgressStyle::format_map(), where that might be less of an issue. As in your Extension trait, I think it would make sense to also pass Instant to tick(). I'm not convinced about the need for reset() and I don't understand what data() would be for.

1Dragoon · 2022-03-30T15:13:09Z

In my particular use case, I'm using stateful information that is based on what is already known about the progress. If we were to call a reset, then that information would immediately become inconsistent and we might see some unexpected behavior. Also this could be used for more than one formatter, so say I needed one set of information about ETA, (which you don't see implemented in #420, but nonetheless there is a use for it) and another set of information about bytes_per_sec, then I'll need some way to communicate that back to the the calling function, which is what data is for (though admittedly, I didn't flesh that out terribly well, what I did in the PR is just to get my point across and let people more knowledgeable than me determine the best path from there.)

But yeah I see your point about format map, which could let us do away with the need for a data return path. I'm not sure how that avoids a mutable borrow of progressstate though. In order to tick it and even reset it, I think the calls would have to originate from a mutable reference to barstate, wouldn't it?

At any rate, could you have a look at the smoothing example I submitted as part of #420? Those (two separate "downloads") at least demonstrate my two main intended use cases. Maybe there's a better way to achieve those than the mess I created?

djc · 2022-03-30T19:15:17Z

My goal here is to create the simplest possible API that can serve decently complicated use cases. One goal of that is that I can avoid digging into the details of everyone's use cases. Sharing state between multiple formatters definitely passes beyond my complexity threshold. If you want, you can display two pieces of information next to each other in a single fmt::Display, which seems like a reasonable solution if you have two things that badly need to share state.

1Dragoon · 2022-03-30T19:53:22Z

Yep that's perfectly reasonable. And to be honest, it felt kludgy as I was writing it. I'll take another stab later. The only reason I'm offering my use case is I might be missing something obvious, and if somebody else has a better solution to that problem, all the better. I see how we could easily nix the data part. Especially considering that it's still possible to share state anyways without any API help needed (many ways to do that, actually, thread safe too.) Are you still wanting the reset nixed though?

djc · 2022-03-30T20:50:57Z

I think having reset could make sense.

1Dragoon · 2022-03-31T06:53:05Z

Somebody correct me if I'm wrong, but I don't think a trait could ever live inside of format_map because of the fact that ProgressStyle is Clone. Likewise, neither will Default, and for the same reason. The only way this is really possible is if we wrap it into an Arc<Mutex<T>>. I have an aversion to locks, especially within APIs -- deadlocks are infuriating to troubleshoot, can seemingly come out of nowhere at times in edge cases, and take away from that high assurance that I like to aim for. Maybe that's just my autistic paranoia talking? I'd like a more seasoned developer to offer input on that.

I've got an alternative in mind that is lock free while *ahem* technically staying within your rules :) I'll share more after I've made sure it's viable.

djc · 2022-03-31T08:16:42Z

This is why I specified a Default constraint on ProgressTracker, we can create a new instance of the ProgressTracker on clone.

1Dragoon · 2022-03-31T16:13:31Z

Maybe I'm just naive, but isn't any trait method that returns Self going to break object safety? Which AFAIK, default would do that. Also, how would independent clones of an object maintain the same state with one another? Maybe I'm not understanding the purpose behind cloning ProgressStyle.

There are only two other ways I could see doing this while fitting within your parameters:

Require a stateful ProgressTracker to only use struct fields with internal mutability (in which case we can just wrap it in an Arc) which would mean tick() would have to be &self instead of &mut self.
ProgressTracker lives within i.e. BarState, and upon being tick()'ed, BarState sends a message to ProgressStyle We could still do that while keeping the end user API simple, in other words keeping this trait definition:

pub trait ProgressTracker {
    fn tick(&mut self, state: &mut ProgressState, now: Instant);
    fn reset(&mut self, now: Instant);
}

It would just require a little bit of magic from crossbeam_channel that the API user doesn't have to bother with or even be aware of, thus maintaining a simple API. We can also do it without making any of the internal bits complex either, and no change in performance for anybody not using stateful formatting.

1Dragoon · 2022-04-03T18:42:35Z

I just revised PR #420. Although the stateful formatters don't live in ProgressStyle (they live in ProgressState) this approach allows the user to still define them via ProgressStyle and still have them persist, keeping the API simple. I've also worked it so that the estimator could be made optional in the future (i.e. there's no estimator at all unless it's actually needed) though it still remains mandatory right now in order to maintain compatibility with existing user code. And, it can also be swapped out with another one entirely if the user so desires.

Just one trait is added: StatefulFormatter, and it only has two methods: tick and reset. Estimators can be made using this same trait, the user just has to be sure to update rate in ProgressState.

Although some additional locks are added, they won't ever lock in situations where ProgressState isn't already locked, so in some sense there are no additional locks. Either way, a deadlock should be outright impossible here. ProgressStyle should never have to pause due to a lock.

There are a few things I can improve as I'm not entirely satisfied with this yet, but let me know how this looks to you so far.

1Dragoon · 2022-04-04T14:57:41Z

It looks like the StatefulFormatter trait can have its sync requirement relaxed with no real consequences, at least with the way things are right now. So the final trait can look like this:

pub trait StatefulFormatter: Display + Send {
    fn tick(&mut self, state: &mut ProgressState, now: Instant);
    fn reset(&mut self, now: Instant);
}

The reason I called it a StatefulFormatter rather than ProgressTracker is because it's flexible enough that the user can define it for whatever purpose they need. That may or may not be for tracking progress. See the case of my formatter that smooths out bumpy SI unit transitions, which doesn't necessarily have to do with progress tracking.

I'm tempted to add a position parameter requirement to the reset function so that the user can call a reset_eta while allowing their stateful formatters to acquire the bar position immediately rather than having to wait for it to be tick()'ed again. Or maybe even just make it require &mut ProgressState entirely. Not sure yet.

The way I transfer ownership of the stateful formatters feels like a bit of a kludge, but at the same time it makes the API seem a lot more obvious to the user (seems a little weird defining formatters through the style struct and then doing it separately for the pb as a whole for stateful ones) and eliminates some constraints I would otherwise have to put on it. So I'm less inclined to modify it than I was yesterday.

Not sure how the maintainers feel about this, but given tick() requires mutable references to its parent object due to Rust's borrowing rules, perhaps there should be some kind of container struct for these that only exposes a minimal set of fields and functions? Would allow the maintainers to more freely modify ProgressState without having to worry about whether they're breaking any user code, but it would also substantially add to the overall footprint. Given I'm not a maintainer, nor do I have any experience maintaining a popular library like this, I don't feel I'm qualified to make any kind of vote on that one way or another.

There's also a bug with the code I posted where a reset_eta is required by the user in order for user defined estimators to actually work, which I'd rather not be there. I haven't had time to step through it yet to sort out why or what to do about it.

ForgQi · 2022-05-04T13:08:12Z

Thank you for your work.
I encountered a performance problem on 0.16.2 and updated to 0.17.0-rc10, "bytes_ per_ sec" becomes not smooth, What can I do to make it like before

djc · 2022-07-04T11:22:07Z

@ForgQi depends. Would you mind using git bisect to figure out when it became not smooth for you? (Or providing a minimal reproduction that would enable doing the same.)

AronParker · 2022-07-04T13:22:17Z

I believe (without knowing the concrete example of FrogQi) the problem is as follows:

The current Estimator tracks the last 16 speeds that were measured when incrementing the progress bar and yields the average. This works fine when the variance of speeds between each tick in a 16 sized window does not change substantially.

That assumption falls short for anything that is cached. Suppose one uses a BufWriter with a 64K buffer around a File. Any writes that amount to lower than 64K will seem instantaneous, whereas that last write that causes a flush will pessimize the measurement.

Using the above example, when writing 1K chunks it'll overestimate the speed drastically (since we're only measuing writes to memory). Once we have to flush to disk, we'll pessimize the estimations drastically (because once the 64K are written, we're good to go for another 64 rounds before we need to flush the cache again, whereas the estimator believes this will happen every 16 steps - (1 flush + 15 writes) extrapolated)

Possible solutions include using a simple linear extrapolation or using a hybrid approach where half of the estimation is done via the current estimator and the another half is done using a simple linear extrapolation.

I believe linear extrapolation would be the most reasonable and unsurprising default, as operations that need progress bars are often more complex than they are varying in speed.

djc · 2022-07-04T13:30:23Z

I mean, I get that part. #420 will make it possible to provide a fully customizable way to estimate. What I'm wondering about is specifically the regression that @ForgQi is describing between 0.16.2 and 0.17.0-rc.10, since IIRC the core strategy should not have changed between those releases.

desbma · 2022-11-10T09:01:47Z

If anyone is stumbling upon this and wants a quick solution, this is what I use with indicatif 0.17.1 to get smoothed ETA and "per_sec":

          .with_key(
              "smoothed_eta",
              |s: &ProgressState, w: &mut dyn Write| match (s.pos(), s.len()) {
                  (pos, Some(len)) => write!(
                      w,
                      "{:#}",
                      HumanDuration(Duration::from_millis(
                          (s.elapsed().as_millis() * (len as u128 - pos as u128) / (pos as u128))
                              as u64
                      ))
                  )
                  .unwrap(),
                  _ => write!(w, "-").unwrap(),
              },
          )
          .with_key(
              "smoothed_per_sec",
              |s: &ProgressState, w: &mut dyn Write| match (s.pos(), s.elapsed().as_millis()) {
                  (pos, elapsed_ms) if elapsed_ms > 0 => {
                      write!(w, "{:.2}/s", pos as f64 * 1000_f64 / elapsed_ms as f64).unwrap()
                  }
                  _ => write!(w, "-").unwrap(),
              },
          ),

aawsome · 2022-12-09T20:40:11Z

As this is still open, here are my thoughts:

I don't think we need a more fancy method to smooth out things in the estimator - a sliding window is simple and should work. Maybe allow the users to choose the window size as an option to finetune the smoothing.
I think the main problem is that for high-interval ticks or incs, the sliding window is completely refilled before anything is even drawn to screen due to the ratelimiter in DrawTarget (and setting a higher draw frequency doesn't help with the user experiencing the flickering). So I would propose to only update the estimator when we really draw the progress bar. IMO this makes sense as an inc(1000) between two draws should show the same result as 1000 inc(1)s in between - but currently the result in the estimator is completely different.
The result of my proposal can be seen when using a enable_steady_tick together with Update estimate when using a steady ticker #495 - then the estimator is really only updated at the interval of the steady tick.

@djc: If you like this proposal, give me a ping and I'll start a PR.

djc · 2022-12-09T20:53:28Z

Yes, this seems reasonable!

DonSheddow · 2023-04-15T13:20:32Z

@desbma's code gives an estimation averaged over the entire duration of the progress bar, but if you want a moving average based on the last second, I use the following code:

#[derive(Clone, Default)]
struct MovingAvgRate {
    samples: VecDeque<(std::time::Instant, u64)>,
}

impl ProgressTracker for MovingAvgRate {
    fn clone_box(&self) -> Box<dyn ProgressTracker> {
        Box::new(self.clone())
    }

    fn tick(&mut self, state: &indicatif::ProgressState, now: std::time::Instant) {
        // sample at most every 20ms
        if self
            .samples
            .back()
            .map_or(true, |(prev, _)| (now - *prev) > Duration::from_millis(20))
        {
            self.samples.push_back((now, state.pos()));
        }

        while let Some(first) = self.samples.front() {
            if now - first.0 > Duration::from_secs(1) {
                self.samples.pop_front();
            } else {
                break;
            }
        }
    }

    fn reset(&mut self, _state: &indicatif::ProgressState, _now: std::time::Instant) {
        self.samples = Default::default();
    }

    fn write(&self, _state: &indicatif::ProgressState, w: &mut dyn std::fmt::Write) {
        match (self.samples.front(), self.samples.back()) {
            (Some((t0, p0)), Some((t1, p1))) if self.samples.len() > 1 => {
                let elapsed_ms = (*t1 - *t0).as_millis();
                let rate = (p1 - p0) as f64 * 1000f64 / elapsed_ms as f64;
                write!(w, "{rate:.2}/s").unwrap()
            }
            _ => write!(w, "-").unwrap(),
        }
    }
}

which can be used like this: .with_key("smoothed_per_sec", MovingAvgRate::default()). It only samples 50 times per second, so I think it's reasonably performant.

Ten0 · 2023-04-25T16:52:45Z

@desbma

this is what I use with indicatif 0.17.1 to get smoothed ETA

Looks like this is an instant panic: attempt to divide by zero if current progress is 0.

Setting (pos, Some(len)) if pos != 0 fixes this.

afontenot · 2023-04-28T05:19:08Z

I'm in agreement that a exponentially weighted moving average is the best default - with a couple of caveats. I have a branch that implements this in indicatif if you'd like to try it out: https://github.com/afontenot/indicatif/tree/exponential-weighting

Couple of notes:

There are still a few issues to be ironed out, it's not ready to be merged. E.g. I haven't touched the tests at all. Feel free to submit PRs against my branch if you'd like to help out.
My branch is intended to (eventually) allow programs that use indicatif to tune the weighting. I have functions for this but haven't figured out the best way to expose them yet. (Again, help appreciated.) I set a default weight of 15 seconds, where that means that the last 15 seconds receive 90% of the weight in the weighted average.
Actually, I'm not quite convinced that a straight weighted average is the way to go. I've made some real world test data and a program for reproducing indicatif results available, and what I notice is that an exponentially weighted average is extremely jittery:

The blue dots are what an actual, real-world file transfer looks like (with the Rust version of Magic Wormhole, which uses indicatif). This is after setting a steady tick rate of 250 ms! There's a whole lot of stalling that happens as the TCP buffer fills and waits to be emptied.

This results in a lot of jitter in the estimate, which is observable to the end user as the ETA flickering up and down.

So what I did instead is implement a double weighted average, which is enabled by default (but can also be disabled). This introduces a small amount of additional lag (as does any moving average), but as you can see from the graph the resulting estimate is much better behaved.

Ideas / comments appreciated!

fintelia · 2023-04-28T06:08:21Z

How well does the moving average handle very infrequent updates, say only a couple events per minute?

afontenot · 2023-04-28T07:53:32Z

How well does the moving average handle very infrequent updates, say only a couple events per minute?

Not great by default. Necessarily if you are using a short window with a moving average, you will see cyclic behavior if you have infrequent events. You'd want to make the window significantly larger, probably at least ~10 times the average time between events. Maybe there's some way to do dynamic window scaling without it getting too complicated.

chris-laplante · 2023-04-28T14:52:53Z

I don't have a lot to add here, but it may be worth considering what other progress libraries do in other ecosystems. tqdm seems to use exponential moving average, for example: https://github.com/tqdm/tqdm/blob/master/tqdm/std.py#L213.

afontenot · 2023-04-28T19:53:36Z

I don't have a lot to add here, but it may be worth considering what other progress libraries do in other ecosystems. tqdm seems to use exponential moving average, for example: https://github.com/tqdm/tqdm/blob/master/tqdm/std.py#L213.

Interesting - this doesn't appear to do any kind of time weighting, so it will return inaccurate results if the rate is not independent of the time between events. I wrote the exponential average to properly weight by time, with a target value (in seconds) at which data reaches a particular dilution (10%).

alexblanck · 2023-05-02T23:21:38Z

I agree completely with @aawsome

I think the main problem is that for high-interval ticks or incs, the sliding window is completely refilled before anything is even drawn to screen due to the ratelimiter in DrawTarget

Even adjusting the existing estimators to only update when something is actually drawn would help immensely here for programs that update the progress very frequently.

After that is resolved, we can move on to discussing sliding windows vs exponential moving average vs time-weighted, etc..

aawsome · 2023-05-03T13:15:20Z

Even adjusting the existing estimators to only update when something is actually drawn would help immensely here for programs that update the progress very frequently.

I actually tried to work on it - but this turns out to be a bit complex.
There are several ratelimiters currently in use:

Using a steady ticker, the ProgressState is only updated by this ticker
Using inc() without a steady ticker, the ProgressState gets only updated depending on what is allowed by AtomicPosition (didn't get how/why it is working like it is)
Finally, there is the ratelimiter on the DrawTarget which isn't coupled to the ProgressState but controls when the progressbar is actually drawn.

So I wasn't able to present a nice solution without either

adding another ratelimiter which would introduce even more complexity, or
trying to refactor the current concept of ratelimiters with the chance of breaking anything - honestly I wasn't brave enough to do this.

As the the ProgressState gets updated as wanted when using a steady ticker and I only need progressbars with steady tickers, I stoped working on this and just use what's already working.

afontenot · 2023-05-03T17:45:38Z

Even adjusting the existing estimators to only update when something is actually drawn would help immensely here for programs that update the progress very frequently.

If I'm not mistaken, this is already the case if you're using a steady ticker. The estimator update function only gets called when there's a tick, and the tick only happens when the timer calls it. The problems with the status quo are

There is a mathematical error in how the average is determined using the rolling buffer that makes it extremely inaccurate under certain conditions; that's my bug With burst-heavy writes, eta and bitrate show incorrect values when using enable_steady_tick #534
A flat average is less ideal than an exponentially weighted average for most use cases because an EWA (a) prioritizes recency, and (b) doesn't have the limited history of a rolling buffer.
When progress stalls (e.g. you're transferring a file over the Internet and someone pulls out your Ethernet cable), the current estimator stalls, with the ETA showing the last determined value until progress is made again. The situation where this is most annoying is when you have regular, predictable stalls that don't indicate a problem with the transfer. I also discuss this in 534.

The good news is that my approach to the EWA solves all three problems simultaneously. Because I properly time weight new data in the estimator, I don't have the issue identified in (1), which means that you should see very similar estimations whether you use a steady tick or not! And because the predictor factors in the time since the last progress was made, the prediction corrects for (3) and the estimator never stalls.

Basically, the only thing you should have to worry about with the new estimator is choosing an appropriate weighting value for your use case. A decent rule of thumb might be to set the value to something like the 98th percentile for foreseeable event delays (or 15 seconds, whichever is greater). Some future work could try to make this "self-tuning", but I don't think you need that to see a big improvement over the current estimator.

If your progress events are sufficiently rare and unpredictable, I think you're likely to see the best results by (a) choosing a very high weight for the EWA, and (b) disabling steady tick. There's nothing wrong in principle with using a value like 1000000 for the weight, and because I do time weighting, there are no negative consequences to disabling steady tick (except that your ETA won't update until there's an event, but presumably you don't want that).*

* I actually don't know if this is true, I need to test it. The progress bar might decide to update itself even when there's been no tick, but I don't think it does.

alexblanck · 2023-05-03T19:00:39Z

Awesome, using the steady ticker does indeed seem to stabilize things a bit. Would be nice if that wasn't necessary but it's a good workaround for now.

djc · 2023-05-07T13:25:08Z

@afontenot your double EWA sounds very promising, I'd love to see a clean PR for this. Thanks for working on this!

afontenot · 2023-05-07T21:22:12Z

@djc I can submit a PR of my branch as soon as I have a few tests written. The thing I'm hung up on at present is that all the Estimator functions (prior to my work) take an Instant argument that they treat as the current time. I changed this to just call Instant::now in the functions themselves.

Is there a reason that indicatif was passing time values around like this, other than to make testing easier? I could go back to the old way of doing it, if so, or if it just doesn't matter. As it is I've been down a rabbit hole trying to mock Instant for testing...

afontenot · 2023-05-08T01:51:43Z

@djc I have a draft PR up that leaves the now arguments in the Estimator functions. I can add a commit with some refactoring if that is desirable. #539

djc · 2023-06-03T21:52:20Z

With #539 merged, I'm going to close this issue as fixed because I believe we've made a great improvement to the smoothness of ETA and *_per_second. If you still run into issues with 0.17.5 or later, please file a new issue.

aawsome · 2023-06-04T04:29:19Z

Thanks a lot for your work @afontenot and @djc!
I'll test 0.17.5 with and without a steady ticker. (Though I think I'll keep a steady ticker as I sometimes have have pauses for more than a second and also present an elapsed time - users would expect the elapsed time to be updated regularly)

1Dragoon mentioned this issue Jul 8, 2022

Stateful Formatters #420

Merged

This was referenced Sep 12, 2022

ProgressBarIter and finding stream length via seeking breaks rate estimation #480

Closed

Bump version to 0.17.1 #481

Merged

djc mentioned this issue Nov 15, 2022

per_sec and eta values fluctuate wildly from one tick to the next #494

Closed

shssoichiro mentioned this issue Nov 16, 2022

Custom keys on ProgressStyle do not work together with enable_steady_tick #493

Closed

Walther mentioned this issue Jan 8, 2023

2023-01-08 tiny fixes Walther/clovers#173

Merged

This was referenced Apr 17, 2023

Progress bar bitrate intermittently drops on the sending side magic-wormhole/magic-wormhole.rs#192

Closed

With burst-heavy writes, eta and bitrate show incorrect values when using enable_steady_tick #534

Closed

djc mentioned this issue May 10, 2023

Switch Estimator to use an exponential weighting #539

Merged

djc closed this as completed Jun 3, 2023

teor2345 mentioned this issue Jun 29, 2023

Estimates become extremely large if progress updates are infrequent #556

Open

seqre mentioned this issue Mar 7, 2024

Revise showing ETA and rates for progress bars praetorian-inc/noseyparker#140

Open

Option to smooth ETA and *_per_second #394

Option to smooth ETA and *_per_second #394

Comments

aawsome commented Mar 12, 2022

djc commented Mar 12, 2022

aawsome commented Mar 13, 2022 • edited Loading

djc commented Mar 13, 2022

aawsome commented Mar 17, 2022

1Dragoon commented Mar 22, 2022

djc commented Mar 23, 2022

1Dragoon commented Mar 23, 2022

chris-laplante commented Mar 23, 2022

1Dragoon commented Mar 24, 2022

1Dragoon commented Mar 24, 2022 • edited Loading

1Dragoon commented Mar 25, 2022

1Dragoon commented Mar 26, 2022

1Dragoon commented Mar 28, 2022 • edited Loading

djc commented Mar 29, 2022

1Dragoon commented Mar 30, 2022 • edited Loading

djc commented Mar 30, 2022

1Dragoon commented Mar 30, 2022 • edited Loading

djc commented Mar 30, 2022

1Dragoon commented Mar 30, 2022 • edited Loading

djc commented Mar 30, 2022

1Dragoon commented Mar 31, 2022 • edited Loading

djc commented Mar 31, 2022

1Dragoon commented Mar 31, 2022 • edited Loading

1Dragoon commented Apr 3, 2022 • edited Loading

1Dragoon commented Apr 4, 2022 • edited Loading

ForgQi commented May 4, 2022

djc commented Jul 4, 2022

AronParker commented Jul 4, 2022

djc commented Jul 4, 2022

desbma commented Nov 10, 2022

aawsome commented Dec 9, 2022

djc commented Dec 9, 2022

DonSheddow commented Apr 15, 2023 • edited Loading

Ten0 commented Apr 25, 2023 • edited Loading

afontenot commented Apr 28, 2023

fintelia commented Apr 28, 2023

afontenot commented Apr 28, 2023

chris-laplante commented Apr 28, 2023

afontenot commented Apr 28, 2023

alexblanck commented May 2, 2023

aawsome commented May 3, 2023 • edited Loading

afontenot commented May 3, 2023

alexblanck commented May 3, 2023

djc commented May 7, 2023

afontenot commented May 7, 2023

afontenot commented May 8, 2023

djc commented Jun 3, 2023

aawsome commented Jun 4, 2023 • edited Loading

aawsome commented Mar 13, 2022 •

edited

Loading

1Dragoon commented Mar 24, 2022 •

edited

Loading

1Dragoon commented Mar 28, 2022 •

edited

Loading

1Dragoon commented Mar 30, 2022 •

edited

Loading

1Dragoon commented Mar 30, 2022 •

edited

Loading

1Dragoon commented Mar 30, 2022 •

edited

Loading

1Dragoon commented Mar 31, 2022 •

edited

Loading

1Dragoon commented Mar 31, 2022 •

edited

Loading

1Dragoon commented Apr 3, 2022 •

edited

Loading

1Dragoon commented Apr 4, 2022 •

edited

Loading

DonSheddow commented Apr 15, 2023 •

edited

Loading

Ten0 commented Apr 25, 2023 •

edited

Loading

aawsome commented May 3, 2023 •

edited

Loading

aawsome commented Jun 4, 2023 •

edited

Loading