Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resampler - work in progress #14

Open
HpLightcorner opened this issue Mar 17, 2022 · 4 comments
Open

Resampler - work in progress #14

HpLightcorner opened this issue Mar 17, 2022 · 4 comments

Comments

@HpLightcorner
Copy link
Contributor

Hi,
@matthiasstraka, I have several questions regarding the newly implemented Resampler - as we can directly discuss source code here it might be the easiest to discuss questions within this issue and simply close it when we are confident everything is working.

Out of Range issues
I think the following check should be

ODK_ASSERT_LTE(idx + 1, fp.size() - 1);

but is

ODK_ASSERT_LTE(idx + 1, fp.size());

First Sample Timestamp
The first sample timestamp should be calculated based on the current amount of samples?
So, instead of

first_sample_timestamp -= m_input_buffer.size() * estimated_sample_duration;

So, I suggest to do

first_sample_timestamp -= num_samples * estimated_sample_duration;

Furthermore, an additional safety check is a good idea to prevent negative timestamps

first_sample_timestamp -= num_samples * estimated_sample_duration;
ODK_ASSERT_GTE(first_sample_timestamp, 0);

What are your thoughts on that?

@matthiasstraka
Copy link
Member

  1. You are correct, the ASSERT should be less-than fp.size(). Since we currencly do not have ASSERT_LT, I will change it to
ODK_ASSERT_LTE(idx + 2, fp.size());

to prevent issues with unsigned size() values of 0 (or add ASSERT_LT, which requires a change in two codebases). But this immediately triggers a Unit test error -- I will look into that.

  1. if m_input_buffer is not empty, this means there are samples from a previous run that will be added before the new "num_samples"-samples (or like in the code, we just add the new samples after the old ones). We need them so we can use the interpolation code smoothly between blocks of data.
    So,

    first_sample_timestamp -= m_input_buffer.size() * estimated_sample_duration;

    will simply say: correct the first timestamp of the final m_input_buffer. I am pretty sure that this works but this code definitely keeps more "old" samples than needed. And the more estimated_sample_duration-scaled timestamps we accumulate, the more estimation errors we make - so we might have to improve on that when there is a problem with real-world data. Is that understandable?

  2. Additional ASSERTS for timestamps are definitely a good idea, especially since this code has not been used with non-synthetic data so far and who knows what kind of data the user will throw at it.

I will look into the ASSERT problems and push an updated version probably by tomorrow.

@HpLightcorner
Copy link
Contributor Author

Great! Regarding 2)

It seems that I am having issues here with real-world data. Especially with start of the stream. I cannot guarantee that the first package (lets call it start of stream - SoS) of the stream will arrive perfectly in time with Oxygen sample time restart. So I have to add zeros to align the SoS with the Oxygen time (NaN would be better I guess, but not supported by Resampler for now). This will be an arbitrary number of zeros. Hence, I can not guarantee a constant packet size which seems to cause issues with the calculation of first_sample_time using m_input_buffer - using num_samples instead of m_input_buffer.size() solves this for now, maybe causing other issues, we will have to take another look into that issue I guess.....

@matthiasstraka
Copy link
Member

I added a fix that addresses the problem with reading past the last element of the input vector.
I have a hard time understanding what real-world data looks like and only rely on synthetic data provided to me by @moberhofer that works similarly to what is generated in

auto samples = generateSignalRamp(block_size);
double t = m_t_prev + samples.size() / m_true_sample_rate->getValue().m_val;
m_resampler.addSamples(host, out_channel->getLocalId(), t, samples.data(), samples.size());

Currently, the assumption is that all data starts at time 0, but it should not be hard to initialize m_last_timestamp of the Resampler to a different value here:

ADD_CONTIGUOUS_SAMPLES should be able to handle pauses in the beginning of the stream (after the initial pause, all samples must be provided as it is a sync channel).

@HpLightcorner
Copy link
Contributor Author

HpLightcorner commented Mar 18, 2022

I will test the improvements on my code base.

@matthiasstraka When talking about real-world data, lets first define a use-case: For example, microcontrollers are streaming ADC-samples to the host by sending, e.g. 1000 samples every 100 mS - giving us 10.000 samples/second. The microcontrollers are synced by, e.g. gPTP and have a common understanding when exactly 1000 samples are acquired so that the timestamp is a global time among all microcontrollers.

Scenario A
The microcontrollers are starting their stream perfectly in sync with the Oxygen sampling (re)start - this is never going to happen unless Oxygen implements some further communication with these devices (e.g. sending a start command or whatever).

Scenario B
The first packet of the stream arrives at an arbitrary Oxygen time. The resampler needs to "fill up" the time from 0 (start of Oxygen time) to the time-point the first packet arrived with NaN or zeros. Further, this first packet maps the Oxygen-Time to the global, microcontroller time, so this should only happen once from my understanding and is then valid for all microcontrollers in the network (however, this might only be the case when all devices are synced with e.g. gPTP, so we should make this an requirement when working with multiple streams coming from multiple sources and the Resampler or we make this optional with an OOP design pattern). We can further demand that every packet afterwards has the same amount of samples.

So, i think the Resampler should be flling up samples with 0 or NaN such that the stream of data is aligned with the Oxygen time - or ADD_CONTIGUOUS_SAMPLES is able to pause a stream at the beginning 👍

Packet Loss
The Resampler must be able to deal with packet-losses to recover from network faults. There are multiple ways to implement that, a very simple one could be:

  • We now the timestamp of the last packet that has arrived
  • We now the timestamp of the packet indicating stream recovery
  • We now the nominal sample rate

Hence we can calculate the missing number of samples and fill up the stream again with either 0 or NaN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants