Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As an air quality expert I would like to update the automatic processing of the low-cost sensors of A22 #27

Closed
3 tasks done
rcavaliere opened this issue Jun 7, 2023 · 12 comments
Assignees

Comments

@rcavaliere
Copy link
Member

rcavaliere commented Jun 7, 2023

The current processing formula for NO2 is: NO2=a0+a1NO2raw^2+a2NO2^raw+a3O3raw^0.1+a4Tint^4
The new formula is now: NO2=a0+a1NO2raw^2+a2NO2^raw+a3*O3raw+a4*Tint^4

This is due to the fact that there are new sensors now, which perform better with this kind of calibration. To do list here:

clezag added a commit that referenced this issue Jun 14, 2023
- Refactored code to only download data that is actually needed (both stations and datatypes)
- Directly load CSV instead of requiring translation in JSON via tool
- Add new calibration coefficients
- Modify NO2 formula, maintain old formula for old records (new field in processorParameters.csv)
@clezag
Copy link
Member

clezag commented Jun 14, 2023

@rcavaliere I've done the requested changes, and fixed some things:

  • implemented new processing formula. There is a new column 'L' in the CSV, where if 1, it uses the old formula, else the new one. This way, stations that didn't get the new coefficients still work. We can remove this if you want.
  • updated coefficients in the CSV
  • Elaboration is now running in testing

Additional changes:

  • The CSV is now loaded automatically, there is no need to generate the JSON anymore
  • Now specifically only requests stations that are present in the CSV file
  • Now specifically only requests data types that are needed by the elaboration
  • Reworked algorithm that determines time windows of historical (raw) data requests

These changes should prevent future issues when other non-related environment stations get added

@rcavaliere
Copy link
Member Author

@clezag wonderful!
Are the elaborations already running and applied on the available measurements? In order to check if the new calculations are done correctly, we need to ensure that the other basic computations that are needed before this non linear calibration are reactivated as well, check https://github.com/noi-techpark/bdp-elaborations/tree/main/environment-a22-non-linear-calibration (readme). In other words the raw values (period = 60) are first elaborated in order to compute the hourly averages (https://github.com/noi-techpark/bdp-elaborations/tree/main/environment-a22-averages), and then this elaboration can start. It seems to me that the hourly averages computation is still not active, can you check this please?

@clezag clezag self-assigned this Jun 15, 2023
dulvui pushed a commit that referenced this issue Jun 19, 2023
- Refactored code to only download data that is actually needed (both stations and datatypes)
- Directly load CSV instead of requiring translation in JSON via tool
- Add new calibration coefficients
- Modify NO2 formula, maintain old formula for old records (new field in processorParameters.csv)
@clezag
Copy link
Member

clezag commented Jun 19, 2023

@rcavaliere Still struggling a bit with this issue.

I've seen that the averages elaboration requires a certain amount of data points to be present within the 1h time frames. Currently it's 16 records / h.
If the number of records is not present, the elaboration previously failed, and now I've changed it to ignore and proceed averaging the next hour, so it doesn't get stuck forever

Now both in production and test, the data seems to come in quite irregularly with 5, 10 or even 20 minute intervals between points.
I'm not sure where the issue is exactly, but the data collector being configured to push data every 5 minutes is probably one piece of the puzzle.
Stations that now receive data don't have enough records/h, that's why they don't update

Just for testing purposes, I've lowered the number of records / hour to 5 in testing, and some stations now update correctly.

Is this parameter of records/hour something documented and intended?
Do we have to look into the data collector / data provider side, since the period is supposed to be 60s?

@rcavaliere
Copy link
Member Author

@clezag yes this control was intended to avoid the elaboration of intervals in which the instrument did not work well, and I would not change this. But the elaboration should be configured in a way, that it should up to the point in which new measurements are available. So, it could happen that certain stations have more uptodate elaborations, and other that stopped at the time in which the last measurements is available. We should ensure that this situation is guaranteed

@clezag
Copy link
Member

clezag commented Jun 19, 2023

@rcavaliere Yes, I think I've fixed this now. It should skip periods without data or without sufficient data quality.

Still, all the new stations don't have the necessary update frequency, it seems to be a systemic issue. They update every 5 minutes at best.

I think there is a mistake in the configuration, currently it's set up to sync stations every minute, but data only every 5.
Maybe this mistake happened during the migration from Jenkins. Should I switch this around?

@rcavaliere
Copy link
Member Author

@clezag If I understand well, there are issues in how the Data Collector work. Right? It if is the case, try to fix it so that all stations provide correctly real-time data. Please be aware that some sensors (i.e. the ones for which we have updated the calibration coefficients / equation) are at present offline.

@clezag
Copy link
Member

clezag commented Jun 20, 2023

@rcavaliere turns out both test and production were using the same credentials, clientID and topic.
This resulted in them snatching away messages from each other and constantly having their sessions closed.

I've configured a different clientId for testing, which means both test and production should now receive each message separately without interference.

I've noticed some other things though:

  • The application not only collects data, but also sends back to AUGE via MQTT. Both test and production point to the same server, so I think we're publishing everything twice.
  • The data collector still does the same calculations that the elaboration does. The results are published both to ODH and AUGE. The elaboration only looks at the 1h average data, the 1min data is done directly by the data collector. Is this correct? Should we change this?

@rcavaliere
Copy link
Member Author

@clezag ah this could the trigger of all the issues!
Yes, the Data Collector write backs also elaborations... but I think they are managing this throwing back one of the two identical messages. Can we change the Data Collector so that it reads our APIs and provide the elaborations in that way, without computing them?

@clezag
Copy link
Member

clezag commented Jun 20, 2023

@rcavaliere Just writing down what we've decided in our meeting:

  • lower required datapoints / h in average elaboration to 5
  • modify data collector so that it only collects raw data, disable push of processed data to ODH or AUGE

@clezag
Copy link
Member

clezag commented Jun 22, 2023

@rcavaliere I think most of the pipeline is working now. The processing elaboration returns 0 for most up to date stations, which happens when the calculated value is below 0. I'm not sure if this is because of outdated parameters or wrong formula. I think we have to verify on a case per case basis.
Do you know a station which should be working correctly so I can use that one for my tests?

@rcavaliere
Copy link
Member Author

@clezag thanks for you work in this sprint. Yes, the processing seems not be stable, but this is something BrennerLEC partners are still working on. Currently we do no have a stable situation since the active sensors do not have calibration coefficients calculated for the processing, and the other sensors which are not active because they are being installed again on the highway have the coefficients computed. I would say, let's close this issue and then open another one again if we still need to fix something

@clezag
Copy link
Member

clezag commented Jun 26, 2023

Released in production

@clezag clezag closed this as completed Jun 26, 2023
StefanoTavonatti pushed a commit to StefanoTavonatti/bdp-elaborations that referenced this issue Feb 22, 2024
- Refactored code to only download data that is actually needed (both stations and datatypes)
- Directly load CSV instead of requiring translation in JSON via tool
- Add new calibration coefficients
- Modify NO2 formula, maintain old formula for old records (new field in processorParameters.csv)
StefanoTavonatti pushed a commit to StefanoTavonatti/bdp-elaborations that referenced this issue Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants