Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4D data from cdf in dataobj() should match its specified dimensions #41

Open
2 tasks done
thomas-nilsson-irfu opened this issue Dec 8, 2017 · 1 comment
Open
2 tasks done
Assignees

Comments

@thomas-nilsson-irfu
Copy link
Member

thomas-nilsson-irfu commented Dec 8, 2017

// NOTE: This issue is added here mainly as a placeholder until the issue is fully solved, as fixing it in 9440568 generated A LOT more dependent problems.

Step 1: Latest code?

  • This issue is present in the most recent code (i.e. after running git pull this is still a problem).

Step 2: Describe your environment

  • irfu-matlab branch (i.e. "master", "devel"): devel
  • Matlab version used (i.e. "Matlab R2017a"): R2017b
  • Operating system (ie "Windows 7", "Mac OS X 10.12.3", "Linux Ubuntu 16.04"): All of the above
  • A 64-bit installation of Matlab was used.

Step 3: Describe the problem

Expected behavior

Objects and variables created using dataobj() SHALL match the dimensions of the variables inside the CDF files from which it was created. (Which in turn of course should match the appropriate instrument papers, data products guide, release notes or other documentation). That is the Depend_0 (being Epoch in most cases), and subsequent Depend_i for i=1:3. (I.e. 4D data variables) should be in the correct positions and data should be the appropriate sizes in each dimension.

Actual behavior

Presently dataobj()'s permute function change the data according to files created using the old Cluster method. But files created according to the most recent instructions from NASA SPDF should not have permute running in the same way.
These should only have the main time dependency put in the first column and not shifting the second to last column. A detailed overview can be viewed here: https://docs.google.com/spreadsheets/d/1L58nLwy1Mwl9y4OATjkgjb58p6dCiNFerEx2cVbsB0M/edit#gid=0
For MMS this impacts FPI data, and for future missions (SolarOrbiter?) it could impact any other files created according to the correct instructions (see "Test with SPDFCDFWRITE" in the spreadsheet).

Relevant code:

When all the dependent files are ready, then the following can be put into place again:

function fix_order_of_array_dimensions
for iDimension=3:4
% Check if dimensions match expected (ignoring Depend_0) and that
% it is not a single vector (which could be "Nx1").
indDatasets = find(cellfun(@(x) all(x(:) > 1) && numel(x)==iDimension-1, info.Variables(:,2)));
for iDataset=1:numel(indDatasets)
if iDimension==3
% permutate it to N x m x n, where N corrsponds to record (ie Depend_0).
data{indDatasets(iDataset)} = permute(data{indDatasets(iDataset)}, [3 1 2]);
elseif iDimension==4
% Check if it should be simple cyclic permutation (MMS), data
% was read into "recSize x nRec". Also possibly a matrix of
% size "recSize" if it only contained one single record.
% Or if the data should be more complexly permuted (Cluster),
% ie where the recSize does not match the data.
recSize = info.Variables{indDatasets(iDataset), 2}; % Expected record size, (m x n x p)
nRec = info.Variables{indDatasets(iDataset), 3}; % Numer of records, (N)
dataSize = size(data{indDatasets(iDataset)}); % Size of data, as it was read
if isequal([recSize, nRec], dataSize) || (nRec==1 && isequal(recSize, dataSize) )
% Simple cyclic permutation to get N x m x n x p
data{indDatasets(iDataset)} = permute(data{indDatasets(iDataset)}, [4 1 2 3]);
else
% Re-order Cluster data, from (n x p x m x N) to (N x m x n x p).
data{indDatasets(iDataset)} = permute(data{indDatasets(iDataset)}, [4 3 1 2]);
end
end
end
end
end

@thomas-nilsson-irfu thomas-nilsson-irfu self-assigned this Dec 8, 2017
thomas-nilsson-irfu added a commit that referenced this issue Dec 8, 2017
thomas-nilsson-irfu added a commit that referenced this issue Dec 13, 2017
Important to note: the Depend_3 now contains Energy but the dataobj() have energy dependency in the second column. When dataobj() has been corrected in issue #41 and the code has been put in place to align Depend_i with its corresponding column number in the data this part must be updated accordingly!!

Speed up example (repeated 3 times to ensure "caching" does not impact the times).

Tint = irf.tint('2017-10-01T00:00:01Z/2017-10-01T01:59:59Z');
tic; ePDist1=mms.get_data('PDe_fpi_fast_l2', Tint, 1); toc;

Old average: 5.7 seconds, New average: 2.9 seconds.
thomas-nilsson-irfu added a commit that referenced this issue Dec 13, 2017
Important to note: once dataobj is corrected in issue #41 this will require some change.

Speed up example (repeated 3 times to ensure "caching" does not impact the times).

Tint = irf.tint('2017-10-01T00:50:23.004446000Z/2017-10-01T00:52:22.975294000Z'); % Ie. one FPI brst file
tic; ePDist1=mms.get_data('PDe_fpi_brst_l2', Tint, 1); toc;

Old average: 19.47 seconds, New average: 14.65 seconds.
@ErikPGJ
Copy link
Member

ErikPGJ commented Sep 6, 2019

I believe myself to have seen the same kind of/similar phenomenon for 1+3 dimensional zVar (1 record dimension, 3 dimensions/record)

  • 1ea1d71 thomas-nilsson-irfu (2019-09-06 13:15:26 +0200) (HEAD -> SOdevel, origin/SOdevel) Merge remote-tracking branch 'origin/devel' into SOdevel
    MATLAB R2016a
    Ubuntu 18.04.3 LTS

Found when Solar Orbiter/RPW calibration s/w "Bicas" reads a BIAS RCT (calibration file).
Different bad dimensions from dataobj depending on number of CDF zVar records.
(Note: TRANSFER_FUNCTION_COEFFS:DEPEND_0=Epoch_L)

Do = dataobj(...)

zVar/file with 1 CDF record :
cdfdump: "TRANSFER_FUNCTION_COEFFS CDF_DOUBLE/1 3:[2,8,4] F/TTT"
size(Do.data.TRANSFER_FUNCTION_COEFFS.data) == [ 4 2 8]

zVar/file with 2 CDF records:
cdfdump: "TRANSFER_FUNCTION_COEFFS CDF_DOUBLE/1 3:[2,8,4] T/TTT"
size(Do.data.TRANSFER_FUNCTION_COEFFS.data) == [2 4 2 8]

NOTE: Created separate issue #50 for the different behaviour depending on number of CDF records. Related?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants