Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate to plane / Kalman on plane / Matriplex with support for scalar operations and VDT #148

Closed
wants to merge 23 commits into from

Conversation

osschar
Copy link
Collaborator

@osschar osschar commented May 15, 2024

PR description:

Preliminary PR to discuss physics and computational performance of

  • propagateToPlane / kalmanUpdateOnPlane
  • Modifications to Matriplex: support scalar operations, usage of VDT.
  • Use Matriplex modifications to re-express vectorized code for propToPlane (mostly in PropagationMPlex.icc).

Computational performance notes

Before Matriplex / VDT changes the track finding ran about 4-times slower. Matriplex and VDT changes brought this down by about 20% -- so we still have an enormous slowdown which is not understood at all.

Further, the Matriplex vectorization of components by use of Matriplex scalars needs to be re-checked for the new operators and for VDT. Matevz examined the assembler code for the MiniPropagators which used the prototype implementation of this concept (also including some VDT functions ... but not embedded in Matriplex).

Things to check / look at
  1. Compare vtune/igProf profiles for propToPlane vs. the standard implementation. Figure out where time is actually spent. Giuseppe is still working on the Kalman update on plane / in local coordinates.

  2. Matriplex vectorization should be reviewed / consistently tested, along with VDT (there is special pre-processor define that causes fallback from vdt::fast_xyzz() to std::xyzz() functions.

    • Not quite sure how to go about this on an "industrial scale" -- for all functions / operators and for all usages in specific CMSSW functions.
    • I have not added simd pragmas ... they were not necessary in the MiniPropagators test. But maybe they can help by triggering some vectorization reports in case they fail.
  3. Known problems / issues

    • Mario reported a CMSSW problem with static_assert assuring float/double Mplex type for VDT - I only ran standalone builds with gcc-13 on Fedora 39.
    • unary minus operator is giving me some trouble (member function without an argument didn't work, had to define it outside of the class) -- maybe I did something silly.

cerati and others added 7 commits April 25, 2024 04:51
….cc), move structs BeamSpot and DeadRegion out of Hit.h.
…ify the

new propagateToPlane code.

* Add VDT support to Matriplex, mostly to work on Matriplex scalar types.
  - Functions are prefixed as fast_xyzz(), same as in VDT.
  - Controlled by define MPLEX_VDT. Additionally, if MPLEX_VDT_USE_STD is also
    defined, the fast_xyzz() functions fall back to using std:: variants.
    This is useful for performance comparisons.
  - Add reduction operator and an assigner class / method to extract scalars
    (one i,j element) or to assign to it.

* Massage propagateToPlane low level implementation
  - Use the new Matriplex functionality to simplify code.
  - Remove the nmin, nmax indices initially introdcued to support GPU code
    (this was actually a wrong way ... Matriplex does support MatriplexVectors
    that should have been used instead -- and extended as needed).
// fixme? //(pf.use_param_b_field ? 0.01f * Const::sol * Config::bFieldFromZR(psPar(n, 2, 0), hipo(psPar(n, 0, 0), psPar(n, 1, 0))) : 0.01f * Const::sol * Config::Bfield);
const float bF = 0.01f * Const::sol * Config::Bfield;
const float qh2 = bF * lp(n,0,0);
const float t1r = std::sqrt(1. + lp(n,0,1)*lp(n,0,1) + lp(n,0,2)*lp(n,0,2))*pzSign(n,0,0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const float t1r = std::sqrt(1. + lp(n,0,1)*lp(n,0,1) + lp(n,0,2)*lp(n,0,2))*pzSign(n,0,0);
const float t1r = std::sqrt(1.f + lp(n,0,1)*lp(n,0,1) + lp(n,0,2)*lp(n,0,2))*pzSign(n,0,0);

?

@@ -1009,15 +1137,17 @@ namespace mkfit {

void kalmanUpdatePlane(const MPlexLS& psErr,
const MPlexLV& psPar,
const MPlexQI& Chg,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const MPlexQI& Chg,
const MPlexQI& chg,

naming style

// fixme? //(pf.use_param_b_field ? 0.01f * Const::sol * Config::bFieldFromZR(psPar(n, 2, 0), hipo(psPar(n, 0, 0), psPar(n, 1, 0))) : 0.01f * Const::sol * Config::Bfield);
const float bF = 0.01f * Const::sol * Config::Bfield;//fixme: cache?
const float qh2 = bF * lp_upd(n,0,0);
const float cosl1 = 1./vn(n,0,2);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const float cosl1 = 1./vn(n,0,2);
const float cosl1 = 1.f/vn(n,0,2);

const float vj = vn(n,0,0)*rot(n,0,0) + vn(n,0,1)*rot(n,0,1) + vn(n,0,2)*rot(n,0,2);
const float vk = vn(n,0,0)*rot(n,1,0) + vn(n,0,1)*rot(n,1,1) + vn(n,0,2)*rot(n,1,2);
const float cosz = vn(n,0,2)*qh2;
jacLoc2Curv(n,0,0) = 1.;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
jacLoc2Curv(n,0,0) = 1.;
jacLoc2Curv(n,0,0) = 1.f;

although assignments should be OK

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

template<typename TT>
Matriplex& negate_if_ltz(const Matriplex<TT, D1, D2, N> &sign) {
for (idx_t i = 0; i < kTotSize; ++i) {
if (sign.fArray[i] < 0)
fArray[i] = -fArray[i];
}
return *this;
}

Brute forcing the matrix to positive...

osschar and others added 8 commits May 23, 2024 15:09
- Collect unique ModuleShapes in MkFitGeometryESProducer and store them
in a vector for each LayerInfo.

- Add shape_id member to ModuleInfo (and remove half-legtn that is now available
from the ModuleShape).

- Add LayerInfo::m_has_charge to the prinout.

- List ModuleShapes when detailed TrackerInfo is requested.

- TrackerInfo::print_tracker() now takes additional argument 'precision' that
determines number of decimal places to use for printing of module/shape
parameters.
osschar and others added 8 commits May 31, 2024 09:13
- Collect unique ModuleShapes in MkFitGeometryESProducer and store them
in a vector for each LayerInfo.

- Add shape_id member to ModuleInfo (and remove half-legtn that is now available
from the ModuleShape).

- Add LayerInfo::m_has_charge to the prinout.

- List ModuleShapes when detailed TrackerInfo is requested.

- TrackerInfo::print_tracker() now takes additional argument 'precision' that
determines number of decimal places to use for printing of module/shape
parameters.
}
MPlex56 jacCCS2Curv(0.0f);
jacCCS2Curv.aij(0, 3) = mpt::negate_if_ltz(sinT, inChg);
jacCCS2Curv.aij(0, 5) = mpt::negate_if_ltz(cosT, inChg);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source in TrackState::jacobianCCSToCurvilinear has jac(0, 5) = charge * cosT * invpt;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source in TrackState::jacobianCCSToCurvilinear has jac(0, 5) = charge * cosT * invpt;

@cerati which one do you think is a bug?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with mpt::negate_if_ltz... maybe @osschar ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my version before this change was the same as TrackState::jacobianCCSToCurvilinear

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is that the 1/pT is missing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was not missing in my version (you can see it if you look at the code that was removed)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh rats, it's my fault again :)
or maybe it's better because of it ;)

@osschar
Copy link
Collaborator Author

osschar commented Sep 25, 2024

Changes added to #151, this is now obsolete, closing.

@osschar osschar closed this Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants