Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m_as with a non-unit quantity #1415

Open
lovasoa opened this issue Nov 17, 2021 · 14 comments
Open

m_as with a non-unit quantity #1415

lovasoa opened this issue Nov 17, 2021 · 14 comments
Labels
Milestone

Comments

@lovasoa
Copy link

lovasoa commented Nov 17, 2021

I would expect

reference = ureg.Quantity("10 megawatt")
ureg.Quantity("100 megawatt").m_as(reference)

to either throw an error, or return 10.

However, it returns 100, which is a little bit surprising (and unsafe).

If m_as accepts a quantity as its unit parameter, than shouldn't it behave the same as if I had defined a new unit with my reference quantity and passed that to m_as ?

If the current behavior is actually intented, then maybe it could be highlighted in the documentation.

@jules-ch
Copy link
Collaborator

When using Quantity with m_as you will get the magnitude with the same unit as this Quantity.

It is the expected behaviour, I've just realized this function is not present in the documentation. You are right about this.
But I doubt it will change.

@cpascual
Copy link
Contributor

I understand @lovasoa confusion, but I want to say that I personally always understood m_as just as @jules-ch explained.

So personally I would be in favour of just documenting it, but keeping as it is.

@jules-ch jules-ch added the docs label Nov 18, 2021
@burnpanck
Copy link

burnpanck commented Dec 18, 2021

I on the other hand would expect m_as to behave as @lovasoa does: IMHO q.m_as(p) == (q/p).m_as("") should hold, if it executes at all.
There is a good reason for that. The whole point of a units package is to prevent silent mis-interpretation of units. You use m_as in exactly those cases, where you have to strip units for external reasons, so it is an important sentry point to guarantee physically correct behaviour. And in physical computation, a quantity "1 km" is exactly equivalent to "1000 m". Thus, I would expect that .m_as("1 km") does exactly the same as .m_as("1000 m"). If pint now silently strips the pre-factor from that second number, it is silently modifying the meaning of an operation based on a physically inconsequential property. If you really want to write "give me quantity a expressed in units of quantity b", then write that: a.m_as(b.units) - The Python Zen (PEP20) says: Explicit is better than implicit.
Thus I second @lovasoa: The current behaviour is silently producing results that are wrong or physically meaningless, the exact thing this library intends to prevent. (Again the Python ZEN: Errors should never pass silently.). Thus, I see two options: either m_as only accepts units but not quantities, or it behaves the same when physically equivalent values are passed.

@lovasoa
Copy link
Author

lovasoa commented Dec 18, 2021

Yes. As a side note, this is the reason why I opened this issue. I caught a bug that could have had catastrophical consequences in the real world. And avoiding catastrophes is the reason we used pint in the first place.

@cpascual
Copy link
Contributor

The current behaviour is silently producing results that are wrong or physically meaningless

I cannot agree with that affirmation. I'd rather say that it is producing a result that you did not expect but it is so because your expectation is "wrong" (in the sense that it is not the expectation with which it was implemented), but which is perfectly correct and physically meaningful according to its intended behaviour .

IMHO, both expectations are equally reasonable, but the fact is that the one which was in the developer's mind was "give me the magnitude in the same units that this other quantity has". And I see no reason for changing the behaviour (which would introduce backwards-compatibility issues).

So, I think that the current behavior should be correctly documented, also including a note to explicitly warn about this potential misunderstanding.

At most, if the potential for confusion (even with a well-documented method) is deemed too high, I think that the m_as method or its usage with a Quantity as an argument could be deprecated. But in no way its return value should be changed because that could indeed have catastrophic consequences in existing code that relies in a legitimate expectation that also happens to be the intended one.

@burnpanck
Copy link

burnpanck commented Dec 19, 2021

As it is not documented, there is no definite "intended" behaviour. If at all, then it would be up to @hgrecco to decide. Even then, in the interest of continuous improvement, API's can change. If there is a behavioural change like this one - then it's usually done with a deprecation period and suitable warnings. But I repeat - as long as it wasn't documented, there was never a "public" API here.

I cannot agree with that affirmation. I'd rather say that it is producing a result that you did not expect but it is so because your expectation is "wrong" (in the sense that it is not the expectation with which it was implemented), but which is perfectly correct and physically meaningful according to its intended behaviour .

Then please explain the physical meaning of the current behaviour. Given the length of a car l and the diameter of its wheel d - what is the meaning of l.m_as(d)? A stand-alone unit is a well-known ("blessed") quantity, that has been given a name everyone is supposed to understand - so that has physical meaning. But as part of a quantity, the specific unit used to express that quantity has no physical meaning (it is not "visible" to any physical behaviour) - as the example before tried to illustrate.

Note that there is some asymmetry between .to(...) and .m_as(...): While they share many traits, they have different physical meaning, and are used for different purposes. The former does not form part of physical computation, its purpose is one of representation. The latter however is a central piece for physical computation, namely when there is a need to (temporarily) go outside the realm of pint for external reasons.

The physical behaviour of .to is the "identity" function - you are guaranteed that a == a.to(b)(physically - that is up to rounding). Therefore .to has considerable leeway in the choice of unit to select, e.g. when given a quantity argument such as b = 1000*u.m - the unit itself of the resulting quantity carries no physical meaning either way on it's own, so as long as the resulting quantity is correct, it is up to the library to choose what unit to use.

Very different with .m and .m_as(...): These operations create a new value with a different physical meaning. If that behaviour has no physical meaning, then it defeats the purpose of a units package. The only sensible meaning of a.m_as(b) is "the ratio of a/b", that is the length of a measured by the physical quantity b. That is extremely important: .m_as is the only method in pint able to do that - q.m has no physical meaning, even if defined as q.m_as(q.units) - as q.units itself has no physical meaning either.

You are concerned with breaking user code on relying on this behaviour? Take for example a.m_as(b+c) (note the argument may be passed in through a function or another long chain of operations), that expression is at risk of breaking anytime today: There is no guarantee what unit the result of a+b has - it could be a.units, b.units or any other compatible unit. But there is a guarantee that the result is a quantity with the physical meaning of the physical sum of the two quantities - expressed in any suitable unit.

I do therefore believe that the current behaviour is at danger of silently creating wrong results whenever you cannot guarantee the unit of the argument. Because almost nothing in pint (or any physical quantity package for that matter) guarantees a specific unit beyond the physical meaning of a quantity, that means the only case where a breaking change would impact code that is not already broken is the case of a literal argument, e.g. q.m_as("1000 m"), which seems to server little purpose. That said, if we want to be very cautious, what we can do is completely prohibit quantity arguments to .m_as, at least for some time.

@cpascual
Copy link
Contributor

cpascual commented Dec 20, 2021

IMHO, the key point of contention here is "whether the current behaviour should be considered API or not"

My argument for that is that, in absence of explicit documentation, the current implementation is the source of truth.

This of course is not a justification for all sorts of buggy behaviours, but in this case the implementation makes all the sense from a pragmatical programmatic point of view for concisely solving use cases such as "I've got two quantities, and I want the magnitude of the first one expressed in the same units as the second one". In other words, I want a concise way of expressing a.to(b.units).magnitude. Furthermore: If the intended behaviour was to provide a normalization, implementing m_as would not be needed since one could simply and very concisely write a/b

With that said, I think I've sufficiently expressed my point of view and that the opposing point of view is also sufficiently documented, so I will avoid feeding what to me starts resembling a bit like a flame war (it may be just my subjective impression, please do not take offence and accept my apologies if that is not the case).

@burnpanck
Copy link

burnpanck commented Dec 21, 2021

It is only a flame war, if we're pitting opinions against each other, rather than objective arguments. I was trying to make clear that there are objective arguments why I believe that behaviour that the current behaviour is effectively inconsistent. You bring forwards counter arguments - so I don't think this is a flame war. And I would like to counter your counter arguments 😉 as follows:

  • I believe a.m_as(b.units) is both concise and very clear (and a.to(b).m works too and that will remain so)
  • a/b is still a dimensionless quantity, but not unitless. .m_as is needed when you cannot take a quantity, but need a base number, usually for external reasons (e.g. to feed into scipy). .m_as it is the only good way to do so, as became apparent in How to get base numbers from quantities consistently, efficiently and safely? #1348. Alternatives that do not work are:
    • (a/b).magnitude: This is very dangerous and thus defeats the purpose of a units package. You loose units checking and even worse, you may get the magnitude of a scaled dimensionless quantity (such as percent), even when b is not a scaled dimensionless quantity.
    • a.to(b).magnitude: For the case that b is a quantity (as opposed to a unit), it has no way to generate a (magnitude,unit) pair that has b in the unit position. Contrary to .m_as, .to retains physical meaning if it just takes the unit of the supplied quantity b, so IMHO it's ok to do that.
    • float(a/b): Does only work for scalar quantities, and therefore cannot be used in typical numpy-style functions that automatically vectorise over their arguments if an array is provided.
    • np.array(a/b): Is essentially the same as (a/b).magnitude with all its dangers. See also a comment to a tread on the difference between Scalars and Arrays.

@keewis
Copy link
Contributor

keewis commented Dec 21, 2021

how about (a / b).to("dimensionless").magnitude or (a / b).m_as("dimensionless") for now? As far as I can tell m_as was only ever intended to be a shortcut for .to(...).magnitude, and I wouldn't change that.

Also, to me the current behavior of .to is very surprising – it would probably be more appropriate to call it .to_like – so I'd vote to either deprecate passing a quantity or to start supporting scaled units in .to (the latter looks way more complicated, though).

@burnpanck
Copy link

It seems that (a/b).m_as("dimensionless") is probably the best we have today - though I would still prefer to have an API that has a documented, guaranteed physical meaning - which m_as currently does.

I see the desire that .m_as should be exactly equivalent to .to(...).magnitude. That could guarantee physical meaning if .to could guarantee the output unit, e.g. if quantity arguments were deprecated/disallowed. So +1 for .to only with units and potentially a .to_like(q) as a shortcut for .to(q.units).

@cpascual
Copy link
Contributor

+1 to deprecate passing a quantity to .to

Regarding a.to_like(b), I don't see it as a problem but I am not a fan of adding methods for similar stuff (I personally would just use a.to(b.units), which happens to be just one extra character). For the same reason I would not be sad if .m_as would be deprecated entirely. The only thing I am against is to change the current return value of .m_as (due to the potential risks to existing codes relying on its current behaviour).

@jules-ch
Copy link
Collaborator

jules-ch commented Dec 22, 2021

A bit of context here, a lot of methods in pint accepts UnitLike types (str, Unit, UnitContainer, Quantity) to use it as a unit in public methods.
This is the case here in m_as, where you can use Quantity with m_as to say I want this quantity as a magnitude with the same unit as this other Quantity.

This is perfectly valid and saying "this has no physical meaning" just does not make sense. It really is not an argument IMHO ❤️, we are a community of scientist, just because the current implementation does not suit you, saying that it's physically wrong does not make it True.
It was a choice to use Quantity in some methods to represent Unit in those.
The current implementation/API reflects those past choices.

Though I hear your arguments :), right now the API should be flexible enough to let you normalize quantities with another like @cpascual said with (a/b).m_as("").
If that is not concise enough, implement it in a custom function.

I personally had this use case where I needed magnitude normalized with a UnitLike '10^6 N' for computation.

We can discuss a new method to normalize quantities with another while retaining dimension check as you said. That's something we can consider.

I'm not in favour of changing the current behaviour with the scaled one.
I'll be more inclined to use a dedicated function to normalize quantities and either keep the same behaviour for m_as or deprecate the use of Quantity as UnitLike type.

@burnpanck
Copy link

What I mean by non-physical is that neither the magnitude nor the unit are a property of a physical quantity (given a steel rod, you can measure its length but neither its magnitude nor its unit). They are a property of the representation of that quantity. Because of that, any computation whose result depends on the representation is necessarily not a model of a physical process, and thus a "wrong" result within a scientific computation.

The same goes with the number type used to represent the magnitude. In the old days, rocket science would mean having a guidance system implemented using integer computation only. Any value that depends on a computation falling between two integer value would cause a "wrong" thrust output. A good engineer can deal with that - he chooses an implicit fixed-point representation with a suitable number of fractional digits. However, a lot of care has to be taken to always take note of the implicit exponent that is carried along with the computation. Things are much simpler with 64bit doubles, the workhorse of scientific computation. The processor automatically takes care of that exponent.

A units library extends the same concept to the unit aspect of a physical computation. This leads to fewer errors, just like the use of doubles avoids most rounding errors you would get with integers. However, if I want to make use of that, I need to rely on the behaviour to be consistent. Otherwise, it instills distrust in the library. Consider what would happen if divisions with doubles would silently discard the fractional part of the dividend. If that were the case, one could just work around and be sure to never to do a division with non-integers and you are back to "mathematical" behaviour. Yet, I'm sure doubles would just be banned from all code-guidelines in fields with a heightened scrutiny.

I accept that we are not going to change the return value of m_as. How can we get to a decision if Quantity arguments are to be deprecated?

@hgrecco
Copy link
Owner

hgrecco commented Dec 23, 2021

I agree with @jules-ch: we need some context to understand how we got here. Looking at the tracker, a large fraction of the recent issues have the same root: the creation of the Unit class as a first class registry citizen. I am not saying that this was an error but it had some unintended consequences. In the old days of Pint, a Unit was just a quantity with a unit magnitude, therefore we did not have these these discussions for the API.

In #227 it is clear that the case raised here was overlooked. The proponent was clearly thinking about a unit, not a quantity.

@burnpanck and @lovasoa are right that this API might lead to unexpected results and silent bugs. We should fix it. But I also think that @cpascual is right: changing the behaviour will break a lot of code, deprecation of Quantity as an argument might be the best option.

The question is how we move forward in a way that is consistent, surprise free and does not break peoples code.
My suggestion would be along the following lines.

  1. Make a list of the functions that accepts quantity as UnitLike and are ambiguous about the result.
  2. In those functions, raise DeprecationWarning (version n+1) -> Warning (version n+2) -> TypeError (version n+3)

Alternatively, we could split in two cases (but might be unnecesary complicated):

  • When the argument is Quantity with a unit magnitude, raise
    DeprecationWarning (version n+1) -> Warning (version n+2) -> TypeError (version n+3)
  • When the argument is a Quantity with a non unit magnitude, raise
    Warning (version n+1) -> TypeError (version n+2)

(For strings, i on the result after parsing.)

In all cases provide a usefull message, could be something like this:

The use of a quantity object as UnitLike argument has been deprecated. 
Please change your code to `q1.m_as(q2)` to `q1.m_as(q2.units)`

I am open to create another method for some of the use cases described here, but I would suggest we discuss this in another issue.

@hgrecco hgrecco added this to the 0.23 milestone Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants