Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement PositionsBound. #2223

Merged
merged 1 commit into from
Mar 7, 2018

Conversation

markusbaumeister
Copy link
Contributor

Currently there is no library method to return the set of all bound
entries of a list. This commit implements this functionality in an
efficient manner.

To see the effect consider the following time difference:

gap> t := 10^7;
10000000
gap> L := [];
[  ]
gap> L[t] := t;;
gap> Filtered( [1..Length(L)], i -> IsBound(L[i]) ); time;
[ 10000000 ]
637
gap> R := Runtime();
2198
gap> res := [];;
gap> pos := 1;;
gap> for i in [1..Length(L)] do
> if IsBound(L[i]) then
> res[pos] := i;
> pos := pos+1;
> fi;
> od;
gap> res;
[ 10000000 ]
gap> Runtime() - R;
239
gap>

@fingolfin fingolfin added the kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements label Feb 28, 2018
@fingolfin
Copy link
Member

FYI, PositionsProperty(L, ReturnTrue); also does the job and has about the same performance.

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! Here are some further remarks.

lib/list.gd Outdated
@@ -1002,6 +1002,34 @@ DeclareOperation( "PositionsProperty", [ IsList, IsFunction ] );
DeclareOperation( "PositionBound", [ IsList ] );


#############################################################################
##
#O PositionsBound( <list> ) . . . . position of first bound element in a list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That summary does not seem to fit what the function does -- it returns all bound positions, no?

lib/list.gd Outdated
##
## <#GAPDoc Label="PositionsBound">
## <ManSection>
## <Oper Name="PositionsBound" Arg='list'/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will not turn up in the manual unless you explicitly include it in one of the doc/ref/*.xml files. It would hence stay undocumented for endusers.

lib/list.gd Outdated
## <Oper Name="PositionsBound" Arg='list'/>
##
## <Description>
## returns the set of all indices for which an element is bound in the list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds slightly off to me... perhaps "returns the set of all indices of the list which are bound to an element" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I oriented myself on the documentation of PositionProperty just above. Should I change that as well (if I'm already at it)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Though we might also want to get another opinion by a native speaker like @ChrisJefferson or @stevelinton

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which is better. I don't really like "indices which are bound" since it isn't really the indices which are bound, but the corresponding list positions. "indices for which an element is bound in the list" is a little convoluted, but precise. Neither is genuinely likely to be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you like "the set of all bound positions of the list"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@codecov
Copy link

codecov bot commented Mar 1, 2018

Codecov Report

Merging #2223 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2223      +/-   ##
==========================================
+ Coverage   70.34%   70.34%   +<.01%     
==========================================
  Files         481      481              
  Lines      253181   253190       +9     
==========================================
+ Hits       178091   178102      +11     
+ Misses      75090    75088       -2
Impacted Files Coverage Δ
lib/list.gi 71.18% <100%> (+0.18%) ⬆️
hpcgap/lib/hpc/stdtasks.g 38.61% <0%> (-0.26%) ⬇️
src/objset.c 81.49% <0%> (-0.23%) ⬇️
src/hpc/traverse.c 95.45% <0%> (-0.03%) ⬇️
src/gap.c 63.52% <0%> (+0.33%) ⬆️

@frankluebeck
Copy link
Member

I guess that with the hint to PositionsProperty(L, ReturnTrue); this pull request became moot?

Let me nevertheless add three comments, the first two are minor, the third is not:

(1) It is a general observation that in List or Filtered when called with very quick functions as second argument, the function call overhead itself can be significant. That is what we see in the example of the first post.

(2) Why don't you use Add in the proposed function and avoid the use of the variable pos?

(3) Efficient functions for lists must be implemented as functions, not methods, to avoid method selection overhead for lists. Example: While your function is faster for the example in the first post, try this:

gap> a:=List([1..100],i->[1]);;
gap> b:=List([1..100],i->[a]);;
gap> ll:=List([1..10^7],i->b);;
gap> Length(Filtered([1..Length(ll)],i->IsBound(ll[i]))); time;
10000000
3417
gap> DeclareOperation( "PositionsBound", [ IsList ] );
gap> InstallMethod( PositionsBound,["IsList"],
> function( list )
>     local i, bound, pos;
>     bound := [];
>     for i in [ 1 .. Length( list ) ] do
>         if IsBound( list[i] )  then
>             Add(bound,i);
>         fi;
>     od;
>     return bound;
> end);
gap> Length(PositionsBound(ll)); time;

I have not waited for an answer (probably you need to wait a few hours).

@markusbaumeister
Copy link
Contributor Author

It is probably not completely moot (since the use of PositionsProperty to check for bound entries is kind of obscure). But I do not know if one of these options (either replace the body of the code by PositionsProperty or document the use of PositionsProperty) is the preferred option of most of the developers. If so, I'm happy to modify/retract this PR as needed.

Concerning (2): I assumed that Add would be slower. After checking I found this to be mistaken.

I will modify the PR in the way (3) suggests as soon as I know which option is preferred by the developers (i.e. whether this PR becomes moot).

@markuspf
Copy link
Member

markuspf commented Mar 4, 2018

I am for adding a function for PositionsBound, even if it is just PositionsProperty(l, ReturnTrue).
I can't count the number of times I found out that PositionsBound can be implemented in that short (and obscure) way.

Note also that the fga package implements BoundPositions as

InstallGlobalFunction( BoundPositions,
    l -> Filtered([1..Length(l)], i -> IsBound(l[i])) );

and fr happily uses it...

@fingolfin
Copy link
Member

Yes, I also think that having PositionsBound would be useful, even though I pointed out the PositionsProperty(l, ReturnTrue) trick -- but I meant that as a suggestion for the implementation, and not to say that PositionsBound is useless. Sure, it's implementation can be very short, but I think it very helpful, esp. to people are not yet long-year "masters" of the GAP language.

I also think that it improves code readability. People will either immediately guess what PositionsBound(list) does, or else they can simply look it up in the manual. Whereas seeing PositionsProperty(list, ReturnTrue) the first time can be quite confounding, and make the code less transparent.

In that regard, I think it is similar to e.g. PrimeDivisors(n), which is "just" an alias for Set(FactorsInt(n)), but makes it clear with one glance what's going on.

lib/list.gd Outdated
## </ManSection>
## <#/GAPDoc>
##
DeclareOperation( "PositionsBound", [ IsList ] );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Frank pointed out, this should be turned into a global function for optimal performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case it should not refer back to PositionsProperty (since that is not a global function).

@ChrisJefferson
Copy link
Contributor

I would also like to add PositionsBound, I have needed this a few times, and always just write a for loop.

I agree that it should be a global function. It could even just be (in my opinion) L -> PositionsProperty(L, ReturnTrue). However, I'd never thought of writing it in that way, and I suspect most other people wouldn't either.

@frankluebeck
Copy link
Member

I also support adding this function. (I raised the question in my previous comment because I was unsure about the status of the discussion, and didn't want to imply that I'm against such a function.)

Although the function is mainly intended to be called with lists that have holes, one could add efficient handling of dense lists:

if IsDenseList(list) then
  return [1..Length(list)];
fi;

Currently there is no library method to return the set of all bound
entries of a list. This commit implements this functionality in an
efficient manner.
@fingolfin fingolfin merged commit 20b545a into gap-system:master Mar 7, 2018
@fingolfin fingolfin added the release notes: added PRs introducing changes that have since been mentioned in the release notes label Mar 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: added PRs introducing changes that have since been mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants