Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding NonEmptyVector #1137

Merged
merged 11 commits into from
Jul 13, 2016
Merged

Adding NonEmptyVector #1137

merged 11 commits into from
Jul 13, 2016

Conversation

zainab-ali
Copy link
Contributor

Adding NonEmptyVector as discussed in #1088

@ceedubs
Copy link
Contributor

ceedubs commented Jun 17, 2016

@zainab-ali thanks for working on this!

It looks like there are already merge conflicts, so you'll want to pull in the latest changes from master.

Also please note the second paragraph of the description for #1088 -- I think we'll want to change this implementation to not use the "cons" approach for Vector.

@@ -11,11 +10,6 @@ package object data {
def NonEmptyList[A](head: A, tail: A*): NonEmptyList[A] =
OneAnd[List, A](head, tail.toList)

def NonEmptyVector[A](head: A, tail: Vector[A] = Vector.empty): NonEmptyVector[A] =
OneAnd(head, tail)
def NonEmptyVector[A](head: A, tail: A*): NonEmptyVector[A] =
Copy link
Contributor

@kailuowang kailuowang Jun 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this API (head : A, tail: A*) would be handy.

*/
def show(implicit A: Show[A]): String =
s"NonEmptyVector(${A.show(head)}, ${Show[Vector[A]].show(tail)})"
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this show correct?
Or should I have s"NonEmptyVector(${Show[Vector[A]].show(vector)})" instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better than the cons like show.

@codecov-io
Copy link

codecov-io commented Jun 18, 2016

Current coverage is 89.09%

Merging #1137 into master will increase coverage by 0.29%

@@             master      #1137   diff @@
==========================================
  Files           232        234     +2   
  Lines          3073       3137    +64   
  Methods        3021       3082    +61   
  Messages          0          0          
  Branches         49         52     +3   
==========================================
+ Hits           2729       2795    +66   
+ Misses          344        342     -2   
  Partials          0          0          

Sunburst

Powered by Codecov. Last updated by a639f66...e173928


object NonEmptyVector extends NonEmptyVectorInstances {

def apply[A](head: A, tail: Vector[A]): NonEmptyVector[A] = NonEmptyVector(head +: tail)
Copy link
Contributor

@non non Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to support this as:

def apply[A](body: Vector[A], last: A): NonEmptyVector[A] = NonEmptyVector(body :+ last)

In general I don't like treating non-empty vector as a head/tail pair because it's so inefficient -- prepending to a vector is O(n) and reallocates the entire vector. Appending is a lot more efficient, so in this case I think it makes more sense.

Honestly I'd probably prefer an interface like:

def apply[A](xs: Vector[A]): Option[NonEmptyVector[A]] = ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably missed something. I have been holding the belief that according to
http://docs.scala-lang.org/overviews/collections/performance-characteristics.html
Prepending to a vector is as efficient as appending? I thought it makes sense based on my (not very thorough) understanding of the implementation of vector. Is there some caveat in this claim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might propose the following instantiation methods:

// to be consistent with var-arg `apply` methods for other collections
def apply[A](head: A, tail: A*): NonEmptyVector[A]

def fromVector[A](vector: Vector[A]): Option[NonEmptyVector[A]]

// this will throw an exception if the vector is empty
def fromVectorUnsafe[A](vector: Vector[A]): NonEmptyVector[A]

def prepend[A](head: A, tail: Vector[A]): NonEmptyVector[A]

def append[A](vector: Vector[A], last: A): NonEmptyVector[A]

I think that would cover all of the use-cases. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right @kailuowang. I forgot that vectors maintain a start index to allow this kind of thing to be effectively constant time. I was wrong.

I think maybe I got confused because for awhile Vector(x) ++ xs was very slow. But this was just a bug, not related to x +: xs.

@ceedubs
Copy link
Contributor

ceedubs commented Jun 18, 2016

@zainab-ali thanks, this is getting really close I think!

*/
def flatMap[B](f: A => NonEmptyVector[B]): NonEmptyVector[B] =
NonEmptyVector(vector.flatMap(a => f(a).vector))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about reduce? Since, we know that this is non-empty the reduce(fn: (A, A) => A): A is a safe signature. That or we could have: def combineAll(implicit: sg: Semigroup[T]): T (or we could have both).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduce and several other methods are available via the Reducible instance, but I agree that it would probably be nice to have some of these helpers on the class itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be called reduceLeft(f: (A, A) => A): A and reduce(implicit S: Semigroup[A]): A to be consistent with cats.Reducible. AFAK, the standard library has similar names.

@zainab-ali
Copy link
Contributor Author

I've been mulling over the naming conventions for safe / unsafe methods.

I agree with the following conventions:

  1. If a function is unsafe, it should be suffixed by unsafe. This is because cats is safe by default.
  2. We should stick to the standard library conventions whenever they are safe.

So when a function is safe, but is analogous to an unsafe standard library function, what should we do?
e.g updated(i: Int): Option[A], apply(i: Int): Option[A], Vector.apply(v: Vector[A]): Option[NonEmptyVector[A]]

Since cats is safe by default, we could keep the names as is, but as @johnynek points out, is causes confusion for users of the standard library, when functions they expect to be unsafe start returning options.

In the above comments, we've decided rename the functions instead:
updated to updatedOption
apply(i: Int) to get, and the unsafe variant to getUnsafe
apply(v: Vector[A]) to fromVector and the unsafe variant to fromVectorUnsafe

TBH, I'm uneasy about this, because I find variation in method names more confusing than variation in signatures. However, since I can't think of a better solution, I'll go with it.

We should use the same convention for the NonEmptyList in #1120

* Left-associative reduce using the Semigroup of A
*/
def reduce(implicit S: Semigroup[A]): A =
reduceLeft(S.combine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semigroup has combineOption which can possibly be faster than using combine.


/** Gets the element at the index, if it exists */
def get(i: Int): Option[A] =
Try(getUnsafe(i)).toOption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that toVector.lift(i) would probably be a more efficient implementation (though this is performance speculation without a benchmark).


/** Updates the element at the index, if it exists */
def updated(i: Int, a: A): Option[NonEmptyVector[A]] =
Try(updatedUnsafe(i, a)).toOption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More performance speculation: if (toVector.isDefinedAt(i)) Some(toVector.updated(i, a)) else None.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I guess that would need to wrap it in a NonEmptyVector too.

@ceedubs
Copy link
Contributor

ceedubs commented Jul 1, 2016

👍 thanks @zainab-ali!

I've left some minor comments about possibly more performant implementations of a few things, but I'm happy to let those happen in a followup PR.

* Left-associative reduce using the Semigroup of A
*/
def reduce(implicit S: Semigroup[A]): A =
S.combineAllOption(tail).foldLeft(head)(S.combine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I would do: S.combineAllOption(toVector).get since we know the get can't throw since toVector.nonEmpty.

@kailuowang
Copy link
Contributor

👍

* universal .toString method.
*/
def show(implicit A: Show[A]): String =
s"NonEmptyVector(${Show[Vector[A]].show(toVector)})"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will give us something like NonEmptyVector(Vector(1, 2, 3)) for NonEmptyVector(1, 2, 3).
Maybe we could do "NonEmpty" + Show[Vector[A]].show(toVector) which should give us NonEmptyVector(1,2,3) ?

@peterneyens
Copy link
Collaborator

This looks great @zainab-ali, I left some minor comments.

Another thing which might be useful is an unapply method on the NonEmptyVector object so you can pattern match on NonEmptyVector, but this can probably be added in a later PR.

def concat(other: NonEmptyVector[A]): NonEmptyVector[A] = NonEmptyVector(toVector ++ other.toVector)

/**
* Alias for concat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to make this Alias for [[concat]] so that the scaladoc has a link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see that we are using overloading, so I don't know how that works with scaladoc. I'm generally skeptical of overloading, though it's probably fine in this instance. I think it would prevent us from being able to turn NonEmptyVector into a value class though, wouldn't it?

@non
Copy link
Contributor

non commented Jul 13, 2016

It looks we have have multiple thumbs-up. I'm going to give mine too 👍 and merge this. There is probably some follow-up work to do, but let's open a new PR for that.

@non non merged commit 556e97b into typelevel:master Jul 13, 2016
ceedubs added a commit to ceedubs/cats that referenced this pull request Jul 15, 2016
This follows up on several small suggestions from comments in typelevel#1137. The
only meaningful modification is changing the format of `show` from
something like `NonEmptyVector(Vector(1, 2, 3))` to something like
`NonEmptyVector(1, 2, 3)`.

I've also added a `length` method to `NonEmptyVector`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants