Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for hinted deserialization? #29

Closed
nefilim opened this issue May 28, 2015 · 13 comments
Closed

Support for hinted deserialization? #29

nefilim opened this issue May 28, 2015 · 13 comments

Comments

@nefilim
Copy link

nefilim commented May 28, 2015

Thanks for creating and open sourcing ScalaPB, love the idea of using the Google proto parser.

I've been using ScalaBuff for a while and contributed some support for hinted deserialization, basically just a map from a class identifier to function to deserialize a byte stream for the matching proto message. Here's an example generated with ScalaBuff:

trait HintedDeserializer {
  def deserializePayload(payload: Array[Byte], payloadType: String): com.google.protobuf.GeneratedMessageLite
}

...

object VoyagerInternalProtos extends HintedDeserializer {
        def registerAllExtensions(registry: com.google.protobuf.ExtensionRegistryLite) {
        }

        private val fromBinaryHintMap = collection.immutable.HashMap[String, Array[Byte] ⇒ com.google.protobuf.GeneratedMessageLite](
                 "VPResult" -> (bytes ⇒ VPResult.parseFrom(bytes)),
                 "VPPlayerIdentifiers" -> (bytes ⇒ VPPlayerIdentifiers.parseFrom(bytes))
        )
        def deserializePayload(payload: Array[Byte], payloadType: String): com.google.protobuf.GeneratedMessageLite = {
                fromBinaryHintMap.get(payloadType) match {
                        case Some(f) ⇒ f(payload)
                        case None    ⇒ throw new IllegalArgumentException(s"unimplemented deserialization of message payload of type [${payloadType}]")
                }
        }
}

SandroGrzicic/ScalaBuff#82

I havent looked behind the scenes for ScalaPB yet, do you think it's reasonable to add something like this to ScalaPB? Could you point me in the right direction where to look & get started?

I imagine for ScalaPB an implementation of a trait along these lines:

trait ScalaPBHintedDeserializer {
  def deserializePayload(payload: Array[Byte], payloadType: String): com.trueaccord.scalapb.GeneratedMessage
}

Any advice appreciated.

Thanks
Peter

@thesamet
Copy link
Contributor

thesamet commented Jun 2, 2015

Hi Peter,

Thanks for the pull request. Wouldn't it be simpler to build the map using reflection outside ScalaPB? Initialize it as a lazy val, iterate over the descriptors to discover all messages. Then calls to deserializePayload() are as fast as your implementation since the reflection has already been done.

This approach (assuming there is some envelope that contains the full message name) is not generic enough to worth the price for adding this feature.

Feel free to re-open if you think I missed anything - if this can be achieved using reflection outside ScalaPB then that would be my preference. I'd be happy to post a code sample in our documentation on how to do it.

@thesamet thesamet closed this as completed Jun 2, 2015
@nefilim
Copy link
Author

nefilim commented Jun 2, 2015

Could you expand on "iterate over the descriptors to discover all messages" ?

I've run into this issue (https://issues.scala-lang.org/browse/SI-7046) multiple times and has made me hesitant to use macros or runtime reflection for anything like this, hence moved to codegen.

@thesamet
Copy link
Contributor

thesamet commented Jun 2, 2015

ScalaPB (in master, which will become version 0.6) provides descriptors (https://developers.google.com/protocol-buffers/docs/reference/java/index) for the generated code. These classes provide a way to list all protocol buffers, the fields in them, and so on in a structured way. In this context, we can use the FileDescriptor to get the list of all messages, and then having the name of the message, use reflection to find the parseFrom method.

It would probably be easiest if I posted a code sample. Let me get back to you in a few days.

@nefilim
Copy link
Author

nefilim commented Jun 2, 2015

Ah, great, was about to suggest generating at least a list of messages but a collection of Descriptors should do the trick nicely.

Do you have an estimated timeline for a 0.6 release?

@thesamet
Copy link
Contributor

thesamet commented Jun 2, 2015

0.6 should come within the next two weeks. It's mostly complete, I want to
have more test coverage for the interoperability between proto2 and proto3
before it goes live.

On Tue, Jun 2, 2015 at 9:33 AM, Peter vR notifications@github.com wrote:

Ah, great, was about to suggest generating at least a list of messages but
a collection of Descriptors should do the trick nicely.

Do you have an estimated timeline for a 0.6 release?


Reply to this email directly or view it on GitHub
#29 (comment).

-Nadav

@nefilim
Copy link
Author

nefilim commented Jun 16, 2015

hi @thesamet how is 0.6 looking? when do you think I can start integrating with master branch?

@thesamet
Copy link
Contributor

Can you try working with 0.5.8? It is built from the master branch.

On Tue, Jun 16, 2015 at 7:50 AM, Peter vR notifications@github.com wrote:

hi @thesamet https://github.com/thesamet how is 0.6 looking? when do
you think I can start integrating with master branch?


Reply to this email directly or view it on GitHub
#29 (comment).

-Nadav

@nefilim
Copy link
Author

nefilim commented Jun 17, 2015

running into compilation errors:

java.util.List[com.google.protobuf.Descriptors.FieldDescriptor] does not take parameters
  __fieldsMap(__fields(0)).asInstanceOf[Int],

here's the generated code:

   def fromFieldsMap(__fieldsMap: Map[com.google.protobuf.Descriptors.FieldDescriptor, Any]): com.minomonsters.voyager.proto.ACChatroomAndParticipants = {
    require(__fieldsMap.keys.forall(_.getContainingType() == descriptor), "FieldDescriptor does not match message type.")
      val __fields = descriptor.getFields
      com.minomonsters.voyager.proto.ACChatroomAndParticipants(
        __fieldsMap(__fields(0)).asInstanceOf[Int],
        __fieldsMap(__fields(1)).asInstanceOf[Int]
      )
    }

and the proto

message ACChatroomAndParticipants {
    required uint32 roomId = 1;
    required uint32 participantCount = 2;
}

I can open a separate issue for this if I'm not doing something stupid. Also I have noticed 0.5 uses protobuf 3 which is still in alpha? I was hoping for something production ready

@nefilim
Copy link
Author

nefilim commented Jun 17, 2015

just a bit more info, i use protoc-jar and i used both -v261 and -v300 with the same results

@nefilim
Copy link
Author

nefilim commented Jun 17, 2015

@thesamet curious why https://github.com/trueaccord/ScalaPB/blob/master/compiler-plugin/src/main/scala/com/trueaccord/scalapb/compiler/ProtobufGenerator.scala#L590 isn't using the List.get accessor like the optional and repeated fields?

@thesamet
Copy link
Contributor

Thanks for reporting this bug. Please try 0.5.9 (just pushed to Sonatype, should become available in the next hour).

0.5.x uses the runtime of protocol buffers 3 which is in Alpha, but supports both proto2 and proto3 language levels. You can even use ScalaPB 0.5.x with protoc 2.6.1 if your protos are in proto2 syntax. Having said that, if you would like to be completely safe and avoid alpha releases then you should stick with ScalaPB 0.4.x.

The field descriptors in 0.5.x are still not solving the problem you are trying to fix easily. I think that either in 0.4.x or 0.6.x it would be best to manually maintain a list of the messages you care about. Would something like this do:

  val companionsByName: Map[String, GeneratedMessageCompanion[_ <: GeneratedMessage]] =
    Seq(Message1, Message2, Message3).map {
      c => c.descriptor.name -> c
    }.toMap

  def deserializePayload(payload: Array[Byte], payloadType: String): GeneratedMessage = {
    companionsByName(payloadType).parseFrom(payload)
  }

@nefilim
Copy link
Author

nefilim commented Jun 24, 2015

Thanks @thesamet for the bug fix.

It's a bit disappointing to hear that the new version won't be solving my problem as originally suggested. I'm sure you can appreciate manually maintaining a list of messages is not ideal and not a viable long term strategy for me personally.

It seems like it should be theoretically possible to generate a list of descriptors at compile time, I had a quick stab at it but couldn't make it work, especially with the bounded types (as in your example):

  def getObjectReference(clazz: String): GeneratedMessageCompanion[_] = {
    currentMirror.reflectModule(currentMirror.staticModule(clazz)).instance.asInstanceOf[GeneratedMessageCompanion[_]]
  }

  val companionsByName: Map[String, GeneratedMessageCompanion[_]] =
    VoyagerFrontendProto.descriptor.getMessageTypes.asScala.map { c =>
      c.getName -> getObjectReference(c.getFullName)
    }.toMap

Not an area of expertise for me. Would you be open to at least code generating a list of message Descriptors in the generated object (for each proto file)? This seems like an easy solution, adding very little to the generated code.

@thesamet
Copy link
Contributor

Hi Peter,

The list of all descriptors is available in the FileDescriptor [*], see
getMessageTypes(). For every proto file, ScalaPB generates an object named
FileNameProto which contains the file descriptor for it as a lazy val.
Would that work for you?

[*]
https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/Descriptors.FileDescriptor

On Wed, Jun 24, 2015 at 8:21 AM, Peter vR notifications@github.com wrote:

Thanks @thesamet https://github.com/thesamet for the bug fix.

It's a bit disappointing to hear that the new version won't be solving my
problem as originally suggested. I'm sure you can appreciate manually
maintaining a list of messages is not ideal and not a viable long term
strategy for me personally.

It seems like it should be theoretically possible to generate a list of
descriptors at compile time, I had a quick stab at it but couldn't make it
work, especially with the bounded types (as in your example):

def getObjectReference(clazz: String): GeneratedMessageCompanion[] = {
currentMirror.reflectModule(currentMirror.staticModule(clazz)).instance.asInstanceOf[GeneratedMessageCompanion[
]]
}

val companionsByName: Map[String, GeneratedMessageCompanion[_]] =
VoyagerFrontendProto.descriptor.getMessageTypes.asScala.map { c =>
c.getName -> getObjectReference(c.getFullName)
}.toMap

Not an area of expertise for me. Would you be open to at least code
generating a list of message Descriptors in the generated object (for each
proto file)? This seems like an easy solution, adding very little to the
generated code.


Reply to this email directly or view it on GitHub
#29 (comment).

-Nadav

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants