Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate protocol and implementation? #1

Closed
Dushistov opened this issue Jan 24, 2020 · 20 comments · Fixed by #7
Closed

separate protocol and implementation? #1

Dushistov opened this issue Jan 24, 2020 · 20 comments · Fixed by #7

Comments

@Dushistov
Copy link
Contributor

Dushistov commented Jan 24, 2020

@lkolbly
I use ublox GPS receiver, but it is connected via SPI on my board.
Do you interesting split this crate into two: one for UBX protocol and other part with rs232 and so on?

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 24, 2020

Yes, I don't think it makes sense to force users of Device to have a serial port. Would it make sense to template Device on the port type (which is now a Box<dyn serialport::SerialPort> anyway), and then the user could just construct their own serial port/SPI/whatever and pass it in? I don't know if we necessarily need a whole separate crate for each protocol.

We could have a RS232-friendly wrapper, and a SPI-friendly wrapper, as features though.

We'd probably have to change how the poll_for method works, but I think that's fine.

@Dushistov
Copy link
Contributor Author

Dushistov commented Jan 24, 2020

Would it make sense to template Device

Not sure that it is possible to create effective abstraction.
For example for spi basic operations is like this transfer(input buffer, output buffer)
you send and receive data at the same time. While with rs232 often protocols based on send then receive pattern.

Plus it is possible to have several devices connected via SPI to CPU. And you activate gpio1 or gpio2 to choose your preferable device. The problem in this configuration is that "Device" object that control spi is may be not corresponding to only GPS,
but also to pressure and temperature sensor.

@Dushistov
Copy link
Contributor Author

Perfect image of crate for me, is UBX protocol state machine parser - you give it bytes and it time to time give to you packets, plus packet unpacker, plus functions to create packets. Parser is preferably based on nom but this is just preference.

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 24, 2020

Yeah, SPI's simultaneous read/write model does not match well with the RS232 model.

I very much like nom as well, I wrote this before I had discovered it. I'd be down for a rewrite. Having a bytes-in, bytes-out state machine I think would have to be able to handle to possibility of processing multiple packets (or a few packets followed by a partial packet) in any particular sequence of bytes handed to the DeviceStateMachine class.

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 24, 2020

I can make some prototypes for what such a class might look like this weekend.

@Dushistov
Copy link
Contributor Author

Dushistov commented Jan 25, 2020

Plus I should add that serial crate that use as dependency (serialport) may be not suitable for all use cases. For example I use "Raspberry Pi 3" for work with navigation device, there is nice crate for it: rppal. It covers almost all interfaces that I need: i2c, spi and uart. So if my gps receiver will be connected via serial port (it doesn't, but if) then I will be prefer to use rppal crate for work with uart, because of I already uses it for other interfaces.

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 27, 2020

Good point. I just pushed a commit which splits up the parsing from all the I/O dependencies and policy - now you can use the Segmenter struct to extract packets from an input byte stream, as well as you can turn packets into bytes using a to_bytes method.

Usage looks something like:

write_to_spi(ublox::CfgMsg{classid: 1, msgid: 7, rates: [0, 0, 0, 0, 1, 0]}.to_bytes());
let mut s = Segmenter::new();
loop {
   let bytes = read_from_spi();
   for packet in s.consume_all() {
      match packet {
         Some(Packet::NavPosVelTime(packet)) => {
            // Do something with the fix
         }
         None => { /* don't care about other packets */ }
      }
   }
}

Thoughts?

@Dushistov
Copy link
Contributor Author

Dushistov commented Jan 27, 2020

@lkolbly

Thoughts?

I like general idea, here some remarks:

  1. I don't meet variable size packages during manual reading, so to_bytes I suppose should not allocate memory and return
    fixed size array as result, and size of array will depend on struct Type.
  2. Here I suppose typo and consume_all takes bytes as argument?
  3. I don't really like Result<Vec<Packet>> as result type of consume_all, why it should allocate anything? It can return something like Result<impl Iterator> which will construct suitable struct "in place" without need of heap allocation. If it is impossible to use Iterator, may be
    FallibleStreamingIterator instead.

And it would be great to remove all dependencies, my dependencies tree is now huge enough,
I don't see any practical reason for any dependencies for package parsing, except nom, but you aren't using it now.

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 27, 2020

I don't see any practical reason for any dependencies for package parsing, except nom, but you aren't using it now.

The packets are all simple enough I can get away with bincode, which needs things to implement Serialize or Deserialize from serde. The code is somewhat shorter with bincode than nom, I think.

Here I suppose typo and consume_all takes bytes as argument?

consume_all takes &[u8], just like consume.

I don't meet variable size packages during manual reading, so to_bytes I suppose should not allocate memory and return fixed size array as result, and size of array will depend on struct Type.

Some packets, like the response for AlpSrv, don't have a size known at compile time.

Though I suppose, the packets could implement the Read trait, though that would complicate parsing a bit.

EDIT: Or for writing, the Write trait.

@Dushistov
Copy link
Contributor Author

Dushistov commented Jan 27, 2020

The packets are all simple enough I can get away with bincode

Oh, this is unexpected for me, because of there is no guarantee about preserving of binary format between major versions update, so it need something like 100% test coverage for easy way to migrate between major releases, but this is interesting way for parse such simple structures.

Though I suppose, the packets could implement the Read trait, though that would complicate parsing

Or may be it can be lazy parsing, extracting only what need user and without even use additional memory, like:

struct Foo {
  repr: [u8; N],
}

impl Foo {
   fn get_fielda(&self) -> u32 { u32::from_le_bytes([self.repr[7], self.repr[8],... ]) }
}

struct FooRef<'a> {
  repr: &'a [u8]
}

impl<'a> FooRef<'_> {
   fn get_fielda(&self) -> u32 { u32::from_le_bytes([self.repr[7], self.repr[8],... ]) }
}

So structures would hold already packed data and use getters to get exactly what user want and only that fields. It save some CPU cycles, plus parser can return FooRef, just 16 bytes for each packet.
For unlimited packages it can be struct Foo { Box<[u8]> }.

The downside of this idea is that there is need for some kind of builder pattern for simplify way to construct them, plus it should be some kind of macros or "derive" crate that simplify description of getters and constructors based on "build pattern", but it can reduce dependency graph to something like syn/quote.

@lkolbly
Copy link
Collaborator

lkolbly commented Jan 29, 2020

I don't think constructing structures like that will be too hard, since they could be constructed much the way they are now, and there would be an easy way to handle different protocol versions (since the class Foo could know the protocol version and parse accordingly.).

I don't know whether the parser would be able to return a FooRef in general, since it's tricky for the user to guarantee that they have the whole packet without parsing it themselves. Although maybe an API like this would work:

pub fn make_packet_view<'a>(data: &'a [u8]) -> (usize, Option<PacketView<'a>>) {
   // Returned PacketView holds a reference to incoming data, iff a complete packet exists.
   // Also returns offset in the buffer until after this packet
}

Zero copies required, with a usage like:

let mut start = 0;
let mut end = 0;
let mut gpsbuffer: [u8; 128] = [0; 128]; // Better hope you never get a packet >128 bytes big :P
loop {
   let (offset, packet) = ublox::make_packet_view(gpsbuffer[start..end])?;
   process_optional_packet(packet);
   start += offset;
   end -= offset;
   if end == start {
      start = 0;
      end = 0;
   }
   if end == 128 {
      // Move the array
   }
   end += read_from_spi(&gpsbuffer[start..])?;
}

The trick being that any given packet has to be a continuous chunk in memory.

You're right though, that some sort of macro or derive crate would have to be created to build all the views. I don't know much about building macros in Rust (just a little bit about macro_rules), but I've been looking for a good opportunity to learn :)

@Dushistov
Copy link
Contributor Author

Dushistov commented Jan 29, 2020

@lkolbly

For sending I don't see why use references, that's why I describe two types: Foo and FooRef,
the both have the same "getters", but one own memory while other just reference.
The second one will reference memory inside Segmenter, so I don't see any issue with "continuous chunk in memory".

Like:

struct NavPosLLH {
  repr: [u8; 28 + 2/*checksum*/ + 4/*header*/],
}
impl NavPosLLH {
  fn bytes(&self) -> &[u8] { &self.repr }
  fn lat(&self) -> f64 { ... }
}

struct NavPosLLHRef<'a> {
   repr: &'a [u8],
}
impl<'a> NavPosLLHRef<'_> {
  fn lat(&self) -> f64 { ... }
}

If you some strange reason want to send Nav-PosLLH you need NavPosLLH,
if you parsing parsing data you get NavPosLLHRef from Segmenter::consume_all, which match header,
check length, check checksum and only then construct NavPosLLHRef.

Though may be better Segmenter::consume_one. Because of Iterator have some issues with references, but I suppose it is possible to know only after draft implementation.

So I don't see any issues with "continuous chunk in memory".

but I've been looking for a good opportunity to learn

I don't think that with syn/quote you have to learn something worth to mention.
I fixed some bugs in:
https://github.com/yangby-cryptape/rust-property
and
https://github.com/Hoverbear/getset

In general you just:

let ast: DeriveInput = syn::parse(input).expect_or_abort("Couldn't parse for getters");

and you already has analog of Vec<Field> after that you simply have to just generate Rust code that you want.

@lkolbly
Copy link
Collaborator

lkolbly commented Feb 1, 2020

I've spent a bit of time poking around with syn/quote, I agree it won't be too hard to make a handy little macro to generate getters. Maybe I can even get it to handle bitfields:

#[ubx_packet]
struct MyPacket {
   #[ubx_bitfield(2)]
   fielda: u32, // Two-bit field

   #[ubx_bitfield(4)]
   fieldb: u32, // 4-bit field immediately after fielda
}

Or something.

The second one will reference memory inside Segmenter, so I don't see any issue with "continuous chunk in memory".

Inconveniently, Segmenter internally uses a circular buffer to store data, so it's possible that a packet straddles the end of the circle (and thus is made up of two discontinuous chunks of memory).

Segmenter could be changed to not use a circular buffer, but that would mean moving data around as it gets parsed. (packet A arrives, packet B arrives, packet A is read, packet B must now be moved to where packet A was so that packet C can arrive)

The Ref type could conceivably be circular-buffer-aware, but I think that would be highly error prone.

Instead of a Segmenter, though, perhaps one could use a function which itself operates directly on a contiguous chunk of memory:

let rx_buffer = read_spi_bus();
let (packet, advance, bytes_required) = ublox::scan_for_packet(rx_buffer);
match packet {
   Some(ublox::NavPosLLHRef(packet)) => ...
}
// If bytes_required > 0, wait for more bytes, unless it would overflow the buffer, in which case use bytes_required to spin until we get the next packet.
// If advance > 0, shuffle rx_buffer around a bit.

It's a bit unwieldy of an API, since you're doing all your own buffer management, but that might be fine if you're in a particularly restricted environment.

To your point though, it would be convenient for sending packets, since you could build them in-place in a transmit buffer, if the structs all had setters.

@Dushistov
Copy link
Contributor Author

@lkolbly

Segmenter could be changed to not use a circular buffer, but that would mean moving data > around as it gets parsed.

Actually it can be circular buffer, but not for bytes, but for packets.
Something like VecDeque<Vec<u8>>, so one Vec<u8> per packet.
After "adoptation" to GPS receiver sending speed and CPU processing speed,
it should stop memory allocation and almost every time reuse already allocated memory for Vec<u8> and VecDeque

@Dushistov
Copy link
Contributor Author

@lkolbly

I made prototype to benchmark different parser implementations.
As input I take 188 megabytes of logs from real ublox, split into random size chunks from 1 to 100,
and use three implementation to parse this logs by chunks and compare time and parse result (to make sure that all implementations works correct).

Three implementations:

  1. Segmenter from master
  2. Super naive implementation: it parse one byte at time, when package ready it clear buffer and create suitable type by copying data like Segmenter
  3. Vec as buffer and zero-copy view of packets (as described above),
    time to time it uses Vec::drain to memmove rest of bytes to begin of buffer.

I measure time with clock(3) functions from libc and run benchmark several times:

  1. 0.75556 seconds
  2. 0.596255 seconds
  3. 0.199282 seconds

So I suppose I start work with parser that uses (3) as main strategy,
it looks good enough.

Dushistov added a commit to Dushistov/ublox that referenced this issue Feb 23, 2020
This need for future steps, like separate protocol parser/generator,
and code generators
ref ublox-rs#1
@lkolbly
Copy link
Collaborator

lkolbly commented Feb 23, 2020

Cool! I did end up playing around a bit with syn/quote, and wrote up some of the packets to that. It includes generating views into packets, so I think it wouldn't be too hard to write Packet::deserialize so that it returns a packet view.

I haven't had a chance to port all of the packets to this system, but I pushed what I have done.

@Dushistov
Copy link
Contributor Author

@lkolbly

I haven't had a chance to port all of the packets to this system, but I pushed what I have done

Great, I need to start communicate with my ublox receiver ASAP.
So I am going to combine our discussion into real code, and even partly implemented derive macro will be not bad help.

@lkolbly
Copy link
Collaborator

lkolbly commented Feb 23, 2020

The derive macro should be able to represent any packet, except I think ones with strings (or any generic array of bytes). If you need to parse one of those I might be able to whip something up later (I'm out most of today though), but also just handwriting something wouldn't be too bad.

@Dushistov
Copy link
Contributor Author

@lkolbly

The derive macro should be able to represent any packet

There are many things that were out of our discussion, but nice to have.
For example, to reduce code size generation I suppose we need two derives, like Serialize, Deserialize:
something like UbxProtocolSend, UbxProtocolRecv, so we can generate code only for receiving or only for sending (I suppose only several packets need both).

Also it would be nice to have mapper functions, like:

#[derive(UbxProtocolRecv)]
struct NavPosLlh {
   #[ubx_protocol(map= "lat_degree")]
   lat: i32,
}

fn lat_degree(lat: i32) -> f64 {
   f64::from(lat) * 1e-7
}

and as result it generate getter for NavPosLlhRef that not only extract i32 field,
but also will apply mapper function and result type of getter will be not i32 but f64.
And name will be lat_degree. fn lat_degree(&self) -> f64.

This is because of scaling via 1e-7 is part of protocol, so I don't see why user of crate should care about this.

I suppose I will add missed thing during work with this crate and make PR.

@lkolbly
Copy link
Collaborator

lkolbly commented Feb 23, 2020

As it is now, it generates both a Ref view and a MutRef view, so as long as you have a buffer to write into, you can set fields as desired. It generates getter and setter traits as well.

I like the mapper idea, although Position and Velocity implement From traits for a few different packets where's it relevant (ex. NavVelNED), so you can use that as well. Ultimately I want more stuff exposed in the higher-level abstractions, but for now you should be able to get away with those without having to care about the protocol's scaling factors.

Dushistov added a commit to Dushistov/ublox that referenced this issue Mar 11, 2020
Features:
- removes unnecessary dependicies
- more type safe code
- faster parser: 3x
fixes ublox-rs#1
Dushistov added a commit to Dushistov/ublox that referenced this issue Mar 12, 2020
Features:
- removes unnecessary dependicies
- more type safe code
- faster parser: 3x
fixes ublox-rs#1
Dushistov added a commit to Dushistov/ublox that referenced this issue Mar 13, 2020
Features:
- removes unnecessary dependicies
- more type safe code
- faster parser: 3x
fixes ublox-rs#1
lkolbly pushed a commit that referenced this issue Mar 24, 2020
* Use own proc macro to generate code to deal with packets

Features:
- removes unnecessary dependicies
- more type safe code
- faster parser: 3x
fixes #1

* make all names in ubx_packets starts with Ubx for uniformity

* export packet size as constant

Usefull for SPI, where we need provide read buffer simultaneously
with write buffer

* feature: make possible to print (+validate) extension field in MonVer

* feature: new packages: Nav-Dop, Nav-Sol, AckNak

* feature: nice method to check if ack for certain message

* fix: generted code for handling i8 field in ubx_packet_recv

* cleanups/typo fixes plus formating

* fix several clippy warnings

* parser/refactoring: simplify work with &data[pos..]

* parser_tests/refactroing: split big test into small one

* make possible to reuse ublox dump test

* feature: automatically calculate max packet size for parser

* refactoring: rename CfgMsg(3|8) -> CfgMsg(Single|All)Port(s)

* feature: Debug implementation for FixStatusInfo

* ublox_derive/src/input refactoring: use "use syn::Error"

* ublox_derive: check that rest_handling not already defined

* refactoring: simplify generate_send_code_for_packet

* fix comment grammar

* ublox_derive/src/input: improve error message

* fix tests

* refactoring: usage of fold, plus more appropriate names
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants