Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DAC build 8 timing data readback #299

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add support for DAC build 8 timing data readback #299

wants to merge 2 commits into from

Conversation

patzinak
Copy link

@patzinak
Copy link
Author

To potential reviewers:

It appears not that much of the original code that is needed to read the timing data from DAC build 8 has been removed.

This enhancement fix should be backward compatible since the added code is logically protected by isinstance(runner, dac.DacRunner_Build8) check. If necessary, the readback functionality for other old builds could be restored by a trivial change to, for example, isinstance(runner, (dac.DacRunner_Build8, dac.DacRunner_BuildX)) (of course, in such case, it is better to move the tuple out into dac.py module as a predefined constant).

The only nontrivial change, is the following two lines:

if isinstance(runner, dac.DacRunner_Build8):
    extracted = (extracted,)

Without this a few errors are thrown from the pylabrad package modules without any explicit links to the server methods that caused them (one of these cases, when the error traceback is pretty confusing). I guess that the original error is raised due to the improper usage of LabRAD types or something like that but I am not aware of anything that fits better than '*i' for the corresponding run_sequence setting. This semi-hack was inspired by the ADC extract methods. Alternative approaches to this two-liner fix, apart from modifying the LabRAD type mask, would be to modify extract method(s) for the DACs (this seems to be less backward-compatible) or put the hack into run_sequence setting (the purpose of the hack will be even more obscure).

The disadvantages of the current implementation is an added extremely small computational overhead and somewhat loosen type casting requirements for the run_sequence method return.

The server has been tested in an actual setup configuration that includes DACs build 8 and ADC build 7.

Finally, I haven't touched the server version number but the change probably justifies bumping the last digit.

Hope this helps.

@maffoo
Copy link
Contributor

maffoo commented Jan 5, 2016

@patzinak, thanks for this change. I'm looking at the code in more detail, but have two concerns right off the bat. First, I think we should enforce that the boards from which data is being read back are either all DACs or all ADCs. Returning data as a single rectangular array of ints doesn't make sense in the mixed case (at best, we'd have to do something like return a cluster of ADC data and DAC data because the shapes wouldn't match), but in any case I don't see anyone wanting to use the boards in a configuration where they're getting both ADC demod data and DAC timing data back at the same time.

Second, even in the DAC timing data case it's possible to run multiple boards at once, so I think the data format returned by run_sequence should be *2i, where the first index runs over boards in the timing order and the second index runs over the sequence repetitions for each boards. Even with one board, you'd still get a 2D array, but containing only a single row. I think we've found it's a bad idea to vary the return type of a setting based on the parameters used to call the setting (e.g. *i if running one board, *2i if running multiple boards); it's much better to keep things as uniform as possible (*2i always). Of course, we already have the problem that run_sequence produces different response types based on whether you are running ADCs or DACs, and in the ADC case also depending on whether you are running in average or demod mode. But we can at least try not to make the situation too much worse :)

Edit: I think I misunderstood the point of *i in your change, which seems to be superfluous since the way you're adding an extra tuple of wrapping around the timing results you'll always get a *3i. I guess my preference would be go ahead and add *2i back in as it was before for returning DAC timing data, rather than try to fit into the shape of ADC data.

@maffoo
Copy link
Contributor

maffoo commented Jan 5, 2016

I worked on this a bit and came up with the following, which includes validating the timing order up front to ensure we don't try to mix DACs and ADCs or demod and average mode: 634f498

@patzinak
Copy link
Author

patzinak commented Jan 9, 2016

@maffoo, first of all, thank you for putting your time in this particular issue. We as a research group, certainly appreciate any efforts on your side that will allow us to seamlessly benefit from any future updates in this repo. Also, though this is not specifically related to the issue, we are currently in a serious process of improving the reliability and measurement efficiency of our setups. I'm going to open a few new issues here. Some questions may sound naive but we try to do our homework: at first we try to find out whether a problem has been already addressed somewhere, if not, one or two of us try to find a solution, and when we have a some sort of consensus in the group on the nature of the problem, we post an issue here.

Specifically to this issue, your first concern is something that I have actually thought about and we did in fact at some point had a chip that we intended to characterize running Preamp/DAC and ADC boards simultaneously. The chip turned out to have some fabricational issues but the idea of a mixed experiment (Preamp/DAC readout and ADC demodulation) does not sound crazy for us (think of a qubit coupled to two resonators). Having said this, I am not familiar with the hardware/Ethernet technical details to judge whether this is simply not an intended usage scenario or this is just something that nobody yet wanted to do. My short answer is that we would like to see the more general case implemented but this is not of the highest priority at the moment.

As for the second concern, I was a bit confused about the overall vision for the way of returning structurally different data sets. The things makes more sense to me now. Your are right, I intended to do *i but couldn't trace down an issue that I had and came up with a hack that worked. I totally agree that *2i as I understand this now is a better approach: we occasionally do experiments when several DAC boards return the data.

I've looked through your modified code. It looks good to me. If I have a chance I'll test it next week and let you know how well it works for us.

Thanks again!

Addition: What is the problem with returning always *4i in a way such that the first dimension corresponds to the board index? When the discrepancy exists between lengths in one or another dimension the shorter dimensions is padded with an equivalent of numpy.nan if this equivalent exists. It is assumed that the user knows the order of the boards and is able properly interpret the data.

@maffoo
Copy link
Contributor

maffoo commented Jan 25, 2016

@patzinak, I think if we want to support the mixed case of returning both DAC timing data and ADC demod data in one experiment, it would be better to return a cluster like (*4i, *2i) where the first element contains the ADC results and the second contains the DAC results. That seems better to me that putting all the DAC data into one array with special padding that the user had to know how to ignore. One problem is that there is no equivalent to NaN for ints, since every bit pattern is a valid integer, so we'd have to pick a special sentinal value; and of course all that padding still takes up space on the wire so it'd be rather less efficient.

Supporting the mixed case like this would be fairly easy, but I think given that the ADC and DAC data are in two lists we'd probably want to have the user specify to separate "timing order" lists corresponding to these two entries. Again, not too hard, but would probably want to create new settings for this rather than further complicating the existing settings.

@pomalley
Copy link
Contributor

Sorry to jump in late here, but I seem to remember spending a lot of time getting the qubit sequencer to support simultaneous DAC and ADC readback (this was back when we were first developing resonator readout). So I'm guessing that the FPGA server must have supported it, too. Was it removed at some point?

@maffoo
Copy link
Contributor

maffoo commented Jan 26, 2016

If it ever worked in the qubit sequencer of fpga server, it must have been removed from both. The problem in both places is the same, namely that the "shape" of data returned by DACs and ADCs is not the same, so they can't really be combined into one rectangular array, which is what we currently return.

@patzinak
Copy link
Author

Yes, I see now and agree that padding is not a great option. I also don't really have a strong opinion on this issue as long as the interface is easily comprehensible.

Ideally, however, it would be great if:

  1. None of the functionality is ever removed unless there is a very serious reason for it.
  2. If the backward compatibility is broken it is a good practice to write a document/a wiki page/etc. with examples such as "if you were doing /that/ you should be now doing /this/ and if you were doing /that/ you should be now doing /this/"...
  3. Clustering data in general is pretty confusing, it is assumed the user code somehow keeps track of the board orders. This is not that hard per se. However, the interface could be more transparent if the run_squence and data retrieval are split into a couple of settings. For example, this seems to be very intuitive:
p.run(reps=1020)
p.data('Board Group 1 DAC 3', key='dac3')
p.data('Board Group 1 ADC 7', key='adc7') 
p.data('Board Group 1 ADC 8', key='adc8') 

Not sure weather relying on contexts here is a good idea from the performance standpoint. Yet, if the performance is an issue than Python is not the best option... On the other hand, our group experience with the GHz FPGA boards is definitely not hassle-free — for us reliability and transparency would be of a higher priority. For now, however, we would be more than happy just to see the critical gaps eliminated between our code repositories.

@DanielSank
Copy link
Member

1. None of the functionality is ever removed unless there is a very serious reason for it.

Agreed.

2. If the backward compatibility is broken it is a good practice to write a document/a wiki page/etc. with examples such as "if you were doing /that/ you should be now doing /this/ and if you were doing /that/ you should be now doing /this/"...

I think the proper place for that information is in the code. Trying to sync wiki articles or other external media with the code adds unnecessary complexity.

3. Clustering data in general is pretty confusing, it is assumed the user code somehow keeps track of the board orders.

Shouldn't the doc string for run_sequence be the one-stop-shop for understanding the data returned by run_sequence? Presently, the doc string for run_sequence says

Returns:
    If ADC boards all in average mode, data returned as a *3i. The three
    indices are:
        (board index in timing order, I/Q, time sample index).

   If ADC boards all in demodulate mode, data returned as a *4i.
    The four indices label:
        (demod channel, stat, retrigger, I/Q).
    retrigger indexes multiple triggers in a sequence.

   If only DACs present, we return no data.

   ADC boards must be either all in average mode or all in demodulate mode.

The obvious formatting issues aside, if you could suggest an improvement I am happy to put it in.

p.run(reps=1020)
p.data('Board Group 1 DAC 3', key='dac3')
p.data('Board Group 1 ADC 7', key='dac7') 
p.data('Board Group 1 ADC 8', key='adc8') 

How about

timing_order = ['DAC 1', DAC 2']
fpga.timing_order(timing_order)
result = fpga.run_sequence(whatever)
result_by_board_name = dict((board_name, data)
                             for (board_name, data) in zip(timing_order, result))

@patzinak
Copy link
Author

I think the proper place for that information is in the code. Trying to sync wiki articles or other external media with the code adds unnecessary complexity.

I totally agree — I'd personally love to see everything in the doc strings and in the head comments. Wikis should be reserved for the higher level examples: i.e. tutorials on integrating multiple modules, code structure, vision and so on. However, some people may object having examples that are related to the previous versions when the backward compatibility is broken. In such cases, designated wiki pages could be actually helpful: they should refer to specific code version transitions and must not be updated later (apart from error/typo elimination) because both, the old and new versions, are a part of the history once the transition is implemented. Because of this the information is never outdated: as the time goes it becomes just less useful but is still correct.

Shouldn't the doc string for run_sequence be the one-stop-shop for understanding the data returned by run_sequence?

It could be, I don't have a strong opinion here. The problem is how to return the heterogeneous data from a single setting. Seems like two ADC modes and DAC timing data create some mess, otherwise we wouldn't have this discussion. But honestly, I'm okay with this mess. I would however argue that doc string is not perfect.

Returns:
    If ADC boards all in average mode, data returned as a *3i. The three
    indices are:
        (board index in timing order, I/Q, time sample index).

Looks great to me!

   If ADC boards all in demodulate mode, data returned as a *4i.
    The four indices label:
        (demod channel, stat, retrigger, I/Q).
    retrigger indexes multiple triggers in a sequence.

What is stat stands for? Let's think... What was the reason not to keep I/Q as the second dimension? Not clear... Not perfect but okay.

   If only DACs present, we return no data.

This was not previously the case. What was the reason to remove the DAC support? What should I do if I need to collect data from DACs? In an ideal world, I'd expect to see "If you need to acquire the DAC data use version x.y.z of this server." or "The DACs support was removed starting from version x.y.z because the DAC builds that return timing data do not support XYZ protocol.'

   ADC boards must be either all in average mode or all in demodulate mode.

"because XYZ communication protocol does not allow this..." or "The mixed mode case is not implemented due to the absence of any real need."

If the example from the previous comment works with

timing_order = ['DAC 1', 'ADC 2']
...

it will be really-really great.

@DanielSank
Copy link
Member

However, some people may object having examples that are related to the previous versions when the backward compatibility is broken.

If we make a breaking change I see no problem to note that in the doc string, at least for a while. When it gets to the point that the change is far enough in the past then it's easy enough to remove a section of documentation.

In such cases, designated wiki pages could be actually helpful: they should refer to specific code version transitions and must not be updated later (apart from error/typo elimination) because both, the old and new versions, are a part of the history once the transition is implemented. Because of this the information is never outdated: as the time goes it becomes just less useful but is still correct.

I think a note in the documentation of the relevant code would be considerably more direct and a lot easier to maintain.

Clustering data in general is pretty confusing, it is assumed the user code somehow keeps track of the board orders.

Shouldn't the doc string for run_sequence be the one-stop-shop for understanding the data returned by run_sequence?

It could be, I don't have a strong opinion here. The problem is how to return the heterogeneous data from a single setting. Seems like two ADC modes and DAC timing data create some mess, otherwise we wouldn't have this discussion. But honestly, I'm okay with this mess. I would however argue that doc string is not perfect.

I'm trying to respond to one thing at a time. You said that clustering data is confusing because it assumes that the use code somehow keeps track of the board orders. I pointed out the doc string of run_sequence because that doc string makes it clear that the user has to have decided on that ordering anyway; it's not an extra assumption. This point is independent of the issue of inhomogeneous data.

If the example from the previous comment works with

timing_order = ['DAC 1', 'ADC 2']
...

it will be really-really great.

I was not trying to suggest that this actually works in the current version of the code. I was responding to your suggested code wherein each board's data is "fetched" via an independent call to the fpga server. In other words, in pursuing support for the data type you guys want, I think we should aim for an interface that looks like the example I posted, rather than one which involves a separate call to get each board's data.

@patzinak
Copy link
Author

I should have made my point clearer and should have specified in the comment that starts with

Yes, I see now and agree that padding is not a great option...

that the comment mostly refers to:

Supporting the mixed case like this would be fairly easy, but I think given that the ADC and DAC data are in two lists we'd probably want to have the user specify to separate "timing order" lists corresponding to these two entries. Again, not too hard, but would probably want to create new settings for this rather than further complicating the existing settings.

In fact in one of the earlier comments I said:

It is assumed that the user knows the order of the boards and is able properly interpret the data.

implying that keeping track of the board orders is not that hard and could be still assumed in some of the proposed implementations. You can say that I'm contradicting to myself while I consider this to be a discussion of different possibilities.

The main point was the following: if some settings are going to be introduced in order to simplify the inhomogeneous data transfer why not to simply the interface? Note, all this could be done by keeping as it is (or improving) the current run_sequence setting interface, i.e. by introducing new settings that are some sort of wrappers around run_sequence.

Again, this is just an idea. You guys are in a better position to make any decision.

As for doc strings, as I said I'm in strong favor to keep everything close to the code, i.e. in the doc-strings. I think we are totally on the same page here.

@patzinak
Copy link
Author

Btw, totally unrelated,

result_by_board_name = dict((board_name, data)
                             for (board_name, data) in zip(timing_order, result))

is arguably less readable than

result_by_board_name = {board: result[k] for k, board in enumerate(timing_order)}

@DanielSank
Copy link
Member

Yes it is more readable. Please make a PR.

I probably didn't know about dict comprehensions when that code was written. Heck, they may not have been in the language at that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants