Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BitString decoding is ambiguous in version >=2.5.0 #276

Open
emlun opened this issue Mar 1, 2023 · 2 comments
Open

BitString decoding is ambiguous in version >=2.5.0 #276

emlun opened this issue Mar 1, 2023 · 2 comments

Comments

@emlun
Copy link

emlun commented Mar 1, 2023

Hi! I'm encountering an issue with the BitString decoding since version 2.5.0. Specifically, the issue was introduced in commit 18b3b7d, which strips the "unused bits" byte and shifts the unused bits from the end to the beginning of the bit string.

The issue is that ASN.1 BitStrings are big-endian, and the Decoder doesn't output the number of unused bits, so as far as I can tell there's no way to tell which of the bits in the output corresponds to which bit in the encoded BitString. Here's a practical example:

The FIDO U2F Authenticator Transports Extension is an X.509 extension defined thus:

-- FIDO U2F certificate extensions
id-fido-u2f-ce-transports OBJECT IDENTIFIER ::= { id-fido-u2f-ce 1 }

fidoU2FTransports EXTENSION ::= {
  WITH SYNTAX FIDOU2FTransports
  ID id-fido-u2f-ce-transports
}

FIDOU2FTransports ::= BIT STRING {
  bluetoothRadio(0), -- Bluetooth Classic
  bluetoothLowEnergyRadio(1),
  uSB(2),
  nFC(3),
  uSBInternal(4)
}

An extension value representing the list [uSB(2), nFC(3)] is encoded in DER as: 03 02 04 30, representing the bit string 0011 .... with the 4 unused bits written as ..

Now consider this example script:

import asn1


def show_bits(bstr):
    return [f"{b:08b}" for b in bstr]


ext_value = bytes([0x03, 0x02, 0x04, 0x30])

dec = asn1.Decoder()
dec.start(ext_value)
tag, value = dec.read()

print(ext_value.hex(), show_bits(ext_value))
print(value.hex(), show_bits(value))

In asn1 version 2.4.2 and earlier, this outputs:

03020430 ['00000011', '00000010', '00000100', '00110000']
0430 ['00000100', '00110000']

and you can extract the bit flags by identifying bit i in the spec with the bit 1 << (8 - i - 1) in the second byte of value.

In asn1 version 2.5.0 and later, this outputs:

03020430 ['00000011', '00000010', '00000100', '00110000']
03 ['00000011']

Now the decoder doesn't return now many bits are unused, so there seems to be no way to tell if the shifted bits in the outputs represent [uSB(2), nFC(3)] (bit string 0011 ....) or [bluetoothRadio(0), bluetoothLowEnergyRadio(1)] (bit string 11.. ....), or any other combination. Had the bits been reversed, you could identify bit i in the spec with the bit 1 << i in the output, but the output is still big-endian so that's not possible without also extracting the "unused bits" value from the original input.

Am I missing something?

I think the decoder needs to either also return the number of unused bits, or return some higher-level data model (for example, list[bool] or an accessor class instance) that allows extracting and/or iterating over the BitString flags.

@andrivet
Copy link
Owner

andrivet commented Mar 1, 2023

I do not fully understand the issue, i have to look at it in details and I will. I am currently a little busy with other projects so it may take some time.

@emlun
Copy link
Author

emlun commented Mar 2, 2023

Thanks, no rush! I'd be happy to help design and implement a solution. I'll probably experiment a bit with an accessor pattern, but let me know if you'd prefer a particular direction on how to handle this. I'm not familiar with what conventions there are in the library, but I'll take a look around and try to find something that can fit in.

Also, here's an expanded example (again using the FIDO transports X.509 extension as a practical example) to perhaps better show the nuances here:

import asn1

def show_bytes(bstr):
    return " ".join([f"{b:02x}" for b in bstr])

def show_bits(bstr):
    return [f"{b:08b}" for b in bstr]


U2F_TRANSPORTS = [
    'bluetoothRadio', 'BLE', 'USB', 'NFC', 'USBInternal', 'lightning']

def decode_transports(flags, unused_bits):
    return [
        U2F_TRANSPORTS[i]
        for i in range(8 - unused_bits)
        if flags & (1 << (8 - i - 1))]


examples = [
    bytes([0x03, 0x02, 7, 0b10000000]),
    bytes([0x03, 0x02, 6, 0b01000000]),
    bytes([0x03, 0x02, 5, 0b00100000]),
    bytes([0x03, 0x02, 2, 0b00000100]),

    bytes([0x03, 0x02, 5, 0b01100000]),
    bytes([0x03, 0x02, 4, 0b00110000]),

    bytes([0x03, 0x02, 4, 0b11000000]),
    bytes([0x03, 0x02, 2, 0b00110000]),

    bytes([0x03, 0x02, 1, 0b01100000]),
    bytes([0x03, 0x02, 0, 0b00110000]),
]

for ext_value in examples:
    dec = asn1.Decoder()
    dec.start(ext_value)
    tag, value = dec.read()
    unused = ext_value[2]
    desc = ", ".join(decode_transports(ext_value[3], unused))

    print(f"\nExample: {desc}; {unused} unused")
    print(f"Encoded: {show_bytes(ext_value): <12}    bits: {show_bits(ext_value)}")
    print(f"Decoded: {show_bytes(value): <12}    bits: {show_bits(value)}")

Running with asn1 version 2.5.0, this outputs:


Example: bluetoothRadio; 7 unused
Encoded: 03 02 07 80     bits: ['00000011', '00000010', '00000111', '10000000']
Decoded: 01              bits: ['00000001']

Example: BLE; 6 unused
Encoded: 03 02 06 40     bits: ['00000011', '00000010', '00000110', '01000000']
Decoded: 01              bits: ['00000001']

Example: USB; 5 unused
Encoded: 03 02 05 20     bits: ['00000011', '00000010', '00000101', '00100000']
Decoded: 01              bits: ['00000001']

Example: lightning; 2 unused
Encoded: 03 02 02 04     bits: ['00000011', '00000010', '00000010', '00000100']
Decoded: 01              bits: ['00000001']

Example: BLE, USB; 5 unused
Encoded: 03 02 05 60     bits: ['00000011', '00000010', '00000101', '01100000']
Decoded: 03              bits: ['00000011']

Example: USB, NFC; 4 unused
Encoded: 03 02 04 30     bits: ['00000011', '00000010', '00000100', '00110000']
Decoded: 03              bits: ['00000011']

Example: bluetoothRadio, BLE; 4 unused
Encoded: 03 02 04 c0     bits: ['00000011', '00000010', '00000100', '11000000']
Decoded: 0c              bits: ['00001100']

Example: USB, NFC; 2 unused
Encoded: 03 02 02 30     bits: ['00000011', '00000010', '00000010', '00110000']
Decoded: 0c              bits: ['00001100']

Example: BLE, USB; 1 unused
Encoded: 03 02 01 60     bits: ['00000011', '00000010', '00000001', '01100000']
Decoded: 30              bits: ['00110000']

Example: USB, NFC; 0 unused
Encoded: 03 02 00 30     bits: ['00000011', '00000010', '00000000', '00110000']
Decoded: 30              bits: ['00110000']

Notice how all of these decoded values in the example are ambiguous:

  • 00000001 could mean "any one bit flag is set", which one depends on how many bits were unused
  • 00000011 could mean either [BLE, USB] or [USB, NFC], again depending on how many bits were unused
  • ...and so on with the other examples with different numbers of unused bits.

I should also mention that what I said above:

Had the bits been reversed, you could identify bit i in the spec with the bit 1 << i in the output

is only true in this particular case, where undefined is equivalent to false. But that cannot be assumed true in the general case, and in that case there's no way to distinguish zero bits from undefined bits. The BitStrings 0... ...., 00.. .... and 000. .... all get decoded to 0000 0000, but [False] may not always be equivalent to [False, False, False].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants