Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add guard against long topics in (un)subscribe #164

Open
wnelis opened this issue Jan 16, 2025 · 10 comments
Open

Add guard against long topics in (un)subscribe #164

wnelis opened this issue Jan 16, 2025 · 10 comments

Comments

@wnelis
Copy link

wnelis commented Jan 16, 2025

In file mqtt_as/__init__.py, in both methods subscribe and unsubscribe, the field remaining length is encoded with a single octet. Lines 536 and 560 both read:

    struct.pack_into("!BH", pkt, 1, sz, pid)

The (implicit) assumption is that the payload, a topic in both cases, is short enough to have the message fit in at most 127 octets. A check to see if this assumption is correct, might be needed.

Personally, I would prefer a check like:

    assert sz < 128, "Topic too long"
    struct.pack_into("!BH", pkt, 1, sz, pid)

Once the application using this module is tested and it is known that topics in use are not too long, the assert-statements can be effectively removed by increasing the optimization level of microPython, thus decreasing the size of the byte-code.

@bobveringa
Copy link
Contributor

The assert statement would at least prevent users from getting confused when using longer topic lengths.

But looking at the spec, and other elements of the code, I would argue for just supporting any length for (un)subscribe. The logic is simple, and (un)subscribes happen rarely enough that it shouldn't really impact performance. The downside is that it cannot be optimized away with MicroPython optimization levels. In my opinion, this small tradeoff is worth it for being able to use any length topic.

@wnelis
Copy link
Author

wnelis commented Jan 17, 2025

As a side note: if methods (un)subscribe are to support any length topics, encoding (and sending?) field remaining length will probably be implemented in a separate method. If so, this new method can also be used in method connect. In that case a cosmetic change in the latter method could be implemented, by effectively changing (at line 309)

    premsg = bytearray(b"\x10\0\0\0\0\0")
    msg = bytearray(b"\x04MQTT\x00\0\0\0")

into

    premsg = bytearray(b"\x10\0\0\0\0")
    msg = bytearray(b"\0\x04MQTT\x00\0\0\0")

(which requires an update of all indices in byte-array msg). The new method would replace code snippet (at line 336)

        i = 1
        while sz > 0x7F:
            premsg[i] = (sz & 0x7F) | 0x80
            sz >>= 7
            i += 1
        premsg[i] = sz
        await self._as_write(premsg, i + 2)

in which the last line looks like a coding error, but is not with the current definitions of the aforementioned byte-arrays.

@bobveringa
Copy link
Contributor

Seems like you have a good handle on the situation. I wish I could give more input at this time, but I am leaving for a holiday soon and do not have time at the moment. I can investigate further (or implement the feature) in about 2 weeks.

@peterhinch
Copy link
Owner

peterhinch commented Jan 18, 2025

[EDIT]
I have now spent some time with the specs for V3.1.1 and V5 and I think we have a few problems with [un]subscribe.

The Remaining Length byte sz

In V3.1.1 this is a single byte value <= 127 (limiting allowable topic length).
In V5 this is 1-4 bytes encoded as a Variable Byte Integer (spec 1.5.5) as per @wnelis. We need to address this.

UNSUBACK

This is currently not handled; unsubscribe causes the connection to drop after a period of waiting.

Suggested Plan

I will fix the V3.1.1 bugs, namely

  • Unsubscribe error as per original post.
  • Guard on topic length.
  • Handle UNSUBACK for V3.1.1 (will require adapting for V5).
  • Add code comments re possible V5 issues (labelled TODO).

I will also refactor to remove code duplicated in the [un]subscribe methods.

@bobveringa

By the time you return I will have pushed an update and reported back. Please could you check my observations on the V5 spec and when you get time submit a PR?

@wnelis
Copy link
Author

wnelis commented Jan 18, 2025

The Remaining Length byte sz

In V3.1.1 this is a single byte value <= 127 (limiting allowable topic length).

The specification I've used is taken from https://mqtt.org/mqtt-specification/. I read in the specification of v3.1.1 that field remaining length is always encoded in a variable length, see section 2.2. In the description of the subscribe control message, section 3.8.1, there is no mention of encoding the length in a single octet. The same is true for the unsubscribe, section 3.10.1. Moreover, both allow for multiple topics to be included in the payload, increasing the probability that the payload will be longer than 127 octets. Thus there seems to be no difference between v3.1.1 en v5 with respect to encoding the length of the payload.

@peterhinch
Copy link
Owner

The V5 spec 3.8.1 is specific:

This is the length of Variable Header plus the length of the Payload, encoded as a Variable Byte Integer.

However I've not spotted anything to that effect in the V3.1.1. It would simplify the code if both versions use VBI and remove the need for a guard. Please can you point me at a specific statement to that effect in the V3.1.1 spec?

@bobveringa
Copy link
Contributor

bobveringa commented Jan 19, 2025

I only have access to my phone at the moment (and MQTT docs are not ideal to read on a phone)

But I think section 2.2.3 in the v3.1.1 spec is clear on the matter.

The Remaining Length is the number of bytes remaining within the current packet, including data in the variable header and the payload. The Remaining Length does not include the bytes used to encode the Remaining Length.

I think that it is fair to assume this applies to all packets. The only reason to doubt this is that a lot of packets say byte 2... - remaining length but for subscribe it just says byte 2. But given section 2.2.3 I think this is just a visualization error.

EDIT: I just looked at the source code of paho.mqtt (python) and it seems like they use a variable length header for (un)subscribe on both v5 and v3.1.1.

@wnelis
Copy link
Author

wnelis commented Jan 20, 2025

Using paho mqtt client, version 2.1.0, selecting MQTT version 3.1.1, a subscription to a topic with a name of 131 (ASCII) characters was performed. The subscription was successful. Using tcpdump and wireshark the network packets are captured and analyzed. Wireshark showed a remaining length of 136 octets, encoded in two octets as 0x88 0x01, in the subscribe control message.

@peterhinch
Copy link
Owner

I now have a build which supports long topics, both for publication and subscription. This works under V3.1.1 and V5.

A historical note.

mqtt_as was conceived as an adaptation of the official library on the assumption that the implementation of the V3.1.1 protocol was correct. This bug originated here. It's presumably never been spotted because topics in microcontroller applications tend to be short.

@peterhinch
Copy link
Owner

I have raised this issue against the official library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants