73
$\begingroup$

Some time ago, the question was asked in chat, why MTProto (Telegram's protocol) is supposedly worse than Axolotl (Signal's protocol) as both protocols have been the inventions of their respective companies, thereby "rolling their own crypto", which by agreement is bad.

At this point, stating that the protocol is "worse" was my action and shall not influence answers.

To get a more complete answer to this widely interesting question I'm hereby asking for a canonical answer that compares MTProto with Axolotl and uses well-defined and well-accepted cryptographic knowledge and standards. A responder may (but does not have to) put a conclusion promoting one protocol over the other.

An answer may include, but doesn't have to or is limited to the following points:

  1. Broken security ("On the CCA (in)security of MTProto" by Jakobsen and Orlandi).
  2. Highly non-standard modes (e.g. IGE).
  3. Highly discouraged modes (e.g. MAC-and-Encrypt)
$\endgroup$
2

3 Answers 3

57
+100
$\begingroup$

Alright, I'll bite.

First, let me propose bounding the discussion to just the core of the protocol.

In particular, let's not get hung up on:

  • Social engineering attacks
  • How broadly the end-to-end encryption is applied (i.e. are all conversations in the app encrypted?)
  • The backgrounds of the inventors and reasoning for inventing the protocol
  • Metadata leakage (i.e. who is talking to whom, which is pretty much the same between the two protocols)

There's a surprising amount of vitriol floating around the internet on those points, NONE of which is germane to the discussion at hand.


MTProto

EDIT: the following is specific to MTProto 1.0. MTProto 2.0 has since been released, in late 2017. Some of the potential problems discussed below have been fixed (ex. they migrated from sha1 -> sha256), while others have not seen updates (ex. MTProto 2.0 still uses IGV). I encourage people to read the new spec (the url is the same; the content has been updated). I would also encourage any sufficiently-motivated, enterprising individual to write a new answer that takes into account updates to the protocols. :)


Let's quickly review the basics of MTProto 1.0:

(Full specification for those interested: spec v1 )

The protocol starts with a standard Diffie-Hellman key exchange. Users can compare a hash of the generated key, exposed as either an image and hex code, for the two parties to exchange over an existing secure channel (preferably, in person). There was a theoretical vulnerability where the original image wasn't large enough (and hence the key-exchange could be man-in-the-middled by someone with sufficient resources), but that's reportedly been fixed by increasing the size of the images / hex printout. The session key ("auth_key") is derived from this original key (see the spec for particulars).

The meat of the specification hinges on the message envelope. Briefly, it takes the payload, prepends a salt and session ID, hashes that (SHA-1), and uses that as the "msg key" (MAC). This also gets mixed with the auth_key (SHA-1 based KDF) to produce a per-message encryption key (256 bit) and initialization vector (256 bit). The salt, session ID, payload, and 0-15 bytes of arbitrary padding are then encrypted with AES in IGE mode. Finally, the auth_key_id, "msg key" (MAC) and encrypted data are concatenated to produce the final message.

Whew. Actually, the diagram from the spec makes much more sense.

MTProto makes what many cryptographers would consider odd choices. You pointed out several in your question.

  • A hash algorithm that's reaching the tail-end of its life (SHA1)
  • An old attempt at authenticated encryption (IGE), of which the authentication part has long-since been broken, but which MTProto doesn't actually use to provide authentication.
  • A discouraged mode (MAC-then-encrypt)
  • Vulnerability to a chosen-ciphertext attack (which may not have been a choice as such).
  • A non-standard padding algorithm (which is to say, "append whatever bytes catch your fancy")

Those are the sorts of decisions that give cryptographers the willies - and rightfully so. They've all been broken in various ways.

However - I'll point out that, just like choosing good primitives doesn't necessarily guarantee a secure protocol, choosing bad primitives doesn't necessarily guarantee an insecure protocol. Anecdotally, bad primitives have probably lead to insecure protocols more often than good primitives have lead to secure protocols, but that's beside the point. Bad primitives can still be secure as long as you side-step the parts of them that are broken.

Let's examine that:

  • Would a break in SHA1 lead to a break in MTProto? (it would probably allow an attacker to forge a single chat message per collision; but we're not quite there yet as of 2016)
  • Does the known problem with IGE lead to a break in MTProto? (no; they're not depending on the authentication)
  • Is MAC-then-encrypt used improperly in MTProto? (no)
  • Does the IND-CCA attack invalidate the security guarantees of MTProto? (the authors admit not, at least tentatively)
  • Does the padding algorithm make MTProto susceptible to any attacks? (technically yes, since it's used in the IND-CCA attack, but again, it doesn't lead to known plaintext-recovery attacks)

Then why do those choices give cryptographers the willies? First because of experience: even if two choices are not problems on the surface, they can sometimes interact in non-intuitive ways to result in real problems. In a sense, problematic choices are anecdotally multiplicative in complexity and probability of breakage. (By analogy, that's an inherent property of high-dimensional design spaces.) Also, cryptographers have to think about lots of unknowns: are nation-state attackers using unpublished vulnerabilities in any of those underlying algorithms? Those are particularly concerning since, again, they can interact badly with other design decisions.


Axolotl / Signal Protocol

The Signal Protocol (formerly known as Axolotl) is, to some degree, an orange to MTProto's apple. It solves different problems, and is concerned with slightly different security model.

For example, Signal is primarily concerned with key derivation and updates/ratcheting (the Axolotl Ratchet, now called the Double Ratchet Algorithm; DRA for short), while MTProto is primarily concerned with the message envelope format, assuming that key exchange has already taken place via fairly generic Diffie-Hellman.

The goal of the Double Ratchet algorithm is to minimize the impact of compromised keys. This article explains how it works better than I ever could, but the gist is this:

DRA takes the best properties of DH ratcheting (used in OTR - the "Off The Record" IM protocol) and symmetric key ratcheting (used in SCIMP, the Silent Circle Instant Messaging Protocol).

Breaking that down:

  • SCIMP uses a (one-way) KDF to derive a new key after every message is sent, and the previous key is promptly forgotten. This way, if the key is ever compromised, it can only be used to decrypt future messages, not past messages.
  • OTR renegotiates DH keys for every round-trip of messages between parties, and thus self-heals compromised keys after a round-trip has taken place. This limits the exposure window, but the window can extend into the past in this case, unlike with SCIMP.

DRA on the other hand, regenerates per-message keys like SCIMP, but also renegotiates the DH keys for every round-trip.

The envelope used in the Signal Protocol is unremarkable (in a good way), when compared with MTProto:

  • Encrypt-then-MAC (using HMAC-SHA256)
  • AES in CTR mode, PKCS5 padding (>= version 3, at least)

No red flags there.

A formal analysis showed that the Signal Protocol (then called TextSecure) is vulnerable to an Unknown Key-Share attack, whereby B can trick A into sending message M to C, such that A believes it has actually sent M to B. The paper suggests a way to mitigate that attack, but I wasn't able to find any information as to whether that change has actually been applied.


Conclusion

In the envelope, Signal uses a simple Authenticated-Encryption (AE) system, whereas MTProto uses a custom wrapper that, through a series of odd choices, doesn't qualify as AE or IND-CCA. Although there are no known plaintext-recovery attacks in MTProto (yet), those properties could well lead to theoretical (or practical) attacks.

For key derivation, Signal has a good story to tell: compromising the current key will only compromise a set of future messages (possibly zero; until the next message from the other party arrives). However, in MTProto, compromising the authorization key (which is the closest equivalent key) allows all past and future messages to be compromised.

Personally, I (as a developer) wouldn't use MTProto except if I needed interoperability with Telegram. The folks behind the Signal Protocol (Open Whisper Systems) have fairly universally been praised for the security of their protocol, which makes it an obvious choice.

$\endgroup$
6
  • 1
    $\begingroup$ Great answer. Not sure I agree that meta-data leakage is not germane to crypto, but clearly the issue is broad enough without including that comparison here. $\endgroup$
    – otus
    Commented Jun 23, 2016 at 4:23
  • 1
    $\begingroup$ @otus - fair point. I reworded that sentence to be a little less contentious. $\endgroup$ Commented Jun 23, 2016 at 12:45
  • 1
    $\begingroup$ @JoshuaWarner Thanks for your solid answer but can you please update the answer to cover the new MTProto 2.0 that was released with Telegram 4.6 in early December 2017. I've read on some infosec blogs that the IND-CCA vulnerability has been fixed. It also now uses SHA-256 instead of SHA1 for hashing, as well as many other things. $\endgroup$ Commented Feb 1, 2018 at 18:34
  • 1
    $\begingroup$ @TaherElhouderi I don't have time at the moment to look in-depth, but I did update my answer to mention the new version and point out that the analysis is out of date, to avoid misleading any newcomers. $\endgroup$ Commented Feb 2, 2018 at 3:46
  • $\begingroup$ Very sweet right up, thank you very much. Would you mind posting anywhere else a blog post comparison with the Keybase encryption model? Now in 2021 there is a big fight on whether Signal or Telegram is the best service to replace the other commonly used one. But what if Keybase was in fact the best platform? $\endgroup$
    – Mathieu J.
    Commented Jan 20, 2021 at 11:21
9
$\begingroup$

Besides Joshua Warner's excellent answer, I do also want to point out that someone has to "roll their own crypto" at some point for there to be any designs and implementations at all.

On that front, the Signal protocol was chiefly designed by Moxie Marlinspike, a cryptography and security researcher with a solid track record and credited with multiple published novel attacks against real-world cryptosystems. MTProto (the protocol behind Telegram) appears to have been developed chiefly by Nikolai Durov, a mathematician who (to my knowledge) has no public background in cryptography.

While neither of these points speaks directly to the security of either protocol, it's generally considered a positive thing to have protocol designers and implementers who have prior experience with both attack and design.

$\endgroup$
0
0
$\begingroup$

The differences between the protocols have been very well answered above, and there's little else to add. But I feel it would be amiss not to mention two related topics that should be borne in mind by someone taking the comparison seriously.

First, assuming that the messages are secure in transit, a logical place for a determined attacker to attempt to intercept messages is at either end of the communication channel; that is, by attempting to hack the app, or the OS hosting the app. A discussion of the relative potential attack surface of the main implementations of each protocol (the Signal app, and the Telegram app) - is out of scope, but interesting. Simple apps with fewer features have less to attack.

Second, an important factor in choosing the right messaging protocol for you may be its adoption rate. The more people using a service, and the more normal this behaviour is in aggregate, and harder any one message becomes to pick out from the crowd, and the less suspicious. And, of course, an app that nobody else uses is only good for the lonely.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.