ECC in OpenPGP (as of March 2021)

In this bug report, I describe how we can improve OpenPGP specification for ECC, so that different implementations can inter-operate well, and an implementation can avoid a bug. Here, my focus is ECC.

Firstly, I point out the single issue: Opaque Data Element should be available in OpenPGP. I name it Strange Octet String (SOS).

Secondly, I explain about where we should use SOS to represent ECC things in OpenPGP.

Thirdly, I describe how ECC things were defined in OpenPGP. And I explain how these were implemented in GnuPG for classic ECC, then modern ECC (Ed25519 and Curve25519).

Fourthly, I describe my bug in GnuPG 2.3-beta regarding Ed25519 key and signature, which I'm fixing now.

Fifthly, I explain about the implementation of handling Ed448 signature and X448 encryption in GnuPG 2.3-beta, with SOS.

Lastly, I conclude that SOS may be the best apporach for OpenPGP.

1: Opaque Data Element should be available in OpenPGP

RFC 4880 and RFC 6637 were good, when it was common to represent data in big-endian format.

These days, in modern ECC, things are defined in little-endian format.

In this situation, it's good to have a data element which can represent something in native format (or anything, hopefully).

Don't repeat, just refer

It's good for a specification to be able to simply refer another specification when it uses technology of latter.

When we repeat some definitions in a specification from another specification, something wrong may happen. It's good when we can avoid this.

Strange Octet String (SOS)

Here, I propose an opaque data element in OpenPGP, named Strange Octet String (SOS).

I had tried introducing straight-forward Octet String to OpenPGP (length in octets plus an opaque string), but for backward compatibility, I concluded that "strange" one is better, instead.

It is defined as:

An SOS consists of two pieces: a two-octet scalar that is the length of the SOS in bits followed by an opaque string of octets.

Most important point is that no semantics is given for the data element, but it is up to underlying crypto algorithm. (-: down to?)

Another point is that it has exactly the same structure of MPI, thus, every MPI can be handled as an SOS.

When an SOS data object is handled as an MPI, it would look "strange" as it may include preceding zeros, which is not allowed in well-formed MPI.

2: Let us use SOS for ECC things

I'd like to propose using SOS for ECC things. We can use SOS for following data.

  • EdDSA Signature
    • R: to represent an EC point, use SOS
    • S: to represent integer, use SOS
  • EdDSA Public Key
    • For an EC point, use SOS
  • ECDH Public Key with modern Curves
    • For an an EC point, use SOS
  • EdDSA Secret Key
    • Just use SOS
  • ECDH Secret Key with modern Curves
    • Just use SOS
  • ECDH Ephemeral Public Key with modern Curves
    • For an an EC point, use SOS

(If needed,) existing definitions for ECC can replace MPI to SOS.

3: ECC in RFC 6637, Ed25519 and Curve25519 in OpenPGP

You'd say, it's a mess. But that's how it's evolved.

There are (somewhat) old standards under SECG (Standards for Efficient Cryptograpy Group), and others which followed that. Those use big-endian format. Curves in this category include: secp256k1, NIST P-256, NIST P-384, NIST P-521, and ones of brainpool. I call an ECC with such curves as "classic".

These days, we have another category called "SafeCurves", which include Curve25519 and Curve448. I call an ECC with SafeCurves as "modern". Those use little-endian format.

Speaking about OpenPGP specification, so far, it only defines classic ECC.

Implementations have already adopted modern ECC, specifically, use of Ed25519 signature and encryption with ECDH Curve25519 are common. Implementations caught 22 for EdDSA algorithm number. It requires new algorithm number, because signature components are defined differently in EdDSA; While EdDSA's (R,S) are an EC point and an integer, ECDSA' (R,S) are two integers.

Note that an implementation like GnuPG has infrastructure which had written in days of big-endian format of DH, ElGamal, DSA, and RSA.

Naturally, classic ECC had introduced to GnuPG using the infrastructure of big-endian format.

After that, Ed25519 signature was implemented in GnuPG. Then, ECDH with Curve25519 was implemented.

Classic ECC with MPI

In RFC 6637, it is defined using MPI as:

9. Encoding of Public and Private Keys
[...]
   Algorithm-Specific Fields for ECDSA keys:
[...]
      o  MPI of an EC point representing a public key
[...]
     Algorithm-Specific Fields for ECDH keys:
[...]
         -  MPI of an EC point representing a public key
[...]
     Algorithm-Specific Fields for ECDH or ECDSA secret keys:
[...]
      o  an MPI of an integer representing the secret key, which is a
         scalar of the public EC point
[...]
10.  Message Encoding with Public Keys
[...]
   Algorithm-Specific Fields for ECDH:
[...]
      o an MPI of an EC point representing an ephemeral public key

And ECDSA signature is effectively defined using MPI as:

Algorithm-Specific Fields for ECDSA signatures:

  - MPI of ECDSA value r.

  - MPI of ECDSA value s.

Ed25519 data format

If I described current format for Ed25519...

Keys are something like this:

9. Encoding of Public and Private Keys
[...]
   Algorithm-Specific Fields for EdDSA keys:
[...]
      o  MPI of an EC point representing a public key with prefix 0x40
         so that we can avoid removing preceding zeros.
[...]
     Algorithm-Specific Fields for EdDSA secret keys:
[...]
      o  an MPI of an integer representing the secret key,
         in little-endian format, removing preceding zeros to be
         well-formed MPI.

Signature are something like this:

Algorithm-Specific Fields for ECDSA signatures:

  - MPI of R of EdDSA, whcih represents an EC point without
    prefix 0x40, removing preceding zeros to make it well-formed
    MPI.

  - MPI of EdDSA value s, an integer in little endian format,
    removing preceding zeros to be well-formed MPI.

Please note that we don't put 0x40 to represent an EC point in R part of signature. Please also note that we have to remove preceding zeros to form well-formed MPI for secret keys (non-encrypted), R and S part of signature.

Curve25519 data format

If I described current format for Curve25519, it's something like this:

9. Encoding of Public and Private Keys
[...]
     Algorithm-Specific Fields for ECDH keys:
[...]
         -  MPI of an EC point representing a public key with prefix 0x40
            so that we can avoid removing preceding zeros.
[...]
     Algorithm-Specific Fields for ECDH secret keys:
[...]
      o  an MPI of an integer representing the secret key, which is a
         scalar of the public EC point (!!big-endian integer!!)
[...]
10.  Message Encoding with Public Keys
[...]
   Algorithm-Specific Fields for ECDH:
[...]
      o an MPI of an EC point representing an ephemeral public key
        with prefix 0x40 so that we can avoid removing preceding zeros.

It's a mixture of little-endian and big-endian, unfortunately.

4: Ed25519 interoperability issue in GnuPG 2.3-beta (currently being fixed)

In GnuPG 2.3-beta, it is handled as SOS, so, we have an issue for Ed25519.

See:

5: Ed448 and X448 in GnuPG 2.3-beta, using SOS

Keys and signatures are simply defined using SOS, there are no prefix, no removal or recovery of zeros.

As described above, it would look a bit strange as an OpenPGP data, if you interpret it as an MPI (it may include preceding zeros), but it's a way simple.

Conclusion

For new things, like Ed448 and X448, it's better to use SOS.

Lastly, let me make an excuse. Curve25519 in GnuPG was implemented in that way, the intention at that time was to align to RFC 6637 and existing implementation (X25519 specification was not yet available).