In this article, I try to explain a specific way to improve current (as of March 2021) situation of OpenPGP with regards to EdDSA and ECDH with modern curves.

Specifically, I suggest a way to introduce SOS. But even if you won't introduce SOS methodology into your implementation of OpenPGP, this article would be useful to confirm/check/examine corner cases of OpenPGP using EdDSA and ECDH with modern curves.

## SOS definition

It is defined as:

An SOS consists of two pieces: a two-octet scalar that is the length of the SOS in bits followed by an opaque string of octets.

There may be two different interpretations for "the length of the SOS".

- It can be defined as 8 times the length (in octets) of an opaque string of octets.
- It can be defined as the length (in bits) of a big-endian number represented.

By allowing these two interpretations, any MPI can be interpreted as an SOS.

While an MPI has semantics as an integer, an SOS is catch-all thing, leaving semantics to underlying cryptographic algorithm.

## Classic ECC in OpenPGP using SOS

Using SOS, existing specification for classic ECC can just replace the word "MPI" to "SOS". Namely, we will use SOS to represent a scalar and an EC point.

And it may improve the specification by removal of one (minor) mistake.

IMHO, in RFC 6637 (Section 6. Conversion Primitives), we had done the mistake:

The point is encoded in the Multiprecision Integer (MPI) format [RFC4880]. The content of the MPI is the following:

B = 04 || x || ywhere x and y are coordinates of the point P = (x, y), each encoded in the big-endian format and zero-padded to the adjusted underlying field size. The adjusted underlying field size is the underlying field size that is rounded up to the nearest 8-bit boundary.

Here, we abuse an MPI to represent an EC point, although it's not an integer but a composite of two integers.

By using SOS, we can just refer an external standard, like:

(When an OID is one of NIST Curves,) See RFC 8422 (Section 5.4.1. Uncompressed Point Format for NIST Curves), for its semantics.

## Modern ECC in OpenPGP using SOS

We can use SOS to represent a scalar and an EC point.

We can use native format of underlying cryptographic algorithm with SOS.

## Special cares for non-standard-as-of-march-2021 things

To OpenPGP implementation, Ed25519 key and signature was introduced before this definition of SOS. Curve25519 key and encryption by ECDH with Curve25519 was introduced before this definition of SOS, and before the definition of X25519.

Because of that, we need special cares for those data.

### Special care for EdDSA with Ed25519

For keys, existing implementations use the prefix 0x40 to represent an EC point. On the other hand, this prefix is not used to represent an EC point in signature part of R.

Thanks to the prefix 0x40, in key handling as MPI, preceding zeros removal won't occur. But implementations must support preceding zeros recovery for signature part of R and signature part of S.

If SOS were introduced at that time, we had not used the prefix, and handling as an SOS, there had no preceding zeros removal.

For interoperability between existing implementations with MPI handling, implementations of SOS handling still must use the prefix and support preceding zeros recovery for Ed25519.

### Special care for ECDH with Curve25519

For keys, existing implementations use the prefix 0x40 to represent an EC point. In secret key, secret scalar is represented by big-endian integer.

If SOS **and** X25519 were introduced at that time, we had not used
the prefix, and secret scalar had been defined as little-endian
format. And we would call it ECDH with X25519.

For interoperability between existing implementations with MPI handling, implementations of SOS handling still must use the prefix, and secret scalar must be interpreted as big-endian integer.

## Why zero-removal, in the first place?

In the culture of ASN.1, BER requires integer to be represented by minimum octet. So does DER.

I suspect this is one of causes of the problem.

With SOS mindset, it should be underlying cryptographic algorithm (layer) that does zero-removal, if needed. It must not be OpenPGP (layer) that does zero-removal.

## Discussions

The idea of an SOS is focusing OpenPGP format itself to **convey**
data of underlying cryptographic algorithm, and let underlying
cryptographic algorithm to define the data format natively.

We can consider other approaches. Here, we show three approaches.

### Practically easiest

Do just like Ed25519 did for Ed448 (prefix 0x40 for key and no for signature, and zero-removal/zero-recovery). Do just like Curve25519 did for Curve448 (prefix 0x40 for key and big-endian secret scalar).

Pros: for an existing implementation (and possibly many of existing implementations), it might be easier to implement.

Cons: code complexity and quality can be sacrificed. Providing test vectors would require deeper consideration to cover different cases.

### Each data format definition by each curve

We can define each data format by each curve in OpenPGP. For example, data format of signature of Ed448 in OpenPGP can be defined specific for Ed448.

Pros: an implementation may dispatch to underlining cryptographic algorithm routine naturally. Given the hypothetical situation when we start writing an OpenPGP implementation from scratch, it would be easier, because there is no layers.

Cons: it's unfriendly to non-standard things. For an implementation, it's not possible to proceed when it encounters an unknown curve. SOS approach allows to proceed (to other parts of data), even if it's not possible to process the data with an unknown curve inside.

### Non-strange but Just an Octet String

We can define simpler Octet String (with length in octets), say, JOS.

Pros: JOS is clearly better, if we were writing up OpenPGP specification from scratch.

Cons: Introducing JOS to current OpenPGP would require new algo numbers for EdDSA-JOS and ECDH-JOS.

## Conclusion (of mine)

SOS is a compromise to quickly introduce another modern curve of ECC, in a way that doesn't widen the wound.

While it looks a bit "strange", its backward-compatibility to existing data is good. Considering adding support of SOS, it's friendly to existing implementations.