The Kerberos.NET library relies on a collection of cryptographic primitives to implement the protocol. This post describes what those primitives are and how they're used.

Kerberos as used in the real world is an implementation of a handful of specifications. If you take a look at the deep dive document you can see the various specs that make up the functional aspects of the protocol. There are additional specs that define the cryptographic algorithms used by Kerberos:

You can follow those specs further down the rabbit hole into distinct primitives. These specs in particular define the characteristics of things like key derivation, padding, negotiation, etc.

This library implements the RFC's above and, with one exception, relies on the platform to do any real crypto operations. All of these primitives are exposed via a platform abstraction layer, the CryptoPal. This Pal exposes the dozen or so operations required by the library and in most cases just wraps .NET's APIs.

The code is structured into 3 broad areas: publicly-consumable APIs (CryptoService, KerberosCryptoTransformer, DecryptedKrbMessage, etc.), algorithm-specific implementations (Aes128Transformer, etc.), and platform-specific primitive implementations or wrappers (BCryptDiffieHellman, etc.).

There is one primitive implemented directly in the library: RC4. It's only there for backwards compatibility and should not be used.

The publicly consumable API's are the Kerberos-specific algorithms. They directly implement the above RFCs, including things like key-stretching and derivation. The logical structure looks something like this:

kerbcrypto.png

 

There’s a common problem that many applications run in to when executing cryptographic operations, and that’s the fact that the keys they use tend to exist within the application itself. This is problematic because there’s no protection of the keys — the keys are recoverable if you get a dump of the application memory, or you’re able to execute arbitrary code within the application. The solution to this problem is relatively straightforward — keep the keys out of the application.

In order for that to be effective you need to also move the crypto operations out of the application too. This means any attacks on the application won’t yield the keys. This is the fundamental design for services like Azure Key Vault, Amazon CloudHSM, or any other HSM service or device. In fact, even Windows subscribes to this design with CNG keys using LSASS.

I was thinking about this problem over the weekend and realized there isn’t really any good reference architecture out there that shows you how to build this design into your application. The services I mentioned earlier do great jobs at protecting secrets, but they’re kind of designed with certain applications in mind — greenfield, cloud first, or disconnected. This can make it difficult to migrate existing applications to use these services, or maybe you just can’t use them for whatever reason (regulatory, legal, etc.). On top of all that, you don’t even know how to start.

So, I did something dumb. I built a reference implementation that lets you move crypto operations out of your application and into a separate process.

Introducing: Enclave.NET.

READ THIS SECURITY WARNING

This is a reference implementation. That means it’s not designed to handle production loads, and absolutely is not built to withstand attack. It’s a sample intended to show how you might offload certain operations. There are probably some horrible bugs in here and there might even be vulnerabilities.

Was that sufficiently scary enough?

The basic idea is simple. You have an application that hosts a service. The service has a set of commands available to it:

  • Generate key
  • Encrypt
  • Decrypt
  • Sign
  • Validate

Your application calls these commands with the necessary payloads, the service does the thing, and returns a result. The service is HTTP-based, and protected with pinned client certificates.

The crypto operations are real, backed by the jose-jwt library, but they’re ephemeral — the keys are just stored in memory. The idea is that you can inject your own implementations as you see fit, so any migrations you might undergo can be gradual and painless.

There are two classes that need to be implemented — the crypto operations class, and the storage service. You can use the built-in crypto operations class InMemoryCryptoProcessor at your own risk, but you absolutely need to implement the storage, lest you lose all the keys when the app shuts down.

Crypto Processor

Storage Service

You can modify the startup code on your own, or you can implement IStartupTransform and configure it:

Calling the service is simple:

For more information take a look at the sample app.

It’s been a few months since there’s been any public activity on the project but I’ve quietly been working on cleaning it up and there’s even been a PR from the community (thanks ZhongZhaofeng!).

Part of that clean up process has been adding support for AES 128/256 tokens. At first glance you might think it’s fairly trivial to do — just run the encrypted data through an AES transform and you’re good to go — but let me tell you: it’s not that simple.

On Securing Shared Secrets

There’s primarily one big difference between how RC4 and AES are used in Kerberos, which is that AES salts the secret (DES actually does this too, but DES is dead), whereas RC4 just uses an MD4 hash of the secret. That means you need to think about how you store secrets when using the two algorithms. With RC4 you can either store the cleartext or the hash and you can still use it anywhere. With AES you need the cleartext secret any time you reconfigure things because the salt (more on this later) may necessarily change. That means the secret isn’t portable — this is a good thing. The problem though is that the salt uses information that isn’t available at runtime, meaning there’s an extra step involved when setting up the service. This is frustrating.

On Calculating Salts

A salt is just a value that makes a secret more unique. Generally they’re just random bytes or strings you tack on to the end of a secret, but in the Kerberos case it’s a special computed value based on known attributes. This actually serves a purpose because the salt can be derived from a token without needing extra information; the token would need to carry extra information if the salt was random, which might break compatibility or just increase the token size. The Kerberos spec is a bit vague about how this is supposed to work, but it’s basically just concatenating the realm with the principal names of the service:

salt = [realm] + [principalName0]...[principalNameN]
salt = FOO.COMserver01

This actually works well because it’s simple to compute and solves the portability problem. Both the ticket issuer and the ticket receiver know enough to encrypt and decrypt the token.

Enter Active Directory.

For reasons not entirely clear, Active Directory decided to do it a little differently. At least it’s documented. The difference is not trivial and introduces a piece of information that’s not in the request, which means you need prior knowledge for this to work — the new computation requires the samAccountName of service account without any trailing dollar signs.

salt = [realm.ToUpper()] + "host" + [samAccountNameWithoutDollarSign] + "." [realm.ToLower()]
salt = FOO.comhostserver01.foo.com

Yeah, I don’t know either.

So now you have a couple options. You can either pre-compute the secret/salt combo one time and just store that somewhere, or store the two values and use them at runtime. Pre-computing the key isn’t a bad idea — it’s actually how most environments work, generating a keytab file. This means the secret doesn’t need to be known by the service, but it adds a deployment step.

I recommend just storing the computed value.

On Computing the Key

I mentioned previously that pre-computing the key means the secret isn’t known by the service. This is because the key itself is cryptographically derived from the secret and salt in an overly convoluted way:

tkey = random2key(PBKDF2(secret, salt, iterations, keyLength))
key = DK(tkey, "kerberos")

Basically concatenate the secret and salt and run it through the PBKDF2 key derivation function for a given number of iterations and output a key of a particular length needed by the AES flavor, and then run it through the Kerberos DK function with a known constant string “kerberos”. Oh, and the iterations count is configurable! That means different implementations can select different values AND IT’S NOT INCLUDED ANYWHERE IN THE REQUEST BUT THAT’S OKAY JUST USE THE DEFAULT OF 4096. Oh yeah, and random2key does nothing.

On Decrypting the Token

Decrypting the token is a relatively tame process comparatively. You take that computed key and run it through that DK function again, this time including the expected usage (KrbApReq vs KrbAsReq etc.) and various other constants a few times and then run the ciphertext through the AES transform with an empty initialization vector (16 0x0 bytes). Of course, the AES mode isn’t something normal; it uses CTS mode, which is complicated and probably insecure — so much so that .NET doesn’t implement it. Thankfully it’s relatively easy to bolt on to the AES CBC mode.

You do get the real token once you run it through the decryptor though. But then you need to verify the checksum using a convoluted HMAC SHA1 scheme.

You may notice all this exists in a separate project. This is because the HMAC scheme requires access to intermediate bits of the SHA1 transform, which aren’t available using the built in .NET algorithms. Enter BouncyCastle. 🙁

Funny story: I originally thought I needed BouncyCastle for AES CTS support, but after finding their implementation broken (or more likely my usage of their implementation) I found out how to do it using the Framework implementation. If it wasn’t for the checksum I wouldn’t need BouncyCastle — doh!

That said, if anyone know how to do this using the native algorithms please let me know or fork and fix!

The resulting value is then handed back to the core implementation and we don’t care about the AES bits anymore (well, after repeating the same process above for the Authenticator).

On Cleaning Up the Code

I mentioned originally that I did some clean up as well. This was necessary to get the AES bits added in without making it hacky and weird. Now keys are treated as special objects instead of just as bytes of data. This makes it easier to secure in the future (encrypt in memory, or offload to secure processes, etc.).

I also added in PAC decoding because you get some pretty useful information like group memberships.

There’s still plenty to do to make this production worthy, but those are relatively simple to do now that all this stuff is done.