PKCS#11 and SSL Performance

Posted on 2007/04/17 by jyri

Unless you have written consumers or providers of crypto functionality you probably don’t have much interaction with PKCS#11. What is it? Quoting from RSA Labs, “this standard specifies an API, called Cryptoki, to devices which hold cryptographic information and perform cryptographic functions”.

As a web server administrator you generally don’t have to deal with it either, even though you probably know that the JES Web Server supports PKCS#11 compatible devices for the crypto functionality used by SSL/TLS.

Recently I added a feature which we plan on including in some upcoming update of JES Web Server 7.0, which allows bypassing the PKCS#11 layer for SSL operations. What does that mean? Well, basically it means your server will be able to handle SSL requests faster ;-)

Here is a coarse high level diagram of the relationship between the various building blocks which comprise the SSL support in the web server:

As you can see, NSS always interacts with the crypto device via the PKCS#11 API, whether you’re using a crypto accelerator or the Solaris 10 crypto framework or even when using the built-in software token (softoken).

Of course, this adds some overhead. NSS must construct PKCS#11 structures and call the right APIs and the softoken must decode these structures and pass on the data to the underlying crypto primitives. Since the softoken is part of NSS one might argue that NSS could simply call the crypto functions directly. In a nutshell, that is what the PKCS#11 bypass does. This is conceptually represented in the diagram above by the dotted line red arrow.

For most use cases this is a no-brainer – once this feature is released SSL will perform better. The bypass will be on by default so you don’t even need to do anything other than upgrade your server. It’ll just be faster (I should say even faster!).

There are some details to be aware of, however.

First, NSS’s interaction with the crypto device is “sticky”. If the server key (associated with the certificate given in the server-cert-nickname element of server.xml) lives in Token-A, operations related to SSL sessions initiated with this key will also be performed by Token-A. Note that this applies to all subsequent operations (as long as Token-A is capable of performing them), not only the handshake operations which directly make use of the private key material (the operations which directly use the private key material are always performed by the token in which the private key is stored, regardless of bypass!)

Bypass removes this “stickyness”. When bypass is active NSS will directly use its softoken for all subsequent operations, bypassing not just PKCS#11 overhead but also the external device. In order for this to occur, some of the session-related key material must be extracted from the device and made available to the softoken. It is possible some devices might not support this operation, in which case bypass cannot occur. The good news is that the server will test for this condition – if the device cannot support this requirement, bypass is automatically disabled.

What if the external crypto device is faster than the softoken? Realistically, this not very likely. It would have to be so much faster that it can overcome not only the overhead of PKCS#11 calls but also the overhead of transporting the data to and from the device. But as with all performance-related tunables, you should test both settings using an environment which matches your production setup and loads.

The final detail is related to certification compliance instead of actual functionality. NSS’s FIPS-140 testing has been done using the PKCS#11 layer which means the certification covers that configuration. Using the bypass results in a different code path. Even though ultimately it is the same crypto implementation being used, technically it is not a validated setup. So, if your environment requires running in precisely the FIPS-140 certified configuration, you should not use the bypasss.

More on Request Limiting

Posted on 2007/02/27 by jyri

Continuing with my coverage of check-request-limits (see request limiting and concurrency limiting), today I’ll give a few examples using the <If> tags introduced in Web Server 7.0.

As I showed in the previous articles, using server variables with the monitor parameter can be quite flexible. However, there are times when you may want to apply check-request-limits in ways which cannot be expressed using monitor parameter values. Check out the <If> expressions also introduced in Web Server 7.0.

Let’s look at an example. Say you want to apply limits only to request paths which end in *.jsp. You could write:

<If $path = "*.jsp">
PathCheck fn="check-request-limits" max-rps="10"
</If>

Simple enough! There are some pitfalls to watch out for, though. Take a look at this example:

<If $path = "*.pl">
PathCheck fn="check-request-limits" max-rps="100"
</If>
<If $path = "*.jsp">
PathCheck fn="check-request-limits" max-rps="10"
</If>

At first glance one might think this limits all “*.pl” paths to 100rps and all “*.jsp” paths to 10rps. Not so! Recall that request counts and averages are tr
acked separately for each value of the monitor parameter (thus, for example, when monitor=”$ip” counts are kept separately for every client IP). In the above two invocations of check-request-limits there are no monitor parameters. So where are the counts kept? When the monitor parameter is not given, counts are kept in a default unnamed slot.

By now you can probably see the issue… the counts for the two calls above are kept in the same counter. So, whenever a “*.pl” request is processed, the counter increases. But this same counter is the one used for “*.jsp” requests! If the server is processing an average of 20rps worth of “*.pl” requests, no requestfor “*.jsp” will ever be serviced… Not quite what I wanted!

Fortunately, this is easy to correct by simply tracking each type of request separately, using the monitor parameter:

<If $path = "*.pl">
PathCheck fn="check-request-limits" max-rps="100" monitor="pl"
</If>
<If $path = "*.jsp">
PathCheck fn="check-request-limits" max-rps="10" monitor="jsp"
</If>

Web Server 7.0 Concurrency Limiting

Posted on 2007/02/22 by jyri

Last week I talked about the new request limiting feature in Web Server 7.0. I’ll expand on this topic today by covering the concurrency limiting feature of check-request-limits.

Once again let’s start with the simplest possible usage:

PathCheck fn="check-request-limits" max-connections="2"

The max-connections option tracks instantaneous connections, as opposed to averages over time as max-rps did. The above tells check-request-limits that I want to allow only 2 simultaneous requests being processed at any one time. If two requests are being serviced and a third one arrives, the third one will be rejected with the desired error code (as before, the default is 503).

As before, this minimal example isn’t really useful. For one thing, I probably never want to limit the entire server to just two simultaneous requests. As with max-rps, the use cases become more interesting when the monitor parameter is introduced:

PathCheck fn="check-request-limits" max-connections="2" monitor="$ip"

Now I’m limiting the server to only process two simultaneous requests from any one client (by IP) but there is no limit to the number of distinct clients which can be serviced at the same time (well, there are other limits which apply such as server capacity and worker threads ;-)

I won’t repeat the various examples I gave last time, but suffice to say that all of them can be used with max-connections just as well as with max-rps. You can use any server variable or combinations of multiple variables to establish the domain over which the limits apply.

Note that if a request exceeds the limit it is rejected right away. If you wanted to limit the number of active requests being processed but allow additional ones to queue up you can do that by setting the maximum worker threads instead.

Let me know of any interesting and useful use cases for check-request-limits that you come up with… I’ll add some more examples and notes about it in a future entry. Have fun with it!

Web Server 7.0 Request Limiting

Posted on 2007/02/14 by jyri

One of the cool new features in Web Server 7.0 is the check-request-limits SAF. In a nutshell, it can be used to selectively track request rates and refuse requests if limits are exceeded. While its primary purpose is to help protect against denial of service attacks consisting of high request rates, it is also useful in other scenarios where limiting request counts is of interest.

Here’s the simplest possible invocation:

PathCheck fn="check-request-limits" max-rps="10"

This instructs the SAF to track the average requests per second (rps) and refuse client connections if the average rps exceeds 10rps. The average rate is recomputed every 30 seconds (default interval) based on the number of requests received in those past 30 seconds. If the average rps exceeds 10, all subsequent requests are rejected with HTTP response code 503 (service unavailable), the default rejection response. Requests will continue to be rejected until the average rps falls below the threshhold of 10. Since the average is only recomputed once per interval, this means it’ll be at least one interval before normal service resumes. Naturally, these defaults can all be changed. The following invocation is equivalent to the prior one but shows the default values explicitly:

PathCheck fn="check-request-limits" max-rps="10" interval="30"
                                    continue="threshhold" error="503"

The other possibility for the continue option is “silence”. If set, the incoming request count must fall to zero (for an entire interval) before normal service resumes. You can use this if you want to force the offending requests to truly “go away” before allowing any more to be serviced.

Now, in most cases one would not use a line such as the above in a real server because it is tracking all requests globally (for that web server process) and that is overly heavy handed unless you really want to limit the entire server to such a low average request rate. Perhaps if it is a home server, but not in most cases.

You may have read about the server variables and <If> tag also introduced in Web Server 7.0. Let’s use some of those capabilities to make the request limiting more interesting:

PathCheck fn="check-request-limits" max-rps="10" monitor="$ip"

The “monitor” parameter is optional but in nearly every case you will want to give it a value. It instructs check-request-limits to track request statistics using separate counters for each monitored value. In the example above, separate stats will be kept for every client IP ($ip expands to the client’s IP) making a request to the server. That’s more like it! Now, any client which exceeds my set limit (10rps) will be refused service but all other clients continue to experience normal operation.

You can use any of the supported server variables as the value of “monitor”, of course. Another interesting one might be $uri:

PathCheck fn="check-request-limits" max-rps="10" monitor="$uri"

Here, instead of setting limits for each client, we set the limit for each URI on the server. Perhaps some areas of your server are harder hit and you wish to limit use of those while allowing normal servicing of other areas? The above directive will accomplish that.

In fact you can combine variable as well. This is also legal:

PathCheck fn="check-request-limits" max-rps="10" monitor="$ip:$uri"

Here the SAF will limit only specific clients which request the same URI(s) too frequently (over 10rps, that is) but all other URIs for those clients and all other clients continue to be serviced normally. Cool!

This functionality is fairly flexible so experiment with it for a bit. While these examples will get you started, I’ll describe a few other scenarios later on.

Here are links to relevant parts of the documentation:

My Filesystem Is Broken?

Posted on 2007/02/07 by jyri

In my previous humorous take on password encryption I inserted a few phrases meant to highlight some of the issues. Last week I commented on one of them, this time I’ll take a look at another.

I mentioned that “An attacker who manages to crack or bypass the file protections will be able to obtain the cleartext password, which is a problem”. True enough, as far as the statement goes.

Imagine for a moment that this attacker has indeed managed to bypass the operating system’s file system permissions, so he can read the supposedly protected file and extract the password. The requested solution is to avoid storing the password (or, somewhow, store it encrypted) so that when the attacker breaks the file permissions he will not get the [cleartext] password. Attack foiled!

Or was it?

If faced with this situation, what would the attacker do next?

We know the web server still ultimately needs the passwords, so we know the data will exist within the process at some point even if it is not on disk (or only on disk encrypted). Since the attacker bypassed the file system protections, how about modifying libns-httpd40.so (the core implementation of the server) to insert malicious code which emails him the password once the server process obtains it? That’s just one example, there would be nearly endless injection points where the attacker could insert similar trickery if we assert that the file system permissions have been bypassed.

Fortunately the file system permissions aren’t quite so weak. While it is true that an attacker who breaks them could indeed then read passwords out of a file, it is also true that if the attacker has this power then the entire system is compromised beyond help.

Tea and No Tea

Posted on 2007/02/01 by jyri

If you had the opportunity to play that classic game you probably eventually succeeded in having tea and no tea at the same time. Of course, those events took place in a universe which had the benefit of the Infinite Improbability Drive.

Some time ago I wrote a humorous take on encrypting passwords. Hopefully it was clear that I was poking fun at a few nonsensical implementations. However, every so often I get requests to implement something along the lines of what I described in that article.

There are two types of passwords handled by the web server. First, there are passwords which (one way or the other) will be sent by a client to the server and the server needs to (by various mechanisms) to validate whether it matches the stored password. Second, there are passwords which the web server process will itself need to know during its lifetime because it will interact with some other entity using a protocol which requires the web server to posess the password in the clear.

The first kind is used, for example, when authenticating web clients using HTTP Basic or Digest auth or Servlet FORM authentication. The server doesn’t need to know the actual password of that user. It only needs to know a one-way hash of the password in a suitable form. The suitable form can vary depending on the protocol but we can ignore the details for the moment – the high level point being that the server only needs a hash of the password, therefore it doesn’t need to store the clear text password nor does it need to be able to compute the clear text password (say, by decrypting) at any time.

The second kind is different. Above I defined these to be those which the server will need to know in the clear at some point. So the question is raised, how should these be stored? Storing one-way hashes of the password is clearly out, since the one-way-ness handily breaks the stated requirement of recovering the original password.

We can encrypt these passwords with a suitably strong reversible algorithm! In the previous article I wrote “encryption is really hard to break, so that will certainly improve the security even further”. I chose that wording to highlight the common misconception that encryption is a magic bullet that makes problems go away.

Unfortunately we also need to handle the key management issues if encryption is introduced. It is certainly possible to encrypt those passwords with a strong cipher such as AES and have the web server store only the encrypted data on disk. But we have already asserted that the web server process needs to obtain the clear text of those password at some point during the life of the process. How will the process do that? As long as it has the encryption key it can decrypt the data to obtain the passwords.

So where is that key coming from?

The are two possibilities:

The key is not stored anywhere on disk; a human must enter it into the console when the server process requests it.
The key is kept on disk in a form which allows the server process to obtain it programmatically, without human interaction.

The first choice has some real benefits. The key is never stored anywhere and the passwords are only stored on disk securely encrypted. Of course, there is a major drawback to this option – the server cannot start without the help of the human who needs to enter the key. The scenarios where this is practical are limited.

The second choice is the one I covered in the previous article.

More On Web Server ECC Performance

Posted on 2007/01/12 by jyri

Last summer I talked a bit about ECC performance in Web Server 7.0 while comparing different ECC and RSA key sizes.

In the previous article I had a table which showed the approximate equivalency in strength of RSA vs. ECC key sizes. This time, I’ll pick one row from that table and compare the performance of several cipher suites in those key sizes. I decided to use 3072 bit RSA keys – roughly equivalent to 256 bit ECC keys. For my JES Web Server 7.0 instance, I generated a 3072 bit RSA keypair and an ECC keypair on NIST P-256 curve (256 bit key).

Using the various cipher suites shown below, I ran a fixed number of requests to the web server; the Y axis shows time taken to complete these. As in the previous article, I ran each scenario at various percentages of SSL session reuse; these are shown in the X axis.

As before, when the SSL session is always reused (far left X axis on the graph) the cipher suite and server key size hardly matter since there is only 1 full handshake in the entire run, therefore its cost is lost in the noise. As the percentage of new handshakes increase, the computational load on the server increases and the differences between key sizes and cipher suites become increasingly visible.

Now, remember, in this graph every line is using equivalent server key sizes (3072 bit for RSA, 256 bit for ECC), so we’re focusing on the differences between the cipher suites themselves.

The red line shows the traditional RSA server keypair (3072 bit).

The black line is interesting as it is a good bit slower than all the others. The TLS_ECDHE_RSA_* cipher suites deserve some comment. When using these suites, the web server actually has an RSA keypair. These suites can be used by your existing web server without generating any new keys or having to obtain new server certificates from your CA. Instant ECC adoption! The tradeoff, however, can be seen from the graph… performance is not the best.

The TLS_ECDHE_ECDSA_* (blue line) came out the same (within experimental error, I measured these numbers on my desktop and not a dedicated server) as the RSA cipher suite (at this particular key size – at larger key sizes ECC would have an advantage since the cost of RSA computation increases faster as key sizes grow).

Finally, the green line shows the TLS_ECDH_ECDSA_* runs, which were significantly faster than RSA.

Hopefully this small experiment sheds some light in selecting appropriate cipher suites for your server installation as there are a number of tradeoffs in performance, convenience and flexibility. ECC offers significant performance advantages but as with any technology it is important to understand the details. For example, if performance is very important in your server you should look into generating an ECC key pair instead of attempting to use TLS_ECDHE_RSA_* suites with your existing RSA-based server keypair/certificate.

I should probably also point out that here I looked only at the server side of the performance coin. But if you have small devices (mobile phones, etc) using ECC, the benefit to those clients from ECC over RSA can be substantial.

Using Web Server 7 with Microsoft Active Directory

Posted on 2006/09/11 by jyri

Among the many new security-related features in Web Server 7 are a few new configuration elements for the LDAP auth-db (authentication database).

Here is a summary:

search-filter	[optional] The search filter to use to find the user. The default is uid.
group-search-filter	[optional] The search filter to find group memberships for the user. The default is uniquemember.
group-target-attr	[optional] The LDAP attribute name that contains group name entries. The default is CN.

One use case for these configurable search options is to interoperate with Microsoft Active Directory (MSAD). The problem with MSAD is that user ids are not kept (by default) in the usual uid attribute. For this reason, when the LDAP auth-db attempts to search a MSAD directory to find a user, it will never be able to obtain a match since it is attempting to match on the uid attribute.

In 7.0 we can now set the search-filter attribute to override the usual default. In MSAD the user is kept in an attribute called samAccountName. Here is a sample LDAP auth-db configuration for MSAD (showing a minimal configuration, other options can of course be specified as usual):

<auth-db>
	<name>ldapMSAD</name>
	<url>ldap://crashbox.sfbay/dc=sfbay,dc=sun,dc=com</url>
	<property>
		<name>search-filter</name>
		<value>samAccountName</value>
	</property>
</auth-db>

P.S. Of course, I should probably point out that a better solution is to simply upgrade to Sun’s own Directory Server instead!

Web Server 7 ECC Performance Notes

Posted on 2006/06/26 by jyri

As I have mentioned earlier, the upcoming Web Server 7 will include ECC support.

While relative performance predictions comparing RSA and ECC are available in various papers, I was curious to get a glimpse into how it performed in practice in our web server. So, I did a few runs and graphed the results below.

The X axis corresponds to the percentage of new TLS session handshakes during the run (several thousand requests). If a single client were issuing all the requests it would perform one handshake during the initial connection and reuse that TLS session for all remaining requests. As one could expect, in this case there isn’t really any difference between the algorithms and keysizes since no matter how fast or slow the very first connection was, it is a minute portion of the total runtime. At the other extreme is the case of 100% new handshakes – every request comes from a new client and that client doesn’t reuse the session again.

Neither of these extremes is realistic for web server traffic, of course. Normal usage patterns will fall somewhere in between.

The following table shows approximate equivalency in strength between RSA and ECC, to provide some context to the results above:

RSA	ECC
1024	160
2048	224
3072	256
7680	384
15360	521

So, we can see that while 1024 bit RSA isn’t too much of a performance burden even if our server experiences lots of new handshakes, things look quite different at 2048 bit RSA. And 4096 bit RSA is nearly off the chart. On the other hand, while ECC with the nistp256 curve is roughly equivalent in strength to 3072 bit RSA, it performed faster than 2048 bit RSA. Not bad.

The higher key length won’t be so interesting for years to come (barring unforeseen advances) but it is interesting to note how the performance compares as keysizes grow. ECC with nistp521 is substantially faster then 4096 bit RSA, even though it is roughly equivalent to 15360 bit RSA in strength!

Note: I ran the web server on a single CPU single core server for these tests. Such machines are hard to come by these days.. so I ran it on a very old box I had sitting around. The absolute numbers aren’t very interesting so I left the Y axis numbering out.

Secure Password Storage

Posted on 2006/06/03 by jyri

If you are using SSL with your Web Server 6.1, your server has one or more private keys. These keys are kept in the NSS database and they are encrypted. In order for the server to read its own keys it will need to decrypt this store, for which it will need the password to the NSS database. When the server is started, if SSL is needed, the server prompts for the password it needs, which is then used to unlock the NSS database.

Often, this is inconvenient. After all, servers need to start unattended. In that case, the only solution is to store the required password somewhere so the server can automatically get it during startup.

This is handled in 6.1 by storing the password in a file called password.conf in the config directory. This file is owned and readable only by the web server process, so it is not possible for other users on the system to get at it.

However, the operating system filesystem permissions are sometimes seen as being too weak. An attacker who manages to crack or bypass the file protections will be able to obtain the cleartext password, which is a problem.

Let’s see how we can improve this situation.

I’ll make a small modification to the start script so it obtains the password by invoking an executable (I’ll call it wsgetpwd) instead of prompting interactively:

99c99,101
<               ./$PRODUCT_BIN -r $SERVER_ROOT -d $INSTANCE_CONFIG_DIR -n $IN
STANCE_NAME $@
---
>               PWD=`wsgetpwd $INSTANCE_CONFIG_DIR`
>               echo $PWD | ./$PRODUCT_BIN -r $SERVER_ROOT -d $INSTANCE_CONFIG_D
IR -n $INSTANCE_NAME $@

Important Note: The content of the start script is not a public interface. This means any changes to it are unsupported and it also means you cannot expect any such changes to continue working after a service pack or version upgrade. You’ve been warned. No production servers were harmed in the writing of this article.

Ok, with this tiny bit of infrastructure in place we can experiment with various implementations of wsgetpwd until we find something superior to keeping the cleartext password in password.conf.

Let’s start simple to see if it works. I create a file called password in the instance config directory to contain the password and implemented wsgetpwd to simply print it out:

% rm -f config/password.conf
% echo password > config/password
% chmod 600 config/password
% cat ../bin/https/bin/wsgetpwd
cat $1/password

Starting the server shows that it works fine. So far so good. We haven’t accomplished much yet though – if our attacker can bypass the filesystem permissions on password.conf, they might also bypass the permissions on the new password file.

So, let’s improve wsgetpwd. This time I’ll obfuscate the password using base64 (btoa/atob are small utils from NSS which do this encoding) so it is no longer human-readable.

% echo password | btoa > config/password
% cat config/password
cGFzc3dvcmQK
% cat ../bin/https/bin/wsgetpwd
cat $1/password | atob

Now, even if an attacker manages to read the password file, they’ll only get “cGFzc3dvcmQK” which doesn’t really do them much good (unless they know about atob or base64, but how likely is that?)

Nonetheless, it’s been suggested that the password will be safer if it is encrypted with a proper encryption algorithm. Encryption is really hard to break, so that will certainly improve the security even further. I’ll use encrypt(1).

Some prep work is needed first. I’ll use AES encryption and I’ll use /dev/random to obtain bits for the key and then I’ll encrypt the password with that key into the password file. Finally, I reimplement wsgetpwd to decrypt the password at runtime.

% dd if=/dev/random of=config/encrypt.key bs=1 count=16
% chmod 600 config/encrypt.key
% echo password | encrypt -a aes -k config/encrypt.key > config/password
% cat config/password | od -x
0000000 0000 0100 0000 e803 058c 6f08 5b48 c607
0000020 3517 7fc5 65ce c64e 95ff 576d ee95 4cb4
0000040 e990 5834 df32 9042 df90 9937 47f4 a464
0000060 2720 f8db ad0a d089
0000070
% cat ../bin/https/bin/wsgetpwd
decrypt -a aes -k $1/encrypt.key -i $1/password.aes

Start the server and.. it works! The password is now encrypted with AES on disk and the server is still able to start automatically. When our attackers crack the filesystem permissions on the password file all they will get is the encrypted bits (shown by the od -x output above) – the cleartext password remains secure.

If you’ve read this far, one final thought: Should I file this article under “Web Server Security” or under “Light Comedy”?

stdout

Collected Thoughts