Wednesday, June 19, 2013

Aaaaaah! Metro, you made my day!

As I have mentioned before, WS-security unit tests in metro are using SAAJ to handle SOAP messages. A typical test goes like this:
  1. A SOAP message is created and populated with some elements by the test code.
  2. A WS-Policy-based configuration that defines what to do with the message is created.
  3. A security operation is performed.
  4. The resulting SOAP message is written to a file.
  5. A new instance of SOAP messages is created from the file.
  6. A new WS-Policy-based configuration that defines how to validate the message is created.
  7. Validation is performed.
I decided to add some JAX-WS code to one of the test. Originally the test was doing both signing and encryption, but I removed encryption. It is much easier to see what is going on. Then I verified that the test is still green, and that it fails if I modify the file the test generates before the file is read in. Just in case: you never can be too careful.

I have added the following steps at the end of the test:
  1. The file with the message, created after the security operation, is read in the streaming mode as a JAX-WS message.
  2. A new WS-Policy-based validation configuration is created. It is actually exactly the same code as in SAAJ SOAP message case.
  3. Validation is performed. This is done by a different set of classes than in SAAJ case, although class names are similar. Here metro shines again: although the operations are similar, the way validation is invoked is different. Worse, all the parameters for JAX-WS way of validating are type-compatible with SAAJ validation. Trying to use SAAJ validation code with JAX-WS message compiles but fails with NPE.

After some failed attempts I have got a version that not only compiled but went quite deep into metro code, and then failed with signature validation. It was "Reference #idxxx: signature digest values mismatch" or something like that.

This was ... interesting. The same message is OK if validated as a SOAPMessage instance and fails if validated as a JAX-WS message. Something fishy was going on. Metro had not only provided multiple implementations of XML-Security, but they also managed to make them incompatible. Remind me, what does that "WSIT" stand for?

Of course the problem might have been in the way I have set up JAX-WS-based validation, but I was quite sure it is another genuine "feature" of metro.

In order to understand what the error message means it is necessary to understand what the signed message looks like. It is a SOAP message with a lot of additional stuff added to the SOAP Header (a lot of details omitted):
<Envelope ...
   <Header>
       <Header1 wsu:Id="id_h1".../>
...
          <ds:Signature >
          <ds:SignedInfo>
...
              <ds:Reference URI="#id_h1">
                  <ds:DigestValue>some base64</ds:DigestValue>
              </ds:Reference>
              <ds:Reference URI="#id_body">... <ds:Reference>
...
          </ds:SignedInfo>
          <ds:SignatureValue>some base64</ds:SignatureValue>
          </ds:Signature>
       <HeaderN wsu:Id="id_hN".../>
   </Header>
   <Body wsu:Id="id_body">...</Body/>
</Envelope>

Each <ds:Reference> element "describes" a particular thing that is signed, typically some element from the same message, as well as how that thing has to be preprocessed before calculating the digest, what algorithm is used to produce the message digest, and the message digest value. URI attribute of <ds:Reference> specifies which element is digested.

<ds:SignedInfo> can contain a lot of such <ds:Reference> elements. And then comes <ds:SignatureValue> that is actually a digitally signed message digest of <ds:SignedInfo> element.

The order of signed header elements and <ds:Signature> is not important. Some say the signature must come after to be signed header elements to facilitate streaming, but this is a moot point. Most often than not SOAP Body also has to be signed, so you can kiss goodbye that nice streaming theory.

Anyway the error I was getting, "Reference #idxxx: signature digest values mismatch", was about the very first <ds:Reference> in the message. It meant that the verifying code looked at URI attribute, URI="#id_h1" in this case, found the corresponding header element by its id, <Header1 wsu:Id="id_h1".../>, and calculated its digest. And the calculated digest did not match <ds:DigestValue> of the <ds:Reference>.

I switched on the logging and repeated the test several times, with the same result. I was not sure what I wanted to see. The logging did not show anything exciting. But then I noticed some pattern. The output contained calculated and expected digest values taken from <ds:DigestValue> of the <ds:Reference>. The values were unreadable because the digest is just byte[], and metro guys did not bother with encoding/decoding or pretty-printing them. While the expected digest was clearly changing from run to run, the calculated digest looked the same. This was clearly wrong because the digest should have been calculated over the header element including all its attributes. While most of the things remained unchanged, wsu:Id attribute differed from run to run. So the calculated digest had to be different as well.

Checking the verification code under the debugger confirmed this: the calculated digest was exactly the same every time the test was executed. So what exactly metro is using as the source for the digest calculation? Turned out: in this particular case nothing.

Yeap, nothing. So the calculated "digest" is probably some fixed initial state of the message digest implementation.

The problem had nothing to do with how I used metro API. The real reason was the signed message itself. Time to show the relevant <ds:Reference> in its full glory:
<ds:Reference URI="#_5002">
    <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
    <ds:DigestValue>dKof/iss1y+eaCxi5xQGzXZw8RQ=<ds:DigestValue>
<ds:Reference>

The key to the problem is not what is there, but rather what is absent. Some important piece is missing, namely, "instructions" on how the date has to be preprocessed before calculating the digest. Normally a <ds:Reference> looks like this:
<ds:Reference URI="#_5006">
    <ds:Transforms>
        <ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#">
            <exc14n:InclusiveNamespaces PrefixList="S"/>
        <ds:Transform>
    <ds:Transforms>
    <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
    <ds:DigestValue>JWq3aJtzUP98fkiThJ0WYtcrWCY=<ds:DigestValue>
<ds:Reference>
<ds:Transforms> and its child <ds:Transform> elements are such "instructions".

The rest was easy: knowing that there were no <ds:Transform> elements I looked at what metro does in this case with the referenced element. Well, nothing. No data is digested.

Some questions remained though:
  1. Why the signing code produced a messaged without <ds:Transform> elements?
  2. Why the SAAJ message was successfully validated?
  3. What is the expected behavior in case there is no <ds:Transform> elements according to the specification?

Let's start with the last question. The answer is: I do not care. It might be valid. It might be not valid, but in any case it is definitely not the "signature digest values mismatch". This is actually an answer on the first question as well. Why the signing code produced a messaged without <ds:Transform> elements? It does not matter because metro might need to process such messages anyway, no matter how they are created.

Why the SAAJ message was successfully validated? Well, the validation was performed by same code as the singing. For SAAJ messages metro delegates all the work to JSR 105 "Java XML Digital Signature API Specification" implementation, which is now part of JDK. Basically it is some older version of Apache Santuario, repackaged by Sun. I checked Apache Santuario source and found some remarkable similarities with metro code. Except that Santuario's code does not have this particular bug because after applying all the transforms it checks if there is any data left to be processed, and processes it. And metro does not perform this check. The check existed in Santuario's code for ages, from the first check-in of JSR 105 support in 2005. I guess metro had "borrowed" some even older version of that code. As a result metro fails if there are no <ds:Transform> elements, and also might fail if there are <ds:Transform> elements. I did not check completely the logic there but it looks like some combination of <ds:Transform> elements might result in the same error. The "I" is WSIT looks more and more like a joke.

By the way, why the signing code produced a messaged without <ds:Transform> elements? The transforms to use are coming from WS-Policy, but not directly. At least when I tested some metro samples, the generated messages had <ds:Transform> elements, but WS-Policy declarations used by the samples did not have any explicit mentioning of transforms. Metro runtime is probably adding this during parsing of WS-Policy. The signing code in the unit test uses a combination of WS-Policy file and some runtime policy manipulation to create the final policy for the signature process. What exactly has to be singed is defined by java code, so this is probably the reason why the final policy ended up having no transforms specified. Sure enough after I found out how to add this info to the java-created policy and modified the code, signed messages were produced with <ds:Transform> in <ds:Reference>. And the JAX-WS way of verifying the message went OK as well.


What can I add? At least interoperability-wise metro really shines.

No comments:

Post a Comment