Monday, June 17, 2013

Web services: in search of interoperability

I have some experience with web services, SOAP, SAAJ, what not, but until recently I did not have any "pleasure" to work with all the rest of WS-* standards. WS-Addressing, WS-Crap, WS-Security, WS-whatever.

Not that long ago however I ran out of luck.

The product I am working on can operate as a WS service or a WS client. And it works. But there are situations when "it works" is not good enough. Usually it has to do with artificial barriers created to protect certain markets. This all is accompanied by some "specification" to which a product must conform. As a rule such specification is badly edited collection of copy-pasted fragment from other standard documents from well-known authorities like W3, OASIS, etc., often with contradictory statements. Nobody cares.

Recently we needed to prove our product conforms to one of such specs. We were lucky. Not only there was a specification, there was also a "compliancy service" provided. The "compliancy service" could accept SOAP requests and thus validate if the sender conforms to the specification, or it could send SOAP requests and validate SOAP responses to validate if the receiver conforms to the specification.

Last year I had to deal with another "specification" and "compliancy service" from the same people. Do you know what one needs to have a good compliancy service? No, you do not have to conform to well-known standards, or even to your own specification. Monopoly is good enough. Add some crappy software and you are set.

For example, the software they used then (and still use) could not handle some erroneous HTTP requests. Instead of returning any kind of response the "compliancy service" did nothing. Literally. The connection was kept open, but not a single byte of response was sent back. Eventually the client timed out trying to read a response. It took us more than a month collecting data and e-mailing them before they agreed that the problem is on their side. The problem is still not fixed.

So I knew I had a lot of fun ahead, I just did not know how much.

This time everything revolved around WS-Addressing and optionally WS-Security. How exactly WS-* stuff had to be applied was specified in an "interoperability standard" document. The document was unclear on couple of dozens points, but it was a "standard", so our product had to be "standard"-compliant.

The "compliancy service" found no problem in our requests and responses in case no WS-Security had to be applied. Adding XML signature to the picture changed everything.

First, the "compliancy service" did not like what request elements our product signed. It complained we were signing more than needed. Turned out it was the case of "do what I do and not what I say". The "standard" defined some elements as mandatory and allowed some optional elements to be present. In the section that described what has to be signed it said that all mandatory and optional (if present) elements must be signed. But "compliancy service" did not like that our requests had optional elements that were signed. OK, no optional stuff then. And no more complains from the "compliancy service".

But when I started testing our product as a web service provider all hell broke loose. No matter what I did the "compliancy service" said "signature verification failed". Just that.

Since then I have learned what JWSDP, XWSS, WSIT, Metro, you name it, means. I have seen monsters much worse than in JBoss code.

And I found out that by 2008 there were still XML parsers in the wild that would reject valid XML documents as invalid. And that in 2013 somebody would still use that parser. Ok, ok, granted, I am not really sure if that XML parsing problem is a feature of a parser itself. It might very well be that the parser was improved as part of the "compliancy service" development. But still.. failing to parse XML if there is a character entity representing a whitespace character between some XML elements? Like this:

<Header>
<header1 …/>
<header2 …/>&#x20;
</Header>&#x20;
<Body …/>
</Envelope>

Remove any of these two entities, and the problem goes away, even if is added anywhere else. +100 to "compliance level". Grrr.

After a lot of experiments and quite some test code to generate XML signature I found out that the "compliancy service" did not like new lines in our response messages. Only after I produced a signed response that did not contain new line characters, the "compliancy service" gave up and accepted the response.

This was really strange because request messages with new lines did not cause any trouble. Submitting a bug report to them was not really an option. We did not have another month.

I found out that the "compliancy service" uses some WS-* toolkit from Sun, not sure of the exact name and version of the toolkit. Nowadays it goes under name "Metro". Or is it WSIT? Beats me. Anyway, based on some stack traces I have seen it was a version from around 2008. Oh, Sun! I had some pleasures debugging JAX-WS RI some time ago. Fine experience, unforgettable.

So I decided to download the latest version of that toolkit to play with it. The decision opened a bright world of project code names and their relationships. Googling classes from the stacktrace resulted in XWSS, JSWDP, WSIT, with XWSS being the primary suspect. Project migrations, consolidations, broken download links, Oracle buying Sun added even more fun.

All the roads led to metro and WSIT. The latest version is 2.3, so be it.

Setting it up and running some samples went mostly flawless, but when I started experimenting with soapUI, I immediately ran into an issue. The sample I was using was a SOAP 1.2 web service, but I sent to it a SOAP 1.1 request. Granted, it was a faulty request, but a SOAPFault with NullPointerException and nothing more is quite an extreme way to say "I do not support SOAP 1.1 here".

By the way do you know what WSIT stands for? Web Services Interoperability Technologies. Yeap, "Interoperability".

I also tested how character entities are parsed. I could not reproduce the problem. At least this one is solved. Or it was not a problem of the toolkit at all.

The real fun began when I started sending signed requests from soapUI. First I have got bitten by the fact that soapUI "friendly" modified my messages.

Next problem I ran into was much more serious: some of the signed messages my test code produced were happily accepted by metro and some were rejected with an error that sounded like "Signature verification of SOAP Body failed".

Some of the messages accepted by metro had new line characters, so again the problem we had with the "compliancy service", if it was the problem of the toolkit, was solved. Needless to say when I generated response messages with the exact formatting they still were rejected by the "compliancy service".

And what about the test messages that metro rejected? I actually found the cause quite quickly. Under some circumstances metro chops off whitespace characters and probably also comments that are direct child nodes on after SOAP Body. They probably do it in order to promote "I" ("-nteroperability"). What else can be the reason? And of course whitespaces are significant with XML digital signature.

For example, this:
<Envelope>
…
<Body>
    <elementX .../>
</Body>
</Envelope>

is treated by metro as if it were
<Envelope>
…
<Body><elementX .../></Body>
</Envelope>
But not always. Who said life is easy?

Looking back I know that I was lucky when I have tested our product as a WS client sending data to the "compliancy service". Pure by chance the request messages did not have any whitespace characters in Body.

I should say the source code of metro is a ... well, I do not know. Saying "mess" would not do it justice. It would be more like a huge compliment. Classes with same name in multiple packages maybe doing the same thing? Or maybe not the same? Methods longer than couple of thousand lines? Copy-paste? You name it, and it is there.

I also found that the problem was reported to them, maybe even multiple times. It is always easy when you know exactly what you are looking for. And of course it was reported fixed. Ha! This is another thing I do not understand: you have found a problem, you have fixed it. Is it so much work to fire "find usage" in your IDE? One of the remaining places is in the file next to the one you just have modified! To me it says a lot about quality of the project and people working on it.

The problem is in JAX-WS integration code of WSIT, but given the complexity of metro, JAX-WS is probably the only way metro is used, so the problem affects everybody who is using metro with XML-Security. And people are still running into this problem. The answer from "Interoperability" specialists is of course "it is your problem". Unfortunately it is true.

Another "stamp of quality" is their unit tests. WS-Security subproject has only 11 tests that do something about WS-Security. Compare that with 27 tests around WS-Policy parsing. Even more interesting fact is that their WS-Security tests do not test JAX-WS code paths. Metro web site claims that they use XML streaming to improve performance. And part of their code is using XMLStreamReader. Whether it improves performance I do not know since they like to copy data into XMLStreamBuffer objects to use them in other places as XMLStreamReader. But their unit tests are using SAAJ to read SOAP messages, and not the streaming code. As a result the code that is actually used by a metro-based WS client or server is not tested.

I should probably even not mention the possibility to have some unit tests for testing the interoperability with other toolkits. Doubt they would understand the concept.

Anyway, knowing the problem and the fix I repeated my failing tests, this time fixing the data as needed under debugger. Sure thing, no more signature errors.

Net result: our product is compliant with WS-Security standard. I know what we need to do to make a particular configuration of the latest version of "the great interoperability toolkit of all times" to accept our messages. Given the complexity of metro I have no idea if some other configuration would be OK with our messages.

I still have no idea why the "compliancy service" did not like our responses with new lines in it. Needless to say I tried again making sure there are no whitespaces in Body, but the error was still the same.

If you are thinking of using metro for your projects, do not. Even if you do not need WS-Security. If they manage to screw things up during parsing I do not want to think what they can do in more complicated cases.

If you are unfortunate to use metro now, especially with WS-Security... Well, if you have metro on both sides, you will not be beaten by this bug, because normally metro generates SOAP Body without whitespace characters.

If you have some interoperability issues with XML signatures using metro and some other toolkit, check the messages. Maybe you are lucky and all your issues are caused by whitespaces in SOAP Body.

If you are a metro developer... let me say no more.

No comments:

Post a Comment