Sunday, June 21, 2009

validating PunchOut cXML - pt 2.

Okay… so we covered a bit about POSRs in the last update.  And we kinda covered how it fit into an eCommerce Scenario… but let’s go into that more.

image

PunchOutSetupRequests are issued, and as we said last time the POSR (PunchOutSetupRequest) is sent from the Buyer to the Seller.  And they authenticate by sending an Identity and SharedSecret and a URL for the Seller to send a response back to.

So once the Buyer sends this authenticated message to the server, the Supplier (Seller) reads this message, authenticates the Buyrer and then sends back a URL which is the catalog of items which are for sale. 

This message that is sent back is called a PunchOutSetupResponse.  The Response message is usually nothing more than a “200” or “OK” message and the URL for the catalog.  If the message is anything other than a “200” – that means an error occurred.  “400” series errors generally mean a problem with redirections or locating urls, “500” series errors are server errors – which include the ever popular (and usually the most common) error message “500 Authentication Error”. 

A series 500 error can also mean that for whatever reason the cXML you sent didn’t process.  So it’s important that you have a valid cXML document to send and that’s where validation comes into things.

On the POSR, validation isn’t so difficult.  It’s basically just a From, a To, a Sender with Credentials, a BrowserFormPost URL – indicating where you want the Seller to respond to … and if you want to get fancy some Extrinsic values.  (We’ll talk about Extrinsics some other time since they’re a ton you can do with them – and they’re almost a discussion all their own.)

Where Validation is really important – is on the OrderRequest, also called a PunchOutOrderRequest (POOR) document.  You see in the flow of things you authenticate to the Seller and then then Seller sends you are response message, which has a URL to the catalog you’re going to be shopping at.

You do your shopping – all of this is recorded – and you send this information (your order – or PO) back to the Seller in the form of a PunchOutOrderRequest (POOR).  This has all the information needed to process the order. 

They look like this – which is an image of an actual order request : (Warning - a lot of Scrolling Involved here if you want to see … So Be Warned)

image

<?xml version="1.0" encoding="UTF-8"?>
<!
DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.2.019/cXML.dtd">
<
cXML payloadID="3223232@ariba.acme.com"
timestamp="1999-03-12T18:39:09-08:00" xml:lang="en-US">
 
Below is the Header – and this isn’t really going to be much different from the POSR we created, it goes FROM the customer shown below as “Admin@acme.com”.  You’ll notice (and someone did email me about this) that the “domain” listed below is “AribaNetworkUserId” and yes, you can change this. It says “AribaNetworkUserId” because this is one of the actual examples from www.cxml.org.   

Anyway – you’ll also note that there’s more than one Credential Identities
in the From field. Yup – you can do that too. When I have more time I’ll go into why this is a cool trick to use if your a Seller, but it’s one of those weird things that we don’t have time for today.So – moving on to the TO field, and the Sender fields – these are the same as the POSR.

   
<Header>
<
From>
<
Credential domain="AribaNetworkUserId">
<
Identity>admin@acme.com</Identity>
</
Credential>
<
Credential domain="AribaNetworkUserId" type="marketplace">
<
Identity>bigadmin@marketplace.org</Identity>
</
Credential>
</
From>
<
To>
<
Credential domain="DUNS">
<
Identity>942888711</Identity>
</
Credential>
</
To>
<
Sender>
<
Credential domain="AribaNetworkUserId">
<
Identity>admin@acme.com</Identity>
<
SharedSecret>abracadabra</SharedSecret>
</
Credential>
<
UserAgent>Ariba.com Network V1.0</UserAgent>
</
Sender>
</
Header>

We then move to the Request tag, which says “Hey this is a Request – it’s going to have tons of info in it. From there we get down to the OrderRequestHeader. Notice the “orderID”?

Thats generally the actual P.O. ID from many of your Buyer software ERP systems (most of them
actually) and right next to it – is the orderDate, also known in some software as the “createdDate”.

Next is the “type” – and it can be “new” or “edit” or even “delete”. And really that’s just the begining you see you have the ShipTo and the BillTo and the Contacts and we haven’t even begun to talk about ItemDetails – which are the actual items. A lot of these have rules to them, and there are some fields which are “optional but required”. Which means that, for example the Zip Code field is not required for a lot of software that processes the cXML… unless of course it’s got a isoCountryCode for the USA. Then it becomes required. Go ahead and take a look at this example and you can get some idea of how OrderRequests look.

This, btw, is a pretty short
OrderRequest. A real one may be many many many lines of text.

<
Request deploymentMode="test">
<
OrderRequest>
<
OrderRequestHeader orderID="DO1234" orderDate="1999-03-12" type="new">
<
Total>
<
Money currency="USD">2.68</Money>
</
Total>
<
ShipTo>
<
Address>
<
Name xml:lang="en">Acme</Name>
<
PostalAddress name="default">
<
DeliverTo>Joe Smith</DeliverTo>
<
DeliverTo>Mailstop M-543</DeliverTo>
<
Street>123 Anystreet</Street>
<
City>Sunnyvale</City>
<
State>CA</State>
<
PostalCode>90489</PostalCode>
<
Country isoCountryCode="US">United States</Country>
</
PostalAddress>
</
Address>
</
ShipTo>
<
BillTo>
<
Address>
<
Name xml:lang="en">Acme</Name>
<
PostalAddress name="default">
<
Street>123 Anystreet</Street>
<
City>Sunnyvale</City>
<
State>CA</State>
<
PostalCode>90489</PostalCode>
<
Country isoCountryCode="US">United States</Country>
</
PostalAddress>
</
Address>
</
BillTo>
<
Tax>
<
Money currency="USD">0.19</Money>
<
Description xml:lang="en">CA Sales Tax</Description>
</
Tax>
<
Payment>
<
PCard number="1234567890123456" expiration="1999-03-12"/>
</
Payment>
<
Comments xml:lang="en-US">Anything well formed in XML can go here.</Comments>
</
OrderRequestHeader>
<
ItemOut quantity="2" requestedDeliveryDate="1999-03-12">
<
ItemID>
<
SupplierPartID>1233244</SupplierPartID>
</
ItemID>
<
ItemDetail>
<
UnitPrice>
<
Money currency="USD">1.34</Money>
</
UnitPrice>
<
Description xml:lang="en">hello</Description>
<
UnitOfMeasure>EA</UnitOfMeasure>
<
Classification domain="SPSC">12345</Classification>
<
ManufacturerPartID>234</ManufacturerPartID>
<
ManufacturerName>foobar</ManufacturerName>
<
URL>www.foo.com</URL>
</
ItemDetail>
<
Shipping trackingDomain="FedEx" trackingId="1234567890">
<
Money currency="USD">2.5</Money>
<
Description xml:lang="en-us">FedEx 2-day</Description>
</
Shipping>
<
Distribution>
<
Accounting name="DistributionCharge">
<
Segment type="G/L Account" id="23456" description="Entertainment"/>
<
Segment type="Cost Center" id="2323" description="Western Region Sales"/>
</
Accounting>
<
Charge>
<
Money currency="USD">.34</Money>
</
Charge>
</
Distribution>
<
Distribution>
<
Accounting name="DistributionCharge">
<
Segment type="G/L Account" id="456" description="Travel"/>
<
Segment type="Cost Center" id="23" description="Europe Implementation"/>
</
Accounting>
<
Charge>
<
Money currency="USD">1</Money>
</
Charge>
</
Distribution>
<
Comments xml:lang="en-US">Anything well formed in XML can go here.</Comments>
</
ItemOut>
</
OrderRequest>
</
Request>
</
cXML>

Sorry for the scrolling – but if you’re interested in learning about cXML you have to get your eyeballs dirty.  There’s no easy way to learn cXML Order Requests – other than looking at them.  The Order Request is the actual order – the items which are being purchased.  If you have to pick a place that you really need to make sure the cXML is valid this is one of the places you’re going to put at the top of the list. 


So how do we know all these values and what they mean and what the options are?  Well it’s actually pretty simple.  Remember that 2nd line of the OrderRequest?  The one identifying it as cXML? The one that if you were writing a parser for this in XML it kept failing on and you probably just deleted that line because you couldn’t get it to work right?  (Yeah … I know who you are. ;-))


<!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.2.019/cXML.dtd">


Well –that is probably the single most important document in your cXML.  It’s the one that tells computers where to get a copy of the DTD that you can validate the file against. It is the line that identifies the cXML as cXML and if you have deleted this line because your XML parser threw a hissy fit when it read it … you’ve just got well formed XML and it’s worthless to send to anyone as cXML.


I’m going to go on record right now and tell you that whatever your consultant that you paid a ton of money for tells you – you have to have this line.  Regardless of what your Supplier Network says or any other documentation you have says… if you are not including this line in your parse then you are not sending anyone cXML and there’s a good chance that you’re going to have errors when you connect to anyone who is cXML compliant and that’s almost everyone who adheres to this standard.


So … that is why I’m spending so much time on this subject and what a POSR is and what a POOR is and later on… what a POOM (PunchOutOrderMessage) is.


It’s also why I coded up a very short and very quick and dirty validator in the last blog update.  So if you don’t have it – go back and get that and see if you can get it running.


I included the source code so you can download a copy of Visual Studio VB Express (which is free from Microsoft at this link) and take a look at how this works.


Inside this little toy you’ll see a function I’ve borrowed from Microsoft’s MSDN samples on validation with a DTD.  I’ve named it cXML_validate – so it’s pretty easy to find.  Let’s look at it below:

Private Function cXML_validate(ByVal value As String) As Boolean
' Set the validation settings.
Dim settings As XmlReaderSettings = New XmlReaderSettings()
settings.ProhibitDtd = False
settings.ValidationType = ValidationType.DTD
AddHandler settings.ValidationEventHandler, AddressOf ValidationCallBack
cb_Success = True
' Create the XmlReader object.
Try
Dim
reader As XmlReader = XmlReader.Create(value, settings)
' Parse the file.
While reader.Read()
End While
Catch
ex As Exception
MsgBox(ex.Message)
End Try

Return
cb_Success

End Function

Pretty straightforward right?  It creates a XMLReader, it sets the settings for the reader and it has a handler that handles the ValidationCallBack for when the validation fails, and the rest is just a very simple get the file, read through the file, and if there’s an error (aside from the validationcallback) toss it. 


Now, what throws a lot of people (especially with .NET code) is that when Microsoft created the XML reader system, they assumed that it would be used for XML (silly Microsoft – what’s next?  They’re going what – create an operating system that’s used for … what it’s intended.)


Anyway… the reason why this is important is that XML and cXML are NOT the same and that very lonely line we discussed earlier uses a DTD file.  Now that DTD file has all the cool things in there you’ll need to validate the file.  Unfortunately eons ago there was a bunch of discussions on standards.  Now, some people thought that you should use a DTD file to validate a schema on an XML file and some people thought you should use a XSD file. 


I’ll cut to the chase here and simplify this for you – the end result is the people that felt DTD files should be used lost.  So Microsoft being part of the group that felt that XSD files should be used kind of created that as the process around which their XML parsers should be used.  They did consider the use of DTD files, and because of this DTD files can be used to validate a document but you need to know how to turn that on or you’re going to get an error similar to this:


image


So I’ll explain to you how you turn this feature on in the code.  If you look at the function above, there are two lines


settings.ProhibitDtd = False settings.


ValidationType = ValidationType.DTD


Now the first one – settings.ProhibitDtd=False, sets the fact that a DTD will be used in the reader and it’s okay and don’t give us that silly looking error.  The second one of course sets the validation type to DTD file. 


But there’s more because in order to do this, we need to set this property – which we do at the top of the Class, with the following property settings:

Public Property ProhibitDtd() As Boolean
Get
Return False
End Get
Set
(ByVal ProhibitDtd As Boolean)
ProhibitDtd = False
End Set
End Property

 


Yup … it’s that simple.  Just create the property so that it can be returned in the code and you’re good to go.  This believe it or not is something that completely throws a lot of devs who try to use the XMLReader to read a cXML and validate it, and it’s that simple to do.  It would I might add, help if they explained that it’s necessary to do this, but apparently that slipped someones mind I guess.


Getting back to validation… we need to do something else that you’ll find in the code, which is set the value of the callback handler to TRUE, so it will return a FALSE if the validation call back handler is triggered.  We do this because the only time it’s triggered is if there’s a validation error. 


Since we’ll be needing to pass this between the Function and the Callback – we want to add the following to the Public Properties area so the value is accessible.  We’ll call the value cb_Success and make it boolean, so it’s either a TRUE or FALSE, success or Failure.

Private cb_Success As Boolean

So we want the callback handler to hit, display the error, write that out for us, set itself to False and keep going.  Here’s the code for the validationcallback handler:

Private Sub ValidationCallBack(ByVal sender As Object, _
ByVal args As ValidationEventArgs)
'Display the validation error. This is only called on error
cb_Success = False
If
(args.Severity = XmlSeverityType.Warning) Then
RichTextBox1.AppendText("No schema found to enforce validation.")
ElseIf (args.Severity = XmlSeverityType.Error) Then
RichTextBox1.AppendText("Validation error: ")
End If

If Not
(args.Exception Is Nothing) Then ' schema validation error
RichTextBox1.AppendText("DTD Schema error (File: " + args.Exception.SourceUri + ")" + vbCrLf)
RichTextBox1.AppendText(vbCrLf)
RichTextBox1.AppendText(String.Format(vbLf & "Validation event ({0}, {1}):" & vbLf & "{2}",_
args.Exception.LineNumber, args.Exception.LinePosition, args.Message) + vbCrLf)
RichTextBox1.AppendText(args.Message.ToString + vbCrLf)
Label1.Text = "Warning"
End If
End Sub


Once again – it’s really straightforward – we first set the cb_Success value to False since we set it to TRUE before we started.  Next we want it to check for the severity level of the validation error.  If it’s a warning it generally that there’s no schema (dtd) found to enforce the validation. If this is a full blown error – well – we want the message to be of use. 


So – we then tell us where we can find the DTD file, because when an error occurs reading cXML it loads up the DTD file, which means there is a good chance that we’ll see line numbers waaaaay past anything in our cXML file.  If we are – then the error kicked back will be something that we can find in the DTD, which has been loaded already.  (To debug these errors you’ll want to download the DTD and look for the line numbers and position from this DTD … which is why we have it pull the URI for the DTD.)


Next thing we need to do is see if we can’t pull the validation error line number and position of the error within the file (remember = some errors will report within the DTD) and finally we want it to display the error message so we have that too.


Here’s a few screen shots of actual errors you’ll see:


image
(Indicates that you have a malformed or incomplete DOCTYPE header tag – most likely someone removed the DTD or the entire DOCTYPE tag.)


image 
This is kind of a confusing error message, but the gist of it is you have a Attribute or Element that is not conforming properly.  You can look up a specific tag in the DTD from the line numbers provided and see how it’s expected to be set up.


 


The best reason I’ve found for why the error messages from the validators aren’t more clear – is because of the flexibility of XML and of the actual specifications.  Microsoft often takes a bit of a bashing for error messages and code errors related to them but in fact – this is actually how the spec is designed.  Humorously enough, while researching for examples for this error I ran across this in a forum posting:


Here’s a discussion with a response from Microsoft on why you see the above error:







 Hello Michael,
This is not a bug in XmlValidatingReader nor it is a bug in IE6 or Firefox. The XML spec gives the parser a choice to error. See section 3.2.1 of the XML spec in Validity constraint: Proper Group/PE Nesting (http://www.w3.org/TR/REC-xml/#sec-element-content): "For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text SHOULD contain at least one non-blank character, and neither the first nor last non-blank character of the replacement text SHOULD be a connector (| or ,)."
Thanks,

-Helena Kupkova, Microsoft


And the reply:

Thank you for the reply.
As far as I can tell, the XML spec doesn't specify that it's an error to violate a constraint indicated by the keyword SHOULD, SHOULD NOT, RECOMMENDED, MAY, or OPTIONAL. Thus, isn't XmlValidatingReader nonconforming when it reports that test.xml is in error?

-Michael

 


As you can see – the original assumption was the error was from the XMLValidatingReader code, when in fact it’s reporting exactly what it’s supposed to – and in the spec it tells us exactly what should go there for this error, namely


"For interoperability, if a parameter-entity reference appears in a choice, seq, or Mixed construct, its replacement text SHOULD contain at least one non-blank character, and neither the first nor last non-blank character of the replacement text SHOULD be a connector (| or ,)."

And as you can see from Michaels reply … a lot of confusion gets generated by who should conform to what and the use of words like “should” and “optional” … and this is why Specs should be clear.  Because everyone has the right to their own opinion – and if given the chance they’ll use their opinion and do something silly with it just because they “think” it should be a certain way.


So – in this case Michael has an issue with the word “should” and Michael “should” realize that when the XML committee says you “should” do something it means yes… in this case you have to – even if it’s inconvenient and makes you write more code.   Lesson:  Don’t shoot the messenger – or better yet – read the specs. 


These are just a few error messages that will be generated.  Because we’re now working with the DTD instead of just treating this like some well formed XML with no rules or structure for processing behind it.  The downside to this is we now have to actually adhere to those rules (which we have to anyway when the files got processed) and not be blind and pretend they didn’t exist.


The upside to this is – our files are far more likely to process without error, they are going to process with far more efficiency since they’re valid and don’t have to be scrubbed or massaged to get the data out of them, and they are far less likely to kick back an error and not get processed at all.


But a caveats or two about what you read here and the code provided:


   - First of all – this is very simplified.  You can write validators that are capable of some amazing things.  Secondly it has errors in the code.  Well, let’s not say errors – let’s say this is the code I had to write because the code I wrote was too close to the code I wrote for someone else… and it was late when I did it.  So – some oddities in it that you can clean up – just above the Label1.test=”Warning:” in the validationcallback handler?  We don’t need that line that writes the args.message to the Richtextbox1.  It’s already done in the line above it.


Like I said – it was late when I wrote that and it’s hardly my best work.  But it does work – so feel free to edit it change it and play with it. 


This is just to get you started and I want to stress that. 


- Second of all – Not the expert on this I wish I was.  I’m good with the cXML, but I’m not with cXML.org or any official representative of it.  I have no authority beyond being someone who works with it every day and sees a lot of problems caused by invalid cXML.  In fact, almost 80% of my workload is due to cXML which will not pass validation.  And these files come from some of the biggest names in the IT industry so – don’t feel bad if you don’t “get cXML” – apparently most people don’t.  I know I certainly don’t and I repair their files for them. 


You can study the subject of eCommerce for years and still never know it all. It’s not meant to be the final word on the subject by any means.  It is literally just the beginnings of this for and should not be used to quote, refer to or otherwise say, “In this column they said…”. 


Everything here is backed by the actual cXML.org documentation or the W3, or XML.org.  Quoting me is silly when you have the big guns like those you can use. 


- Final Caveat:  One strange thing you will note…


You mayget a chuckle out of the fact that the cXML examples on cXML.org will not pass a validation using their own DTD.  There are a couple of reasons why – but the basics are that they’re written for very generalized and genericized examples and not meant to pass validation.


Well I’m about ready to turn into a pumpkin… so I’m going to leave this off for now and we’ll cover more validation errors in a later post. 


Until then… take care.

6 comments:

Joshua Nelson said...

Interesting topic :) Thanks for the info Robert

r a jakobson said...

thanks - not sure how interesting it is. Its pretty dry for most people but hey - that's why I'm doing this. No one else is telling anyone how to it seems. :-D

Digihost said...
This comment has been removed by the author.
Unknown said...

Thanks for sharing article about CXML Punchout
CXML Punchout

Unknown said...

Nice Information.
PunchOut cXML- Vurbis Interactive used punchout cxml protocol developed by Ariba which helps for online shopping and ordering between e-procurement systems.
PunchOut cXML

Unknown said...

Vurbis Interactive used punchout cxml protocol developed by Ariba which helps for online shopping and ordering between e-procurement systems.
PunchOut cXML