[17].XML 验证
17 XML Validation
The gSOAP XML parser applies basic rules to validate content. Constraints are not automatically verified unless explicitly set using flags. This helps to avoid interoperability problems with toolkits that do not strictly enforce validation rules. In addition, we cannot always use strict validation for SOAP RPC encoded messages, since SOAP RPC encoding adopts a very loose serialization format. Validation constraints are enabled with the SOAP_XML_STRICT input mode flag set, e.g. with soap_set_imode(soap, SOAP_XML_STRICT) or soap_new(SOAP_XML_STRICT), see Section 9.12 for the complete list of flags.
17.1 Occurrence Constraints
17.1.1 Default Values
Default values can be defined for optional elements and attributes, which means that the default value will be used when the element or attribute value is not present in the parsed XML. See also Section 7.5.7 and examples in subsequent subsections below. Default values must be primitive types, integer, float, string, etc. Default values can be specified for struct and class members, as shown in the example below:
|
struct ns__MyRecord { int n = 5; // optional element with default value 5 char *name = "none"; // optional element with default value "none" @enum ns__color { RED, WHITE, BLUE } color = RED; // optional attribute with default value RED }; |
Upon deserialization of absent data, these members will be set accordingly.
When classes are instantiated with soap_new_ClassName the instance will
be initialized with default values.
17.1.2 Elements with minOccurs and maxOccurs Restrictions
To force the validation of minOccurs and maxOccurs contraints the SOAP_XML_STRICT
input mode flag must be set.
The minOccurs and maxOccurs constraints are specified for fields of a
struct and members of a class in a header file using the following
syntax:
| Type fieldname [minOccurs[:maxOccurs]] [= value] |
The minOccurs and maxOccurs values must be integer literals. A default
value can be provided. When minOccurs is not specified, it is assumed to
be zero.
For example
|
struct ns__MyRecord { int n = 5; // element with default value 5, minOccurs=0, maxOccurs=1 int m 1; // element with minOccurs=1 int __size 0:10; // sequence <item> with minOccurs=0, maxOccurs=10 int *item; std::vector<double> nums 2; // sequence <nums> with minOccurs=2, maxOccurs=unbounded }; struct arrayOfint { int *__ptr 1:100; // minOccurs=1, maxOccurs=100 int size; }; |
Pointer-based struct fields and class members are allowed to be nillable when minOccurs is zero.
17.1.3 Required and Prohibited Attributes
Similar to the minOccurs and maxOccurs annotations defined in the previous
section, attributes in a struct or class can be annotated with occurrence
constraints to make them optional (0), required (1), or prohibited (0:0).
Default values can be assigned to optional attributes.
For example
|
struct ns__MyRecord { @int m 1; // required attribute (occurs at least once) @int n = 5; // optional attribute with default value 5 @int o 0; // optional attribute (may or may not occur) @int p 0:0; // prohibited attribute }; |
Remember to set the SOAP_XML_STRICT input mode flag to
enable the validation of attribute occurrence constraints.
17.2 Value Constraints
17.2.1 Data Length Restrictions
A schema simpleType is defined with a typedef by taking a base primitive to defined a derived simpleType. For example:
| typedef int time__seconds; |
This defines the following schema type in time.xsd:
|
<simpleType name="seconds"> <restriction base="xsd:int"/> </simpleType> |
A complexType with simpleContent is defined with a wrapper struct/class:
|
struct time__date { char *__item; // some custom format date (restriction of string) @enum time__zone { EST, GMT, ... } zone; } |
This defines the following schema type in time.xsd:
|
<complexType name="date"> <simpleContent> <extension base="xsd:string"/> </simpleContent> <attribute name="zone" type="time:zone" use="optional"/> </complexType> <simpleType name="zone"> <restriction base="xsd:string"> <enumeration value="EST"/> <enumeration value="GMT"/> ... </restriction> </simpleType> |
Data value length constraints of simpleTypes and complexTypes with simpleContent are defined as follows.
|
typedef char *ns__string256 0:256; // simpleType restriction of string with max length 256 characters typedef char *ns__string10 10:10; // simpleType restriction of string with length of 10 characters typedef std::string *ns__string8 8; // simpleType restriction of string with at least 8 characters struct ns__data // simpleContent wrapper { char *__item :256; // simpleContent with at most 256 characters @char *name 1; // required name attribute }; struct time__date // simpleContent wrapper { char *__item :100; @enum time__zone { EST, GMT, ... } zone = GMT; } |
Remember to set the SOAP_XML_STRICT input mode flag to
enable the validation of value length constraints.
17.2.2 Value Range Restrictions
Similar to data length constraints for string-based data, integer data value range
constraints of numeric simpleTypes and complexTypes with simpleContent are defined as
follows.
|
typedef int ns__int10 0:10; // simpleType restriction of int 0..10 typedef LONG64 ns__long -1000000:1000000; // simpleType restriction of long64 -1000000..1000000 typedef float ns__float100 -100:100; // simpleType restriction of float -100..100 struct ns__data // simpleContent wrapper { int __item 0:10; // simpleContent range 0..10 @char *name 1; // required name attribute }; |
Currently the value bounds must be integer valued. Therefore, floating
point ranges are limited to integer bounds. This may change in future
releases.
17.2.3 Pattern Restrictions
Patterns can be defined for simpleType content. However, patterns are
currently not enforced in the validation process though possibly in
future releases.
To associate a pattern with a simpleType, you can define a simpleType with a typedef and a pattern string:
| typedef int time__second "[1-5]?[0-9] - 60"; |
This defines the following schema type in time.xsd:
|
<simpleType name="second"> <restriction base="xsd:int"> <pattern value="[1-5]?[0-9] - 60"/> </restriction base="xsd:int"/> </simpleType> |
The pattern string MUST contain a valid regular expression.
17.3 Element and Attribute Qualified/Unqualified Forms
Struct, class, and union members represent elements and attributes that are
automatically qualified or unqualified depending on the schema element and
attribute default forms. The gSOAP engine always validates the prefixes of
elements and attributes. When a namespace mismatch occurs, the element or
attribute is not consumed which can lead to a validation error (unless the
complexType is extensible or when SOAP_XML_STRICT is turned off).
See Section 10.3 for details on the
the struct/class/union member identifier translation rules.
Consider for example:
|
//gsoap ns schema elementForm: qualified //gsoap ns schema attributeForm: unqualified struct ns__record { @char * type; char * name; }; |
Here, the ns__record struct is serialized with qualified element name and unqualified attribute type:
|
<ns:record type="..."> <ns:name>...</ns:name> </ns:record> |
The "colon notation" for struct/class/union member field names is used
to
override element and attribute qualified or unqualified forms.
To override the form for individual members that represent elements and
attributes, use a namespace prefix and colon with the member name:
|
//gsoap ns schema elementForm: qualified //gsoap ns schema attributeForm: unqualified struct ns__record { @char * ns:type; char * :name; }; |
where name is unqualified and type is qualified:
|
<ns:record ns:type="..."> <name>...</name> </ns:record> |
The colon notation is a syntactic notation used only in the gSOAP header file
syntax, it is not translated to the C/C++ output.
The colon notation does not avoid name clashes between members. For example:
|
struct x__record { @char * name; char * x:name; }; |
results in a redefinition error, since both members have the same name. To avoid name clashes, use a underscore suffix:
|
struct x__record { @char * name; char * x:name_; }; |
Not that the namespace prefix convention can be used instead:
|
struct x__record { @char * name; char * x__name; }; |
which avoids the name clash. However, the resulting schema is different
since the last example generates a global name element definition that is
referenced by the local element.
More specifically, the difference between the namespace prefix convention with double underscores
and colon notation is that the namespace prefix convention generates schema
element/attribute references to elements/attributes at the top level,
while the colon notation only affects the local element/attribute namespace
qualification by form overriding. This is best illustrated by an example:
|
struct x__record { char * :name; char * x:phone; char * x__fax; char * y__zip; }; |
which generates the following x.xsdschema:
|
<complexType name="record"> <sequence> <element name="name" type="xsd:string" minOccurs="0" maxOccurs="1" nillable="true" form="unqualified"/> <element name="phone" type="xsd:string" minOccurs="0" maxOccurs="1" nillable="true" form="qualified"/> <element ref="x:fax" minOccurs="0" maxOccurs="1"/> <element ref="y:zip" minOccurs="0" maxOccurs="1"/> </sequence> </complexType> <element name="fax" type="xsd:string"/> |
and the y.xsd schema defines contains:
| <element name="zip" type="xsd:string"/> |

浙公网安备 33010602011771号