Apidoc XML Schema

This documents the schema of the XML that is used by systems at UiB to describe their external APIs and their relationships.

For this schema we'll try to reuse parts of Docbook where it it make sense, but Docbook is quite complex so we try to simplify things so that it will not be too unpleasant to actually maintain these documents by hand. If we later find value in using various docbook tools to process this information we should be able to set up transformations into "real" docbook files. For an example of how docbook looks like see the docbook.org refentry example.

The overall structure of the apidoc XML document is as follows:

apidoc
    system*
        @refname
        @cmdb?
        abstract
        class*
            @name
        uses*
            @ref
            @dataflow?
            #text
        link*
            @href
            @role?
            #text
        description?
        api*
            @refname
            @dataflow?
            abstract
            synopsis?
                file*
                    @host = "localhost"
                    @path = api.refname
                    @content-type = "text/plain"
                    @charset = "UTF-8"
                    content | csv-file | xml-file
                httpservice*
                    @method = "GET"
                    @enctype = "application/x-www-form-urlencoded"
                    @content-type?
                    @action
                    httpheader*
                    content?
                    param*
                        @name
                        @type = "string"
                        @default?
                        @optional = "false"
                        #text
                    httpresponse*
                        @code = "200"
                        @message?
                        @content-type?
                        @label?
                        httpheader*
                        content?
                dbtable*
                    @database?
                    @viewname = api.refname
                    column*
                dbview*
                    @database?
                    @viewname = api.refname
                    column*
                command*
                    #text
            description?
            examples?
                @format = "markdown"
                #text | #xhtml

abstract
    #text

description
    @format = "markdown"
    #text | #xhtml

httpheader
    @name
    #text

column
    @name
    @type = "string"
    #text

content
    #text

csv-file
    @separator = ","
    @eol = "\n"
    @escape = '"'
    column*

xml-file
    @schema?
    #xhtml

Indentation show parent/kid relationships. The root element is always apidoc. The '@'-prefixed lines show key attributes; the other lines represent elements. The suffix '?' is used for optional elements or attributes (0 or 1 occurrence). The suffix '*' is used for optional elements that might repeat (0 or more occurrences). The suffix '+' is used for elements that might repeat (1 or more occurrences). Attribute values with defaults are optional as well. The default value is specified after the '=' sign. The #text marker denotes that the parent element take text content.

The apidoc XML Elements

This section documents each of the element types that can be used in apidoc documents.

<abstract>

This element contains a short text that describes the purpose of a system or an api. No kids besides the text.

<api>

This element denotes a single api entry of the parent system. The attribute refname is required. Note that a dot in the api refname is not used to form hierarchies, it's just a regular char.

The attribute dataflow can be used to describe the direction that data mainly flows over the API. Its value can be one of "pull", "push", "both". The default is "pull".

Allowed kids are abstract, synopsis, description, and examples.

<apidoc>

Root element. The kids must be system elements.

<class>

Kid of system and can be repeated. Required attribute is name. The name should be an identifier string without spaces.

This is used to tag or categorize systems in various ways. The classes "client", "server" and "external" has predefined meaning.

Systems with APIs or referenced by uses are implictly of class "server". Systems with uses are implcitly of class "client".

<content>

Textual content; no kid elements accepted. This element can be kid of either httpresponse or file.

Since this is the text that shows in the synopsis section it should only show a short extract of the typical structure of the content. For complete examples use the examples section instead.

<description>

Element that contains text that more fully describes a system or an api.

The optional format attribute is used denote the markup language used for the description text. Allowed values are "xhtml" and "markdown". The default is "markdown" if the description element only contains text and "xhtml" if there are sub-elements contained within.

When format is "xhtml" then xhtml elements can be freely mixed with text content. When format is "markdown" then there can be no kids besides the text content. In this case it's advisable to embed the text in <![CDATA[...]]> sections as any HTML fragments embedded in the markdown code would otherwise need to be "escaped".

In the text strings on the form "apidoc:system-refname" or "apidoc:system-refname/api-refname" can be used for creating links to other systems and apis. We basically hi-jack the "apidoc" URI scheme. Examples:

apidoc:sebra.sws
apidoc:sebra.sws/person

The first one is a link to the "sebra.sws" system. The second is a link to the "person" api within the "sebra.sws" system.

<examples>

Basically more description text but this is meant to be used to describe examples of how the API is used. Attributes and behaviour otherwise as for description.

<file>

This element occurs as a kid of synopsis and represents information exchanged via a file.

The attribute path is an absolute or relative path name of the file. Its value default to the refname of the corresponding api.

The attribute host is the name of the machine where the file resides. The default value is localhost.

The attribute content-type specify the format for the file content. We use parameterless MIME-style content-type strings as values here. The default is text/csv if the <file> has a <csv-file> kid. The default is text/xml if the <file> has a <xml-file> kid. Otherwise the default is text/plain.

The attribute charset specify what character set the file is encoded in. For text files the default is UTF-8.

<httpheader>

Textual content; no kid elements accepted. Required attribute is name. Example:

<httpheader name="Content-Type">text/html</httpheader>

This is used to list significant headers in the request or response. Since the "Content-Type" header can also be defined as an attribute of the httpresponse it will be an error to specify it both ways.

<httpresponse>

Used to specify what kind of response to expect. This element should be a kid of httpservice and it's repeated for each of the different kind of responses the request can generate.

The attribute code specify the HTTP status code of the response. The default value is "200".

The attribute message specify the HTTP status line message of the response. The default is derived from the code. For instance with a code of "200" this will be "OK" and with a code of "404" this will be "Not found".

The attribute content-type specify what the kind of content will be found in the body of the message. There is no default.

The attribute label specify a short textual string used to label this response relative to the other responses. Could be a short string like "on failure".

All attributes are optional.

When there is a need to specify other headers use one or more httpheader kid elements and optionally an content element to show the overall structure of the content.

<httpservice>

This element can only occur as kid of synopsis and it specifies an HTTP service endpoint. In many ways this is similar to the <form> element of HTML. The method, action and enctype attributes are basically the same and then we use param kids instead of HTML's input.

The method attribute is optional; its value defaults to "GET". This specify the HTTP method to be used for the request.

The action attribute is required. It specifies the URL endpoint. Its value use a subset of the syntax specified by uritemplate. Basically use {param} as placeholders in the URI string and then we might extend it to use the more complicated features of uritemplate when the need arises. If there are no {param} placeholder in this value then any parameters will just fill in the query part of the URL as for HTML forms or the body content when the method is specified as "POST".

The enctype attribute is only used when method is "POST" and its meaning is the same as for the HTML attribute. The default value is "application/x-www-form-urlencoded" (same as the HTML default).

The element can contain repeated occurrences of httpheader, param and httpresponse elements. An optional content content element is also allowed.

Normally you would specify one param for each of the placeholders in action. Placeholders without a corresponding param will be treated as if an empty param was present for them. param elements without a corresponding placeholder will be treated as query parameters as done for input elements in HTML forms.

Example:

<httpservice method="GET" action="http://www.example.com/user/{uid}">
   <param name="uid" optional="false">The Unix user name of the given person</param>
   <httpresponse content-type="text/xml">
      <content><![CDATA[<user><name>...</name>...</user>]]></content>
   </httpresponse>
   <httpresponse code="404"/>
</httpservice>

<link>

Relate this service to something else. The href attribute is a URL. It's possible to use "apidoc:" URLs to reference other systems or APIs. The role attribute qualifies the link. What values it can take is to be decided based on experience.

<param>

Used to declare a parameter. Kid of httpservice. The content should be a short text describing the purpose of the parameter.

Required attribute is name. There should be a corresponding parameter in the uritemplate of the action attribute and the name should be unique within a httpservice.

The type attribute is optional; its value defaults to "string". Other possible values include "number", "integer", "date", "year", "month", "semester", "enum(foo,bar)".

The default attribute is optional; it specifies that leaving out this parameter has the same effect as providing the given value. This implies that this parameter is optional as well.

The optional attribute is optional; its value defaults to "false" iff a default isn't provided. The other possible value is "true" (surprisingly enough). It's an error to specify this attribute as "false" when a default is provided.

<synopsis>

This describes the syntax of the API in structured form. The kid elements used will differ based on the nature of the service. It will be one or more of the following elements:

Only httpservice is specified currently. Other types of APIs will be specified on demand.

<system>

This element describes a system. The attribute refname is required and it should be a short unique identifier string. Subsection relationships are formed by forming refnames with dots in them. For instance a system with a refname of "foo.bar" will be a subsystem of the system with refname "foo".

The attribute cmdb is optional and encodes the identifier of the corresponding service at the Issuetracker configDB.

Allowed kid elements are abstract, zero or more uses elements, description, and zero or more api elements.

<uses>

This element denotes the fact that the parent system uses some api or system. This describes a dependency on the other system or api. The parent element must be system.

The attribute ref is required. It should be the refname some system, or the refname of some api qualified by its system.

The attribute dataflow is optional and overrides the dataflow specified or defaulted for the api. See api for what values can be specified. If ref references a system the dataflow defaults to "both".