Difference between revisions of "Webservices API"
m (Cosmetics) |
m (Added category) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 235: | Line 235: | ||
== Authentication == | == Authentication == | ||
− | The webservice uses the Basic authentication scheme. If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be - | + | The webservice uses the Basic authentication scheme(1). If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be - |
<pre>base64 ( <username>:<password> )</pre> | <pre>base64 ( <username>:<password> )</pre> | ||
Line 246: | Line 246: | ||
RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L] | RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L] | ||
RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"</pre> | RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"</pre> | ||
+ | |||
+ | (1) A functional implementation of WSSE authentication is included in <tt>system/lib-webservices.php</tt> but can not be used since Geeklog does not have access to the user's unencrypted password and therefore can't perform the authentication ... | ||
+ | |||
+ | |||
+ | == Implementation Details == | ||
+ | |||
+ | === Length of Entry IDs === | ||
+ | |||
+ | Atompub clients will usually include an ID when creating a new entry. To ensure that this ID is unique and can be used, Geeklog will need to know the max. length of an ID as used by a plugin. For stories and static pages, that max. length is 40 characters and has now been hard-coded as constants. | ||
+ | |||
+ | Earlier versions of Geeklog did not use a hard-coded length, so it was easy to use longer IDs (e.g. for SEO purposes) by simply changing the database and the input fields in the story or static pages editor. If you did that, you will have to adjust the constants accordingly or you will not be able to modify your entries through the webservices API. | ||
+ | |||
+ | === Slug === | ||
+ | |||
+ | Some Atompub clients will send a <code>Slug:</code> header with the POST request when creating a new entry. This header contains a text string that the client suggests to be used in the ID for the new entry. | ||
+ | |||
+ | Geeklog will try to make use of the Slug: header, if it decides that it needs to create a new ID for the entry. However, the content will be ignored if it contains %-encoded characters since those are usually non-printable or Unicode characters that can not be used in an entry ID in Geeklog. | ||
+ | |||
+ | The content of the Slug: header is also available as a <code>'slug'</code> entry in the [[Webservices_API#Standard_Input_Keys|input array]]. Plugins can either use it directly or pass it to the <code>WS_makeId</code> function to create a new ID. | ||
Line 280: | Line 299: | ||
Other than that detail, the Atompub implementation for stories is also fully compliant. | Other than that detail, the Atompub implementation for stories is also fully compliant. | ||
+ | |||
+ | |||
+ | [[Category:Development]] |
Latest revision as of 12:58, 8 May 2009
Contents
- 1 Webservices Documentation for Geeklog 1.5
- 1.1 Relation of the Webservices component with the rest of Geeklog
- 1.2 GL Directives
- 1.3 Standard Input Keys
- 1.4 Output Array
- 1.5 Service Messages
- 1.6 HTTP Responses and Return Codes
- 1.7 Atom Client Requirements
- 1.8 XML Namespaces
- 1.9 Atom Extensions
- 1.10 Categories
- 1.11 URI Details
- 1.12 Authentication
- 1.13 Implementation Details
- 1.14 Security Implications
- 1.15 Compliance
Webservices Documentation for Geeklog 1.5
The purpose of the Geeklog webservices module is to provide an application layer interface for Geeklog that can be used by standardized clients such as aggregators, desktop publishing software etc. to interact with the web-server, and create and update content programmatically.
The Atom Publishing Protocol is the protocol that has been implemented for this purpose. The protocol is now an official internet standard (RFC 5023).
For an end-user documentation, please see Using the Webservices.
Relation of the Webservices component with the rest of Geeklog
In order to incorporate the webservice, the Geeklog code has been re-organized. The purely functional code has been shifted into 'service' methods, separating it from the code that manages the display and rendering of the plugin. For instance, there are now functions called 'service_post_story' and 'service_delete_story' that take an array of parameters as argument and perform the 'post' and 'delete' functions respectively. These methods can be called by the webservice and the HTML scripts alike. When called from any external functions, these functions must be called by using the PLG_invokeService method, like this:
PLG_invokeService($plugin, $verb, $input, $output, $svc_msg);
$plugin |
The plugin whose function needs to be invoked |
$verb |
The action to be performed ('post,' 'delete,' etc.) |
$input |
An array of input parameters Some of these input parameters may be optional, depending on the plugin. The plugin MUST ignore unknown parameters. |
$output |
An array of output parameters passed by reference. The calling function may choose to display these parameters as it chooses. In the story and staticpages implementation, this variable is a string in a few cases, but it is STRONGLY RECOMMENDED that this variable be treated as a simple array of parameter-value pairs. On successful operation, the webservice script MAY display the content of this variable to the client in the form of XML. |
$svc_msg |
This array, short for "service messages" is used exclusively by the webservice to get certain types of control information from the plugin. |
The above invocation calls the function (provided it exists)
service_<verb>_<plugin>(<input>, <output>, <svc_msg>)
where <output> and <svc_msg> are passed by reference to the plugin.
The <output> and <svc_msg> arrays MUST be filled in by the plugin function, so that it can be acted upon by the calling function.
Implications for writing a plugin
If a plugin is developed according to certain rules, it can automatically provide an Atom-enabled interface for the client. For this, the following verbs must be implemented:
submit |
This verb handles any kind of data posted to the server. In the context of the Atom protocol, this verb is used to create new items or update existing items on the server. (See: GL Directives) Successful completion results in the return of either a 'HTTP/1.1 201 Created' response or a 'HTTP/1.1 200 Ok' response to the Atom client. |
delete |
This verb deletes a specified resource on the server. |
get |
This verb handles the retrieval of information existing on the server. When accessed via the webservice, each item is serialized into XML, in the Atom Syndication format (See: RFC 4287). The plugin may return a single entry or multiple entries. The plugin MUST return a single item if the $id variable is set. (See: Service Messages) |
The Atom specifications have been extended to allow verbs other than the ones above. (See: Atom Extensions)
Enabling webservices for a plugin
In order to enable webservices for a plugin, it must implement the plugin_wsEnabled_<plugin>
function AND this function must return true
. For example:
function plugin_wsEnabled_staticpages() { return true; }
For a plugin that supports webservices, webservices can be disabled by returning false
in the function above.
The following function can be used in the rest of the GL code to check if a specific plugin supports webservices:
PLG_wsEnabled($type);
GL Directives
The $input array should contain all the information required for the successful processing of the requested action. Some keys in this array are, however, reserved for providing useful processing information to the plugin. These array keys MUST NOT be used to store user-provided information.
'gl_svc' |
If true, this indicates that the function has been invoked by the webservices component. Ideally, this should not matter, but for existing plugins, it eases the transition to an Atom-enabled server by allowing the plugin to differentiate between a webservice call and an invocation by the HTML component. |
'gl_edit' |
If true, this indicates that the 'submit' verb has been invoked in 'Edit' mode, which means an existing item is to be modified. On successful completion, the Atom client will receive a 'HTTP/1.1 200 Ok' response, rather than a 'HTTP/1.1 201 Created' response that is normally transmitted for new items. |
'gl_etag' |
If set, this variable contains the If-Match HTTP header (with the double-quotes stripped) sent by the client along with a updation request. Unless it is empty, this variable MUST be compared to the 'updated' property of the existing item before the item is modified. This ensures that the item has not been modified in the interval between its retrieval by the client and subsequent updation. |
Standard Input Keys
Apart from the GL directives, there are some more array keys for the $input variable that have standard meanings. These include -
'id' |
The ID of the item that the client wants to refer to. |
'title' |
The title of the item under consideration. |
'author_name' |
The name of the author, as provided by the client. |
'category' |
An array of all the categories for the item, supplied by the user. (See: Categories) |
'updated' |
The date and time, as accurate as possible when the item was last updated. Since the 'updated' value is used to determine if the item has been modified, it is STRONGLY RECOMMENDED that the value be updated on each modification of the item. This value is in the RFC 2822 format. The following keys are also updated with the local time, based on the value of $input['updated'] -
'publish_month' 'publish_year' 'publish_day' 'publish_hour' 'publish_minute' 'publish_second' |
'summary' |
A summary of the content of the item. |
'content' |
The main content of the item. |
Output Array
The $output variable contains all the output generated by the plugin function. In error conditions, the $output variable MAY be a string rather than an array, since the webservice does not handle the $output variable under error conditions. However, this is NOT RECOMMENDED.
The items listed in Section Standard Input Keys MUST be filled in appropriately by the plugin function before returning.
Service Messages
The $svc_msg array is used to return specific messages to the webservice component. The following array keys are understood -
'id' |
The ID of the item under consideration. For POST requests, this ID forms a part of the URI returned in the Location header, as specified by the Atom protocol. For GET requests, this ID forms part of the URI that is inserted into each entry. |
'error_desc' |
When the plugin function returns an error code, the webservice looks at this value and returns it to the user if it is non-empty. This is particularly useful for making the 400 Bad Request errors more descriptive and plugin-specific. |
'gl_feed' |
The plugin should set this variable to true if the plugin is returning multiple items, rather than a single item. This means that $output is expected to be an array of arrays. |
'offset' |
This variable indicates the number of items (from the start) that the server would have to skip in order to present the next partial list of items of the collection. This value forms a part of the URI inserted into the Atom feed document. |
'output_fields' |
This array provides the list of keys of the $output variable that should be converted into XML and displayed to the user. This is primarily used because the plugin function may want to hide some of the output values in case of the webservice. This list MUST NOT include any of the standard Atom elements (See: Standard Input Keys). Those elements will be displayed to the user, even without being listed here. |
HTTP Responses and Return Codes
The service_<verb>_<plugin>
functions MUST return one of the following codes
PLG_RET_OK |
Everything is okay |
PLG_RET_AUTH_FAILED |
Credentials were supplied by the client, but authentication failed |
PLG_RET_PERMISSION_DENIED |
The client does not access to the specified resource |
PLG_RET_PRECONDITION_FAILED |
The If-Match HTTP header condition provided by the client failed |
PLG_RET_ERROR |
An error apart from the ones above was encountered |
The Atom server returns one of the following responses on successful (code>PLG_RET_OK</code>) or unsuccessful (all other return codes) completion of an operation:
PLG_RET_OK |
200 Ok | This is the usual response. |
201 Created | This is the response returned when the HTTP method used is POST | |
PLG_RET_AUTH_FAILED |
401 Unauthorized | Authentication failed |
PLG_RET_PERMISSION_DENIED |
403 Forbidden | The supplied credentials are insufficient |
PLG_RET_PRECONDITION_FAILED |
412 Precondition Failed | A necessary condition failed |
Atom Client Requirements
A standard Atom client can be used to post, edit, delete and get items to the server using any plugin. A Geeklog specific client would provide fine-grained control over the input data. (See: Atom Extensions)
XML Namespaces
The namespaces that are expected by the webservice are
- http://www.w3.org/2007/app - Atom publishing protocol
- http://www.w3.org/2005/Atom - standard Atom elements
- http://www.geeklog.net/xmlns/app/gl - Geeklog-specific elements
Atom Extensions
The webservice ignores all XML elements that do not belong to one of the above namespaces. Some elements belonging to the http://www.w3.org/2005/Atom are interpreted as explained in Section: Standard Input Keys. All other elements belonging to either the http://www.w3.org/2005/Atom or http://www.geeklog.net/xmlns/app/gl namespaces are transformed thus:
$input[<name>] = <value>
where
-
<name>
is the local name of the node -
<value>
is the value of the node's content (if the content is text-only) OR the text-values contained in all the child nodes, stored as an array
For example
<somename>John</somename> becomes $input['somename'] = 'John'; <somename><param>abcd</param></somename> becomes $input['somename'] = array ( 'abcd' );
To invoke a verb other than 'submit,' 'delete' or 'get,' the client should insert the following XML snippet as a child of the atom:entry node
<action xmlns="http://www.geeklog.net/xmlns/app/gl">$verb</action>
where $verb is the requested verb. The content should be submitted as a POST request. Successful operation returns a '201 Created' HTTP response.
Categories
Atom categories correspond to Topics in Geeklog. The Atom server has support for multiple categories. atom:category elements of the form -
<category xmlns="http://www.geeklog.net/xmlns/app/gl" term="sometopic"/> <category xmlns="http://www.geeklog.net/xmlns/app/gl" term="someothertopic"/>
are converted into -
$input['category'] = array ( 'sometopic', 'someothertopic' );
If the plugin can support only one topic, then it MAY reject all or all except one category provided by the user.
The server provides the client a list of possible categories using the 'getTopicList' verb.
URI Details
The webservice follows the standard Atom discovery mechanism to let clients know the URIs of the available services. A webservice URI is of the form -
http://<domain>/webservices/atom/?plugin=<plugin_name> http://<domain>/webservices/atom/?plugin=<plugin_name>&id=<object_id> http://<domain>/webservices/atom/?plugin=<plugin_name>&offset=<offset_value>
In the absence of the <object_id> parameter value, the URI is assumed to point to the entire collection. In this case, the first <offset_value> items MAY be skipped on a GET request. If the plugin provides support for skipping elements, then the $svc_msg['offset'] value, on return, MUST contain the offset value for obtaining the next set of items in the collection.
If the <object_id> value is invalid but not empty, the plugin function must return an error response.
Authentication
The webservice uses the Basic authentication scheme(1). If the client receives a '401 Unauthorized' HTTP response, then it MAY send another request with suitable credentials. The authentication string passed by the client is expected to be -
base64 ( <username>:<password> )
Basic authentication is handled implicitly by the webserver in most cases.
If PHP is installed as a CGI binary on your server, then authentication might fail because Apache may not pass on the authorization headers to PHP. In that case, update your .htaccess
file to include the following lines:
RewriteEngine on RewriteRule .* - [E=REMOTE_USER:%{HTTP:Authorization},L] RewriteCond %{HTTP:Authorization} username=\"([^\"]+)\"
(1) A functional implementation of WSSE authentication is included in system/lib-webservices.php but can not be used since Geeklog does not have access to the user's unencrypted password and therefore can't perform the authentication ...
Implementation Details
Length of Entry IDs
Atompub clients will usually include an ID when creating a new entry. To ensure that this ID is unique and can be used, Geeklog will need to know the max. length of an ID as used by a plugin. For stories and static pages, that max. length is 40 characters and has now been hard-coded as constants.
Earlier versions of Geeklog did not use a hard-coded length, so it was easy to use longer IDs (e.g. for SEO purposes) by simply changing the database and the input fields in the story or static pages editor. If you did that, you will have to adjust the constants accordingly or you will not be able to modify your entries through the webservices API.
Slug
Some Atompub clients will send a Slug:
header with the POST request when creating a new entry. This header contains a text string that the client suggests to be used in the ID for the new entry.
Geeklog will try to make use of the Slug: header, if it decides that it needs to create a new ID for the entry. However, the content will be ignored if it contains %-encoded characters since those are usually non-printable or Unicode characters that can not be used in an entry ID in Geeklog.
The content of the Slug: header is also available as a 'slug'
entry in the input array. Plugins can either use it directly or pass it to the WS_makeId
function to create a new ID.
Security Implications
Plugin developers should be aware that writing a function of the form -
service_<x>_<y> (...)
makes the function open to the public, in the sense that it can be called using the webservice, with appropriate parameters. Functions should not be named in this way, unless they are intended to be called independently.
An important corollary is that function calls of the type -
if (<security_check>) { PLG_invokeService(...); }
are BAD, because the same function can be called from the webservice WITHOUT the <security_check>.
Compliance
To the best of our knowledge, the webservices / Atompub implementation in Geeklog complies with RFC 5023 (and RFC 4287, where applicable).
Atom Protocol Exerciser
The Atom Protocol Exerciser (aka The APE) by Tim Bray performs several operations against an Atompub service (such as the one implemented in Geeklog) and evaluates the responses, i.e. it is looking for expected results according to the RFCs.
At the time of this writing (January 2008) Geeklog's Atompub implementation for the Static Pages plugin passes all of the APE's tests.
Non-conformance of Stories Implementation
The Atompub implementation for stories fails one test - but that was a deliberate decision. In this test, the APE creates 3 stories, modifies the second one, and then expects the stories to show up in the order 2-1-3. However, since Geeklog does not have the concept of a "last-modified" date for stories, we would have to modify the creation date of the story to pass this test. Which would mean that any change to a story through an Atompub client would cause the story to show up as "new" on the site, even if someone only fixed a typo in an old story.
Other than that detail, the Atompub implementation for stories is also fully compliant.