Enterprise Integration Zone is brought to you in partnership with:

I am a software architect working in service hosting area. I am interested and specialized in SaaS, Cloud computing and Parallel processing. Ricky is a DZone MVB and is not an employee of DZone and has posted 87 posts at DZone. You can read more from them at their website. View Full User Profile

Common REST Design Pattern

06.04.2008
| 58661 views |
  • submit to reddit

Based on the same architectural pattern of the web, "REST" has a growing dominance of the SOA (Service Oriented Architecture) implementation these days. In this article, we will discuss some basic design principles of REST.

SOAP : The Remote Procedure Call Model

Before the REST become a dominance, most of SOA architecture are built around WS* stack, which is fundamentally a RPC (Remote Procedure Call) model.

Under this model, "Service" is structured as some "Procedure" exposed by the system. For example, WSDL is used to define the procedure call syntax (such as the procedure name, the parameter and their structure). SOAP is used to define how to encode the procedure call into an XML string. And there are other WS* standards define higher level protocols such as how to pass security credentials around, how to do transactional procedure call, how to discover the service location ... etc.

Unfortunately, the WS* stack are getting so complicated that it takes a steep learning curve before it can be used. On the other hand, it is not achieving its original goal of inter-operability (probably deal to different interpretation of what the spec says). In the last 2 years, WS* technology development has been slowed down and the momentum has been shifted to another model; REST.

REST: The Resource Oriented Model

REST (REpresentation State Transfer) is introduced by Roy Fielding when he captured the basic architectural pattern that make the web so successful. Observing how the web pages are organized and how they are linked to each other, REST is modeled around a large number of "Resources" which "link" among each other. As a significant difference with WS*, REST raises the importance of "Resources" as well as its "Linkage", on the other hand, it push down the importance of "Procedures".

Under the WS* model, "Service" in the SOA is organized as large number of "Resources". Each resource will have a URI that make it globally identifiable. A resource is represented by some format of "Representation" which is typically extracted by an idempotent HTTP GET. The representation may embed other URI which refers to other resources. This emulates an HTML link between web pages and provide a powerful way for the client to discover other services by traversing its links. It also make building SOA search engine possible.

On the other hand, REST down play the "Procedure" aspect and define a small number of "action" based on existing HTTP Methods. As we discussed above, HTTP GET is used to get a representation of the resource. To modify a resource, REST use HTTP PUT with the new representation embedded inside the HTTP Body. To delete a resource, REST use HTTP DELETE. To get metadata of a resource, REST use HTTP HEAD. Notice that in all these cases, the HTTP Body doesn't carry any information about the "Procedure". This is quite different from WS* SOAP where the request is always made using HTTP POST.

At the first glance, it seems REST is quite limiting in terms of the number of procedures that it can supported. It turns out this is not the case, REST allows any "Procedure" (which has a side effect) to use HTTP POST. Effectively, REST categorize the operations by its nature and associate well-defined semantics with these categories (ie: GET for read-only, PUT for update, DELETE for remove, all above are idempotent) while provide an extension mechanism for application-specific operations (ie: POST for application procedures which may be non-idempotent).

URI Naming Convention

Since resource is usually mapped to some state in the system, analyzing its lifecycle is an important step when designing how a resource is created and how an URI should be structured.

Typically there are some eternal, singleton "Factory Resource" which create other resources. Factory resource typically represents the "type" of resources. Factory resource usually have a static, well-known URI, which is suffixed by a plural form of the resource type. Some examples are ...

http://xyz.com/books
http://xyz.com/users

"Resource Instance", which are created by the "Factory Resource" usually represents an instance of that resource type. "Resource instances" typically have a limited life span. Their URI typically contains some unique identifier so that the corresponding instance of the resource can be located. Some examples are ...

http://xyz.com/books/4545
http://xyz.com/users/123


"Dependent Resource" are typically created and owned by an existing resource during part of its life cycle. Therefore "dependent resource" has an implicit life-cycle dependency on its owning parent. When a parent resource is deleted, all the dependent resource it owns will be deleted automatically. Dependent resource use an URI which has prefix of its parent resource URI. Some examples are ...

http://xyz.com/books/4545/tableofcontent
http://xyz.com/users/123/shopping_cart

Creating Resource

To create a resource instance of a particular resource type, make an HTTP POST to the Factory Resource URI. If the creation is successful, the response will contain a URI of the resource that has been created.

To create a book ...

POST /books HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>...</title>
<author>Ricky Ho</author>
</book>
HTTP/1.1 201 Created
Content-Type: application/xml; charset=utf-8
Location: /books/4545

<ref>http://xyz.com/books/4545</ref>

To create a dependent resource, make an HTTP POST to its owning resource's URI

To upload the content of a book (using HTTP POST) ...

POST  /books/4545  HTTP/1.1
Host: example.org
Content-Type: application/pdf
Content-Length: nnnn

{pdf data}
HTTP/1.1 201 Created
Content-Type: application/pdf
Location: /books/4545/content

<ref>http://xyz.com/books/4545/tableofcontent</ref>

HTTP POST is typically used to create a resource when its URI is unknown to the client before its creation. However, if the URI is known to the client, then an idempotent HTTP PUT should be used with the URI of the resource to be created.

To upload the content of a book (using HTTP PUT) ...

HTTP/1.1 201 Created
Content-Type: application/pdf
Location: /books/4545/content

<ref>http://xyz.com/books/4545/tableofcontent</ref>
HTTP/1.1 200 OK

Finding Resources

Make an HTTP GET to the factory resource URI, criteria pass in as parameters. (Note that it is up to the factory resource to interpret the query parameter).

To search for books with a certain author ...

GET /books?author=Ricky HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<books>
<book>
<ref>http://xyz.com/books/4545</ref>
<title>...</title>
<author>Ricky</author>
</book>
<book>
<ref>http://xyz.com/books/4546</ref>
<title>...</title>
<author>Ricky</author>
</book>
</books>

 

Another school of thoughts is to embed the criteria in the URI path, such as ... http://xyz.com/books/author/Ricky

I personally prefers the query parameters mechanism because it doesn't imply any order of search criteria.

Lookup a particular resource

Make an HTTP GET to the resource object URI

Lookup a particular book...

GET /books/4545 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>...</title>
<author>Ricky Ho</author>
</book>

In case the resource have multiple representation format. The client should specify within the HTTP header "Accept" of its request what format she is expecting.

Lookup a dependent resource

Make an HTTP GET to the dependent resource object URI

Download the table of content of a particular book...

GET /books/4545/tableofcontent HTTP/1.1
Host: xyz.com
Content-Type: application/pdf
HTTP/1.1 200 OK
Content-Type: application/pdf
Content-Length: nnn

{pdf data}

Modify a resource

Make an HTTP PUT to the resource object URI, pass in the new object representation in the HTTP body

Change the book title ...

PUT /books/4545 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>Changed title</title>
<author>Ricky Ho</author>
</book>
HTTP/1.1 200 OK

Delete a resource

Make an HTTP DELETE to the resource object URI

Delete a book ...

DELETE /books/4545 HTTP/1.1
Host: xyz.com
HTTP/1.1 200 OK

Resource Reference

In some cases, we do not want to create a new resource, but we want to add a "reference" to an existing resource. e.g. consider a book is added into a shopping cart, which is another resource.

Add a book into the shopping cart ...

POST  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<add>
<ref>http://xyz.com/books/4545</ref>
</add>
HTTP/1.1 200 OK

Show all items of the shopping cart ...

GET  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<shopping_cart>
<ref>http://xyz.com/books/4545</ref>
...
<shopping_cart>

Note that the shopping cart resource contains "resource reference" which acts as links to other resources (which is the books). Such linkages create a resource web so that client can discovery and navigate across different resources.

Remove a book from the shopping cart ...

POST  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<remove>
<ref>http://xyz.com/books/4545</ref>
</remove>
HTTP/1.1 200 OK

Note that we are using HTTP POST rather than HTTP DELETE to remove a resource reference. This is because we are remove a link but not the actual resource itself. In this case, the book still exist after it is taken out from the shopping cart.

Checkout the shopping cart ...

POST  /orders  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<ref>http://xyz.com/users/123/shopping_cart</ref>
HTTP/1.1 201 Created
Content-Type: application/xml; charset=utf-8
Location: /orders/2008/04/10/1001

<?xml version="1.0" ?>
<ref>http://xyz.com/orders/2008/04/10/1001</ref>

Note that here the checkout is implemented by creating another resource "Order" which is used to keep track of the fulfillment of the purchase.

Transaction Resource

One of the common criticism of REST is because it is so tied in to HTTP (which doesn't support a client callback mechanism), doing asynchronous service or notification on REST is hard. So how do we implement long running transactions (which typically require asynchronicity and callback support) in REST ?

The basic idea is to immediately create a "Transaction Resource" to return back to the client. While the actual processing happens asynchronously in the background, the client at any time, can poll the "Transaction Resource" for the latest processing status. Lets look at an example to request for printing a book, which may take a long time to complete

Print a book

POST  /books/123  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

?xml version="1.0" ?>
<print>http://xyz.com/printers/abc</print>
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Location: /transactions/1234

<?xml version="1.0" ?>
<ref>http://xyz.com/transactions/1234</ref>

Note that a response is created immediately which contains the URI of a transaction resource, even before the print job is started. Client can poll the transaction resource to obtain the latest status of the print job.

Check the status of the print Job ...

GET /transactions/1234 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<transaction>
<type>PrintJob</type>
<status>In Progress</status>
</transaction>

It is also possible to cancel the transaction if it is not already completed.

Cancel the print job

POST  /transactions/1234  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<cancel/>
HTTP/1.1 200 OK

Conclusion

The Resource Oriented Model that REST advocates provides a more natural fit for our service web. Therefore, I suggest that SOA implementation should take the REST model as a default approach.

Published at DZone with permission of Ricky Ho, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

steve hinz replied on Thu, 2008/06/05 - 11:05pm

You're forgetting that GET or POST means more than one thing.  GetFoo(), GetBar() in wsdl means something.  To your book example, I post "blah blah blah blah". Is that a table of contents, comment, new chapter.  If you get into REST you have to decompose soooooo much that for some services it's not worth it.  Not saying it doesn't make sense for some services (it makes perfect sense for some), but it's not the be all end all.

Ricky Ho replied on Fri, 2008/06/06 - 1:20am

Thanks for using concrete examples so we can do a technical comparison ...

This is one major difference between SOAP and REST.

1) SOAP Style has few URL endpoints, but each endpoint will have many operations.  REST Style has many endpoints, each endpoint has few "standard" operations defined by HTTP Verbs.

In SOAP, GetFoo() and GetBar() need to use POST even though they are supposed to be idempotent, the response cannot be cached also

In REST, there will be two seperate URLs.  http://xyz.com/object/foo   and  http://xyz.com/object/bar

In my book example, if you post "blah, blah, blah" ...

1) to http://xyz.com/books/123/table_of_content, then it means table of content of book123

2)  to http://xyz.com/books/123/comments, then it means a new comment to book123

3) to http://xyz.com/books/123/chapters, then it means a new chapter.  (You can also use HTTP PUT to chapters/5)

It is true that some scenarios, such as a money transfer operation from accountA to accountB maybe better represented in RPC style.  First of all, these scenarios are rarely occur.  Even when they occur, you can use a REST style to represent it ...

HTTP POST  http://xyz.com/fund_transfer_operation

<operation>

  <from>http://bank1.com/account/1</from>

  <to>http://bank2.com/account/3</to>

  <amount>400</amount>

</operation> 

You also have an XML Schema attached to the payload.  Now tell me what extra stuff that WSDL gives you that XML schema is missing. 

Rgds, Ricky

Ricky Ho replied on Fri, 2008/06/06 - 1:27am

This is one major difference between SOAP and REST.

... SOAP Style has few URL endpoints, but each endpoint will have many operations. REST Style has many endpoints, each endpoint has few "standard" operations defined by HTTP Verbs.

In SOAP, GetFoo() and GetBar() need to use POST even though they are supposed to be idempotent, the response cannot be cached also

In REST, there will be two seperate URLs. http://xyz.com/object/foo and http://xyz.com/object/bar

In my book example, if you post "blah, blah, blah" ...

1) to http://xyz.com/books/123/table_of_content, then it means table of content of book123

2) to http://xyz.com/books/123/comments, then it means a new comment to book123

3) to http://xyz.com/books/123/chapters, then it means a new chapter. (You can also use HTTP PUT to chapters/5)

It is true that some scenarios, such as a money transfer operation from accountA to accountB maybe better represented in RPC style. First of all, these scenarios are rarely occur. Even when they occur, you can use a REST style to represent it ...

HTTP POST http://xyz.com/fund_transfer_operation

<operation>

<from>http://bank1.com/account/1</from>

<to>http://bank2.com/account/3</to>

<amount>400</amount>

</operation>

You also have an XML Schema attached to the payload. Now tell me what extra stuff that WSDL gives you that XML schema is missing.

George Jempty replied on Fri, 2008/06/06 - 6:21am

>To modify a resource, REST use HTTP PUT

Either this is a mistake, or the author is not qualified to write this article. Creating a resource is done with PUT; modifying is done with POST.

Anyway, REST obviously means nothing other than the *full* HTTP spec, rather than just the subset (GET/POST and maybe HEAD) that is typically used. Why a bunch of supposedly bright people - web developers - needed another name (REST) for the full HTTP spec, in order for them to see its efficacy, lord only knows.

Mr Fielding didn't come up with anything but a meme, I learned all about the HTTP spec from OReilly's "Webmaster in a Nutshell" back in 2000, and I only had 2 years total IT experience at the time. Another decade on and I realize I may as well be flipping burgers, considering how many ignorant people I'm surrounded by in this industry.

Peter Wolfenden replied on Fri, 2008/06/06 - 6:25am

I've been thinking recently about the REST design pattern, and your article nicely summarizes some of the ideas which I've collected. Thanks for writing it.

But I find your section on "Finding Resources" to be usatisfying. Here's why:

Suppose we're dealing with objects whose internal structure is like a tree and not at all like a list. One of the advantages of the design pattern which you describe is that attributes & sub-attributes of an object may be addressed via "dependent resources".  In other words, the tree of object attributes is "exposed" as a parallel tree of URIs.

Now, suppose we want to search for a particular set of objects which have a particular set of (sub)attributes. In order to pass the query (sub)attributes to the factory resource as (a) parameter(s) I'll need to stuff them into a JSON string (or XML, ugh). Wouldn't it be much nicer to be able to manage the tree of query details via another parallel tree of URIs? This suggests the need for a "query factory", whose instances are managed just like objects - although we need a way to list them, and "run" them to generate result sets.

Next, suppose we want to view only certain (sub)attributes of each object in the result set for a query. Again, we can pass the display (sub)attributes to the URI for an object, but again (in general) I'll need to stuff them into a structure. So again, it would be nicer to be able to manage the tree of display details via yet another parallel tree of URIs. This suggests a "view factory", whose instances are managed just like objects - although we need a way to list them, and apply them to result sets.

Finally, we need some way to combine "query" and "view" operations.

One way would be to simply pass URIs for both a "query" instance and a "view" instance to the object URI.
Another way would be to have "query" instances dump their results sets in "list" instances, and to have "view" instances read data from "list" instances. In this case we need a "list factory" (and a way to list the "list"s).

Peter Wolfenden replied on Fri, 2008/06/06 - 6:31am

George Jempty - The POST vs. PUT debate raged for a while, and there are many who still feel the way you do, but the consensus seems to be on Ricky's side (see for example: http://www.elharo.com/blog/software-development/web-development/2005/12/08/post-vs-put/)

Ricky Ho replied on Fri, 2008/06/06 - 3:08pm in response to: Peter Wolfenden

I am not seeing the issue of my query structure.  Can you give an example to highlight the issue ?

Peter Wolfenden replied on Sat, 2008/06/07 - 3:12am in response to: Ricky Ho

To illustrate the query issue which I described in my posting yesterday, consider the following tree:

myobject/
  attr1/
  attr2/
    subattrA/
    subattrB/
      subsubattrX/
      subsubattrY/
    subattrC/
      subsubattrZ/
  attr3/
    subattrA/
    subattrC/
      subsubattrZ/
  attr4/
  ...
  attr23/
    subattrW/
      subsubattrK/
        subsubsubattrL/
        subsubsubattrM/
        subsubsubattrN/

Suppose each "leaf" node in the tree represents an addressable attribute of "myobject".

We can read the current value of attr3/subattrC/subsubattrZ/ for myobject intance 123 via:

  GET myobjects/123/attr3/subattrC/subsubattrZ/

and we can assign a new value to this attribute via:

  PUT myobjects/123/attr3/subattrC/subsubattrZ/ "value"

But now suppose I want to find all instances of myobject which satisfy all of the following:

attr1/ = "vanilla"
attr2/subattrA/ = "strange"
attr2/subattrB/subsubattrY/ = "lowfat"
attr4/ > "0.3"
attr23/subattrW/subsubattrK/subsubsubattrL/ = "94110"
attr23/subattrW/subsubattrK/subsubsubattrM/ = "xyzzy"
attr23/subattrW/subsubattrK/subsubsubattrN/ = "Piazza San Marco"

I *could* pack all those path=>value pairs into a string (as seems to be suggested in your original article) and run my query via:

  GET query=attr1/="vanilla"&attr2/subattrB/="strange"...(you get the idea)...

Or I could use JSON or some other structure to represent the query structure more efficiently (in the above example this would let me avoid repeating attr23/subattrW/subattrK/).

But I think it would be best if I could load my query into a tree-like "query resource" and pass the associated URI to the resource which actually runs the query. I see two advantages to doing this:


1) It reduces the amount of information which the client must send to the server in situations where the client needs to run several "similar" queries one after the other.

2) It provides a useful mechanism for sharing query resources among client applications.

Essentially the same reasoning applies to justify the "view resource" idea.

Peter Wolfenden replied on Mon, 2008/06/09 - 10:53am in response to: Ricky Ho

Thanks for the reply. The idea of PUTing a query "template" is appealing, and can be applied equally well to "view" resources. The only downside I see to the additional power/flexibility provided by user-defined query templates is that it becomes more difficult to optimize query performance on the back end - even well-intentioned users will occasionally DOS a web service, and the full expressive power of parentheses, boolean operators, and numerical + string comparison operators seems likely to create some hot spots that I won't be able to fix by hacking the SQL. But I'll concede that the template query pattern is superior to the tree query pattern, and if I decide that I want to limit the expressive power of my queries then the burden is on me to figure out exactly which queries I want to block (and why).

But - back to the URIs for a moment. I hadn't figured out exactly how the URI for running the query should look, and how it should be different from the URIs which are used to examine the queries themselves. And it looks like there's the same problem lurking in your example: what's the URI difference between running and viewing a query template that has no parameters?

One way to avoid this would be to run queries by POSTing the URI for a "query resource" (+ parameters, if any) to the URI for a "view resource". What comes back could either be the query results themselves, or (if the query takes more than a few seconds to run) the URI for a "results resource" with a "status" attribute (which is set to DONE when all the data has been collected) and a "data" attribute (which serves the rendered/stored query results).

Ricky Ho replied on Mon, 2008/06/09 - 12:29pm in response to: Peter Wolfenden

To distinguish between getting the query template and the query execution ...

To read the query template 

 GET xyz.com/books/queries/q1

To execute the query

 GET xyz.com/books/queries/q1/result?param1=strange&param2=0.6

For long running query, the "result" resource is similar to the concept of what I call a "transaction resource", mentioned in the last paragraph of my orginal post. 

 

I expect the query template is NOT "user defined".  In most cases, it is the application developers (who write the query processing code) decide what query and parameters the application will take.  So the HTTP PUT query is mostly called by the application deployment or bootstrap process.  Its purpose is to expose the query template structure so clients can discover them.

Therefore, the backend query optimization and tuning is well under control. 

I am very interested to know how your query URI looks like without refering to a pre-defined query template.  I have no idea about how to represent a general query in a tree path syntax.

Rgds,

Ricky 

Peter Wolfenden replied on Mon, 2008/06/09 - 12:55pm

In the case of q1 (which has two parameters) we can distinguish between examining the query template (GET with no params) and running the query (GET with two params). In my last query I was pointing out that this distinction doesn't work if the query template doesn't have *any* params (suppose q2 is hard-coded as param1 == "strange", for example). Also, this approach doesn't allow you to define default values for your query parameters - which we *can* do if we use entirely different URIs for examining query templates and running them.

 As for a general tree query syntax - XQuery and XPath would be a good starting point, I think, even though the "leaves" of our tree are URIs rather than XML elements.

Ricky Ho replied on Mon, 2008/06/09 - 1:55pm

To read the query template 

 GET xyz.com/books/queries/q1

To execute the query

 GET xyz.com/books/queries/q1/result

 

If you use XPath, how would it be much different from ...

  GET query=attr1/="vanilla"&attr2/subattrB/="strange"

 

Yours will just be http://xyz.com/books/query?xpath={my_xpath_query_here} 

 

Rgds, 

Peter Wolfenden replied on Mon, 2008/06/09 - 4:52pm in response to: Ricky Ho

Aha - I missed the "result" addition.

I expect that all serializations of the query into a string that can be passed in a query parameter will bear some similarity to the anti-example which I described. XPath and XQuery offer some additional expressive power (in the ability to use patterns to describe groups of paths, for example), which could make the queries (and views) smaller and possibly more elegant. But that's not really how I was hoping the pattern would look.

Again, my original desire was to find a way to *avoid* serializing the query tree, and to do so by exposing the query tree (to end users) via a set of URIs. You pointed out that query expressions are more general than query trees, which it certainly true. And making query templates "read only" by end users would certainly make it possible to optimize the SQL queries that run on the back-end DB.

It would of course be *possible* to build up a query expression using generic "logical operator" URIs, but without the need to maintan large a (and well-factored) query library there doesn't seem to be any obvious practical benfit to this approach (better to simply PUT monolithic query templates).

I'll have to think some more about the "view" templates. 

Carla Brian replied on Sun, 2012/07/01 - 8:59am

It is nice to know about this one. I would like to study more on this. This is interesting. - Mercy Ministries

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.