an elaborate machine is indispensable

Wednesday, November 18, 2009

Webmachine 1.5: virtual host dispatching

We recently tagged and pushed the webmachine-1.5 release, which has a number of minor bugfixes and one major new feature: resource dispatching on Host as well as on URL. There was a healthy discussion on the webmachine mailing list about this, and I think that the compromise solution that was created is a good one. The description from the changeset is quite good documentation of the new feature in my opinion.


dispatch rules can now take two different forms:

The old form: {PathMatchSpec, Module, Paramters}
The new form: {HostMatchSpec, [{PathMatchSpec, Module, Parameters}]}

The former is equivalent to the latter with HostMatchSpec={['*'],'*'}

HostMatchSpec is matched against one of (in order of preference):
X-Forwarded-For, X-Forwarded-Host, X-Forwarded-Server, Host

HostMatchSpec can have two forms:
{[HostPart], PortSpec}
or
[HostPart]
The latter is equivalent to the former with PortSpec='*'

The list of host parts is matched against the hostname extracted from
a header in much the same way that PathMatchSpec is matched against
the path.

Examples:

{[], root_resource, [x]}.
{['*'], [{[], root_resource, [x]}]}.
{{['*'],'*'}, [{[], root_resource, [x]}]}.
Will each match the root path of any host.

{["example","com"], [{[], root_resource, [x]},
{["static"], static_resource, [y]}]}.
Will dispatch the root of example.com to root_resource and
example.com/static to static_resource.

{['*',"example","com"], [{[], root_resource, [x]},
{["static"], static_resource, [y]}]}.
Will do the same as above, but also for any subdomains of
example.com.

{{[host,"local"], 8000}, [{[], res_A, [x]}]}.
{{[host,"local"], 8001}, [{[], res_B, [x]}]}.
Will dispatch requests to ?.local:8000/ to res_A and requests
to ?.local:8001/ to resB, binding the host part immediately
preceding ".local" to 'host', such that
wrq:get_path_info(host, ReqData) would return the matched string.


Some notable features of this approach include complete backward compatibility (allowing new host-specific rules to be added to old URL-only dispatch lists without rewriting the entire list) and bringing the same simple pattern-matching style of dispatch to the host portion of the problem.

You may also be interested to see a new site for Webmachine at http://webmachine.basho.com/.

That site is currently just a different structure on the documents from the bitbucket wiki, but it is likely to grow over time.

As of this release, we are doing away with private development branches by default, and expect to work against the bitbucket tip by default. Making that same change for Riak has certainly paid off, and we hope that with Webmachine it will also allow people to more easily get involved and work against the active codebase.

Enjoy!

Wednesday, August 26, 2009

Webmachine as an application front-end

Webmachine 1.4 is pushed to bitbucket, and is providing the HTTP face for a few interesting new software systems.

The very cool guys at Collecta have built a different sort of search engine, great for watching the flow of social-network sorts of conversation occurring about whatever topics you are interested in. The REST API powering that engine is written in Webmachine, and takes advantage of some of Webmachine's more interesting features while interacting with other components. I'll leave more detailed explanation of their technology to their excellent team and just say that they've built something very deserving of attention.

We have also used Webmachine to provide the HTTP interface to our decentralized document store: Riak. Webmachine's ability to provide the full richness of HTTP's capabilities while not dictating anything else about the shape of your application makes it a natural fit for the front end of a system like Riak. The first incarnations of Riak and Webmachine were each built at about the same time: late 2007. As they grew up together there were some internal interface design decisions that were more obvious to us as a result. The structure of Riak's core operations map nicely to the universal interface that includes GET, PUT, POST, and DELETE not just by name but by their essential properties such as idempotency, safety, and defined semantics.

One aspect of the Web that has been essential to its success as the biggest distributed system in the world is the notion of links. This is also one of the things that differentiates Riak from some other data storage systems. It's not exactly a graph database, but it adds some elements of a graph database on top of the great benefits that come with having a decentralized key/value store at the core. Documents in a Riak cluster can have links to other documents in that cluster. Riak itself can take great advantage of this internally, as the MapReduce programming model that we use is ideally suited to walking links in order to build up inputs for the next phase of computation.

These links ought to also be useful to clients, and in the context of HTTP this should be possible in a way that does not assume application knowledge on the part of a client. To that end, the newest release of Riak includes support for the Link header in HTTP responses. This allows clients to explore the link structure of a set of related documents without having to read or understand the body of those documents. Based on our past experience building applications atop Webmachine and Riak, we expect this to be an added bonus for rapid development.

Monday, June 15, 2009

Webmachine 1.3: streamed bodies, multipart forms, and efficiency

Easily the most requested feature for Webmachine since its release has been the ability to "stream" the request and/or response bodies, instead of having to receive or send them in one potentially-large hunk. As of the most recent version, this feature is now available. See the wiki page for details on the API.

A number of other changes are also in, such as multipart form parsing, improved efficiency by changing a gen_server (per request) into a parameterized module, and so on... but I suspect that the streamed bodies are what people are really looking for most. Enjoy!

Monday, June 1, 2009

REST and HTTP services as a business advantage

The advantages of HTTP as an application protocol (not just a transport) as opposed to many other networked service models are not abstract, idealized technical advantages. They directly affect your -- and your partners' -- cost of doing business.

At Basho, our services integrate out of necessity with those of many kinds of partner companies, including CRM, Business Intelligence, Search, and more. We consider ourselves lucky in general when a company we'd like to partner with exposes any consistent and documented interface for this purpose.

However, when those interfaces are SOAP or another RPC-shaped system it means that each integration is a fairly major new project even when the resulting connections between applications are conceptually small. This is because you have to learn the programming model of that other service and work as though you were a developer of that service -- learning their calling conventions, naming schemes, error conditions, and so on.

We recently had the pleasure of integrating with Jigsaw's data service. While they don't quite match up to the ideals of REST just yet, their service is young and the interface is already far better than that of many other business-to-business integration APIs. Not only did they deliver a cleaner and easier service than expected, I suspect that they did so at lower cost than many others. How?

By using HTTP.

Even the coarsest approximation of the Web's uniform interface gives you a much better running start than is possible with, say WSDL and SOAP. Jigsaw's Web interface isn't perfect (GET requests are idempotent but not safe, and a couple of status codes are incorrectly used) but it is simple and it isn't surprising. The fact that there is already a completely interoperable HTTP client in every major programming language means that, instead of using some WSDL to generate 10,000 lines of code to then put a client on, we were able to just jump in and immediately write working client code. The resulting client code was also about 20% as long as the manually written portion of our client code in comparable services that use SOAP.

I'm not talking about ideal systems, and I'm not talking about idealistic academic goals. I'm just talking about the simple realities of how your technical choices affect the level of effort that your partners must apply in order to work with you. That simple reality has a direct and powerful effect on the bottom line.

Wednesday, May 27, 2009

Video Slideshow, Introducing Webmachine

The Webmachine talk at Bay Area Erlang Factory 2009 went quite well. I received useful feedback, and some very interesting and productive conversations spun off after the talk.

For anyone interested who wasn't there, I have recorded a voiceover with the slides and made that video available here. The slides are the same ones used at the conference, but I trimmed the speaking portion a bit. This version is a bit under half an hour; it leaves out a few minor topics but still covers all of the material needed to introduce Webmachine.




Enjoy!

Tuesday, May 26, 2009

Webmachine 1.2

There are a few changes in webmachine-1.2 that deserve mention.

We simplified the API to the dispatcher module so that it can be used easily in a standalone fashion. In cases where another application (such as CouchDB) wants to use Webmachine-style dispatching, it is now easy to just call webmachine_dispatcher:dispatch/2 and get a useful result without any of the rest of Webmachine running. A trivial example:

1> webmachine_dispatcher:dispatch("/a",[{["a"],some_resource,[]}])
{some_resource,[],[],[],".",[]}

The other change that is most interesting from a feature point of view is that the request body is not read off the socket until the first time wrq:req_body/1 is called. This means that a resource can (for example) return an error response code without having to wait for the body to be pulled off the wire first.

There is also a change in the new_webmachine project creation script. Your list of dispatch terms will now by default be in a separate file ("priv/dispatch.conf") instead of directly in your application's _sup file.

This version is identified with the "webmachine-1.2" mercurial tag,

In upcoming versions, we hope to add a few much-clamored-for features such as host-based dispatching and incremental request/response body reading and writing.

Tuesday, April 28, 2009

A Simple Webmachine Example

Bryan Fink (of BeerRiot fame and a colleague at Basho) recently posted a great example of how easy it is to make a useful and working Webmachine resource.

He then followed up with more examples, showing how easy it is to add support for PUT, for authorization, and for
entity tags.

Yesterday's post wrapped up his short series by not only adding DELETE support but also reflecting on the nature of Webmachine and how it lets you improve the way you think about and use the power of HTTP in your applications.

Bryan knows both Erlang and Web programming well; his examples are worth the read.