HATEOAS: I want to believe

16 May, 2023

There's a growing voice in the web dev community. I'm going to call them "web fundamentalists". They hold certain beliefs about web development.

Single-page apps (SPAs) make for a poor user experience.
SPAs are bad for accessibility.
Good APIs make semantically correct use of more of the HTTP verbs than just GET and POST.
It's okay for an application server to return HTML.
The web should be generally usable with JavaScript disabled.
Cookie-based sessions are better than JSON Web Tokens.
JavaScript doesn't need a compiler.
A web API that accepts and returns JSON isn't necessarily REST.
Web applications should be offline-first.
Accessibility encompasses more than assistive technologies. An accessible application is tolerant of old hardware, old software, and poor connectivity.
Boring tech makes for reliable tech.
Progressive enhancement is better than graceful degradation, but both are better than nothing at all.

I agree with most, if not all, of these beliefs, in principle. However, I find that many of these beliefs are often overly idealistic and can fall apart when theory meets practice.

Uphill both ways

Just like many forms of fundamentalism, this form is probably influenced by how quickly and drastically change has occurred. Web development looks completely different today than it did 10 years ago. In a rapidly changing environment, it's easy to fall prey to rose-tinted nostalgia.

I remember when a web application generated all of a page's HTML on the server side. Slowly but surely, more and more logic would find its way into the templates. This was bad because template logic was generally slower than non-template logic. I have worked on applications where more than half of a server request's time was spent rendering the HTML. It was a bad time. In the short term, people started pushing so-called "logic-less templates". Mustache templates became the new hotness.

Longer term, the prevailing wisdom was that we needed to stop treating web browsers like thin clients. Can you believe there was ever any doubt? Web browsers are the thickest of thick clients. The web browser is the new operating system. Look at Chromebooks! People gladly buy laptops that can only run a glorified web browser.

If more than half of your server's workload is building HTML, basically shuffling strings around, you're paying to do something your client's devices are capable of doing on your behalf. Plus, there's so much redundant data going over the wire. You're sending your web app's navigation code for every page request. Ideally, you should amortize the templating and rendering by making your users' devices handle it. Your server doesn't have to do it, so you don't have to pay for the computer time or the data transfer.

I was there at the time and this made perfect sense. Your server only has to deal in pure data and your front-end code only needs to worry about displaying that data and performing server requests. This is what led to REST APIs and SPAs.

But modern solutions call for modern problems and now we have to care about the efficiency with which data is represented and queried. None of that mattered previously because it all happened on the server. Your browser requested one whole page and the server returned one whole page. Since accessing the data and applying it to a template was essentially a single task accomplished by a single program running on a single machine, tightly coupling them wasn't a big deal.

With SPAs, you're encouraged to think of your server and your front-end as two different programs that happen to talk to one another. Wouldn't it be nice if your data representation wasn't tied to any particular display format? Your data is its own platonic ideal of entities and relationships, not all that different from a database. In fact, your application server isn't much more than a thin wrapper around your database. The only extra stuff it needs to do is authentication, authorization, data serialization (into JSON, the web's data serialization format of choice), and some data validation that's more application-specific than you can probably represent in a database alone.

R-E-S-T, find out what it means to me

Representational state transfer, or REST, was the solution for what protocol the web was going to standardize on for this new world of front-ends and back-ends. Almost.

REST, at its heart, is basically a remote procedure call interface that plays to the strengths of the web's existing technology. Specifically, URIs, HTTP verbs, HTTP response codes, and browser caches.

There's a problem, though, at least if you ask a REST purist. Nobody uses it "correctly".¹ ² You're technically not REST unless you're using HATEOAS.

Only love can conquer HATEOAS

Hypermedia as the engine of application state, or HATEOAS. The name that just rolls off the tongue.

I'm not going to explain it in extreme detail here, but the idea, roughly, is that your API returns JSON objects that look like this:

{
  "author": {
    "id": 123,
    "firstName": "Stephen",
    "lastName": "King",
    "birthday": "1947-09-21T05:00:00.000Z",
    "links": {
      "books": "/authors/123/books"
    } 
  }
}

A given model contains all the information you'd expect, plus a links (or _links in some implementations) object containing the URIs of various related models.

Honestly, at first blush, this seems like a really reasonable way to do things. It could benefit from some kind of library or something to automatically parse these things, but it's simple enough that almost any developer could probably roll their own.

However, there's at least one glaring flaw to me where the rubber meets the road, and it's when you need to traverse more than one relationship in the graph.

Suppose, for example, my front-end is consuming an API like the one that would generate that Stephen King model above. I want to get a list of publishers who have published Stephen King's books. Imagine the book model looks like this:

{
  "book": {
    "id": 720,
    "title": "The Dead Zone",
    "pages": 428,
    "publicationDate": "1979-08-30T05:00:00.000Z",
    "isbn": "978-0-670-26077-5",
    "links": {
      "author": "/authors/123",
      "publisher": "/publishers/902"
    }
  }
}

What I can do is request all of King's books with GET /authors/123/books and then call a GET on the publisher link for each book.

However, subjectively speaking, this sucks. I don't know if you've ever had to work with an app that works this way, but it's a bad time for users. Best case scenario, this is going to take a lot of time and a lot of bandwidth due to the overhead of making one API per book. Even with HTTP/3 or whatever, where HTTP is no longer one transaction per connection. And the browser is going to choke if you fire off dozens of AJAX³ requests at once.⁴

Okay, okay. I know that what I described is a ridiculous approach that I definitely have never seen in the wild, much less worked on an app that does it. What we'd actually do is add a new link to the author model that does the relational hop for us on the back-end and return the results we want. Something like /author/123/publishers.

I don't know about you, but I hate adding boilerplate endpoints to apps. Seems like something metaprogramming would handle for you. There would need to be some guidelines from the programmer, though. If I just gave the program my database and told it to figure it out, I'd end up with technically valid but meaningless relationships.⁵ Not to mention the cycles.

What I find myself describing almost exists already, but it's not HATEOAS. It's GraphQL.

I know there are a lot of GraphQL haters out there. And I totally understand. It's not as simple to add GraphQL to an app as it is to incrementally add endpoints that speak JSON. Most GraphQL implementations, both front-end and back-end, are heavy-weight dependencies. And GraphQL itself is not a panacea for the problem of connecting front-end and back-end.

I've used GraphQL a few times in my career. Never for anything that's Serious Business™️. One time was in a Rails app. Another time was with Gatsby.

What impressed me most about using it was the quality of the developer experience. Once you set up your models, querying data was a breeze. Introspection, auto-completion, and type inference made writing queries a joy. Having your data query live in the same file as your display logic was the same kind of revelation as when React gave us JSX and our templates could live with the display logic. It's a code locality win.

Combine that with Prisma's tooling⁶ on the back-end and the need to write boilerplate endpoints virtually disappears.

GraphQL also solves the problem of data validation and sanitation, which HATEOAS doesn't address. Neither solves the problem of authorization, though. My point is that metaprogramming can't account for all the drudgery associated with a CRUD back-end.

The other HATEOAS

Let's get pedantic for a second. I talked about HATEOAS and I gave an example using JSON. The example I gave is an example of HATEOAS, but it's just one implementation. The most important word in HATEOAS is "hypermedia". The JSON examples I gave qualify as hypermedia because they have links.

You know what else qualifies as hypermedia? The OG, HTML! Hyper text markup language.

There are some folks who have made various tools exploring this idea of HTML as HATEOAS. Their latest creation is HTMX.

I think it would be worth trying to create a web application using HTMX. The problem is, it bucks the trends of Big JavaScript™️.

You can have the things that make development easier:

Templates and display logic live together
Data queries and display logic live together
No more endpoint boilerplate hell... maybe?

The catch is, you'll be going back to rendering your HTML on the server side. But at least you don't have to render entire pages per request. You can absolutely return page fragments that get inserted into an existing page.⁷

Wrapping up

I don't mean to be overly critical of the web dev hippies out there. I see you. I love you. I agree with you most of the time.

I am critical of JSON HATEOAS implementations. Without some kind of tools for metaprogramming, I've found it leads to boilerplate fatigue. I don't enjoy churning out CRUD endpoints. I also hate when adding a new endpoint leads to a large impact radius in the code. Adding one new model, for example, is one thing. But then I have to modify all the endpoints of all the other models that contain relationships to the new endpoint that I want to utilize.

I don't think JSON HATEOAS implementations are inherently bad. How can they be? They're implementations. I don't think hypermedia is bad. How could I? The whole web is built on it. I do think that hypermedia is often an inappropriate medium to drive some of the applications we use it for. But, to its credit, hypermedia is adaptable enough that we can usually bend it into something resembling the shape we desire.

I have found that GraphQL, as a query language, seems a lot more powerful than JSON HATEOAS implementations to quickly produce maintainable web applications. But it's mostly due to the quality of its assistive tools. If I had to write GraphQL queries by hand with nothing to help me, I would probably be complaining about it, too.

REST is an architectural style, not a standard. Therefore, "correctness" is up to interpretation.↩
"Correct" REST can be pretty controversial. For example, server-side sessions are technically disallowed, whether that means using a session ID in a cookie or as part of the URI. This is because sessions introduce state, whereas REST is supposed to be stateless. Also, sessions mean there is data which is opaque to the user, which is deemed a security and privacy risk. But what's the alternative? Sending credentials along with every request that needs it seems less secure to me (assuming sessions are implemented securely, which is a big assumption, I know). JSON Web Tokens are not opaque, for better or worse, but it's considered better practice to prefer a session token for systems where the application server is the authentication authority.↩
It's 2023. Are we still calling this AJAX? This is just how the web works now, right? 🤷↩
Ask me how I know.↩
My library card catalog example is too simple for an example, but anyone who has worked with a production database knows what I'm talking about.↩
I'm not a fan of ORMs generally, but I've found Ecto and Prisma to be the least objectionable in my experience so far. Neither is perfect, but I'm particularly impressed by Prisma's tooling, which generates Typescript types for your database models. We need more metaprogramming tools like this.↩
I haven't tried this myself yet, but I'm very intrigued by the concept. The ultimate would be if all response bodies are complete HTML pages. I'm not sure how that would work. Iframes everywhere seems too heavy. It just feels weird to me to return HTML fragments from a back-end.↩