Creating a REST API
You've heard of REST before, but what is REST? How does it work? Can you build one from scratch? Does serverless make your REST life easier?
In this chapter, learn about REST best practices and finish with a small implementation you can try right now. I left mine running ✌️
What is REST
REST stands for REpresentational State Transfer. Coined in Roy Fielding's 2000 doctoral thesis, it now represents the standard approach to web APIs.
Fielding says REST is an architectural principle. It recommends how to build a scalable web system, but there's no official standard.
You may have noticed this in the wild. RESTful APIs follow similar guidelines and no two are alike.
These days any API that uses HTTP to transfer data and URLs to identify resources is called REST.
Here are the 6 architectural constraints that define a RESTful system 👇
- client-server architecture specifies a separation of concerns between the user interface and the storage of data. This simplifies both sides and lets them evolve independently.
- statelessness specifies that the protocol is stateless. Each request to the server contains all information necessary to process that request. Servers do not maintain client context.
- cacheability specifies that clients can cache any server response to improve performance on future requests. Servers have to annotate responses with appropriate caching policies via http headers
- layered system means that, like regular HTTP, a RESTful client shouldn't need to know whether it's talking to a server, a proxy, or load balancer. This improves scalability.
- code on demand (optional) says that servers can send executable code as part of their responses. You see this with web servers sending JavaScript.
- uniform interface is the most fundamental and means that clients can interact with a server purely from responses, without outside knowledge.
A uniform interface
Creating a uniform interface is the most important aspect of a RESTful API. The less clients know about your server, the better.
Each request identifies the resource it is requesting. Using the URL itself.
Responses send a representation of a resource rather than the resource itself. Like compiling a set of database objects into a JSON document.
All messages are self-descriptive meaning both client and server can understand a message without external information. Send everything you need.
A resource's representation includes everything needed to modify that resource. When clients get a response, it should contain everything necessary to modify or delete the underlying resources.
Academics say responses should list possible actions so clients can navigate a system without intimate knowledge of its API. You rarely see this in the wild.
Designing a RESTful API
The trickiest part of building a RESTful API is how it evolves over time. The second trickiest is keeping it consistent across your app.
As your system grows you're tempted to piggy-back on existing APIs and break resource constraints. You're likely to forget the exact wording and phrasing of different parts.
All that is natural. The important part is to start on the right foot and clean up when you can.
Engineers are human
When engineers can do something with your API, they will.
They're going to find every undocumented feature, discover every "alternative" way to get data and uncover any easter egg you didn't mean to include. They're good at their job.
Do yourself a favor and aim to keep everything consistent. The more consistent you are, the easier it will be to clean up.
That means
- all dates in the same format
- all responses in the same shape
- keep field types consistent, if an error is a string it's always a string
- include every field name and value in full
- when multiple endpoints include the same model, make it the same
Here are tips I've picked up over the past 14 years of building and using RESTful APIs.
URL schema
Your URL schema exists to solve one problem: Create a uniform way to identify resources and endpoints on your server.
Keep it consistent, otherwise don't sweat it.
Engineers like to get stuck on pointless details, but it's okay. Make sure your team agrees on what makes sense.
When designing a URL schema I aim to:
- make it guessable, which stems from consistency and predictability
- make it human readable, which makes it easier to use, debug, and memorize
- avoid query strings because they look messy and can make copy paste debugging harder
- avoid strange characters like spaces and emojis. It looks cute, but it's cumbersome to work with
- include IDs in URLs because it makes debugging and logging easier
The two schemas I like, go like this:
https://api.wonderfulservice.com/<namespace>/<model>/<id>
Using an api.*
subdomain helps with load balancing and using special servers for your API. Less relevant in the serverless world because providers create unique domains.
The optional <namespace>
helps you stay organized. As your app grows, you'll notice different areas use similar-sounding names.
You don't need a namespace for generic models like user
or subscription
.
Adding a <model>
that's named after the model on your backend helps you stay sane. Yes it breaks the idea of total separation between client and server, but it's useful to maintain consistency.
If everyone calls everything the same name, you never need to translate. 😉
The <id>
identifies the specific instance of a model that you're changing.
Sometimes it's useful to use this alternative schema:
https://api.wonderfulservice.com/<namespace>/<model>/<verb>/<id>
The verb specifies what you're doing to the model. More on that when we discuss HTTP verbs further down.
Data format
Use JSON. Both for sending data and for receiving data.
Great support in common programming languages, easy to use with JavaScript, simple to write by hand, readable by humans, not verbose. That makes JSON the perfect format for a modern API.
For timestamps I recommend the ISO 8601 standard for the same reason. Great tooling in all languages, readable by humans, writable by hand.
UNIX timestamps used to be popular and are falling out of favor. Hard to read for humans, don't work great with dates beyond 2038, bad at dates before 1970.
Stick to ISO time and JSON data 😊
Using HTTP verbs
Verbs specify what your request does with a resource.
Opinions on how to pass verbs from client to server vary. There's 3 camps:
- stick to HTTP verbs
- verbs belong in the URL
- verbs are part of the JSON payload
Everyone agrees that a GET
request is for getting data and should have no side-effects.
Other verbs belong to the POST
request. Or you can use HTTP verbs like PUT
, PATCH
, and DELETE
.
I like to use a combination.
GET
for getting data. POST
for posting data (both create and update). DELETE
for deleting ... on the rare occasion I let clients delete data.
PUT
and PATCH
can be frustrating to work with in client libraries.
Errors
Should you use HTTP error codes to communicate errors?
Opinions vary.
One camp says that HTTP errors are for HTTP-layer problems. Your server being down, a bad URL, invalid payload, etc. When your application processes a request and decides there's an error, it should return 200 with an error object.
The other camp says that's silly and we already have a great system for errors. Your application should use the full gamut of HTTP error codes and return an error object.
Always return a descriptive JSON error object. That's a given. Make sure your server doesn't return HTML when there's an error.
An error shape like:
{"status": "error","error": "This is what went wrong"}
is best.
I like to combine both approaches. 500
errors for my application failing to do its job, 404
for things that aren't found, 200
with an explanation for everything else. Helps avoid hunting for obscure error codes and sticks to the basics of HTTP.
Added bonus: You can read the error object. Will you remember what error 418 means?
Versioning
I have given up on versioning APIs.
Your best bet is to maintain eternal backwards compatibility. Make additive changes when needed, remove things never.
It leads to messy APIs, which is a shame, but better than crashing an app the user hasn't updated in 3 years.
When you need a breaking change, it's unlikely the underlying model fits your current URL schema. Use a new URL.
Build a simple REST
To show you how this works in practice, we're going to build a serverless REST API. Accepts and returns JSON blobs, stores them in DynamoDB.
You can see the full code on GitHub. I encourage you to play around and try deploying to your own AWS.
Code samples here are excerpts.
URL schema mapped to Lambdas
We start with a URL schema and lambda function definitions in serverless.yml
.
# serverless.ymlfunctions:getItems:handler: dist/manageItems.getItemevents:- http:path: item/{itemId}method: GETcors: trueupdateItems:handler: dist/manageItems.updateItemevents:- http:path: itemmethod: POSTcors: true- http:path: item/{itemId}method: POSTcors: truedeleteItems:handler: dist/manageItems.deleteItemevents:- http:path: item/{itemId}method: DELETEcors: true
This defines the 4 operations of a basic CRUD app:
getItem
to fetch itemsupdateItem
to create itemsupdateItem/id
to update itemsdeleteItem
to delete items
The {itemId}
syntax lets APIGateway parse the URL for us and pass identifiers to our code as parameters.
Mapping every operation to its own lambda means you don't have to write routing code. When a lambda gets called, it knows what to do.
We define lambdas in a manageItems.ts
file. Grouping operations from the same model helps keep your code organized.
getItem
// src/manageItems.ts// fetch using /item/IDexport const getItem = async (event: APIGatewayEvent): Promise<APIResponse> => {const itemId = event.pathParameters ? event.pathParameters.itemId : ""const item = await db.getItem({TableName: process.env.ITEM_TABLE!,Key: { itemId },})if (item.Item) {return response(200, {status: "success",item: item.Item,})} else {return response(404, {status: "error",error: "Item not found",})}}
The getItem
function is triggered from an APIGatewayEvent. It includes a pathParameters
object that contains an itemId
parsed from the URL.
We then use a DynamoDB getItem
wrapper to talk to the database and find the requested item.
If the item is found, we return a success response, otherwise an error. The response
method is a helper that makes responses easier to write.
// src/manageItems.tsfunction response(statusCode: number, body: any) {return {statusCode,body: JSON.stringify(body),}}
You can try it out here:
Try your function
Leave payload empty for GET requests, valid JSON for POST.
Or paste that URL into a browser.
updateItem
Updating an item is the most complex operation we've got.
It handles both creating and updating items, always assuming that the client knows best. You'll want to validate that in production.
Here we overwrite server data with the client payload.
// upsert an item// /item or /item/IDexport const updateItem = async (event: APIGatewayEvent): Promise<APIResponse> => {let itemId = event.pathParameters ? event.pathParameters.itemId : uuidv4()let createdAt = new Date().toISOString()if (event.pathParameters && event.pathParameters.itemId) {// find item if existsconst find = await db.getItem({TableName: process.env.ITEM_TABLE!,Key: { itemId },})if (find.Item) {// save createdAt so we don't overwrite on updatecreatedAt = find.Item.createdAt} else {return response(404, {status: "error",error: "Item not found",})}}if (!event.body) {return response(400, {status: "error",error: "Provide a JSON body",})}let body = JSON.parse(event.body)if (body.itemId) {// this will confuse DynamoDB, you can't update the keydelete body.itemId}const item = await db.updateItem({TableName: process.env.ITEM_TABLE!,Key: { itemId },UpdateExpression: `SET ${db.buildExpression(body)}, createdAt = :createdAt, lastUpdatedAt = :lastUpdatedAt`,ExpressionAttributeValues: {...db.buildAttributes(body),":createdAt": createdAt,":lastUpdatedAt": new Date().toISOString(),},ReturnValues: "ALL_NEW",})return response(200, {status: "success",item: item.Attributes,})}
The pattern we're following is:
- if no ID provided, generate one
- find item in DB
- update or create a
createdAt
timestamp - parse the JSON body
- create an update expression
- add a
lastUpdatedAt
timestamp - return the resulting object
Adding createdAt
and lastUpdatedAt
timestamps is helpful when debugging these systems. You want to know when something happened.
You can find the db.*
helper methods in my dynamodb.ts
file
You can try it out here:
Create an item 👇
Try your function
Leave payload empty for GET requests, valid JSON for POST.
Update an item, try using the ID from before 👇
Try your function
Leave payload empty for GET requests, valid JSON for POST.
deleteItem
Deleting is easy by comparison. Get the ID, delete the item. With no verification that you should or shouldn't be able to.
// src/manageItems.tsexport const deleteItem = async (event: APIGatewayEvent): Promise<APIResponse> => {const itemId = event.pathParameters ? event.pathParameters.itemId : nullif (!itemId) {return response(400, {status: "error",error: "Provide an itemId",})}// DynamoDB handles deleting already deleted values, no error :)const item = await db.deleteItem({TableName: process.env.ITEM_TABLE!,Key: { itemId },ReturnValues: "ALL_OLD",})return response(200, {status: "success",itemWas: item.Attributes,})}
We get itemId
from the URL props and call a deleteItem
method on DynamoDB. The API returns the item as it was before deletion.
PS: in a production system you'll want to use soft deletes – mark as deleted and keep the data
Fin ✌️
And that's 14 years of REST API experience condensed into 2000 words. My favorite is how much easier this is to implement using serverless than with Rails or Express.
I love that you can have different lambda functions (whole servers, really) for individual endpoints.
Next chapter, we look at GraphQL and why it represents an exciting future.