Name | SID | ITSC | GitHub |
---|---|---|---|
Ng Chi Him | 20420921 | chngax | chihimng |
Leung Lai Yung | 20422412 | lyleungad | Benker-Leung |
Lee Pan Yin | 20258366 | pyleeag | fredlee3107 |
Serverless architecture offers great scalability and flexibility for programmers, in that development of the business logic can be focused on, while the deployment and server maintainence work are offloaded to the framework. As both of our FYP are related to IoT devices, where the number of requests to backend fluctuates and system availability is of paramount importance, a serverless backend service is greatly beneficial where it is very scalable. This can also help reduce the cost of maintainence since server instances can be created and set up automatically.
This section covers the process of developing web APIs as lambda functions with OpenFaaS from a backend developer standpoint. We will explain the technology stack used, our api design and database schema and finally the work flow of development.
For our project, API endpoints were developed for client apps or IoT devices to access or change records stored on a document-based database. In particular, four lambda functions were built to handle CRUD operations of the user
document.
In contrast to AWS Lambda, which accepts and runs handler scripts directly, OpenFaaS packages the script as Docker images for execution. With this approach, lambda function code have to be built into Docker images, pushed onto a container registry and finally pulled and run by workers pods on the server. To facilitate this deployment flow, OpenFaaS provides a CLI tool faas-cli
for developers to execute each action from their machine.
As denoted in the diagram above, we used NodeJS to write handler functions and Docker Hub to host built images.
Considered that API and database schema design is not a focus of this course, we will not explain our design in detail. Information here is for you to follow our code more easily.
As mentioned above, four endpoints are available for executing CRUD operations on user
records. They accept requests and respond with a success
boolean flag and updated record content in JSON format, as well as using correct HTTP status codes to denote result of execution.
Please refer to the API documentation here on endpoint urls, sample requests and responses.
For data structure, as data in IoT systems are often hierarchical and related to one user account only, we used MongoDB --- a document-based database --- to store our data. The schema of a user
object is as followed:
{
"user": {
"email": "test@test.com",
"devices": [],
"groups": [],
"locations": [],
"passwordHash": "123456",
"profile": {
"firstName": "Hello",
"lastName": "World"
}
}
}
Refers to
lambda
folder on our code repo, please readlambda/README.md
for description of directories and files
This part describes the steps to replicate and test our work on your local machine.
Here, we expect that Docker Desktop
is installed with a Docker Hub
account logged in. This is so that we can build docker images and upload them to the docker hub repository.
Please also install Minikube
, a tool for running kubernetes locally, and k3sup
, a tool for quick installation of Helm
charts onto the local cluster.
Refers to bootstrap script (lambda/bootstrap_dev_env)
For easier development, we set up a local single-node kubernetes cluster providing OpenFaaS and MongoDB services. Ports are forwarded to localhost for debug purposes.
Our bootstrap script is written to be ran on macOS, with minikube
and k3sup
installed. It might also work on other OSes such as Linux as long as they are supported by the forementioned tools.
After running the script, faas-cli
tool will be installed and authenicated for controlling the OpenFaaS instance at 127.0.0.1:8080
. On the other hand, the mongodb instance can be accessed at 127.0.0.1:27017
with the root account root:root
.
Please create a iot_backend
database with users
collection as well as a lambda:lambda
account with readWrite
access to that database.
If other values are used, please update the mongo_url
and mongo_dbname
configurations on stack_dev.yml
To do so with command line,
# Open sheel on mongodb instance
kubectl exec -it -n mongodb svc/mongodb bash
# Execute MongoDB CLI Client
mongo
# Login and create new user
db.auth('root', 'root')
use admin
db.createUser({user:"lambda", pwd:"lambda", roles:[{role: "readWrite", db: "iot_backend"}]})
# Create new database and collection
use iot_backend
db.createCollection("users")
Refers to handler.js in the four function folders
To develop lambda functions with NodeJS, we make use of the node12
community-maintained template, which provides an event
object containing data of incoming requests and a context
object for receiving response to be sent back. The usage of the template is available here.
To create a new function,
faas-cli new demo --lang node12 -a stack_dev.yml
Refers to deployment script (lambda/deploy.sh)
After the script is written, we build, push and deploy docker image with the faas-cli
tool. We have created a shortcut script for easier re-deployment during debugging process.
./deploy.sh dev
Before the docker image can be uploaded, you need to update the image
fields on stack_dev.yml
so your Docker Hub username is used instead of ours.
As OpenFaaS may not pull latest
tag every time, we need to specify and increment the version number tag if changes were made, so that the new changes are always applied.
After deployment, api endpoints will be available on http://127.0.0.1:8080/function/{function name}
. Please refer to our API documentation for details if you wish to test them.
Similar to above, after we hosted a production environment as described in the next section, we created a stack_prod.yml
based on the previuos stack_dev.yml
with modifications such as updated OpenFaaS and MongoDB endpoints, as well as scaling configurations.
After loggin into the production environment with faas-cli login
, we can use the deployment script again:
./deploy.sh prod
To host a production instance of OpenFaaS and MongoDB, we built a scalable cluster with kubernetes (k8s). Serval open-source tools were used, which will be explained in the section below with details.
This is a network interface plug-in to kubernetes cluster to provide container networking, which enables containers to talk to each other. The implementation details are in here
This is an automatic persistent volume provisioner in the cluster, which provides permanent storage to the some container, such as Database services. The implementation details are in our GitHub-Page.
This is a "pod load balancer". It will automatically delete the pods in "heavy nodes" which has high CPU usage or contains a lot of pods in that node, and the kubernetes cluster itself will recreate the deleted pods under this case. The implementation details are in our GitHub-Page.
We used MongoDB as our database storage in this case. It natively provides a high availability by creating the mongoDB replica set, so when one mongoDB container is down, another one will be able to take its job. And more mongoDB instance can help to load-balance the "read-only" requests. The implementation details are in our GitHub-Page.
Since we are building a k8s cluster directly on EC2 instances, the script helps to automatically create/delete the node from k8s cluster, and also from EC2, according to the average CPU utilization of all existing slave nodes. The script can be found in our GitHub-Page.
The most important part in serverless system is auto scaling on demand. In our kubernetes cluster, there are two components need to be scaled, 1. pods and 2. slave nodes. OpenFaaS already supported the scaling of pods according to the number of request to a function. For slave nodes, we implemented a python script to automatically start instance in EC2 via "boto3", and the newly started instance will automatically join the existing k8s cluster. Currently we used average CPU utilization of all nodes to determine the workload, for example we set the value is 65, which means if the average CPU utilization is over 65%, then a node will be created and join the cluster. Similarly, to auto scale down, if the average CPU utilization is below 20%, then we will only delete those nodes are automatically created by the script, so as to maintain a minimal number of nodes even average CPU utilization is low.
Our experience with OpenFaaS was not smooth. Although the faas-cli
tool simplified a lot of deployment work from a developer standpoint, it has quite a steep learning curve due to unorganized documentations and CLI-centric tools. For example, its docker based deployment flow was not explained clearly on their site and makes it hard for developers with experience on other serverless platform to switch over. In comparison, AWS Lambda provides a fully-fledged GUI web panel for management, as well as good documentation and sample projects to guide developers.
Moreover, the community-maintained templates are quite restrictive and sometimes behaive unexpectedly. In particular, we had to use .succeed()
block to send error responces because the .fail()
block overrides the HTTP headers and content type we specified. The template is poorly documented which increased the diffculty of solving such issues and made development even harder.
These complaints being said, the flexibility provided by the docker-based design is tremendous, where the ecosystem could support lambda functions written in virutally any language if a template is developed by the community. While this approach complicates the deployment process and introduces overhead in both buildtime and runtime, these compromises could be good tradeoffs for a versitile serverless execution framework.
There are many concepts to learn, such as pod, replica set, service, deployment etc in k8s need to be clearly understood. We are setting up the kubernetes cluster from scratch so it is the most complicated way. However, there exist some tools such as kops which can help to setup the cluster in EC2, or the most simple way is to directly use the cluster which provided by cloud provider, such as AWS EKS.
First and foremost, with the lack of proper user autheication and access control mechanisms, our API is not production-ready at all. For authenication, we can make use of the API Gateway module of OpenFaaS which provides OAuth2 support. For access control, middleware can be added before the execution of lambda functions to check if request to a data record comes from the owner or other approved users.
Secondly, our API endpoints do not follow RESTful guilelines and may be hard to work with. To improve the experience, we can make use of a proxy to map GET, POST, PUT and DELETE requests to the respecitve CRUD lambda functions, and accept user id selector in the request url, instead of using the default endpoints provided by OpenFaaS.
Lastly, more lambda functions can be implemented to further ease the development of client applications, such as endpoints for updating records of a particular IoT device. On the other hand, database triggers can be implemented so that lambda functions can be invoked to react to data changes instantly, so as to provide features such as alerts or smart-home behaviors.
To demo our serverless IoT backend system, we have created a scalable cluster with AWS EC2 instances and deployed our lambda functions to the OpenFaaS instance hosted on that cluster. The functions are configured to scale automatically by OpenFaaS. API requests were made to test that our lambda functions execute correctly.
The picture shows that all the pods running to support the four functions.