giusdp's blog

Enabling a FaaS platform to run functions on external servers

Oct 20, 2024 - 5 minute read

FaaS Functions

In FaaS platforms functions are the core unit of execution. You write a snippet of code, upload it to the platform and then some event will trigger the execution of that code.

In the case of OpenWhisk, the platform I work on/with, a function can be a simple JavaScript snippet like this:

function main(args) {
  return { message: 'Hello, World!' };
}

Anatomy of an Invocation in OpenWhisk

With a very simplified view, OpenWhisk has a couple of core components: the Controller and the Invoker. The Controller exposes the REST API to the user, manages things and sends invocation requests to the Invoker (via Kafka). The Invoker receives these requests and interacts with the underlying container runtime to spawn a container, initialize it and execute the function.

The flow is fairly simple: Trigger Event -> Controller -> Invoker -> Container.

Let’s focus for a moment on the container part. In FaaS platforms that use containers as the execution environment, the container will have a “runtime” that needs to be initialised and will use some dependency relative to the function code’s programming language. For example, if you have that above-mentioned JavaScript function, the runtime will probably use Node.js installed o execute the code.

In OpenWhisk, a runtime is effectively a web server with 2 endpoints: /init and /run. The Invoker spawns a container with that running and then sends a POST request to the /init endpoint with the function code in the body. The runtime will do some initialisation (spawns a process ready to run the code) and waits for some input.

When finally an invocation arrives, the Invoker will send another POST request, this time to the /run endpoint with the input data (if any) in the body. The runtime will now execute the code with the relative system (Node.js, Python, etc) and return the result.

Conceptually it is simple, but it’s also powerful for scaling up and down. If there are a lot of invocations, the Invoker(s) can just spawn more and more containers to handle the load. Otherwise, start removing containers that are not being used all the way down to zero.

External Runtimes

It would be cool to be able to run functions on other servers, outside OpenWhisk itself. As an example, if we had an existing server with a GPU we could take advantage of it and run functions that use some LLM model. From the outside, we just need to write the actual code where given some input we interact with the model to produce an answer, and upload it to OpenWhisk. Then OpenWhisk should take care of sending the function to that external server to use the GPU.

This is a project I’ve actually been working on in my part-time job. We can extract the runtime functionality, enhance it and run it directly on the external server. The runtime system is not exactly the same, hence we dubbed it “Server Mode”. From inside the platform, the Invoker can now deploy containers with the runtime in “Client Mode”, which acts as a forward proxy.

The function invocation flow has only one extra step: Trigger Event -> Controller -> Invoker -> Client Mode Runtime Container -> Server Mode Runtime on the external server.

Some Current Limitations

This allows us to push functions outside the platform and take advantage of external resources. Well it’s not so simple. Having divided the standard runtime into two parts we now have to handle cleanup. When OpenWhisk decides to start scaling down and remove containers, all it has to do is to just remove the container. In this scenario OpenWhisk is responsible only of the Client Mode runtime, but nobody would then clean up the Server Mode runtime on the external server.

One way to add runtime cleanup would be a combination of an eviction policy + a new endpoint (e.g. /stop). The client mode runtime can call this endpoint when it’s time to be removed, and in case of any hanging function on the external server the eviction policy will take care of it.

Another aspect to consider is the scalability. OpenWhisk Invokers can just spawn more containers to handle the load (so more and more Client Mode runtimes), but the Server Mode runtime is just one per external server. A possibility is to transform it into a multi-function runtime, so it can be initialised multiple times given proper identification of the functions (now the cleanup would be about removing a function, not the entire runtime).

Well all of this is still a work in progress. Another approach I want to experiment with is to create a Load Balancer service for the Client Mode runtimes to talk to. The Load Balancer will take care of distributing the load to the Server Mode runtimes, and perhaps handle scaling and cleanup.

Conclusion

It is a lot of fun enabling OpenWhisk to run functions on external servers with GPUs, I think we have something innovative here. It makes it really easy to write python code that uses AI stuff and just press send, behind the scenes the platform can invoke the function and use a machine with a GPU rented for cheap anywhere. I’m excited to see where this project will go and how it will evolve.