FaaSr logo

What is FaaSr?

FaaSr is a package that makes it easy for developers to create R functions and workflows that can run in the cloud, on-demand, based on triggers - such as timers, or repository commits. It is built for Function-as-a-Service (FaaS) cloud computing, and supports both widely-used commercial (GitHub Actions, AWS Lambda, IBM Cloud) and open-source platforms (OpenWhisk). It is also built for cloud storage, and supports the S3 standard also widely used in commercial (AWS S3), open-source (Minio) and research platforms (Open Storage Network)

With FaaSr, you can focus on developing the R functions, and leave dealing with the idiosyncrasies of different FaaS platforms and their APIs to the FaaSr package.

Introduction to FaaSr

What can I use FaaSr for?

Originally developed to support event-driven, on-demand automated execution workflows for forecasting, FaaSr can be also useful in applications such as automated data quality assurance/control, and in general for applications where:

  • You want to execute R applications in the cloud but don’t want to manage servers
  • Your application is triggered by events, such as a timer or a repository push
  • You want to develop your application once and be portable across multiple FaaS platforms

How do I use FaaSr?

In the typical case, to use FaaSr you:

  • Develop one or more R functions
  • Complement your functions with FaaSr calls to upload/download file inputs from an S3 cloud storage
  • Create a configuration for your workflow, describing the order in which your functions execute, their inputs and outputs, and credentials for the cloud computing and data services to use
  • Register your workflow with your cloud computing provider
  • Register triggers to start your workflow
  • Access data and logs from your deployed workflows in your cloud storage provider

What are the pre-requisites to use FaaSr?

To use FaaSr, you need:

  • A GitHub account and repository(ies) hosting your R functions
  • A cloud computing account with a supported FaaS provider (e.g. GitHub Actions, IBM Cloud OpenWhisk, or AWS Lambda)
  • A cloud storage account supporting S3 buckets (e.g. AWS S3, Open Storage Network, or a Minio service)

In a nutshell, how does FaaSr work?

FaaSr uses Docker and cloud computing/storage APIs (Application Programming Interfaces) “under the hood” to deploy your functions:

  • Each function in your workflow runs in a Docker container in the cloud, which is deployed by your FaaS provider of choice
  • The data to be used by your functions is stored persistently in a cloud S3 “bucket”
  • A Docker container’s execution is “ephemeral” - i.e. no data persists after the function ends. FaaSr implements functions to download/upload data to/from a Docker container while it executes
  • The FaaSr package takes care of using cloud provider specific APIs to 1) handle the passing of arguments to your function, 2) handle data transfers from/to S3 buckets, and 3) trigger the execution of “downstream” function(s) in your workflow

In other words, FaaSr deals with the specifics/complexities of multiple cloud APIs and exposes to you a simple interface that is consistent across providers.