These are the FaaSr functions that users invoke interactively in their desktop client, e.g. RStudio. They are used to configure, register, and invoke workflows.
Usage: faasr(json_path, env_path)
json_path
is a required argument and specifies the path
of a FaaSr workflow configuration file env_path
is optional
and specifies the path of a file containing credentials for use by
FaaSr
This is the first function used to validate and configure a workflow.
It loads the JSON-formatted workflow configuration file from the path
specified by json_path
and (optionally( credentials from a
file path env_path
. It returns a list that is subsequently
used to register and invoke the workflow with functions
list$register_workflow
and
list$invoke_workflow
Example:
mylist <- faasr(json_path="workflow_file.json", env_path="credentials_file")
In the example above, the faasr
function configures a
workflow described in the workflow_file.json
file and loads
FaaS and S3 credentials from credentials_file
. If
successful, the function returns the mylist
list
Usage: $register_workflow()
This function registers a workflow with FaaS cloud provider(s). It
requires a valid list returned from a previous execution of
faasr()
The cloud providers are specified in
ComputeServers
and DataStores
of the JSON file
provided as input for faasr()
. Their respective credentials
are specified in the credentials file also provided as input for
faasr()
.
Example:
mylist <- faasr(json_path="workflow_file.json", env_path="credentials_file")
mylist$register_workflow()
In the example above, the mylist
list returned from the
faasr()
function is used to register a workflow
Usage: $invoke_workflow()
This function immediately invokes a workflow that has been registered
with a FaaS cloud provider. It requires a valid list returned from a
previous execution of faasr()
that has been registered with
cloud providers using $rergister_workflow()
Example:
mylist <- faasr(json_path="workflow_file.json", env_path="credentials_file")
mylist$register_workflow()
mylist$invoke_workflow()
Usage: $set_workflow_timer(cron)
This function sets up a timer to invoke a workflow that has been
registered with a FaaS cloud provider. It requires a valid list returned
from a previous execution of faasr()
that has been
registered with cloud providers using $rergister_workflow()
The required argument cron
is a string that follows the cron
format
Example:
mylist <- faasr(json_path="workflow_file.json", env_path="credentials_file")
mylist$register_workflow()
mylist$set_workflow_timer("*/10 * * * *")
Usage: $unset_workflow_timer(cron)
This function unsets a timer previously set with
$set_workflow_timer
. It requires a valid list returned from
a previous execution of faasr()
that has been registered
with cloud providers using $rergister_workflow()
Example:
mylist <- faasr(json_path="workflow_file.json", env_path="credentials_file")
mylist$register_workflow()
mylist$set_workflow_timer("*/10 * * * *")
mylist$unset_workflow_timer()
These are the FaaSr functions that users add to their R functions. They are used to read and write from/to S3 buckets and to generate logs for debugging.
Usage:
faasr_get_file(server_name, remote_folder, remote_file, local_folder, local_file)
This function gets (i.e. downloads) a file from an S3 bucket to be used by the FaaSr function.
server_name
is a string with name of the S3 bucket to
use; it must match a name declared in the workflow configuration JSON
file. This is an optional argument; if not provided, the default S3
server specified as DefaultDataStore
in the workflow
configuration JSON file is used.
remote_folder
is string with the name of the remote
folder where the file is to be downloaded from. This is an optional
argument that defaults to ""
remote_file
is a string with the name for the file to be
downloaded from the S3 bucket. This is a required arguemnt.
local_folder
is a string with the name of the local
folder where the file to be downloaded is stored. This is an optional
argument that defaults to "."
local_file
is a string with the name for the file
downloaded from the S3 bucket. This is a required arguemnt.
Examples:
faasr_get_file(remote_folder="myfolder", remote_file="myinput1.csv", local_file="input1.csv")
faasr_get_file(server_name="My_Minio_Bucket", remote_file="myinput2.csv", local_file="input2.csv")
Usage:
faasr_put_file(server_name, local_folder, local_file, remote_folder, remote_file)
This function puts (i.e. uploads) a file from the local FaaSr function to an S3 bucket.
server_name
is a string with name of the S3 bucket to
use; it must match a name declared in the workflow configuration JSON
file. This is an optional argument; if not provided, the default S3
server specified as DefaultDataStore
in the workflow
configuration JSON file is used.
local_folder
is a string with the name of the local
folder where the file to be uploaded is stored. This is an optional
argument that defaults to "."
local_file
is a string with the name for the file to be
uploaded to the S3 bucket. This is a required arguemnt.
remote_folder
is string with the name of the remote
folder where the file is to be uploaded to. This is an optional argument
that defaults to ""
remote_file
is a string with the name for the file to be
uploaded to the S3 bucket. This is a required arguemnt.
Examples:
faasr_put_file(local_file="output.csv", remote_folder="myfolder", remote_file="myoutput.csv")
faasr_get_file(server_name="My_Minio_Bucket", local_file="output.csv", remote_file="myoutput.csv")
Usage:
folderlist <- faasr_get_folder_list(server_name, faasr_prefix)
This function returns a list with the contents of a folder in the S3 bucket.
server_name
is a string with name of the S3 bucket to
use; it must match a name declared in the workflow configuration JSON
file. This is an optional argument; if not provided, the default S3
server specified as DefaultDataStore
in the workflow
configuration JSON file is used.
faasr_prefix
is a string with the prefix of the folder
in the S3 bucket. This is an optional argument that defaults to
""
Examples:
mylist1 <- faasr_get_folder_list(server_name="My_Minio_Bucket", faasr_prefix="myfolder")
mylist2 <- faasr_get_folder_list(server_name="My_Minio_Bucket", faasr_prefix="myfolder/mysubfolder")
Usage:
faasr_delete_file(server_name, remote_folder, remote_file)
This function deletes a file from the S3 bucket.
server_name
is a string with name of the S3 bucket to
use; it must match a name declared in the workflow configuration JSON
file. This is an optional argument; if not provided, the default S3
server specified as DefaultDataStore
in the workflow
configuration JSON file is used.
remote_folder
is string with the name of the remote
folder where the file is to be deleted from. This is an optional
argument that defaults to ""
remote_file
is a string with the name for the file to be
deleted from the S3 bucket. This is a required arguemnt.
Examples:
faasr_delete_file(remote_folder="myfolder", remote_file="myoutput.csv")
faasr_delete_file(server_name="My_Minio_Bucket", remote_file="myoutput.csv")
Usage:
faasr_arrow_s3_bucket(server_name, faasr_prefix)
This function configures an S3 bucket to use with Apache Arrow.
server_name
is a string with name of the S3 bucket to
use; it must match a name declared in the workflow configuration JSON
file. This is an optional argument; if not provided, the default S3
server specified as DefaultDataStore
in the workflow
configuration JSON file is used.
faasr_prefix
is a string with the prefix of the folder
in the S3 bucket. This is an optional argument that defaults to
""
It returns a list that is subsequently used with the Arrow package.
Examples:
mys3 <- faasr_arrow_s3_bucket()
myothers3 <- faasr_arrow_s3_bucket(server_name="My_Minio_Bucket", faasr_prefix="myfolder")
frame_input1 <- arrow::read_csv_arrow(mys3$path(file.path(folder, input1)))
frame_input2 <- arrow::read_csv_arrow(mys3$path(file.path(folder, input2)))
arrow::write_csv_arrow(frame_output, mys3$path(file.path(folder, output)))
Usage: faasr_log(log_message)
This function writes a log message to a file in the S3 bucket, to
help with debugging. The default S3 server for logs is
DefaultDataStore
as specified in the workflow configuration
JSON file. This default can be overridden with
LoggingDataStore
in the workflow configuration JSON
file.
log_message
is a string with the message to be
logged.
Example:
log_msg <- paste0('Function compute_sum finished; output written to ', folder, '/', output, ' in default S3 bucket')
faasr_log(log_msg)
The workflow JSON configuration file is described by the FaaSr JSON schema.
A recommended way to create and manage JSON configuration files is to use the FaaSr workflow builder Shiny app
The credentials file used by FaaSr has key-value string pairs, stored one per line in a text file, with format:
"key":"value"
The key is the name of the credential (which must match the name of the cloud server in a configurration file) The value is the crredential itself
Example:
"My_GitHub_Account_TOKEN"="REPLACE_WITH_YOUR_GITHUB_TOKEN"
"My_Minio_Bucket_ACCESS_KEY"="REPLACE_WITH_ACCESS_KEY"
"My_Minio_Bucket_SECRET_KEY"="REPLACE_WITH_SECRET_KEY"
"My_OW_Account_API_KEY"="REPLACE_WITH_YOUR_OPENWHISK_ID:SECRET_KEY"
"My_Lambda_Account_ACCESS_KEY"="REPLACE_WITH_YOUR_AWS_LAMBDA_ACCESS_KEY"
"My_Lambda_Account_SECRET_KEY"="REPLACE_WITH_YOUR_AWS_LAMBDA_SECRET_KEY"
The example shows credentials for the following accounts:
My_GitHub_Account
My_Minio_Bucket
My_OW_Account
My_Lambda_Account