BanyanONNXRunTime.jl
Wish you could combine your awesome PyTorch/TensorFlow models with the awesome Julia language for some awesome data processing/analytics? Now you can! BanyanONNXRunTime.jl is a drop-in replacement for ONNXRunTime.jl which is a Julia API for Microsoft's ONNX Runtime.
Getting Started
To get started with BanyanONNXRunTime.jl, follow the steps here to set up Banyan.jl.
Then, open the Julia REPL and press ]
to enter "Pkg" (package) mode
and run add BanyanONNXRunTime
. (Ensure you have also added Banyan
and BanyanArrays
first.)
Finally, exit the package mode and start a session.
using Banyan, BanyanArrays, BanyanONNXRunTime
start_session(
cluster_name="sales",
nworkers=128,
session_name="Weekly-Sales-Report",
email_when_ready=true
)
Awesome! You can now use the functions described below for massively parallel data processing in this session of 128 workers.
Compiling to ONNX
ONNX is the standard cross-platform format for machine learning models. You can follow steps below to convert your PyTorch and TensorFlow models to ONNX:
Now that you have a .onnx
file - you may proceed!
Reading in ONNX
API
Use load_inference
to read in a .onnx
file. It may either be located at an Internet-hosted location
with a path beginning with http://
or https://
or an S3 location with a path beginning with s3://
.
Specify dynamic_axis=true
if the model has a dynamic axis.
Notes
Please see Using Amazon S3 for instructions on setting up Amazon S3. In order to load an ONNX model, you must have created your cluster created with access to the S3 bucket that the ONNX model you are working with is in.
If you want to load an ONNX model in an S3 bucket that your cluster was not created with access to, you may need to destroy your cluster and create a new cluster with access to the desired S3 bucket.
When loading an ONNX model, a sample must be collected (the whole model is downloaded for an exact sample). Find out how to collect a sample faster and how to preserve cached samples after writing.
Preprocessing
To first do some preprocessing, check out mapslices
and other related functions from BanyanArrays.jl.
Running a Model
API
To get an InferenceSession
, use load_inference
. To run the InferenceSession
, simply call
the InferenceSession
on a dictionary mapping from a single input name to Banyan array with the appropriate dimensions.
The output is a dictionary mapping from a single output name to a Banyan array.
Currently, we only support models that accept a single array and return a single array as output. We also expect that the model will be called on many slices of the array along the first dimension. Please send us an email at support@banyancomputing.com or contact us on the Banyan Users Slack or create a GitHub issue so that we can meet your needs as soon as possible.
Example
# Get model path
model_path = "https://github.com/banyan-team/banyan-julia/raw/cailinw/onnx-stress/BanyanONNXRunTime/test/res/image_compression_model.onnx"
# Load model
model = BanyanONNXRunTime.load_inference(model_path, dynamic_axis=true)
# Load data
files = ( # 100
IterTools.product(1:10, 1:10),
(i, j) -> "https://gibs.earthdata.nasa.gov/wmts/epsg4326/best/MODIS_Terra_CorrectedReflectance_TrueColor/default/2012-07-09/250m/6/$i/$j.jpg"
)
data = BanyanImages.read_jpg(files; add_channelview=true) # Specify `add_channelview` to add a dimension for the RGB channels
# Call model on data
res = model(Dict("input" => data))["output"]
See the notebook for a full example on PyTorch-based satelite image decoding with BanyanImages.jl and BanyanONNXRunTime.jl.
Note that you must compute
or write a result for computation to happen.