Skip to content


Wish you could combine your awesome PyTorch/TensorFlow models with the awesome Julia language for some awesome data processing/analytics? Now you can! BanyanONNXRunTime.jl is a drop-in replacement for ONNXRunTime.jl which is a Julia API for Microsoft's ONNX Runtime.

Getting Started

To get started with BanyanONNXRunTime.jl, follow the steps here to set up Banyan.jl.

Then, open the Julia REPL and press ] to enter "Pkg" (package) mode and run add BanyanONNXRunTime. (Ensure you have also added Banyan and BanyanArrays first.)

Finally, exit the package mode and start a session.

using Banyan, BanyanArrays, BanyanONNXRunTime


Awesome! You can now use the functions described below for massively parallel data processing in this session of 128 workers.

Compiling to ONNX

ONNX is the standard cross-platform format for machine learning models. You can follow steps below to convert your PyTorch and TensorFlow models to ONNX:

Now that you have a .onnx file - you may proceed!

Reading in ONNX


Use load_inference to read in a .onnx file. It may either be located at an Internet-hosted location with a path beginning with http:// or https:// or an S3 location with a path beginning with s3://.

Specify dynamic_axis=true if the model has a dynamic axis.


Please see Using Amazon S3 for instructions on setting up Amazon S3. In order to load an ONNX model, you must have created your cluster created with access to the S3 bucket that the ONNX model you are working with is in.

If you want to load an ONNX model in an S3 bucket that your cluster was not created with access to, you may need to destroy your cluster and create a new cluster with access to the desired S3 bucket.

When loading an ONNX model, a sample must be collected (the whole model is downloaded for an exact sample). Find out how to collect a sample faster and how to preserve cached samples after writing.


To first do some preprocessing, check out mapslices and other related functions from BanyanArrays.jl.

Running a Model


To get an InferenceSession, use load_inference. To run the InferenceSession, simply call the InferenceSession on a dictionary mapping from a single input name to Banyan array with the appropriate dimensions. The output is a dictionary mapping from a single output name to a Banyan array.

Currently, we only support models that accept a single array and return a single array as output. We also expect that the model will be called on many slices of the array along the first dimension. Please send us an email at or contact us on the Banyan Users Slack or create a GitHub issue so that we can meet your needs as soon as possible.


# Get model path
model_path = ""

# Load model
model = BanyanONNXRunTime.load_inference(model_path, dynamic_axis=true)

# Load data
files = (  # 100
    IterTools.product(1:10, 1:10),
    (i, j) -> "$i/$j.jpg"
data = BanyanImages.read_jpg(files; add_channelview=true)  # Specify `add_channelview` to add a dimension for the RGB channels

# Call model on data
res = model(Dict("input" => data))["output"]

See the notebook for a full example on PyTorch-based satelite image decoding with BanyanImages.jl and BanyanONNXRunTime.jl.

Note that you must compute or write a result for computation to happen.