Skip to content

Running Sessions

Once you have a running cluster, you can start sessions and run data processing code in these sessions. Remember: a cluster is just a bunch of servers (servers are computers without displays and in the context of a cluster, they are referred to as nodes) connected by a network and running remotely in the cloud (AWS data centers).

A session represents a session with a cluster. If no session is running, the cluster will automatically scale down to just a single node (to keep costs down). Once a session is started, data processing code can be offloaded onto the session and run after the necessary nodes are spun up.

Starting a session

You can start a session using the client library.

In the following code snippet, we start a session named mnist_clasify on the cluster weekly-data-playground with 4 workers allocated. Providing a name is optional, but makes it easier to identify the session on the Banyan dashboard. Check out the full documentation for start_session here.

session_id = start_session(
    cluster_name="weekly-data-playground",
    nworkers=4,
    session_name="MNIST-Classify",
    email_when_ready=true
)
session_id = start_session(
    cluster_name="weekly-data-playground",
    nworkers=4,
    session_name="MNIST-Classify",
    email_when_ready=true
)

Set email_when_ready=true (default is false) if you want to receive an email notification when your session is ready and has workers running on your cluster. (A session can take between 10s and 30 min to start running.)

Once the session is running, you can run your code in the session. See the docs for BanyanArrays.jl or BanyanDataFrames.jl for more details on what you can run. Note that the first piece of code that is executed after a session is started may take significantly longer, due to nodes spinning up for compute.

Managing Sessions

You can view information about all your sessions on the Sessions tab of the Banyan dashboard. Sessions are sorted in the order in which they were started, with more recently started sessions displayed near the top of the page.

Sessions table on Banyan dashboard

You can use the client library to start sessions and to manage running sessionss.

You can view all running sessions for a cluster using the following:

get_running_sessions("weekly-data-playground")
get_running_sessions("weekly-data-playground")

Once a session is created, it will continue running until the session is ended or a failure occurs. Note that if a failure occurs, the session will end, as it is not possible for further computation to run on the session. Otherwise, you should explicitly end your session using end_session().

To end a running session, using the following, specifying the session ID. If you do not remember the session ID of your session, you can find this in the table on the Sessions tab of your dashboard using the name of your session.

end_session("2021-06-10-2240237175e6921d0c69f8a93aaa49ffd30287")
end_session("2021-06-10-2240237175e6921d0c69f8a93aaa49ffd30287")

You can also run computations with a session and automatically end the session when it goes out of scope or when the program is terminated, using the following:

with_session(
    cluster_name="weekly-data-playground",
    nworkers=4,
    session_name="mnist_classify",
    store_logs_in_s3=true,
    return_logs_to_client=true
) do s
    # Your code here: see banyancomputing.com/banyan-arrays-jl-docs or
    # banyancomputing.com/banyan-data-frames-jl-docs for more details
end

Not currently supported

What's next?

Once you've started a session, you'll want to run some computation. Learn how to compute futures that have been produced by Banyan libraries such as BanyanArrays.jl or BanyanDataFrames.jl.