Polyaxon allows users to connect to one or multiple buckets on Google Cloud Storage (GCS) to store job outputs and experiment artifacts.
You should create a google cloud storage bucket (e.g. plx-outputs), and you have to assign permission to the bucket.
Google cloud storage provide an easy way to download access key as json file. You should create a secret based on that json file.
You should then create a secret with this access keys information on Kubernetes on the same namespace as Polyaxon deployment:
kubectl create secret generic gcs-secret --from-file=gcs-secret.json=path/to/gcs-key.json -n polyaxon
persistence: outputs: [OUTPUTS-NAME-TO-USE]: store: gcs bucket: gs://[BACKET-NAME] secret: [SECRET-NAME] secretKey: [SECRET-KEY]
persistence: outputs: outputs: store: gcs bucket: gs://outputs-bucket secret: gcs-secret secretKey: gcs-key.json
You can use polyaxon-client to access the outputs in your jobs/experiments.
Polyaxon client does not bundle by default the google cloud storage requirements to keep the client lightweight:
pip install polyaxon-client[gcs]
or to have more control over the version of GCS storage:
pip install polyaxon-client pip install google-cloud-storage
In your experiment/job definition, you can add this step to be available during the run:
build: ... build_steps: ... - pip3 install polyaxon-client[gcs]
In your experiment/job, Polyaxon exposes the secret related to the outputs as well as the outputs path scheduled for the run as an an env var,
and provides an interface to get an authenticated client for each one of these Paths.
from polyaxon_client.tracking import Experiment experiment = Experiment() ... experiment.outputs_store.upload_file(file_path) experiment.outputs_store.upload_dir(dir_path)
If you are using Tensorflow, you won't need to do any further configuration since Tensorflow can natively use GCS, Polyaxon will automatically set the required environment variables so that Tensorflow can use the bucket.