REST APIs for /platform/rest/deeplearning/v1

Deep Learning Impact RESTful APIs.

Version: v1

BasePath:/platform/rest/deeplearning/v1

http://apache.org/licenses/LICENSE-2.0.html

Methods

[ Jump to Models ]

Configuration

get /conf

Retrieves all dlpd configuration parameters (getDeepLearningConf)

Returns all properties that are defined in $EGO_CONFDIR/../../dli/conf/dlpd/dlpd.conf.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

StringMap

Example data

Content-Type: application/json

{
  "example_key" : "example_key"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains a dictionary of key-value string pairs that is defined in the dlpd.conf file. StringMap

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

Execute

delete /execs

Deletes all tasks started by the current users. (execsDelete)

Delete all tasks

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Ok. Successfully deleted all tasks.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

delete /execs/{execId}

Deletes a task started through Execute (execsExecIdDelete)

Deletes a task

Path parameters

execId (required)

Path Parameter — ID of task

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Ok. Successfully deleted the task.

400

Cannot find task with given execId.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

post /execs/{execId}/deploy

Deploy the trained model to Elastic Distributed Inferece. (execsExecIdDeployPost)

Deploy the trained model to Elastic Distributed Inferece.

Path parameters

execId (required)

Path Parameter — Execution ID of the training task.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request body

modelDescription EDIModelDescription (required)

Body Parameter — Model configuration to deploy.

Return type

CreationResponse

Example data

Content-Type: application/json

{
  "uid" : "uid",
  "href" : "href"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successfully deployed. CreationResponse

400

Invalid input parameters.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

get /execs/{execId}

Retrieves a task started through Execute (execsExecIdGet)

Retrieves a task started through Execute. The returned values 'submissionId' can be used to make other Conductor REST calls to get additional task details.

Path parameters

execId (required)

Path Parameter — ID of task

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

Batch

Example data

Content-Type: application/json

{
  "args" : "args",
  "creator" : "creator",
  "submissionId" : "submissionId",
  "appName" : "appName",
  "schedulerUrl" : "schedulerUrl",
  "appId" : "appId",
  "id" : "id",
  "workDir" : "workDir",
  "state" : "state",
  "events" : "events"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the task. Batch

400

Cannot find task with given execId.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

get /execs/{execId}/log

Retrieve logs of the training task by execution ID. (execsExecIdLogGet)

Retrieve logs of the training task by execution ID.

Path parameters

execId (required)

Path Parameter — Execution ID of the training task.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

String

Example data

Content-Type: application/json

""

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the logs of this training task. String

400

Cannot find task with the given execId.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

get /execs/{execId}/result

Retrieve the result of the training task using an execution ID. (execsExecIdResultGet)

Retrieve the result of the training task using an execution ID.

Path parameters

execId (required)

Path Parameter — Execution ID of the training task.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

String

Example data

Content-Type: application/json

""

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the trained model of the training task. Returned as a zip file. String

400

Cannot find task with the given execId.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

post /execs/{execId}/stop

Stop the training task by execution ID. (execsExecIdStopPost)

Stop the training task by execution ID.

Path parameters

execId (required)

Path Parameter — Execution ID of the training task.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully stopped the tasks.

400

Cannot find task with the given execId.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

get /execs/frameworks

Retrieves all deep learning framework plugins (execsFrameworksGet)

Retrieves all deep learning framework plugins

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

array[DLFramework]

Example data

Content-Type: application/json

[ {
  "distributeStrategy" : "MultiWorkerMirroredStrategy",
  "frameworkVersion" : "frameworkVersion",
  "name" : "name",
  "description" : "description",
  "numPs" : 0,
  "desc" : [ "desc", "desc" ]
}, {
  "distributeStrategy" : "MultiWorkerMirroredStrategy",
  "frameworkVersion" : "frameworkVersion",
  "name" : "name",
  "description" : "description",
  "numPs" : 0,
  "desc" : [ "desc", "desc" ]
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains all deep learning framework plugins. Framework plugin names are used to start a task.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

get /execs

Retrieves all tasks started through Execute (execsGet)

Retrieves all tasks started through Execute

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

projectid (optional)

Query Parameter — projectid

Return type

array[Batch]

Example data

Content-Type: application/json

[ {
  "args" : "args",
  "creator" : "creator",
  "submissionId" : "submissionId",
  "appName" : "appName",
  "schedulerUrl" : "schedulerUrl",
  "appId" : "appId",
  "id" : "id",
  "workDir" : "workDir",
  "state" : "state",
  "events" : "events"
}, {
  "args" : "args",
  "creator" : "creator",
  "submissionId" : "submissionId",
  "appName" : "appName",
  "schedulerUrl" : "schedulerUrl",
  "appId" : "appId",
  "id" : "id",
  "workDir" : "workDir",
  "state" : "state",
  "events" : "events"
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the tasks.

401

Authentication error. The request was denied.

500

An unexpected error occurred.

post /execs

Starts a task through Execute (execsPost)

Starts a task through Execute. It can have a data parameter to specify the task arguments and data sources.

data specification:

{
    'args': 'Arguments to the task. It has the same format as the as the `args` in the request parameters, except the `--cs-datastore-meta` options can be override by below `dataSource` configure.',
    'projectId': 'project Id',
    # hardwareSpec define hardware specification for worker and driver, if hardwareSpec is specified, hardware specification in args will be ignored.
    # if id or name of hardwareSpec is defined, it will use the existing hardware specification, otherwise, it will use the hardware specification entity defined in nodes
    'hardwareSpec': {
        'id': 'id of hardware specification id',
        'name': 'name of hardware specification',
        'nodes': {
            'cpu': {
                'units': 'number of worker cpu units',
            },
            'mem': {
                'size': 'worker memory size',
            },
            'gpu': {
                'num_gpu': 'number of worker gpu',
                'gpu_profile': 'gpu type, one of: generic, full, slice',
                'mig_profile': 'MIG profile for slice type gpu, e.g. 1g.5gb, 2g.10gb',
            },
            'num_nodes': 'number of workers',
            'num_drivers': 'number of drivers',
            'drivers': {
                'cpu': {
                    'units': 'number of driver cpu units',
                },
                'mem': {
                    'size': 'driver memory size',
                }
            }
        }
        # override values of the hardware specification
        'asset_params':[
            {
                'path' : 'path of params begin from /nodes, e.g. /nodes/num_nodes',
                'value':  'new value of the params',
            },
        ]
    }
    'dataSource': [
    {
        'type': 'Type of the data source, it can be `fs`, `connection` or `data_asset`',
        'asset': {
            'asset_id': 'CP4D asset id for `connection` or `data_asset` asset.',
            'project_id': 'CP4D project id where the asset locates',
            'catalog_id': 'CP4D catalog id where the asset locates',
            'space_id': 'CP4D space id where the asset locates',
        },
        'location': {
            # for `connection` or `data_asset` type data source, configure data connection interaction properties for the asset.
            # for `fs` type data source, below configurations are allowed.
            'paths': 'string, optional, relative data path in wmla data pvc.',
            'volume': 'string, optional, CP4D storage volume name'
        },
        'parameters': {
            'read_to_file': 'bool,  optional, to indicate if read data source into memory or download it as a file, default is False',
            'save_root_path': 'string, optional, only valid if read_to_file=True, to indicate where the file will be saved to. Check more about it after this.',
            'asset_name': "string, optional, when read_to_file=True this asset_name will be used as the file name to save, when read_to_file=False, it will be used as the key in the dict output of WMLADataManager.create_from_data_source().read_pandas() to distinguish the result",
            'batch_size': 'int, optional, flight service parameter to decide batch size in each chunck to read, default 10000 when read_to_file=False, 1000 when read_to_file=True.',
            'num_partitions': 'int, optional, flight service parameter to decide how to participate the data source when reading it. default 4.',
        }
    }]
}

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

args (required)

Query Parameter — Arguments to the task. These arguments can be found in the command line interface. They can be model specific arguments. Examples are "--exec-start tensorflow --model-main TF_mnist.py", "--exec-start PyTorch --model-main PyTorch_mnist.py --batch-size 200"

Form parameters

file (required)

Form Parameter — If the model consists of one file then specify that file. If the model consists of a directory, then it's the tar of the directory with suffix ".modelDir.tar"

data (optional)

Form Parameter — Python dict or json format, convert to string when calling REST.

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful task creation

401

Authentication error. The request was denied.

500

An unexpected error occurred.

HyperSearch

post /hypersearch/algorithm/debug

debug alogorithm (debugAlgorithm)

Generate and download a fake task_attr.pb for local algorithm debugging.

Pass simulated hpo task submit request in the request body, which should be json format as below:

data sepcification:

{
   'hpoName': 'optional, string, name/id for the hpo task, will generate one if none specified here.',
   'modelSpec': 
   {
       'args': 'required, string, same as BYOF training'
   },
   'algoDef':
   { 
       'algorithm': 'required, string, it can be build in algorithms like Random, Tpe, Hyperband and ExperimentGridSearch, or user installed algorithms',
       'maxRunTime': 'optional, int, max running time of the hpo task in minutes, default -1(unlimited)',
       'maxJobNum': 'optional, int, max number of training job to submitted for hpo task, default -1(unlimited)',
       'maxParalleJob': 'optinal, int, max number of training job to run in parallel, default 1',
       'objectiveMetric': 'required, string, name of metric will be optimized, same one in the val_dict_list.json',
       'objective': 'required, string, optimize policy, one of minimize, maximize',
       'algoParams': 'optional, list like [{'name':'', value:''}], additional algorithm parameters and it could be different for each algorithm which will be covered in later part'
   },
   'hyperParams':
   [
       {
           'name': 'required, string, hyperparameter name, the same name will be used in the config.json so user model can load it',
           'type': 'required, string, one of Range, Discrete',
           'dataType': 'required, string, one of int, double, str',
           'minDbVal': 'double, required if type=Range and datatype=double',
           'maxDbVal': 'double, required if type=Range and datatype=double',
           'minIntVal': 'int, required if type=Range and datatype=int',
           'maxIntVal': 'int, required if type=Range and datatype=int',
           'discreteDbVal': 'double, list like [0.1, 0.2], required if type=Discrete and dataType=double',
           'discreteIntVal': 'int, list like [1, 2], required if type=Discrete and datatype=int',
           'discreateStrVal': 'string, list like ['1', '2'], required if type=Discrete and datatype=str',
           'power': 'a number value in string format, the base value for power calculation. ONLY valid when type is Range',
           'step': 'a number value in string format, step size to split the Range space. ONLY valid when type is Range'
       }
   ]
}

New hpo task request should use hyperParams in the request body.

For Random, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used to propose hyperparameter combinations.'
    }
]

For Hyperband, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used by Hyperband to propose hyperparameter combinations in the first rung of brackets.'
    },
    {
        'name': 'eta',
        'value': 'Optional, string, the reduction factor to control the proportion of configurations discarded in each Hyperband brackets. Default 3.'            
    },
    {
        'name': 'ResourceName',
        'value': 'Required, string, the parameter name that will be taken as resource in Hyperband, normally training epochs or iterations. User can get this parameter from config.json just like other hyper-parameters.'
    },
    {
        'name': 'ResourceValue',
        'value': 'Required, int value in string format, it is the corresponding upper limited value for the ResourceName.'
    }
]

For Tpe, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used for the initial warm up hyperparameter combinations and the random generator of Gaussian Mixture Model.'
    },
    {
        'name': 'WarmUp',
        'value': 'Optional, string, the number of initial warm up hyperparameter combinations. It should be bigger than 2. If maxJobNum is smaller than this value, maxJobNum will be taken as the value. Default 20.'
    },
    {
        'name': 'EICandidate',
        'value': 'Optional, string, the number of hyperparameter combinations proposed each round as the candidates for Expected Improvement to propose the final one hyperparameter combination. It should be bigger than 1. Default 24.'
    },
    {
        'name': 'GoodRatio',
        'value': 'Optional, string, the fraction to use as good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. It should be bigger than 0. Default 0.25.'
    },
    {
        'name': 'GoodMax',
        'value': 'Optional, string, the max number of good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. It should be bigger than 1. Default 25.'
    }
]

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request body

hpoTaskOpts HpoTaskInput (required)

Body Parameter — The simulated hpo task submit input.

Return type

String

Example data

Content-Type: application/json

""

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful generate and download a fake task_attr.pb for algorithm debugging. String

401

Authentication error. The request was denied.

500

An unexpected error occurred.

delete /hypersearch

Delete all hpo tasks (deleteAllHPO)

Delete all hpo tasks.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully deleted all the hpo tasks.

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource is not found.

409

Conflict. The requested resource cannot be deleted because it is in use.

500

An unexpected error occurred.

delete /hypersearch/{hpoName}

Delete a hpo task (deleteOneHPO)

Delete a hpo task.

Path parameters

hpoName (required)

Path Parameter — The hpo task name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully deleted the hpo task.

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource is not found.

409

Conflict. The requested resource cannot be deleted because it is in use.

500

An unexpected error occurred.

delete /hypersearch/algorithm/{algoName}

Delete a hpo plugin algorithm (deleteOneHPOALGORITHM)

Delete a hpo plugin algorithm.

Path parameters

algoName (required)

Path Parameter — The hpo plugin algorithm name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully deleted the hpo plugin algorithm.

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource is not found.

500

An unexpected error occurred.

get /hypersearch

Retrieve all hpo tasks. (getAllHPO)

Get all the hpo tasks that the login user can access.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

projectid (optional)

Query Parameter — projectid

Return type

array[HpoTaskState]

Example data

Content-Type: application/json

[ {
  "running" : 0,
  "duration" : "duration",
  "creator" : "creator",
  "createtime" : "createtime",
  "hpoName" : "hpoName",
  "experiments" : [ {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  }, {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  } ],
  "progress" : "progress",
  "best" : {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  },
  "state" : "state",
  "failed" : 1,
  "complete" : 6
}, {
  "running" : 0,
  "duration" : "duration",
  "creator" : "creator",
  "createtime" : "createtime",
  "hpoName" : "hpoName",
  "experiments" : [ {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  }, {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  } ],
  "progress" : "progress",
  "best" : {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  },
  "state" : "state",
  "failed" : 1,
  "complete" : 6
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the resulting hpo tasks.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

get /hypersearch/algorithm

Retrieve all hpo algorithm by algorithm type. (getAllHPOAlgorithm)

Get all the hpo tasks that the login user can access.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

type (optional)

Query Parameter — The algorithm type, BUILD_IN or USER_PLUGIN, if not specified, it will query all algorithms

Return type

array[HpoAlgorithmDesc]

Example data

Content-Type: application/json

[ {
  "path" : "path",
  "creator" : "creator",
  "createtime" : "createtime",
  "logLevel" : "logLevel",
  "condaEnv" : "condaEnv",
  "name" : "name",
  "condaHome" : "condaHome",
  "type" : "type",
  "remoteExec" : true
}, {
  "path" : "path",
  "creator" : "creator",
  "createtime" : "createtime",
  "logLevel" : "logLevel",
  "condaEnv" : "condaEnv",
  "name" : "name",
  "condaHome" : "condaHome",
  "type" : "type",
  "remoteExec" : true
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the resulting hpo tasks.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

get /hypersearch/algorithm/{algoName}

Retrieve the hpo algorithm detail (getOneAlgorithm)

Retrieve the hpo algorithm detail with the specified name in URL.

Path parameters

algoName (required)

Path Parameter — The hpo algorithm name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

HpoAlgorithmDesc

Example data

Content-Type: application/json

{
  "path" : "path",
  "creator" : "creator",
  "createtime" : "createtime",
  "logLevel" : "logLevel",
  "condaEnv" : "condaEnv",
  "name" : "name",
  "condaHome" : "condaHome",
  "type" : "type",
  "remoteExec" : true
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the resulting hpo algorithm. HpoAlgorithmDesc

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

get /hypersearch/{hpoName}

Retrieve the hpo task detail (getOneHPO)

Retrieve the hpo task detail with the specified hpo task name in URL.

Path parameters

hpoName (required)

Path Parameter — The hpo task name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

HpoTaskState

Example data

Content-Type: application/json

{
  "running" : 0,
  "duration" : "duration",
  "creator" : "creator",
  "createtime" : "createtime",
  "hpoName" : "hpoName",
  "experiments" : [ {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  }, {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  } ],
  "progress" : "progress",
  "best" : {
    "maxiteration" : 3,
    "appId" : "appId",
    "metricVal" : 5.637376656633329,
    "startTime" : "startTime",
    "id" : 5,
    "state" : "state",
    "metrics" : [ {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    }, {
      "min" : 2.3021358869347655,
      "max" : 7.061401241503109,
      "name" : "name",
      "latest" : 9.301444243932576
    } ],
    "endTime" : "endTime",
    "hyperParams" : [ {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    }, {
      "maxIntVal" : 1,
      "fixedVal" : "fixedVal",
      "minIntVal" : 7,
      "dataType" : "int",
      "userDefined" : true,
      "discreateStrVal" : [ "discreateStrVal", "discreateStrVal" ],
      "type" : "range",
      "discreteIntVal" : [ 1, 1 ],
      "maxDbVal" : 4.145608029883936,
      "name" : "name",
      "step" : "step",
      "power" : "power",
      "discreteDbVal" : [ 1.0246457001441578, 1.0246457001441578 ],
      "minDbVal" : 2.027123023002322
    } ]
  },
  "state" : "state",
  "failed" : 1,
  "complete" : 6
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the resulting hpo task. HpoTaskState

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

post /hypersearch/algorithm/install

Install a new hpo plugin algorithm (installHPOAlgorithm)

Install a new hpo plugin algorithm by providing algorithm scipts as well as other required parameters.

To install a new hpo plugin algorithm, we need string format of input parameters, which is python dict or json format as below:

data sepcification:

{
   'name': 'required, string, name/id for the plugin algorithm, should be unique.',
   'path': 'optional, string, the path for plugin algorithm scripts on server, required for local installation mode.',
   'condaHome': 'optional, string, the CONDA_HOME to run the algorithm scripts, it will use the DLI_CONDA_HOME if not specified.',
   'condaEnv': 'optional, string, the conda environment to run the algorithm scripts, it will use the DLI default conda environment if not specified.',
   'remoteExec': 'optional, boolean, whether to deploy algorithm execution remotely, the default value is false.',
   'logLevel': 'optional, string, the log level of the plugin algorithm, the default value is INFO.'
}

Consumes

This API call consumes the following media types via the Content-Type request header:

multipart/form-data
application/x-www-form-urlencoded

Form parameters

file (optional)

Form Parameter — tar the plugin algorithm directory with suffix ".tar", require if the using upload installation mode

data (required)

Form Parameter — Python dict or json format, convert to string when calling REST.

Return type

CreationResponse

Example data

Content-Type: application/json

{
  "uid" : "uid",
  "href" : "href"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successfully installed the hpo plugin algorithm. CreationResponse

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

put /hypersearch/{hpoName}/restart

restart a hpo task (restartOneHPO)

restart a hpo task

Path parameters

hpoName (required)

Path Parameter — The hpo task name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully stopped the hpo task forcely.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource was not found.

500

An unexpected error occurred.

post /hypersearch

Start a new hpo task (startHPO)

Start a new hpo task by providing required parameters.

To start a hpo task, we need string format of input parameters, which is python dict or json format as below:

data sepcification:

{
   'hpoName': 'optional, string, name/id for the hpo task, will generate one if none specified here.',
   'modelSpec': 
   {
       'args': 'required, string, same as BYOF training',
       # hardwareSpec define hardware specification for worker and driver, if hardwareSpec is specified, hardware specification in args will be ignored.
       # if id or name of hardwareSpec is defined, it will use the existing hardware specification, otherwise, it will use the hardware specification entity defined in nodes
       'hardwareSpec': {
           'id': 'id of hardware specification id',
           'name': 'name of hardware specification',
           'nodes': {
               'cpu': {
                   'units': 'number of worker cpu units',
               },
               'mem': {
                   'size': 'worker memory size',
               },
               'gpu': {
                   'num_gpu': 'number of worker gpu',
                   'gpu_profile': 'gpu type, one of: generic, full, slice',
                   'mig_profile': 'MIG profile for slice type gpu, e.g. 1g.5gb, 2g.10gb',
               },
               'num_nodes': 'number of workers',
               'num_drivers': 'number of drivers',
               'drivers': {
                   'cpu': {
                       'units': 'number of driver cpu units',
                   },
                   'mem': {
                       'size': 'driver memory size',
                   }
               }
           }
           # override values of the hardware specification
           'asset_params':[
               {
                   'path' : 'path of params begin from /nodes, e.g. /nodes/num_nodes',
                   'value':  'new value of the params',
               },
           ]
       }
        'dataSource':  [
        {
            'type': 'Type of the data source, it can be `fs`, `connection` or `data_asset`',
            'asset': {
                'asset_id': 'CP4D asset id for `connection` or `data_asset` asset.',
                'project_id': 'CP4D project id where the asset locates',
                'catalog_id': 'CP4D catalog id where the asset locates',
                'space_id': 'CP4D space id where the asset locates',
            },
            'location': {
                # for `connection` or `data_asset` type data source, configure data connection interaction properties for the asset.
                # for `fs` type data source, below configurations are allowed.
                'paths': 'string, optional, relative data path in wmla data pvc.',
                'volume': 'string, optional, CP4D storage volume name'
            },
            'parameters': {
                'read_to_file': 'bool,  optional, to indicate if read data source into memory or download it as a file, default is False',
                'save_root_path': 'string, optional, only valid if read_to_file=True, to indicate where the file will be saved to. Check more about it after this.',
                'asset_name': "string, optional, when read_to_file=True this asset_name will be used as the file name to save, when read_to_file=False, it will be used as the key in the dict output of WMLADataManager.create_from_data_source().read_pandas() to distinguish the result",
                'batch_size': 'int, optional, flight service parameter to decide batch size in each chunck to read, default 10000 when read_to_file=False, 1000 when read_to_file=True.',
                'num_partitions': 'int, optional, flight service parameter to decide how to participate the data source when reading it. default 4.',
            }
        }]
   },
   'algoDef':
   { 
       'algorithm': 'required, string, it can be build in algorithms like Random, Tpe, Hyperband and ExperimentGridSearch, or user installed algorithms',
       'maxRunTime': 'optional, int, max running time of the hpo task in minutes, default -1(unlimited)',
       'maxJobNum': 'optional, int, max number of training job to submitted for hpo task, default -1(unlimited)',
       'maxParalleJob': 'optinal, int, max number of training job to run in parallel, default 1',
       'objectiveMetric': 'required, string, name of metric will be optimized, same one in the val_dict_list.json',
       'objective': 'required, string, optimize policy, one of minimize, maximize',
       'additionalMetrics': 'optional, dict like {'metric_name': 'metric strategy'}, where metric stragety can be one of minimize, maximize, latest. latest will be used as the strategy if other names than those three is specified.',
       'algoParams': 'optional, list like [{'name':'', value:''}], additional algorithm parameters and it could be different for each algorithm which will be covered in later part'
   },
   'hyperParams':
   [
       {
           'name': 'required, string, hyperparameter name, the same name will be used in the config.json so user model can load it',
           'type': 'required, string, one of Range, Discrete',
           'dataType': 'required, string, one of int, double, str',
           'minDbVal': 'double, required if type=Range and datatype=double',
           'maxDbVal': 'double, required if type=Range and datatype=double',
           'minIntVal': 'int, required if type=Range and datatype=int',
           'maxIntVal': 'int, required if type=Range and datatype=int',
           'discreteDbVal': 'double, list like [0.1, 0.2], required if type=Discrete and dataType=double',
           'discreteIntVal': 'int, list like [1, 2], required if type=Discrete and datatype=int',
           'discreateStrVal': 'string, list like ['1', '2'], required if type=Discrete and datatype=str',
           'power': 'a number value in string format, the base value for power calculation. ONLY valid when type is Range',
           'step': 'a number value in string format, step size to split the Range space. ONLY valid when type is Range'
       }
   ],
   'experiments':
   [
       {
          'id': 'required, int, hyperparameter experiment id',
          'hyperParams':
          [
              {
                  'name': 'required, string, hyperparameter name, the same name will be used in the config.json so user model can load it',
                  'dataType': 'required, string, one of int, double, str',
                  'fixedVal': 'required, the same type with datatype specified, if dataTye=double, need fixedVal type doulbe'
              }
          ]
       }
    ]
}

Each new hpo task request could only choose one from hyperParams and experiments, for search algorithm ExperimentGridSearch, only experiments is supported, for other algorithms, only hyperParams is supported:

For Random, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used to propose hyperparameter combinations.'
    }
]

For Hyperband, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used by Hyperband to propose hyperparameter combinations in the first rung of brackets.'
    },
    {
        'name': 'eta',
        'value': 'Optional, string, the reduction factor to control the proportion of configurations discarded in each Hyperband brackets. Default 3.'            
    },
    {
        'name': 'ResourceName',
        'value': 'Required, string, the parameter name that will be taken as resource in Hyperband, normally training epochs or iterations. User can get this parameter from config.json just like other hyper-parameters.'
    },
    {
        'name': 'ResourceValue',
        'value': 'Required, int value in string format, it is the corresponding upper limited value for the ResourceName.'
    }
]

For Tpe, algoParams can be provided as this:

'algoParams':
[
    {
        'name': 'RandomSeed',
        'value': 'Optional, string, the random seed used for the initial warm up hyperparameter combinations and the random generator of Gaussian Mixture Model.'
    },
    {
        'name': 'WarmUp',
        'value': 'Optional, string, the number of initial warm up hyperparameter combinations. It should be bigger than 2. If maxJobNum is smaller than this value, maxJobNum will be taken as the value. Default 20.'
    },
    {
        'name': 'EICandidate',
        'value': 'Optional, string, the number of hyperparameter combinations proposed each round as the candidates for Expected Improvement to propose the final one hyperparameter combination. It should be bigger than 1. Default 24.'
    },
    {
        'name': 'GoodRatio',
        'value': 'Optional, string, the fraction to use as good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. It should be bigger than 0. Default 0.25.'
    },
    {
        'name': 'GoodMax',
        'value': 'Optional, string, the max number of good hyperparameter combinations from previous completed experiment training to build the good Gaussian Mixture Model. It should be bigger than 1. Default 25.'
    }
]

Consumes

This API call consumes the following media types via the Content-Type request header:

multipart/form-data
application/x-www-form-urlencoded

Form parameters

file (required)

Form Parameter — If the model consists of one file then specify that file. If the model consists of a directory, then it's the tar of the directory with suffix '.modelDir.tar"

data (required)

Form Parameter — Python dict or json format, convert to string when calling REST.

Return type

CreationResponse

Example data

Content-Type: application/json

{
  "uid" : "uid",
  "href" : "href"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successfully started hpo task. CreationResponse

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

409

Conflict. The requested resource already exists.

500

An unexpected error occurred.

put /hypersearch/{hpoName}

Stops a hpo task (stopOneHPO)

Stops a running hpo task.

Path parameters

hpoName (required)

Path Parameter — The HPO task name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully stopped the hpo task.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource was not found.

500

An unexpected error occurred.

put /hypersearch/{hpoName}/force

Stop a hpo task forcely (stopOneHPOForce)

Stop a running hpo task forcely.

Path parameters

hpoName (required)

Path Parameter — The hpo task name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully stopped the hpo task forcely.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource was not found.

500

An unexpected error occurred.

ResourcePlans

post /resplans/resplan

Creates a resource plan. (createResplan)

Creates a new resource plan.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request body

resplan Resplan (required)

Body Parameter — The information of the resource plan.

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

201

The resource plan is created successfully.

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

409

Conflict. The requested resource already exists.

500

An unexpected error occurred.

delete /resplans/resplan/{resplan_name}

Deletes a resource plan. (delResplanByName)

Deletes a resource plan.

Path parameters

resplan_name (required)

Path Parameter — The resource plan name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successfully deleted the resource plan.

400

The request format is invalid.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource is not found.

409

Conflict. The requested resource cannot be deleted because it is in use.

500

An unexpected error occurred.

get /resplans/resplan/{resplan_name}

Retrieves a resource plan by its name (getResplan)

Retrieves a resource plan by its name.

Path parameters

resplan_name (required)

Path Parameter — The resource plan name.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

V1Resplan

Example data

Content-Type: application/json

{
  "path" : "path",
  "usedgpu" : "usedgpu",
  "requestgpu" : "requestgpu",
  "usedcpu" : "usedcpu",
  "requestcpu" : "requestcpu"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains the resource plan. V1Resplan

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource was not found.

500

An unexpected error occurred.

get /resplans/resplantree

Retrieves the tree of the resource plan (getResplanTree)

Returns the resource plan tree for the customer view.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Return type

TreeDto

Example data

Content-Type: application/json

{
  "path" : "path",
  "isParent" : "isParent",
  "childTreeDto" : [ null, null ],
  "V1Resplan" : {
    "path" : "path",
    "usedgpu" : "usedgpu",
    "requestgpu" : "requestgpu",
    "usedcpu" : "usedcpu",
    "requestcpu" : "requestcpu"
  },
  "name" : "name",
  "pid" : "pid",
  "id" : "id"
}

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response that contains resource plan trees TreeDto

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

500

An unexpected error occurred.

put /resplans/resplan

Update a resource plan. (updateResplanByName)

Update a resource plan.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request body

resplan Resplan (required)

Body Parameter — The information of the resource plan.

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

204

Successfully updated the resource plan.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested resource was not found.

500

An unexpected error occurred.

Scheduler

get /scheduler/applications/{appid}/driver/logs/{type}/download

Download driver log for a WMLA elastic distributed training application. (downloadDriverLogFile)

Download driver log for a WMLA elastic distributed training application.

Path parameters

appid (required)

Path Parameter — The WMLA elastic distributed training application ID.

type (required)

Path Parameter — The type of the log to retrieve, which is one of 'stdout', 'stderr', or "launcherlog".

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request headers

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/octet-stream

Responses

200

Successful response, with the log returned.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested instance group or application ID is not found.

500

An unexpected error occurred.

get /scheduler/applications/{appid}/executor/{executorid}/logs/{type}/download

Download the executor log for a MSD application. (downloadExecutorLogFile)

Download the executor log for a MSD application.

Path parameters

appid (required)

Path Parameter — The MSD application ID.

executorid (required)

Path Parameter — The executor ID for the MSD application.

type (required)

Path Parameter — The type of the log to retrieve, which is one of 'stdout', 'stderr', or "launcherlog".

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request headers

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/octet-stream

Responses

200

Successful response, with the log returned.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested instance group or application ID is not found.

500

An unexpected error occurred.

get /scheduler/applicationStatistic

Retrieves deep learning applications statistic. (getApplicationStatistic)

Retrieves deep learning applications statistic.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

applicationid (optional)

Query Parameter — The ID of the application.

applicationname (optional)

Query Parameter — The name of the application.

consumer (optional)

Query Parameter — The consumer of the application.

start (optional)

Query Parameter — The start job index.

length (optional)

Query Parameter — The length of the list

state (optional)

Query Parameter — The state of the applications

Return type

array[ApplicationStatistic]

Example data

Content-Type: application/json

[ {
  "gpuUsed" : 5.962133916683182,
  "gpuReq" : 1.4658129805029452,
  "cpuUsed" : 6.027456183070403,
  "cpuReq" : 0.8008281904610115,
  "jobPending" : 5,
  "jobRunning" : 2,
  "username" : "username"
}, {
  "gpuUsed" : 5.962133916683182,
  "gpuReq" : 1.4658129805029452,
  "cpuUsed" : 6.027456183070403,
  "cpuReq" : 0.8008281904610115,
  "jobPending" : 5,
  "jobRunning" : 2,
  "username" : "username"
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response which contains a list of deep learning applications statistic.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

416

An index is out of range.

500

An unexpected error occurred.

get /scheduler/applications

Retrieves deep learning applications (getApplications)

Retrieves deep learning applications.

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Query parameters

applicationid (optional)

Query Parameter — The ID of the application.

applicationname (optional)

Query Parameter — The name of the application.

driverid (optional)

Query Parameter — The ID of the application driver.

search (optional)

Query Parameter — search

sort (optional)

Query Parameter — The field name to sort the response by. Only one field name can be specified as the sort type. Prefix the field name with "-" to sort in descending order.

order (optional)

Query Parameter — order

start (optional)

Query Parameter — start

length (optional)

Query Parameter — length

state (optional)

Query Parameter — state

projectid (optional)

Query Parameter — projectid

Return type

array[ApplicationDetail]

Example data

Content-Type: application/json

[ {
  "memused" : 5.637376656633329,
  "hosts" : 1,
  "schedulerUrl" : "schedulerUrl",
  "endtime" : 3,
  "starttime" : 9,
  "type" : "BATCH",
  "applicationname" : "applicationname",
  "dltype" : "Caffe",
  "tunningname" : "tunningname",
  "slots" : 0,
  "submittedtime" : 7,
  "appReason" : "appReason",
  "appFailureDetail" : "appFailureDetail",
  "demandslots" : 6,
  "coresused" : 5.962133916683182,
  "apprunduration" : 2.027123023002322,
  "model" : "model",
  "state" : "state",
  "applicationid" : "applicationid",
  "dataset" : "dataset",
  "username" : "username",
  "timestamp" : 2
}, {
  "memused" : 5.637376656633329,
  "hosts" : 1,
  "schedulerUrl" : "schedulerUrl",
  "endtime" : 3,
  "starttime" : 9,
  "type" : "BATCH",
  "applicationname" : "applicationname",
  "dltype" : "Caffe",
  "tunningname" : "tunningname",
  "slots" : 0,
  "submittedtime" : 7,
  "appReason" : "appReason",
  "appFailureDetail" : "appFailureDetail",
  "demandslots" : 6,
  "coresused" : 5.962133916683182,
  "apprunduration" : 2.027123023002322,
  "model" : "model",
  "state" : "state",
  "applicationid" : "applicationid",
  "dataset" : "dataset",
  "username" : "username",
  "timestamp" : 2
} ]

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

application/json

Responses

200

Successful response which contains a list of deep learning applications.

400

Bad request. The request was not formatted correctly.

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

416

An index is out of range.

500

An unexpected error occurred.

get /scheduler/applications/{appid}/driver/logs/{type}

Retrieve latest lines of driver log for a WMLA elastic distributed training application. (getDriverLog)

Retrieve latest lines of driver log for a WMLA elastic distributed training application.

Path parameters

appid (required)

Path Parameter — The WMLA elastic distributed training application ID.

type (required)

Path Parameter — The type of the log to retrieve, which is one of 'stdout', 'stderr', or "launcherlog".

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request headers

Query parameters

lastlines (optional)

Query Parameter — The number of last lines to retrieve. Specify a positive number to retrieve the number of last lines that the value specifies. The default value is 10.

Return type

String

Example data

Content-Type:

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

text/plain

Responses

200

Successful response, with the log returned. String

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested instance group or application ID is not found.

500

An unexpected error occurred.

get /scheduler/applications/{appid}/executor/{executorid}/logs/{type}

Retrieve latest lines of the executor log for a MSD application. (getExecutorLog)

Retrieve latest lines of the executor log for a MSD application.

Path parameters

appid (required)

Path Parameter — The MSD application ID.

executorid (required)

Path Parameter — The executor ID for the MSD application.

type (required)

Path Parameter — The type of the log to retrieve, which is one of 'stdout', 'stderr', or "launcherlog".

Consumes

This API call consumes the following media types via the Content-Type request header:

application/json

Request headers

Query parameters

lastlines (optional)

Query Parameter — The number of last lines to retrieve. Specify a positive number to retrieve the number of last lines that the value specifies. The default value is 10.

Return type

String

Example data

Content-Type:

Produces

This API call produces the following media types according to the request header; the media type will be conveyed by the Content-Type response header.

text/plain

Responses

200

Successful response, with the log returned. String

401

Authentication error. The request was denied.

403

Forbidden. The request was denied.

404

The requested instance group or application ID is not found.

500

An unexpected error occurred.

Models

[ Jump to Methods ]

ApplicationDetail -
ApplicationStatistic -
Attr -
Batch -
Children -
CreationResponse -
DLFramework -
EDIModelDescription -
Envs -
EventData -
EventDataCreateParam -
HpoAlgorithmDesc -
HpoExperiment -
HpoHyperParameter -
HpoMetric -
HpoTaskDetail -
HpoTaskInput -
HpoTaskState -
Labels_Map -
Resplan -
Spec -
StringMap -
TreeDto -
V1Resplan -
algoDef -
algoParams -
fixedHyperParam -
resDef -
searchExperiment -
searchGrid -

`ApplicationDetail` - Up

applicationid (optional)

String The application ID.

applicationname (optional)

String The application name.

type (optional)

String The application type.

Enum:

BATCH

NOTEBOOK

state (optional)

String The application state.

slots (optional)

Integer The number of slots that are used by the application.

demandslots (optional)

Integer The number of demanded slots that is being used by the application.

hosts (optional)

Integer The number of hosts on which applications run.

coresused (optional)

Double The number of CPU cores that are allocated to the application. format: double

memused (optional)

Double The amount of memory, in MB, that is used by the application. format: double

username (optional)

String The user that executes the application.

timestamp (optional)

Long The time-stamp of the updated application. format: int64

submittedtime (optional)

Long The application submitted time. format: int64

schedulerUrl (optional)

String The scheduler URL.

starttime (optional)

Long The application start time. format: int64

endtime (optional)

Long The application finish time. format: int64

apprunduration (optional)

Double The application run duration. format: double

model (optional)

String The deep learning model name.

dataset (optional)

String The deep learning dataset name.

dltype (optional)

String The deep learning framework name (e.g Caffe or TensorFlow).

tunningname (optional)

String The deep learning tuning name.

appReason (optional)

String The application reason.

appFailureDetail (optional)

String The application failure detail.

`ApplicationStatistic` - Up

username (optional)

String The user that executes the application.

cpuReq (optional)

Double The requested CPU. format: double

cpuUsed (optional)

Double The used CPU. format: double

gpuReq (optional)

Double The requested GPU. format: double

gpuUsed (optional)

Double The used GPU. format: double

jobPending (optional)

Integer The number of pending jobs.

jobRunning (optional)

Integer The number of running jobs.

`Attr` - Up

name

String The name.

value

String The value.

`Batch` - Up

creator (optional)

String Name of the user who started the task.

id (optional)

String ID of the batch.

args (optional)

String arguments to the tasks.

submissionId (optional)

String ID of the task.

workDir (optional)

String The work directory of the task.

appName (optional)

String The application name of the task.

events (optional)

String The events of the task.

appId (optional)

String ID of the application.

schedulerUrl (optional)

String URL of the scheduler.

state (optional)

String batch state.

`Children` - Up

name (optional)

String The children names.

usedcpu (optional)

String The used CPU.

usedgpu (optional)

String The used GPU.

requestcpu (optional)

String The requested CPU.

requestgpu (optional)

String The requested GPU.

`CreationResponse` - Up

uid (optional)

String Unique id of object.

href (optional)

String Relative endpoint url of the corresponding object.

`DLFramework` - Up

name (optional)

String Name of the deep learning framework.

desc (optional)

array[String] Description of the deep learning framework.

frameworkVersion (optional)

String The framework version.

distributeStrategy (optional)

String The distributed strategies.

Enum:

MultiWorkerMirroredStrategy

ParameterServerStrategy

description (optional)

String Description.

numPs (optional)

Integer Number of parameter server workers

`EDIModelDescription` - Up

name (optional)

String The name of the Elastic Distributed Inference model.

tag (optional)

String The information tag for the model.

runtime (optional)

String The runtime to load and run the model.

weight (optional)

String The relative weight path within the training model directory.

kernel (optional)

String The relative kernel file path with the training model directory.

attributes (optional)

array[Attr] Key-value of attributes can be accessed in model kernel.

environments (optional)

array[Envs] Additional environment variables to run model kernel.

`Envs` - Up

name

String The name.

value

String The value.

`EventData` - Up

id (optional)

String The event id.

eventTimeGMT (optional)

Long The time the event logged. format: int64

eventName (optional)

String The event name.

eventLevel (optional)

String Event level.

Enum:

Info

Warning

Error

eventDetails (optional)

String The details of the event.

entityType (optional)

String The entity type.

entityName (optional)

String The entity name.

entityJobId (optional)

String Job id related with the event.

entityOwner (optional)

String owner of the event entity.

entityConsumer (optional)

String owner of the event entity.

eventSrc (optional)

String Source of the event.

`EventDataCreateParam` - Up

eventName (optional)

String The event name.

eventLevel (optional)

String Event level.

Enum:

Info

Warning

Error

eventDetails (optional)

String The details of the event.

entityType (optional)

String The entity type.

entityName (optional)

String The entity name.

entityJobId (optional)

String Job id related with the event.

eventSrc (optional)

String Source of the event.

`HpoAlgorithmDesc` - Up

name (optional)

String The hpo algorithm name.

path (optional)

String The hpo algorithm installation path (only for plugin algorithm).

condaHome (optional)

String The CONDA_HOME used to run hpo algorithm (only for plugin algorithm).

condaEnv (optional)

String The conda environment used to run hpo algorithm (only for plugin algorithm).

creator (optional)

String The creator of the hpo algorithm (only for plugin algorithm).

createtime (optional)

String The creation time of the hpo algorithm (only for plugin algorithm).

type (optional)

String The type of the hpo algorithm.

remoteExec (optional)

Boolean The plugin algorithm execution mode is remoted or not (only for plugin algorithm).

logLevel (optional)

String The log level for the plugin algorithm (only for plugin algorithm).

`HpoExperiment` - Up

id (optional)

Integer Experiment id.

state (optional)

String Experiment state.

metricVal (optional)

Double metric value. format: double

metrics (optional)

array[HpoMetric]

maxiteration (optional)

Integer maximum iteration.

hyperParams (optional)

array[HpoHyperParameter]

appId (optional)

String application Id.

startTime (optional)

String The start time of experiment training.

endTime (optional)

String The end time of experiment training.

`HpoHyperParameter` - Up

name

String Hyperparameter name, the same name will be used in the config.json so user model can load it.

type

String One of Range, Discrete, Fix.

Enum:

range

discrete

fix

dataType

String One of int, double, str.

Enum:

int

double

str

minDbVal (optional)

Double Minimal double value if type=Range and datatype=double. format: double

maxDbVal (optional)

Double Maximal double value if type=Range and datatype=double. format: double

minIntVal (optional)

Integer Minimal int value if type=Range and datatype=int.

maxIntVal (optional)

Integer Maximal int value if type=Range and datatype=int.

discreteDbVal (optional)

array[Double] Double list like [0.1,0.2] if type=Discreate and datatype=double. format: double

discreteIntVal (optional)

array[Integer] Int list like [1,2] if type=Discreate and datatype=int.

discreateStrVal (optional)

array[String] str list like ['1','2'] if type=Discreate and datatype=double.

userDefined (optional)

Boolean whether is user defined parameter.

fixedVal (optional)

String fixed hyperparameter.

step (optional)

String A number value in string format, step size to split the Range space. ONLY valid when type is Range.

power (optional)

String A number value in string format, the base value for power calculation. ONLY valid when type is Range.

`HpoMetric` - Up

name (optional)

String Metric name.

min (optional)

Double Minimal value of the metric. format: double

max (optional)

Double Maximal value of the metric. format: double

latest (optional)

Double Latest value of the metric. format: double

`HpoTaskDetail` - Up

input (optional)

HpoTaskInput

searchGrid (optional)

searchGrid

state (optional)

String The tuning status.

creator (optional)

String The user who created the tuning task.

createtime (optional)

String The time the tuning task was created.

`HpoTaskInput` - Up

modelName

String The deep learning model name.

hpoName

String The deep learning tuning name.

hyperParams (optional)

array[HpoHyperParameter] The deep learning tuning hyperparameters.

resDef

algoDef

experiments (optional)

array[searchExperiment] Only valid for ExperimentGridSearch algorithm which will submit train with this list of experiments.

`HpoTaskState` - Up

hpoName (optional)

String The name of hyperparameter optimization(HPO) task.

state (optional)

String The state of HPO task.

running (optional)

Integer The number of running tasks of HPO.

complete (optional)

Integer The number of all complete tasks of HPO.

failed (optional)

Integer The number of failed tasks of HPO.

progress (optional)

String The progress of HPO task.

duration (optional)

String The duration of HPO task.

creator (optional)

String The creator of HPO task.

createtime (optional)

String The create time of HPO task.

best (optional)

HpoExperiment The best hyper parameters of HPO

experiments (optional)

array[HpoExperiment] All experiments of HPO.

`Labels_Map` - Up

key (optional)

String The key.

value (optional)

String The value.

`Resplan` - Up

name (optional)

String The resource plan name.

parent (optional)

String The parent resource plan.

labels (optional)

array[Labels_Map] The map of labels.

`Spec` - Up

parent (optional)

String The parent of the resource plan.

parentNamespace (optional)

String The parent name space.

children (optional)

array[Children] The children.

`StringMap` - Up

example_key (optional)

String

`TreeDto` - Up

id (optional)

String The resource plan ID.

name (optional)

String Resource plan name.

pid (optional)

String The parent ID of the resource plan.

path (optional)

String The Resource plan path.

isParent (optional)

String Parent or not.

V1Resplan (optional)

V1Resplan

childTreeDto (optional)

array[TreeDto] The children trees of the resource plan.

`V1Resplan` - Up

usedcpu (optional)

String The used CPU.

usedgpu (optional)

String The used GPU.

requestcpu (optional)

String The requested CPU.

requestgpu (optional)

String The requested GPU.

path (optional)

String Used for submit WMLA job and show the consumer.

`algoDef` - Up

algorithm definition.

algorithm

String The tuning algorithm. it can be build in algorithms like Random, Tpe, Hyperband and ExperimentGridSearch, or user installed algorithms.

maxRunTime (optional)

Integer Max running time of the hpo task in munites, default -1(unlimited).

maxJobNum (optional)

Integer Max number of training job to submitted for hpo task, default -1(unlimited).

maxParalleJobNum (optional)

Integer Max number of training job to run in parallel, default 1.

hyperbandEta (optional)

Double hyperband eta value. format: double

objective (optional)

String Optimize policy, one of minimize and maximize.

Enum:

Maximize

Minimize

algoParams (optional)

array[algoParams] optional, additional algorithm parameters and it could be different for each algorithm.

`algoParams` - Up

name

String Name of the search algorithm parameter name.

value

String Value for the corresponding algirhtm parameter.

`fixedHyperParam` - Up

name

String Hyperparameter name, the same name will be used in the config.json so user model can load it

dataType

String one of int, double, str

fixedVal

String The same type with datatype specified, if dataTye=double, need fixedVal type doulbe

`resDef` - Up

The deep learning tuning resource definition.

framework

String The deep learning framework.

Enum:

Caffe

TensorFlow

PyTorch

PyTorchOnElastic

initWeightPath (optional)

String Weight file path.

maxiteration

String The maximum iteration count.

miniteration (optional)

String The minimum iteration count, optional for hyperband, default value 1.

batchsize

Integer The batch size tuning parameter.

gpuNum (optional)

Integer The gpu number.

workerNum (optional)

Integer The number of workers in the cluster.

distribute (optional)

Boolean Whether using distribute mode.

syncMode (optional)

String The gradient synchronization mode in elastic distributed training. This parameter to specify whether the training is a synchronous training, or an asynchronous training.

Enum:

SYNC

ASYNC

resourceInstanceId (optional)

String Instance group id

`searchExperiment` - Up

Integer experiment id.

fixedHyperParams

array[fixedHyperParam] List of hyperparameters used in this experiment training.

`searchGrid` - Up

experiments (optional)

array[HpoExperiment]

best (optional)

HpoExperiment

running (optional)

Integer The total number of parallel running jobs.

complete (optional)

Integer The number of completed jobs.

failed (optional)

Integer The number of failed jobs.

progress (optional)

String The progress.

duration (optional)

String Run duration of this task.

REST APIs for /platform/rest/deeplearning/v1

Table of Contents

Consumes

Return type

Example data

Produces

Responses

200

401

403

500

Consumes

Produces

Responses

204

401

500

Path parameters

Consumes

Produces

Responses

204

400

401

500

Path parameters

Consumes

Request body

Return type

Example data

Produces

Responses

200

400

401

500

Path parameters

Consumes

Return type

Example data

Produces

Responses

200

400

401

500

Path parameters

Consumes

Return type

Example data

Produces

Responses

200

400

401

500

Path parameters

Consumes

Return type

Example data

Produces

Responses

200

400

401

500

Path parameters

Consumes

Produces

Responses

204

400

401

500

Consumes

Return type

Example data

Produces

Responses

200