Examples of customizations

This topic shows you different examples of how to add custom libraries through conda or pip using the provided templates for Python and R when you create an environment template.

Mamba is available only if you are running on Cloud Pak for Data 4.5.1 or later.

You can use mamba in place of conda in the following examples with conda. Remember to select the checkbox to install from mamba if you add channels or packages from mamba to the existing environment template.

Examples exist for:

Hints and tips:

Adding conda packages

To get latest versions of pandas-profiling:

dependencies:
  - pandas-profiling

This is equivalent to running conda install pandas-profiling in a notebook.

Adding pip packages

You can also customize an environment using pip if a particular package is not available in conda channels:

dependencies:
  - pip:
    - ibm-watson-machine-learning

This is equivalent to running pip install ibm-watson-machine-learning in a notebook.

The customization will actually do more than just install the specified pip package. The default behavior of conda is to also look for a new version of pip itself and then install it. Checking all the implicit dependencies in conda often takes several minutes and also gigabytes of memory. The following customization will shortcut the installation of pip:

channels:
  - empty
  - nodefaults

dependencies:
  - pip:
    - ibm-watson-machine-learning

The conda channel empty does not provide any packages. There is no pip package in particular. conda won't try to install pip and will use the already pre-installed version instead. Note that the keyword nodefaults in the list of channels needs at least one other channel in the list. Otherwise conda will silently ignore the keyword and use the default channels.

Combining conda and pip packages

You can list multiple packages with one package per line. A single customization can have both conda packages and pip packages.

dependencies:
  - pandas-profiling
  - scikit-learn=0.20
  - pip:
    - watson-machine-learning-client-V4
    - sklearn-pandas==1.8.0

Note that the required template notation is sensitive to leading spaces. Each item in the list of conda packages must have two leading spaces. Each item in the list of pip packages must have four leading spaces. The version of a conda package must be specified using a single equals symbol (=), while the version of a pip package must be added using two equals symbols (==).

Customize with pip-installed dependencies in an air-gapped system

If you want to customize an environment in an air-gapped system that has no access to a repository server either locally or on the internet, you can store the pip package in the project and specify the dependency using the prefix file:/. The custom channels: configuration can point to an empty local channel to avoid conda trying to fetch pip from an external repository.

channels:
  - file:///project_data/data_asset/empty_conda_channel
  - nodefaults

dependencies:
  - pip:
    - file:///project_data/data_asset/your-package-0.1.zip

An empty conda channel can be set up ad-hoc as:

channel_dir="/project_data/data_asset/empty_conda_channel"
!mkdir -p $channel_dir/noarch
with open(channel_dir+"/noarch/repodata.json","w") as f : 
    f.write('{ "channeldata_version": 1, "packages": {}, "subdirs": ["noarch"] }')
!bzip2 -k $channel_dir/noarch/repodata.json

For a more comprehensive descriptions of setting up local channels, see Creating custom channels and Configuring conda to use a file channel.

Adding complex packages with internal dependencies

When you add many packages or a complex package with many internal dependencies, the conda installation might take long or might even stop without you seeing any error message. To avoid this from happening:

Example of a customization that doesn't use the default conda channels:

# get latest version of the prophet package from the conda-forge channel
channels:
  - conda-forge    
  - nodefaults

dependencies:
  - prophet

This customization corresponds to the following command in a notebook:

!conda install -c conda-forge --override-channels prophet -y

Adding conda packages for R notebooks

The following example shows you how to create a customization that adds conda packages to use in an R notebook:

channels:
  - defaults

dependencies:
  - r-plotly

This customization corresponds to the following command in a notebook:

print(system("conda install r-plotly", intern=TRUE))

The names of R packages in conda generally start with the prefix r-. If you just use plotly in your customization, the installation would succeed but the Python package would be installed instead of the R package. If you then try to use the package in your R code as in library(plotly), this would return an error.

Best practices

To avoid problems that can arise finding packages or resolving conflicting dependencies, start by installing the packages you need manually through a notebook in a test environment. This enables you to check interactively if packages can be installed without errors. After you have verified that the packages were all correctly installed, create a customization for your development or production environment and add the packages to the customization template.

Parent topic: Adding a customization