Analytics projects (Watson Studio and Watson Knowledge Catalog)

An analytics project is how you organize your resources to work with data. If you have the Watson Studio service, you can prepare data, analyze data, and build models. The tools you have depend on which of the additional services that supplement Watson Studio, such as Watson Machine Learning, are installed.

If you don’t have the Watson Studio service, you can prepare data with the Watson Knowledge Catalog service, stream data with the Streams flow service, or create dashboards with the Cognos Dashboards service.

Your project can include these types of resources:

You can customize projects to suit your goals. You can change the contents of your project and almost all of its properties at any time. However, you must make these choices when you create the project because you can’t change them later:

Collaboration in projects

As a project creator, you can add other collaborators and assign them roles that control which actions they can take. You automatically have the Admin role in the project, and if you give other collaborators the Admin role, they can add collaborators too. See Adding collaborators and Project collaborator roles.

Collaboration on assets

For all tools in projects except the JupyterLab IDE and RStudio with Git integration, assets are locked during editing to prevent conflicts between changes made by different collaborators.

All collaborators work with the same copy of each asset. Only one collaborator can edit an asset at a time. While a collaborator is editing an asset in a tool, that asset is locked. Other collaborators can view a locked asset, but not edit it. See Managing assets.

Collaboration in the JupyterLab IDE

The JupyterLab IDE uses the version control features of a Git repository instead of locking. When you create a project, you have options to synchronize the project with a Git repository and enable collaborators to use JupyterLab. When you select the JupyterLab option, project collaborators can edit notebooks only in JupyterLab and the standard Jupyter notebook editor is disabled.

The project shows the contents of the branch that you specified when you created the project. Each collaborator must clone the repository to work on notebooks, scripts, or other files independently and simultaneously. To view or schedule jobs for updated assets in the project, collaborators must push their changes to the project branch and then pull the updated assets into the project. Users can use the Git functionality in JupyterLab to work with different branches and handle any merge conflicts. They can push their changes to the project branch either directly from JuptyerLab or through Git, for example, by creating a pull request.

See JupyterLab.

Data assets

You can add these types of data assets to projects:

See Adding data. For some formats of relational or tabular data, you can preview and profile the data when you open the asset.

Operational assets

Operational assets are how you work with data with tools that prepare data, analyze data, or build models. Most types of operational assets have a specific tool with which you create and edit that type of operational asset. Notebooks have a choice of editors.

With Watson Studio, you can create these types of operational assets without additional services:

With Watson Studio, some operational assets require extra services. If your administrator installed the services, you can add these assets:

If you do not have Watson Studio, the operational assets you can create depend on the service:

Environments

Environments control your compute resources. An environment definition specifies hardware and software resources to instantiate the environment runtimes that run your operational assets in tools.

Some types of operational assets have an automatically selected environment definition. However, for some types of operational assets, you can choose between multiple environments when you create an asset and when you run it. Watson Studio includes a set of default environment definitions that vary by coding language, tool, and compute engine type. You can also create custom environment definitions or add services that provide environment definitions. For example, your administrator can install the IBM Analytics Engine powered by Apache Spark service to provide Spark environments.

See Environments.

Jobs

A job is a single run of an operational asset with a specified environment runtime. You can schedule one or repeating jobs, monitor, edit, stop, or cancel jobs. See Jobs.

Project storage

The project file storage uses the storage class associated with Watson Studio.

When you delete a project, its file storage is also deleted.

Additional services

Cloud Pak for Data administrators can install more services to add tools or compute environments to Watson Studio.

Integrations with external tools

Integrations provide a method to interact with tools that are external to the project.

You can integrate with a Git repository to export the project, to work with documents and notebooks in JupyterLab, or to back up the project for source code management purposes.

Project documentation and notifications

While you create a project, you can add a short description to document the purpose or goal of the project. You can edit the description later, on the project’s Settings page.

You can mark the project as sensitive. When users open a project that is marked as sensitive, a notification is displayed stating that no data assets can be downloaded or exported from the project.

You can select to log all project activities. Logging all project activities, tracks detailed project activity and creates a full activities log, which can be downloaded to view.

You can change these settings at any time for the project by changing the state of the toggle button on the Settings page.

All collaborators in a project are notified when a collaborator changes an asset or adds a comment to a notebook.

Catalog integration

A catalog is a central repository for assets where you can easily find and share data and other assets. Before you can access a catalog, a catalog administrator must add you as a catalog collaborator. A catalog has the same type of roles as a project. With any catalog role, you can copy assets from the catalog into a project to use them. With the Editor or Admin role in the catalog, you can create assets in a project and then publish them into the catalog.

Learn more