Analyzing with Python
You can run Python scripts that retrieve data from and write data to a Db2® database.
Such scripts can be powerful tools to help you analyze your data. For example, you can use them to generate statistical models based on data in your database, and to plot the results of these models. Scripts can be run directly from within a Python execution environment, or from within a Jupyter notebook.
In-database analytics using Python
To process data, most native Python functions require that the data first is extracted from a database to working memory. Such a function is called an in-application function. A different type of function, called an in-database function, operates directly on data in a database, without requiring the data to be extracted. Consequently, you can use an in-database function to analyze large amounts of data that would be impractical or impossible to extract.
In-database functions can use performance-enhancing features of the underlying database management system, such as columnar technology. Using in-database functions also avoids security issues that are associated with extracting data and ensures that the data that is being analyzed is as current as possible. Some in-database functions additionally use lazy loading to load only those parts of the data that are actually required, to further increase efficiency.
Both in-application and in-database functions are equally easy to use. An in-application function operates on a construct called a data frame, which is a container that holds, in memory, a copy of the data to be processed. An in-database function operates on a similar construct called an IDA data frame. An IDA data frame does not hold any data directly. Instead, it holds a reference to a table or view in the database or to a selection of rows and columns within that table or view. When a function or method is applied to an IDA data frame, it is usually not run in the application, but is translated into an SQL query. The query is then run against the database, and the result is translated into a Python object.