Dremio with IBM Cloud Data Lake

IBM Cloud provides a cloud native data lake platform, which is unique in its commitment to a serverless consumption model.

You can ingest, store, prepare, optimize and analyze your data in a serverless manner that scales resources fully transparently and exposes a fair pay-as-you-go cost model end-to-end. We provided a good introduction at the SubSurface WINTER 2021 Cloud Data Lake conference a few weeks ago.

Dremio connector to IBM Cloud Data Lake

Dremio is a very popular and open data lake engine to interactively explore, curate and consume data lake data, so it makes sense to use Dremio with IBM Cloud Data LakeIBM’s Cloud Data Lake platform.

We are happy to announce the availability of a new Dremio connector to IBM Cloud Data Lake. It enables Dremio to connect to IBM Cloud Data Lake services, push down SQL operations and retrieve results for processing into the Dremio engine. According to our commitment to open stacks, we have also made the connector itself available as open source.

In the documentation of the connector, you’ll find few simple steps on how you can optionally deploy and run Dremio itself right in IBM Cloud on top of the IBM Cloud Kubernetes Service.

This way, you can build data lake solutions with an open, interactive user experience on top of a fully serverless data lake foundation that scales seamlessly and fairly along with your workload and data demands.

Learn more about IBM Cloud SQL Query.

Torsten Steinbach

Distinguished Engineer & CTO, Big Data in Cloud

IBM acquires StreamSets, a leading real-time data integration company

3 min read - We are thrilled to announce that IBM has acquired StreamSets, a real-time data integration company specializing in streaming structured, unstructured and semistructured data across hybrid multicloud environments. Acquired from Software AG along with webMethods, this strategic acquisition expands IBM's already robust data integration capabilities, helping to solidify our position as a leader in the data integration market and enhancing IBM Data Fabric’s delivery of secure, high-quality data for artificial intelligence (AI). According to a Forrester study conducted on behalf of…

Fine-tune your data lineage tracking with descriptive lineage

4 min read - Data lineage is the discipline of understanding how data flows through your organization: where it comes from, where it goes, and what happens to it along the way. Often used in support of regulatory compliance, data governance and technical impact analysis, data lineage answers these questions and more. Whenever anyone talks about data lineage and how to achieve it, the spotlight tends to shine on automation. This is expected, as automating the process of calculating and establishing lineage is crucial to…

Reimagine data sharing with IBM Data Product Hub

3 min read - We are excited to announce the launch of IBM® Data Product Hub, a modern data sharing solution designed to accelerate data-driven outcomes across your organization. Today, we're making this product generally available to our clients across the world, following its announcement at the IBM Think conference in May 2024. Data sharing has become the lifeblood of modern organizations, fueling growth and driving innovation. But traditional approaches to data sharing can often be a bottleneck constricting the seamless sharing of data.…

IBM Cloud provides a cloud native data lake platform, which is unique in its commitment to a serverless consumption model.

Dremio connector to IBM Cloud Data Lake

More from Analytics

IBM acquires StreamSets, a leading real-time data integration company

Fine-tune your data lineage tracking with descriptive lineage

Reimagine data sharing with IBM Data Product Hub

IBM Newsletters