Question & Answer
Question
By using pandas library to_parquet(), we can easily convert data set in csv file format to parquet file format. However, when I tried to use ibm_watson_studio_lib library save_data(), I got following error:
RuntimeError: Argument "data" must be a bytes-like object. Use str.encode() to convert character data.
![image-20230311211836-1](/support/pages/system/files/inline-images/image-20230311211836-1.png)
Could you tell me how to store the converted parquet file to CP4D Project Asset?
Cause
With following code, you created a parquet binary file in the current working directory.
df_data_1.to_parquet("cars.parque",compression='gzip')
The error occurred due to ibm_watson_studio_lib library save_data() expected a byte-like object for the argument "data", however it found a binary parquet file instead of an encoded plain text file.
Answer
Since the binary parquet file already created in the current working directory, use ibm_watson_studio_lib library upload_file() instead of save_data(). Following is the sample code to convert data set in csv file format to parquet file format and stored it in Project Asset.
import os
from ibm_watson_studio_lib import access_project_or_space
os.chdir ("/project_data/data_asset/")
wslib = access_project_or_space()
wslib.upload_file("cars.parque", df_data_1.to_parquet("cars.parque",compression='gzip'))
![image-20230311213633-1](/support/pages/system/files/inline-images/image-20230311213633-1.png)
[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSHE3N","label":"IBM Watson Studio Premium for IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m50000000ClW2AAK","label":"Organize-\u003ETransform Data"}],"ARM Case Number":"TS012251438","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
15 March 2023
UID
ibm16962995