|Author||Dataiku (Alex Bourret)|
|License||Apache Software License|
How to set up
There are two methods for accessing your Google Drive documents: via sharing with a service account, or direct access using Google’s Single Sign On.
From the Google service account tab, create a new Google Cloud Platform project:
- Select a project > New Project and fill in the required information.
- Enable the Google Drive API for your project
Using a service account
If you choose to use the service account method for sharing your data, you now have to create one such account in your GCP project:
- In the service accounts tab, click on create service account
- Fill in the details and skip step 2 if you want
- On step 3, create a key using the JSON format:
- Save the key and open it with a text editor. You can press Done
- Write down the service account’s email address for later use
- Go to the plugin’s settings page to add a preset (DSS > Plugins > Googledrive > Settings > Google Drive Token) and add a preset
- In the `Access credentials` box, paste the JSON key copied from your text editor
Using Google’s Single Sign On
- In the credential tabs of your GCP project, create a new OAuth Client ID with a Desktop app application type
- Copy the client ID / client secret codes
- Create an OAuth consent screen. You will need to save the form, but the screen does not need to be verified at this stage.
- Go to the plugin’s settings page to add a preset (DSS > Plugins > Googledrive > Settings > Google Single Sign On) and add a preset
- Paste client ID / client secret generated at the second step
In order to use the Single Sign On, each DSS user will first have to enable it for their account, by going to their profile > Credentials > Name of the Google SSO preset > Edit icon. This will redirect the user to Google’s SSO screen where they can login and accept DSS data access. Keep in mind that warning messages will appear for as long as the OAuth consent screen has not been verified.
How to use
- Share the Google Drive document or directory with the service account email address you took note of
- Take note of the directory ID, which is the part of the URL after the last slash character
- Create a new dataset in Dataiku DSS by selecting Dataset > Googledrive
- Click Googledrive Filesystem
- Pick your preset for the connection, and in ID of root directory, paste the previously copied directory ID
- Edit the
DATADIR/config/dip.propertiesfile and add the following key:
- Share an empty directory with the Google Drive service account. It is important that it does not contain data you want to keep: the entire structure contained inside this directory can be deleted by the plugin. Also for this reason, the plugin will refuse to write directly to the root directory.
- In the flow, first create your target Google Drive dataset, by selecting the Googledrive plugin in the dataset list.
- Browse to your target directory, name this new dataset and press create
- If the following message appears :
An invalid argument has been encountered : Missing parameters for CSVgot to the dataset Settings > Format / Preview and set Quoting style
- Pick the source dataset and create a sync recipe from it. Select Use existing dataset and pick your target Google Drive dataset.
- Finally, click on Create recipe.