You can read data as a pandas dataframe or use an iterator in order to stream data. Similarly, you can write data from a complete dataframe or by pushing one row at a time. See here how to process data in-memory.
You can generate complex SQL queries in Python and then execute them. One can also retrieve the results as a pandas dataframe if necessary. See here how to use SQL from Python.
In order to process your data using Hadoop, you can use Spark through Python. Take a look at how to code within a PySpark Recipe. Note that it is also possible to use PySpark in a notebook.
Dataiku provides a lot of code snippets to start with:
Read the full internal Python API documentation
The DSS public API allows you to interact with DSS from any external system. It allows you to perform a large variety of administration and maintenance operations, in addition to accessing datasets and other data managed by DSS.
An example of usage is to administer Dataiku DSS using the Python client.