OpenWeatherMap

This plugin provides a read connector to import historical & forecast weather data from OpenWeatherMap

Plugin information

Version 1.0.1
Author Dataiku (Henri Chabert)
Released 2020-06-01
Last updated 2020-06-01
License Apache Software License
Source code Github
Reporting issues Github

How to set up

General set up

  1. Create an OpenWeatherMap account here.
  2. Login to the platform and go on the API keys tab
  3. Use the Default key or create a new one. Copy the key.
  4. In DSS, go to App > Plugins > Installed > OpenWeatherMap > Settings > OpenWeatherMap API configuration
  5. Add a new preset, name it and fill in the details:
    • System of units: The default system of units you want to use (It can be overwritten when running the recipe)
    • Language: Language of the text describing the weather (It can be overwritten when running the recipe).

Cache set up

The OpenWeatherMap plugin uses a cache system to store data locally, in order to avoid repeating identical queries. You can change preferences by following the following steps:

  1. In DSS, go to App > Plugins > Installed > OpenWeatherMap > Settings > Parameters
  2. Choose the cache storage location between the following options:
    1. User $HOME directory (Default): The cache will be stored under the folder $HOME/.cache/dss/plugins/open_weather_map
    2. Custom: You can choose a custom location. Write in the input below the absolute location (Example: /Users/johnsnow/Documents/dss_cache). Be cautious using this with an UIF instance as permission errors could occur
    3. None: Do not use a cache
  3. Choose the cache parameters:
    • Cache size (in MegaBytes): The maximum size of the cache file (Default is 1Go)
    • Cache eviction policy: The way you want the data to be deleted once the maximum size is reached. You have the choice between three modes:
      • Least Recently Stored: Delete the oldest cache records first
      • Least recently Used: Deleted cache records that weren’t used for the longest time
      • Least Frequently Used: Deleted cache records that are the least frequently used
      • No eviction: Overwrite cache size, the cache will grow without bounds

How to use

The plugin is made of two main components:

  • A connector that allows you to retrieve the data directly from OpenWeatherMap API and put it in a new dataset.
  • A recipe that allows you to add weather information to your data containing lat/lon and data column

OpenWeatherMap connector

  1. Go to your DSS flow
  2. Select OpenWeatherMap in the plugin section of the dataset menu
  3. Click on OpenWeatherMap weather generating
  4. Pick a preset of parameters
  5. Write the latitude and the longitude of the location you want the weather of
  6. Choose the desired granularity
    • Daily information are available 5 days in the past and 7 days in the future. The output dataset is 12 rows long
    • Hourly information are available 5 days in the past and 2 days in the future. The output dataset is 168 rows long

Advanced mode

You can configure more settings by checking “Advanced mode”. The available options are:

  • Data type: You can choose whether you wish to have historical data, forecast data or both (Default is both)
  • System of units: It overwrites the settings of the preset for this specific job
  • Language: It overwrites the settings of the preset for this specific job
  • Use cache: It overwrites the settings of the preset for this specific job
  • Parse output JSON: If you prefer to get a unique column containing the entire response in JSON format, you can check this

OpenWeatherMap recipe

  1. Go to your DSS flow
  2. Select OpenWeatherMap in the plugin section of the recipe menu
  3. Click on OpenWeatherMap Weather mapping
  4. Select the input dataset and the output dataset, and then click on “CREATE”
  5. Fill in the latitude and longitude columns
  6. Select whether you need the current weather or the weather at a date provided by a column (having date format)
  7. Run the recipe

In addition of the weather data, a column named “error” will be added to the output dataset. If something went wrong when retrieving the data for the specific pair location/date, this column will tell you what it is. The errors usually come from the fact that the date is not in the available range [today – 5 days ; today + 7 days] or that you reached your API calls limit.

Advanced mode

You can configure more settings by checking “Advanced mode”. The available options are:

  • System of units: It overwrites the settings of the preset for this specific job
  • Language: It overwrites the settings of the preset for this specific job
  • Use cache: It overwrites the settings of the preset for this specific job
  • Parse output JSON: If you prefer to get a unique column containing the entire response in JSON format, you can check this. You should check this option if you use the option “Append instead of overwrite”

Get the Dataiku Data Sheet

Learn everything you ever wanted to know about Dataiku (but were afraid to ask), including detailed specifications on features and integrations.

Get the data sheet