Skip to content

4. Downloading data

Melanie Jans-Singh edited this page May 18, 2021 · 8 revisions

This page explains how to download data from the database.

Requirements

  • You need to be given access to the database by sharing your IP address with the platform administrators (Find out how here).

  • You need the following packages installed on your machine:

    • psycopg2 (version 2.8.6, installed with pip)
    • datetime
    • os
    • pandas

Accessing the data

The sample code for downloading data is available in the GitHub folder. The passwords are hidden from public view, so please contact the administrators for access to the usernames and passwords.

The code corresponds to the following steps:

  1. Activate the environment in which you have all the packages required in the section above. Say this environment is called myenv, enter the following in the command prompt/terminal:
activate myenv
jupyter notebook
  1. Connect to the database. For this, you will need a username and password.
conn = psycopg2.connect(
    host=crop_host,     port=crop_port,    dbname=crop_dbname, user=crop_user, password=crop_password)

cur = conn.cursor()
  1. Select the period for which you would like to access the data available
dt_to = datetime.now()
dt_from = dt_to + timedelta(days=-365*2)

Note: the data period available will depend on the chosen data.

  1. Make a query for the columns and time period desired

Example for accessing the energy data:

sql_command = """SELECT * FROM energy_data WHERE energy_data.timestamp >= '%s' AND energy_data.timestamp < '%s'""" % (dt_from, dt_to)


df_energy = pd.read_sql(sql_command, conn)
df_energy.head()
df_energy.tail()
  1. Save the data as a csv file.

Change directory to "your_folder_location" (in our case LatestData folder in the OneDrive)

os.chdir("your_folder_location")
df_energy.to_csv(data_folder +"\Energy_raw.csv")

You should now have a saved file on your computer with the latest data.

Tables available

The database is organised by data tables. This section summarises the different tables available and the data they contain:

Table Name columns first_timestamp last_timestamp Sensor_id Type Contains data?
User id, username, email, password, time_created, time_updated NA NA NA infra yes
advanticsys_data id, sensor_id, timestamp, temperature, humidity, co2, time_created, time_updated 26/05/2017 01:05 28/04/2020 09:59 e live data yes
air_velocity_data id, sensor_id, timestamp, temperature, velocity, time_created, time_updated NA NA live data no
crop_growth_data id, crop, progation_date, location_id, trays, m2, date_underlight, surplus_waste_trays, mass_harvested, waste_explanation NA NA live data no
energy_data id, sensor_id, timestamp, electricity_consumption, time_created, time_updated 27/04/2020 00:30 17/05/2021 01:00 16,17 live data yes
environmental_data id, sensor_id, logger_timestamp, device_timestamp, uptime, battery, validity, ch0, ch1, ch2, ch3, opt, co2, temperature, humidity, tempds, time_created, time_updated NA NA NA live data no
infrastructure_data id, tank_time, tank_no, tank_ec, tank_clox, tank_water_temp NA NA live data no
iweather id, temperature, rain_probability, rain, relative_humidity, wind_speed, wind_direction, air_pressure, radiation, timestamp, time_created, time_updated, sensor_id 12/05/2020 10:00 14/05/2021 00:00 47 live data yes
locations locations, id, zone, aisle, column, shelf NA NA infra yes
sensor_location id, sensor_id, location_id, installation_date, time_created, time_updated NA NA infra yes
sensor_types id, sensor_type, source, origin, frequenccy, data, description, time_created, time_updated NA NA infra yes
sensor_upload_log id, type_id, filename, status, log, time_created, time_updated infra yes
sensors id, type_id, device_id, time_created, time_updated, name, last_updated infra yes
tinytag_data id, sensor_id, timestamp, temperature, time_created, time_updated 08/06/2016 11:34 07/12/2019 11:51 28, 34 live data yes
utc_energy_data id, sensor_id, timestamp, electricity_consumption, time_created, time_updated 27/04/2020 00:30 10/05/2021 00:00 16,17 live data yes
weather id, temperature, rainfall, humidity, wind_speed, wind_direction, weather_type, forecast, time_created, time_updated, air_pressure NA NA live data no
zensie_trh_data id, sensor_id, timestamp, temperature, humidity, time_created, time_updated 28/03/2020 11:30 17/05/2021 13:35 18-27 live data yes

Next page: Querying data with SQL