Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch list of data products from datahub #2984

Closed
PriyaBasker23 opened this issue Jan 17, 2024 · 6 comments
Closed

Fetch list of data products from datahub #2984

PriyaBasker23 opened this issue Jan 17, 2024 · 6 comments
Assignees
Labels

Comments

@PriyaBasker23
Copy link
Contributor

PriyaBasker23 commented Jan 17, 2024

User Story

As a developer, I want to compile a comprehensive list of tables within the frontend for presentation to users.

Value / Purpose

The primary objective is to display categorised tables, organised by domains and data products.

Hypothesis

Implementing this functionality will facilitate the delivery of the frontend to users and support iterative development based on feedback.

Proposal

  1. Implement the functionality to list data products by directly querying the datahub API for initial results.
  2. Optionally, explore the use of a Python library for a more streamlined integration into the FastAPI layer at a later stage.
  3. Note: The following steps are not mandatory but are considered for future enhancements.
    a. Develop a dedicated endpoint in FastAPI to handle the list of data products functionality.
    b. Invoke the Python function from the created endpoint to ensure seamless integration.
  4. Maintain flexibility in the approach to accommodate iterative improvements and adjustments as necessary.

Definition of Done

The frontend should successfully display a categorised list of data products, organised by both domains and data products.

@PriyaBasker23 PriyaBasker23 converted this from a draft issue Jan 17, 2024
@murdo-moj murdo-moj self-assigned this Jan 19, 2024
@murdo-moj murdo-moj moved this from Todo to In Progress in Data Catalogue Jan 19, 2024
@murdo-moj
Copy link
Contributor

@murdo-moj
Copy link
Contributor

murdo-moj commented Jan 22, 2024

GraphQL query. Both the interactive query builder on the datahub frontend and the docs on the data product entity are useful https://datahubproject.io/docs/generated/metamodel/entities/dataproduct/
https://datahub.apps-tools.development.data-platform.service.justice.gov.uk/api/graphiql

{
  search(input: {type: DATA_PRODUCT, query: "*", count: 500, start: 0}) {
    total
    searchResults {
      entity {
        ... on DataProduct {
          urn
          tags {
            tags {
              tag {
                urn
                properties {
                  name
                }
              }
            }
          }
          properties {
            name
            description
            numAssets
          }
          relationships(input: {direction: OUTGOING, types: "DataProductContains", count:500, start: 0}) {
            relationships {
              entity { ... on Dataset {properties { name } } }
            }
          }
          domain {
            domain {
              properties {
                name
              }
            }
          }
        }
      }
    }
  }
}

@murdo-moj
Copy link
Contributor

murdo-moj commented Jan 22, 2024

This one is way faster. The subquery to parse the asset(dataset) names is what's taking the time

{
  search(input: {type: DATA_PRODUCT, query: "*", count: 500, start: 0}) {
    total
    searchResults {
      entity {
        ... on DataProduct {
          urn
          tags {
            tags {
              tag {
                urn
                properties {
                  name
                }
              }
            }
          }
          properties {
            name
            description
            numAssets
          }
          relationships(input: {direction: OUTGOING, types: "DataProductContains", count:500, start: 0}) {
            relationships {
              entity { urn }
            }
          }
          domain {
            domain {
              properties {
                name
              }
            }
          }
        }
      }
    }
  }
}

@murdo-moj
Copy link
Contributor

I've got this working in python; just have to integrate the response into the frontend and put eg the api url in the right place in the config

@murdo-moj murdo-moj moved this from In Progress to Review in Data Catalogue Jan 23, 2024
@murdo-moj murdo-moj moved this from Review to In Progress in Data Catalogue Jan 23, 2024
@tom-webber tom-webber added this to the Search and filtering working in custom DataHub front-end milestone Jan 25, 2024
@murdo-moj murdo-moj moved this from In Progress to Done in Data Catalogue Jan 30, 2024
Copy link
Contributor

This issue is being marked as stale because it has been open for 60 days with no activity. Remove stale label or comment to keep the issue open.

@github-actions github-actions bot added the stale label Mar 26, 2024
Copy link
Contributor

github-actions bot commented Apr 2, 2024

This issue is being closed because it has been open for a further 7 days with no activity. If this is still a valid issue, please reopen it, Thank you!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done ✅
Development

No branches or pull requests

4 participants