[dagster-tableau] Exploring embedded data sources #27218

VenkyRules · 2025-01-20T11:30:56Z

Summary & Motivation

Current implementation was fetching limited metadata from tableau which was only limited to id and names, but have added few more fields like upstreamTables and databases details and many more fields.
Earlier we were only showing published data sources and ignoring embedded data sources. With this changes we are showing embedded data sources in case published data sources are not present.

How I Tested These Changes

Tested on local system with the help of docker desktop

…data sources as well in connections

vercel · 2025-01-20T11:31:01Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
dagster-docs-legacy	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jan 20, 2025 11:33am

maximearmstrong

Thanks for this contribution @VenkyRules!

I've added some comments. Also, could we add more tests to test these changes?

maximearmstrong · 2025-01-22T18:42:32Z

python_modules/libraries/dagster-tableau/dagster_tableau/resources.py

-                        ):
-                            data_source_id = published_data_source_data["luid"]
+                        published_data_source_list = embedded_data_source_data.get("parentPublishedDatasources", [])
+                        if len(published_data_source_list) > 0:


[1] I believe you don't need this condition. If published_data_source_list is empty, nothing will be added to data_source

Yes removed this if condition.

maximearmstrong · 2025-01-22T18:46:44Z

python_modules/libraries/dagster-tableau/dagster_tableau/resources.py

+                                            properties=published_data_source_data,
+                                        )
+                                    )
+                        else:


[1] Here you could test if data_source which explicitly states that we fallback to the embedded data sources because no published data source was added.

Have added an alternative logic to this. Please review it once.

maximearmstrong · 2025-01-22T18:47:36Z

python_modules/libraries/dagster-tableau/dagster_tableau/resources.py

                            if data_source_id and data_source_id not in data_source_ids:
                                data_source_ids.add(data_source_id)
+                                embedded_data_source_data["luid"] = data_source_id


Could we add a comment here explaining why we use luid?

Below is the code snippet from translator.py which is called by resources.py here it seems luid is mandatory and in case of embedded_data_sources it was missing. Hence we are creating luid's for those using id of embedded_data_sources.

method name -> def from_content_data ()

data_sources_by_id={ data_source.properties["luid"]: data_source for data_source in content_data if data_source.content_type == TableauContentType.DATA_SOURCE },

maximearmstrong · 2025-01-22T18:49:04Z

python_modules/libraries/dagster-tableau/dagster_tableau/translator.py

+
+        data_source_ids = []
+        for embedded_data_source in sheet_embedded_data_sources:
+            embedded_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])


Suggested change

embedded_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])

published_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])

Name updated to published_data_source_list it makes more sense

maximearmstrong · 2025-01-22T18:52:06Z

python_modules/libraries/dagster-tableau/dagster_tableau/translator.py

-            for published_data_source in embedded_data_source.get("parentPublishedDatasources", [])
-        }
+
+        data_source_ids = []


data_source_ids used to be a set - could we still use a set here?

Yess have updated the implementation. Please review once again.

maximearmstrong · 2025-01-22T18:54:35Z

python_modules/libraries/dagster-tableau/dagster_tableau/translator.py

+            embedded_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])
+            if not embedded_data_source_list:
+                data_source_ids.append(embedded_data_source["id"])
+            else:


You don't need the else statement here, iterating over an empty list won't add anything to data_source_ids in the code below. Adding a comment before the if not embedded_data_source_list condition and here would be useful.

Yess have removed this if statement and iterated over the list so ids would be added if list is not empty.
But have added a flag which lets us know if there were any published data sources already added. And if flag is unchanged we go further and add embedded_data_source id to the list. Please let me know if this logic makes sense to you. Have also added similar comments to these methods.

VenkyRules · 2025-01-23T16:19:04Z

@maximearmstrong - Thanks for your review and comments. Yes will try to add some tests for these latest changes.

VenkyRules added 3 commits January 19, 2025 23:46

Adding code to fetch more metadata from tableau and showing embedded …

d175277

…data sources as well in connections

removed comments

c61dc05

removed extra lines

dec71c3

VenkyRules marked this pull request as draft January 20, 2025 11:31

vercel bot deployed to Preview January 20, 2025 11:33 View deployment

mlarose requested a review from maximearmstrong January 20, 2025 14:43

maximearmstrong requested changes Jan 22, 2025

View reviewed changes

Addressed Review Comments

a7dd78b

VenkyRules requested a review from maximearmstrong January 23, 2025 16:14

VenkyRules changed the title ~~Feat/exploring embedded data sources~~ [dagster-tableau] Exploring embedded data sources Jan 24, 2025

VenkyRules marked this pull request as ready for review January 24, 2025 11:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dagster-tableau] Exploring embedded data sources #27218

[dagster-tableau] Exploring embedded data sources #27218

VenkyRules commented Jan 20, 2025

vercel bot commented Jan 20, 2025 •

edited

Loading

maximearmstrong left a comment

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

maximearmstrong Jan 22, 2025

VenkyRules Jan 23, 2025

VenkyRules commented Jan 23, 2025 •

edited

Loading

	embedded_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])
	published_data_source_list = embedded_data_source.get("parentPublishedDatasources", [])

[dagster-tableau] Exploring embedded data sources #27218

Are you sure you want to change the base?

[dagster-tableau] Exploring embedded data sources #27218

Conversation

VenkyRules commented Jan 20, 2025

Summary & Motivation

How I Tested These Changes

vercel bot commented Jan 20, 2025 • edited Loading

maximearmstrong left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VenkyRules commented Jan 23, 2025 • edited Loading

vercel bot commented Jan 20, 2025 •

edited

Loading

VenkyRules commented Jan 23, 2025 •

edited

Loading