-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Add experimental support for low-code source execution via manifest YAML #175
Conversation
@natikgadzhi, @erohmensing, @bindipankhudi, @alafanechere, @bnchrch - This is ready for your review. Tests are all passing except Python 3.11 tests, which will be resolved soon via @natikgadzhi's CDK update here (just merged, pending release to PyPi): airbytehq/airbyte#38846 |
/fix-pr
|
WalkthroughThe recent updates to the Airbyte module introduce new entities and functionalities, enhance existing modules, and add support for declarative YAML source testing. Key changes include adding the Changes
Sequence Diagram(s) (Beta)sequenceDiagram
participant User
participant Airbyte
participant DeclarativeExecutor
participant Source
User->>Airbyte: Run declarative manifest source
Airbyte->>DeclarativeExecutor: Initialize with manifest
DeclarativeExecutor->>Source: Execute source with manifest
Source-->>DeclarativeExecutor: Return data
DeclarativeExecutor-->>Airbyte: Processed data
Airbyte-->>User: Display data
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
Files selected for processing (14)
- airbyte/init.py (1 hunks)
- airbyte/_processors/sql/init.py (1 hunks)
- airbyte/caches/init.py (1 hunks)
- airbyte/sources/declarative.py (1 hunks)
- airbyte/sources/registry.py (5 hunks)
- airbyte/sources/util.py (4 hunks)
- examples/run_declarative_manifest_source.py (1 hunks)
- examples/run_downloadable_yaml_source.py (1 hunks)
- pyproject.toml (2 hunks)
- tests/conftest.py (2 hunks)
- tests/integration_tests/test_lowcode_connectors.py (1 hunks)
- tests/integration_tests/test_source_test_fixture.py (3 hunks)
- tests/unit_tests/test_anonymous_usage_stats.py (2 hunks)
- tests/unit_tests/test_lowcode_connectors.py (1 hunks)
Files not reviewed due to errors (4)
- airbyte/sources/registry.py (no review received)
- tests/conftest.py (no review received)
- pyproject.toml (no review received)
- airbyte/sources/util.py (no review received)
Files skipped from review due to trivial changes (2)
- airbyte/caches/init.py
- examples/run_declarative_manifest_source.py
Additional comments not posted (16)
airbyte/_processors/sql/__init__.py (2)
6-6
: The import ofsnowflakecortex
aligns with the PR's enhancements to SQL processing capabilities.
Line range hint
6-12
: The updated export list correctly exposes the newSnowflakeCortexSqlProcessor
andSnowflakeCortexTypeConverter
, ensuring they are accessible as intended.tests/unit_tests/test_lowcode_connectors.py (2)
13-31
: The parameterized test setup for low-code connectors is well-implemented, ensuring comprehensive testing across different configurations.
20-31
: The test execution logic is correctly implemented, using theget_source
function with thesource_manifest
parameter to handle YAML sources as intended in the PR.airbyte/__init__.py (1)
19-19
: The addition ofrecords
to the module's exports is appropriate and aligns with the PR's enhancements to the module's capabilities.examples/run_downloadable_yaml_source.py (2)
15-53
: The example script effectively demonstrates the retrieval and usage of YAML connectors, aligning with the PR's objective to support declarative sources.
21-37
: The error handling in the script is robust, effectively capturing and reporting failures during the installation of YAML connectors, which enhances the script's reliability.tests/integration_tests/test_lowcode_connectors.py (2)
20-38
: The test setup for connector initialization is comprehensive and well-implemented, ensuring each connector's ability to initialize is thoroughly tested.
41-78
: The test setup for handling expected failures is well-structured and effectively uses parameterization to test different failure scenarios, enhancing the robustness of the testing process.airbyte/sources/declarative.py (2)
22-69
: TheDeclarativeExecutor
class is well-designed, effectively handling different types of manifest inputs and providing clear error messages, which enhances its usability and robustness.
72-104
: TheDeclarativeSource
class is appropriately implemented, providing detailed documentation and examples for usage, which aids in understanding and utilizing the class effectively.tests/unit_tests/test_anonymous_usage_stats.py (3)
15-18
: LGTM! Proper use of fixture scope and cleanup.
21-21
: LGTM! Telemetry tracking functionality is correctly tested.
75-79
: LGTM! Correct handling of the DO_NOT_TRACK environment variable.tests/integration_tests/test_source_test_fixture.py (2)
221-222
: The addition oflanguage
andinstall_types
parameters enhances the function's flexibility and aligns with the new features introduced in the PR.
30-33
: Ensure theautouse_source_test_registry
fixture is correctly utilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- airbyte/sources/util.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- airbyte/sources/util.py
This adds the ability to run (in theory) 130 declarative yaml sources in PyAirbyte, without any need for additional virtual environment isolation. The
manifest.yml
file content can be provided by the user or auto-downloaded frommaster
branch ofairbytehq/airbyte
.Thanks to @bnchrch and @lmossman for helping figure out the logic.
The
get_source()
implementation inairbyte.experimental
includes a newsource_manifest
input argument.The argument can be any of these types:
Path
- A path to a local Yaml file.dict
- An already-parsed Yaml manifest.str
- A URL path to a Yaml manifest.True
- Indicates that PyAirbyte should find the yaml manifest at the default location, e.g.:The Yaml-runnable connectors can be found using
ab.get_available_connectors(install_type="yaml")
orab.get_available_connectors(InstallType.YAML)
This PR also adds hard-coded exclusions for connectors in three categories:
Usage example
See the 2 new scripts in the
examples
directory for more examples, but the simplest usage is just:In the above example, the source
manifest.yml
is automatically located frommaster
branch ofairbytehq/airbyte
, and the only change from the user perspective is to add the argsource_manifest=True
.Note
Included Connectors
This is the result of calling
get_available_connectors("yaml")
:Show/Hide
Hard-coded exclusions have been removed from this list, for instance, those low-code connectors that require one or more python code files.
Summary by CodeRabbit
New Features
Enhancements
ConnectorMetadata
to include language and installation types.get_available_connectors
to handle different installation types.Dependencies
airbyte-cdk
to^1.2.1
.airbyte-source-faker
to^6.1.2
.Tests