Skip to content

Commit

Permalink
delete unused spec option
Browse files Browse the repository at this point in the history
  • Loading branch information
edgao committed Mar 21, 2024
1 parent 728c92c commit 855c4a4
Show file tree
Hide file tree
Showing 4 changed files with 3 additions and 83 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -223,14 +223,6 @@ public SerializedAirbyteMessageConsumer getSerializedMessageConsumer(final JsonN
config.has(UPLOADING_METHOD) ? EncryptionConfig.fromJson(config.get(UPLOADING_METHOD).get(JdbcUtils.ENCRYPTION_KEY)) : new NoEncryption();
final JsonNode s3Options = findS3Options(config);
final S3DestinationConfig s3Config = getS3DestinationConfig(s3Options);
final int numberOfFileBuffers = getNumberOfFileBuffers(s3Options);
if (numberOfFileBuffers > FileBuffer.SOFT_CAP_CONCURRENT_STREAM_IN_BUFFER) {
LOGGER.warn("""
Increasing the number of file buffers past {} can lead to increased performance but
leads to increased memory usage. If the number of file buffers exceeds the number
of streams {} this will create more buffers than necessary, leading to nonexistent gains
""", FileBuffer.SOFT_CAP_CONCURRENT_STREAM_IN_BUFFER, catalog.getStreams().size());
}

final String defaultNamespace = config.get("schema").asText();
for (final ConfiguredAirbyteStream stream : catalog.getStreams()) {
Expand Down Expand Up @@ -285,26 +277,6 @@ public SerializedAirbyteMessageConsumer getSerializedMessageConsumer(final JsonN
.createAsync();
}

/**
* Retrieves user configured file buffer amount so as long it doesn't exceed the maximum number of
* file buffers and sets the minimum number to the default
* <p>
* NOTE: If Out Of Memory Exceptions (OOME) occur, this can be a likely cause as this hard limit has
* not been thoroughly load tested across all instance sizes
*
* @param config user configurations
* @return number of file buffers if configured otherwise default
*/
@VisibleForTesting
public int getNumberOfFileBuffers(final JsonNode config) {
int numOfFileBuffers = FileBuffer.DEFAULT_MAX_CONCURRENT_STREAM_IN_BUFFER;
if (config.has(FileBuffer.FILE_BUFFER_COUNT_KEY)) {
numOfFileBuffers = Math.min(config.get(FileBuffer.FILE_BUFFER_COUNT_KEY).asInt(), FileBuffer.MAX_CONCURRENT_STREAM_IN_BUFFER);
}
// Only allows for values 10 <= numOfFileBuffers <= 50
return Math.max(numOfFileBuffers, FileBuffer.DEFAULT_MAX_CONCURRENT_STREAM_IN_BUFFER);
}

private boolean isPurgeStagingData(final JsonNode config) {
return !config.has("purge_staging_data") || config.get("purge_staging_data").asBoolean();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -224,15 +224,6 @@
}
],
"order": 7
},
"file_buffer_count": {
"title": "File Buffer Count",
"type": "integer",
"minimum": 10,
"maximum": 50,
"default": 10,
"description": "Number of file buffers allocated for writing data. Increasing this number is beneficial for connections using Change Data Capture (CDC) and up to the number of streams within a connection. Increasing the number of file buffers past the maximum number of streams has deteriorating effects",
"examples": ["10"]
}
}
},
Expand Down

This file was deleted.

8 changes: 3 additions & 5 deletions docs/integrations/destinations/redshift.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,6 @@ Optional parameters:
`bucketPath/namespace/streamName/syncDate_epochMillis_randomUuid.csv` containing three columns
(`ab_id`, `data`, `emitted_at`). Normally these files are deleted after the `COPY` command
completes; if you want to keep them for other purposes, set `purge_staging_data` to `false`.
- **File Buffer Count**
- Number of file buffers allocated for writing data. Increasing this number is beneficial for connections using Change Data Capture (CDC) and up to the number of streams within a connection. Increasing the number of file buffers past the maximum number of streams has deteriorating effects.

NOTE: S3 staging does not use the SSH Tunnel option for copying data, if configured. SSH Tunnel supports the SQL
connection only. S3 is secured through public HTTPS access only. Subsequent typing and deduping queries on final table
Expand Down Expand Up @@ -187,10 +185,10 @@ characters.
### Data Size Limitations

Redshift specifies a maximum limit of 16MB (and 65535 bytes for any VARCHAR fields within the JSON
record) to store the raw JSON record data. Thus, when a row is too big to fit, the destination connector will
do one of the following.
record) to store the raw JSON record data. Thus, when a row is too big to fit, the destination connector will
do one of the following.
1. Null the value if the varchar size > 65535, The corresponding key information is added to `_airbyte_meta`.
2. Null the whole record while trying to preserve the Primary Keys and cursor field declared as part of your stream configuration, if the total record size is > 16MB.
2. Null the whole record while trying to preserve the Primary Keys and cursor field declared as part of your stream configuration, if the total record size is > 16MB.
* For DEDUPE sync mode, if we do not find Primary key(s), we fail the sync.
* For OVERWRITE and APPEND mode, syncs will succeed with empty records emitted, if we fail to find Primary key(s).

Expand Down

0 comments on commit 855c4a4

Please sign in to comment.