Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated regex functions from libcudf #13067

Merged
merged 2 commits into from
Apr 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 0 additions & 90 deletions cpp/include/cudf/strings/contains.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,36 +60,6 @@ std::unique_ptr<column> contains_re(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a boolean column identifying rows which
* match the given regex pattern.
*
* @code{.pseudo}
* Example:
* s = ["abc","123","def456"]
* r = contains_re(s,"\\d+")
* r is now [false, true, true]
* @endcode
*
* Any null string entries return corresponding null output column entries.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link contains_re contains_re(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern Regex pattern to match to each string.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New column of boolean results for each string.
*/
[[deprecated]] std::unique_ptr<column> contains_re(
strings_column_view const& strings,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a boolean column identifying rows which
* matching the given regex_program object but only at the beginning the string.
Expand All @@ -116,36 +86,6 @@ std::unique_ptr<column> matches_re(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a boolean column identifying rows which
* matching the given regex pattern but only at the beginning the string.
*
* @code{.pseudo}
* Example:
* s = ["abc","123","def456"]
* r = matches_re(s,"\\d+")
* r is now [false, true, false]
* @endcode
*
* Any null string entries return corresponding null output column entries.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link matches_re matches_re(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern Regex pattern to match to each string.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New column of boolean results for each string.
*/
[[deprecated]] std::unique_ptr<column> matches_re(
strings_column_view const& strings,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns the number of times the given regex_program's pattern
* matches in each string
Expand All @@ -172,36 +112,6 @@ std::unique_ptr<column> count_re(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns the number of times the given regex pattern
* matches in each string.
*
* @code{.pseudo}
* Example:
* s = ["abc","123","def45"]
* r = count_re(s,"\\d")
* r is now [0, 3, 2]
* @endcode
*
* Any null string entries return corresponding null output column entries.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link count_re count_re(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern Regex pattern to match within each string.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New INT32 column with counts for each string.
*/
[[deprecated]] std::unique_ptr<column> count_re(
strings_column_view const& strings,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a boolean column identifying rows which
* match the given like pattern.
Expand Down
73 changes: 0 additions & 73 deletions cpp/include/cudf/strings/extract.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,41 +63,6 @@ std::unique_ptr<table> extract(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a table of strings columns where each column corresponds to the matching
* group specified in the given regular expression pattern.
*
* All the strings for the first group will go in the first output column; the second group
* go in the second column and so on. Null entries are added to the columns in row `i` if
* the string at row `i` does not match.
*
* Any null string entries return corresponding null output column entries.
*
* @code{.pseudo}
* Example:
* s = ["a1", "b2", "c3"]
* r = extract(s, "([ab])(\\d)")
* r is now [ ["a", "b", null],
* ["1", "2", null] ]
* @endcode
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link extract extract(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern The regular expression pattern with group indicators.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned table's device memory.
* @return Columns of strings extracted from the input column.
*/
[[deprecated]] std::unique_ptr<table> extract(
strings_column_view const& strings,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a lists column of strings where each string column row corresponds to the
* matching group specified in the given regex_program object
Expand Down Expand Up @@ -132,44 +97,6 @@ std::unique_ptr<column> extract_all_record(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a lists column of strings where each string column row corresponds to the
* matching group specified in the given regular expression pattern.
*
* All the matching groups for the first row will go in the first row output column; the second
* row results will go into the second row output column and so on.
*
* A null output row will result if the corresponding input string row does not match or
* that input row is null.
*
* @code{.pseudo}
* Example:
* s = ["a1 b4", "b2", "c3 a5", "b", null]
* r = extract_all_record(s,"([ab])(\\d)")
* r is now [ ["a", "1", "b", "4"],
* ["b", "2"],
* ["a", "5"],
* null,
* null ]
* @endcode
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link extract_all_record extract_all_record(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern The regular expression pattern with group indicators.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate any returned device memory.
* @return Lists column containing strings extracted from the input column.
*/
[[deprecated]] std::unique_ptr<column> extract_all_record(
strings_column_view const& strings,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/** @} */ // end of doxygen group
} // namespace strings
} // namespace cudf
37 changes: 0 additions & 37 deletions cpp/include/cudf/strings/findall.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,43 +65,6 @@ std::unique_ptr<column> findall(
regex_program const& prog,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Returns a lists column of strings for each matching occurrence of the
* regex pattern within each string.
*
* Each output row includes all the substrings within the corresponding input row
* that match the given pattern. If no matches are found, the output row is empty.
*
* @code{.pseudo}
* Example:
* s = ["bunny", "rabbit", "hare", "dog"]
* r = findall(s, "[ab]")
* r is now a lists column like:
* [ ["b"]
* ["a","b","b"]
* ["a"]
* [] ]
* @endcode
*
* A null output row occurs if the corresponding input row is null.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link findall findall(strings_column_view const&,
* regex_program const&, rmm::mr::device_memory_resource*) @endlink
*
* @param input Strings instance for this operation.
* @param pattern Regex pattern to match within each string.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New lists column of strings.
*/
[[deprecated]] std::unique_ptr<column> findall(
strings_column_view const& input,
std::string_view pattern,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/** @} */ // end of doxygen group
} // namespace strings
} // namespace cudf
57 changes: 0 additions & 57 deletions cpp/include/cudf/strings/replace_re.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,35 +59,6 @@ std::unique_ptr<column> replace_re(
std::optional<size_type> max_replace_count = std::nullopt,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief For each string, replaces any character sequence matching the given pattern
* with the provided replacement string.
*
* Any null string entries return corresponding null output column entries.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link replace_re replace_re(strings_column_view const&, regex_program const&,
* string_scalar const&, std::optional<size_type>, rmm::mr::device_memory_resource*) @endlink
*
* @param strings Strings instance for this operation.
* @param pattern The regular expression pattern to search within each string.
* @param replacement The string used to replace the matched sequence in each string.
* Default is an empty string.
* @param max_replace_count The maximum number of times to replace the matched pattern
* within each string. Default replaces every substring that is matched.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New strings column.
*/
[[deprecated]] std::unique_ptr<column> replace_re(
strings_column_view const& strings,
std::string_view pattern,
string_scalar const& replacement = string_scalar(""),
std::optional<size_type> max_replace_count = std::nullopt,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief For each string, replaces any character sequence matching the given patterns
* with the corresponding string in the `replacements` column.
Expand Down Expand Up @@ -133,33 +104,5 @@ std::unique_ptr<column> replace_with_backrefs(
std::string_view replacement,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief For each string, replaces any character sequence matching the given pattern
* using the replacement template for back-references.
*
* Any null string entries return corresponding null output column entries.
*
* See the @ref md_regex "Regex Features" page for details on patterns supported by this API.
*
* @deprecated Use @link replace_with_backrefs replace_with_backrefs(strings_column_view const&,
* regex_program const&, string_view, rmm::mr::device_memory_resource*) @endlink
*
* @throw cudf::logic_error if capture index values in `replacement` are not in range 0-99, and also
* if the index exceeds the group count specified in the pattern
*
* @param strings Strings instance for this operation.
* @param pattern The regular expression patterns to search within each string.
* @param replacement The replacement template for creating the output string.
* @param flags Regex flags for interpreting special characters in the pattern.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New strings column.
*/
[[deprecated]] std::unique_ptr<column> replace_with_backrefs(
strings_column_view const& strings,
std::string_view pattern,
std::string_view replacement,
regex_flags const flags = regex_flags::DEFAULT,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

} // namespace strings
} // namespace cudf
Loading