Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(query): Procedure (Part1) #16348

Merged
merged 12 commits into from
Sep 5, 2024
Merged

feat(query): Procedure (Part1) #16348

merged 12 commits into from
Sep 5, 2024

Conversation

TCeason
Copy link
Collaborator

@TCeason TCeason commented Aug 29, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Now Databend support execute script.

We can use Procedure store the script in meta.

It's experimental now, If wants to use it should set global enable_experimental_procedure=1;

root@localhost:8001/default> CREATE PROCEDURE p1() RETURNS int not null LANGUAGE SQL COMMENT='test' AS $$
BEGIN
    LET x := -1;
    LET sum := 0;
    FOR x IN x TO x + 3 DO
        sum := sum + x;
    END FOR;
    RETURN sum;
END;
$$;

0 row written in 0.088 sec. Processed 0 row, 0B (0 row/s, 0B/s)

root@localhost:8001/default> show procedures;

SHOW procedures

-[ RECORD 1 ]-----------------------------------
        name: p1
procedure_id: 7467
   arguments: p1 RETURN (Int32)
     comment: test
 description: user-defined procedure
  created_on: 2024-08-29 07:51:51.049318

1 row read in 0.095 sec. Processed 1 row, 125B (10.56 rows/s, 1.29 KiB/s)

root@localhost:8001/default> call procedure p1();

call PROCEDURE p1()

-[ RECORD 1 ]-----------------------------------
Result: 2

1 row read in 0.295 sec. Processed 0 row, 0B (0 row/s, 0B/s)

fixes: Part of #14904

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@TCeason TCeason requested a review from drmingdrmer as a code owner August 29, 2024 08:05
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Aug 29, 2024
@TCeason TCeason force-pushed the procedure branch 5 times, most recently from 05bc53a to 7aa130f Compare August 29, 2024 10:14
Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the procedure need a two level mapping name->id->value? A simple name->value seems to be just enough.

Reviewed 7 of 45 files at r1, all commit messages.
Reviewable status: 7 of 45 files reviewed, 4 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/meta/api/src/util.rs line 461 at r1 (raw file):

        Err(KVAppError::AppError(AppError::UnknownProcedure(
            UnknownProcedure::new(

you can just use a generic UnknownError, for example
https://github.com/datafuselabs/databend/blob/5a2f8412239146d1a190dffc1469e2e0dacee391/src/meta/app/src/app_error.rs#L1114

Also the ExistError is also available.


src/meta/app/src/app_error.rs line 72 at r1 (raw file):

#[derive(Debug, Clone, PartialEq, Eq, thiserror::Error)]
#[error("ProcedureAlreadyExists: `{procedure_name}` while `{context}`")]
pub struct ProcedureAlreadyExists {

replace this error with ExistError:
https://github.com/datafuselabs/databend/blob/af5fe4c3384b220555a282a0c2e33b564d75e43c/src/meta/app/src/tenant_key/errors.rs#L25


src/meta/app/src/app_error.rs line 408 at r1 (raw file):

#[derive(thiserror::Error, Debug, Clone, PartialEq, Eq)]
#[error("UnknownProcedure: `{procedure_name}` while `{context}`")]
pub struct UnknownProcedure {

Replace this with UnknownError:
https://github.com/datafuselabs/databend/blob/af5fe4c3384b220555a282a0c2e33b564d75e43c/src/meta/app/src/tenant_key/errors.rs#L106


src/meta/app/src/principal/procedur_name_ident.rs line 50 at r1 (raw file):

        const TYPE: &'static str = "ProcedureNameIdent";
        const HAS_TENANT: bool = true;
        type ValueType = Id<ProcedureId>;

Define ProcedureId with DataId, which is more easy to use, for example:
https://github.com/datafuselabs/databend/blob/5a2f8412239146d1a190dffc1469e2e0dacee391/src/meta/app/src/sch
ema/catalog_id_ident.rs#L20

@TCeason
Copy link
Collaborator Author

TCeason commented Aug 29, 2024

Does the procedure need a two level mapping name->id->value? A simple name->value seems to be just enough.

Reviewed 7 of 45 files at r1, all commit messages.
Reviewable status: 7 of 45 files reviewed, 4 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/meta/api/src/util.rs line 461 at r1 (raw file):

        Err(KVAppError::AppError(AppError::UnknownProcedure(
            UnknownProcedure::new(

you can just use a generic UnknownError, for example
https://github.com/datafuselabs/databend/blob/5a2f8412239146d1a190dffc1469e2e0dacee391/src/meta/app/src/app_error.rs#L1114

Also the ExistError is also available.


src/meta/app/src/app_error.rs line 72 at r1 (raw file):

#[derive(Debug, Clone, PartialEq, Eq, thiserror::Error)]
#[error("ProcedureAlreadyExists: `{procedure_name}` while `{context}`")]
pub struct ProcedureAlreadyExists {

replace this error with ExistError:
https://github.com/datafuselabs/databend/blob/af5fe4c3384b220555a282a0c2e33b564d75e43c/src/meta/app/src/tenant_key/errors.rs#L25


src/meta/app/src/app_error.rs line 408 at r1 (raw file):

#[derive(thiserror::Error, Debug, Clone, PartialEq, Eq)]
#[error("UnknownProcedure: `{procedure_name}` while `{context}`")]
pub struct UnknownProcedure {

Replace this with UnknownError:
https://github.com/datafuselabs/databend/blob/af5fe4c3384b220555a282a0c2e33b564d75e43c/src/meta/app/src/tenant_key/errors.rs#L106


src/meta/app/src/principal/procedur_name_ident.rs line 50 at r1 (raw file):

        const TYPE: &'static str = "ProcedureNameIdent";
        const HAS_TENANT: bool = true;
        type ValueType = Id<ProcedureId>;

Define ProcedureId with DataId, which is more easy to use, for example:
https://github.com/datafuselabs/databend/blob/5a2f8412239146d1a190dffc1469e2e0dacee391/src/meta/app/src/sch
ema/catalog_id_ident.rs#L20

In future it will support rename and privilege assess. So I must store it Like db id or table id.

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Sep 2, 2024
@drmingdrmer drmingdrmer requested a review from sundy-li September 2, 2024 04:06
Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 45 files at r1, 5 of 12 files at r2, 1 of 12 files at r4.
Reviewable status: 10 of 46 files reviewed, 14 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/meta/app/src/principal/procedure.rs line 39 at r2 (raw file):

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Default, Eq, PartialEq)]
pub struct ProcedureIdent {

This struct does not need serde AFAIK

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Default, Eq, PartialEq)]
pub struct ProcedureIdent {

src/meta/app/src/principal/procedure.rs line 81 at r2 (raw file):

pub struct ProcedureIdList {
    pub id_list: Vec<u64>,
}

This struct does not need serde AFAIK, either

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Eq, Default, PartialEq)]
pub struct ProcedureIdList {
    pub id_list: Vec<u64>,
}

src/meta/app/src/principal/procedure.rs line 151 at r2 (raw file):

                self.meta
            ),
        }

Such implementation is redundant. Reduce it by building a type name first:

Or just implement Display for CreateOption would be more elegant.

Suggestion:

impl Display for CreateProcedureReq {
    fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
        let typ = match self.create_option {
            CreateOption::Create => "create_procedure",
            CreateOption::CreateIfNotExists => "create_procedure_if_not_exists",
            CreateOption::CreateOrReplace => "create_or_replace_procedure",
        };
        write!(
                f,
                "{}:{}/{}={:?}",
                typ,
                self.name_ident.tenant_name(),
                self.name_ident.procedure_name(),
                self.meta
            ),

src/meta/app/src/principal/procedure.rs line 158 at r2 (raw file):

pub struct CreateProcedureReply {
    pub procedure_id: u64,
}

No serde is required.

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Eq, PartialEq)]
pub struct CreateProcedureReply {
    pub procedure_id: u64,
}

src/meta/app/src/principal/procedure.rs line 180 at r2 (raw file):

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct RenameProcedureReply {}

remove empty reply and replace with ()

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct RenameProcedureReply {}

src/meta/app/src/principal/procedure.rs line 203 at r2 (raw file):

pub struct DropProcedureReply {
    pub procedure_id: u64,
}

no serde is required.

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct DropProcedureReply {
    pub procedure_id: u64,
}

src/meta/app/src/principal/procedure.rs line 264 at r2 (raw file):

    // include all dropped procedures
    IncludeDropped,
}

remove serde

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub enum ProcedureInfoFilter {
    // include all dropped procedures
    IncludeDropped,
}

src/meta/app/src/principal/procedure_id_to_name.rs line 75 at r2 (raw file):

    #[test]
    fn test_background_job_id_ident() {

not background_job anymore


src/meta/proto-conv/src/procedure_from_to_protobuf_impl.rs line 40 at r2 (raw file):

        reader_check_msg(p.ver, p.min_reader_ver)?;

        reader_check_msg(p.ver, p.min_reader_ver)?;

duplicated code

Code quote:

    fn from_pb(p: pb::ProcedureMeta) -> Result<Self, Incompatible> {
        reader_check_msg(p.ver, p.min_reader_ver)?;

        reader_check_msg(p.ver, p.min_reader_ver)?;

@TCeason
Copy link
Collaborator Author

TCeason commented Sep 2, 2024

    reader_check_msg(p.ver, p.min_reader_ver)?;


    
      
    

      
    

    
  

Done

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 45 files at r1, 1 of 12 files at r4, 12 of 18 files at r5, all commit messages.
Reviewable status: 21 of 46 files reviewed, 6 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 2, 2024
Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 18 files at r5, all commit messages.
Reviewable status: 22 of 46 files reviewed, 7 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/meta/app/src/principal/procedure.rs line 148 at r6 (raw file):

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct DropProcedureReply {

Needless serde

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct DropProcedureReply {

@TCeason TCeason requested a review from drmingdrmer September 3, 2024 01:23
@TCeason
Copy link
Collaborator Author

TCeason commented Sep 3, 2024

Reviewed 1 of 18 files at r5, all commit messages.
Reviewable status: 22 of 46 files reviewed, 7 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

src/meta/app/src/principal/procedure.rs line 148 at r6 (raw file):

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct DropProcedureReply {

Needless serde

Code quote:

#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, PartialEq, Eq)]
pub struct DropProcedureReply {

remove Drop/Update's serde

@TCeason TCeason force-pushed the procedure branch 2 times, most recently from a824f9b to cf9931a Compare September 3, 2024 03:49
@drmingdrmer drmingdrmer requested a review from sundy-li September 3, 2024 04:18
Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just some nits. :)

Reviewed 13 of 45 files at r1, 1 of 12 files at r4, 4 of 18 files at r5, 1 of 1 files at r6, 7 of 7 files at r7, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/query/sql/src/planner/binder/ddl/procedure.rs line 142 at r7 (raw file):

            return_types,
            created_on: Utc::now(),
            updated_on: Utc::now(),

Create time just once to avoid confusing different create/update time:

Suggestion:

        let t = Utc::now();
        Ok(ProcedureMeta {
            return_types,
            created_on: t,
            updated_on: t,

src/query/users/src/user_procedure.rs line 55 at r7 (raw file):

        Ok(())
    }
}

These wrapper functions seem unnecessary: UserApiProvider::procedure_api()::get/add/drop() looks good enough

Code quote:

impl UserApiProvider {
    // Add a new Procedure.
    #[async_backtrace::framed]
    pub async fn add_procedure(&self, tenant: &Tenant, req: CreateProcedureReq) -> Result<()> {
        let procedure_api = self.procedure_api(tenant);
        let _ = procedure_api.create_procedure(req).await?;
        Ok(())
    }

    #[async_backtrace::framed]
    pub async fn get_procedure(
        &self,
        tenant: &Tenant,
        req: GetProcedureReq,
    ) -> Result<GetProcedureReply> {
        let procedure_api = self.procedure_api(tenant);
        let procedure = procedure_api
            .get_procedure(&req)
            .await?
            .ok_or_else(|| AppError::from(req.inner.unknown_error("get_procedure")))?;
        Ok(procedure)
    }

    // Drop a Procedure by name.
    #[async_backtrace::framed]
    pub async fn drop_procedure(&self, tenant: &Tenant, req: DropProcedureReq) -> Result<()> {
        let _ = self.procedure_api(tenant).drop_procedure(req).await?;
        Ok(())
    }
}

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 20 of 20 files at r8, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)


src/meta/app/src/principal/procedure_identity.rs line 88 at r8 (raw file):

impl KeyCodec for ProcedureIdentity {
    fn encode_key(&self, b: KeyBuilder) -> KeyBuilder {
        b.push_str(&self.encode())

It would be better with b.push_str(&self.name).push_str(&self.args). push_str() escapes special chars. self.encode() is not required.

@dosubot dosubot bot removed the lgtm This PR has been approved by a maintainer label Sep 5, 2024
@TCeason
Copy link
Collaborator Author

TCeason commented Sep 5, 2024

Reviewed 20 of 20 files at r8, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

src/meta/app/src/principal/procedure_identity.rs line 88 at r8 (raw file):

impl KeyCodec for ProcedureIdentity {
    fn encode_key(&self, b: KeyBuilder) -> KeyBuilder {
        b.push_str(&self.encode())

It would be better with b.push_str(&self.name).push_str(&self.args). push_str() escapes special chars. self.encode() is not required.

If two procedure first is named p1 with int arg and ,second is named p1int with none args:

p1(int)
p1int()

We need to split them. b.push_str(&self.name).push_str(&self.args) can do this?

pub fn push_raw(mut self, s: &str) -> Self {
        if !self.buf.is_empty() {
            // `/`
            self.buf.push(0x2f);
        }

        self.buf.extend_from_slice(s.as_bytes());
        self
    }

Oh, push_raw can push a /, maybe I will do a test on local.

@drmingdrmer
Copy link
Member

Reviewed 20 of 20 files at r8, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

src/meta/app/src/principal/procedure_identity.rs line 88 at r8 (raw file):

impl KeyCodec for ProcedureIdentity {
    fn encode_key(&self, b: KeyBuilder) -> KeyBuilder {
        b.push_str(&self.encode())

It would be better with b.push_str(&self.name).push_str(&self.args). push_str() escapes special chars. self.encode() is not required.

If two procedure first is named p1 with int arg and ,second is named p1int with none args:

p1(int)
p1int()

We need to split them. b.push_str(&self.name).push_str(&self.args) can do this?

yes.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 5, 2024
Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 25 of 25 files at r9, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @andylokandy, @sundy-li, and @TCeason)

@BohuTANG BohuTANG merged commit 8b19e5e into databendlabs:main Sep 5, 2024
74 of 75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer pr-feature this PR introduces a new feature to the codebase size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants