Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres replication #7

Merged
merged 2 commits into from
Jan 3, 2024
Merged

Postgres replication #7

merged 2 commits into from
Jan 3, 2024

Conversation

panosfol
Copy link
Collaborator

This PR contains the following:

  • pg_connect: Connection to the origin and the target database.
  • replicate: Utilize the pg_dump and pg_restore commands to replicate the database.
  • Utilizing the log.h library.

* `PATH=/path/to/pg_command` therefore we also need 5 characters for
* `PATH=` + 1 for '\0'.
*/
char *pg_bin, prefixed_command_path[261] = "PATH=";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need PATH= in this variable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postgres needs PATH env variable to find the commands, and according to the docs it needs to be in that format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i remove it I get an error from the execve command:
could not find a "pg_dump" to execute.

The process will still run though but the only way to stop the error from appearing is to have the path with PATH= prefix.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the execve docs, the env variables should be of the format key=values, maybe thats why the error appears.

Copy link
Contributor

@charmitro charmitro Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, other database driver might also use *ve due to their variables. Let's keep this consistent.

Do you think a wrapper for using execve is wise? It might help engineers use run their executable and avoid bloated code, such as

ret = execute_command(command, args, env_vars, stdout, stderr .....)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would a wrapper work in this case? Is it to make it more contribution-friendly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid rewriting these again and again.

cnc/src/postgres.c

Lines 206 to 230 in c324eec

p_dump = fork();
if (p_dump < 0) {
if (get_verbose()) {
pr_info("`pg_dump` process failed to start: %s. Replication process will be terminated.\n",
strerror(errno));
} else {
printf("%s\n", strerror(errno));
}
free(dump_pass);
free(restore_pass);
return -2;
} else if (p_dump == 0) {
execve(strncat(command_path, "pg_dump", strlen("pg_dump") + 1),
dump_args, dump_envp);
pr_info("`pg_dump` error: %s\n", strerror(errno));
exit(-1);
}
wait(&wstatus);
// Check the status of the forked process, anything other than 0 means failure.
if (wstatus != 0) {
pr_info("`pg_dump` was unsuccessful. Replication process will be terminated.\n");
free(dump_pass);
free(restore_pass);
return -1;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok got it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do it as a follow up. Proceed with the PR as it is.

@charmitro
Copy link
Contributor

valgrind --leak-check=full --show-leak-kinds=all ./cnc -f test.yaml -v
==1745997== Memcheck, a memory error detector
==1745997== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1745997== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==1745997== Command: ./cnc -f test.yaml -v
==1745997== 
==1745997== Conditional jump or move depends on uninitialised value(s)
==1745997==    at 0x484663C: strncat (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x10A033: replicate (postgres.c:195)
==1745997==    by 0x109581: execute_db_operations (db.c:34)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1745997== Conditional jump or move depends on uninitialised value(s)
==1745997==    at 0x4846647: strncat (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x10A033: replicate (postgres.c:195)
==1745997==    by 0x109581: execute_db_operations (db.c:34)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1746000== Syscall param execve(envp[i]) points to uninitialised byte(s)
==1746000==    at 0x4998917: execve (syscall-template.S:120)
==1746000==    by 0x10A1D1: replicate (postgres.c:220)
==1746000==    by 0x109581: execute_db_operations (db.c:34)
==1746000==    by 0x10968A: main (main.c:45)
==1746000==  Address 0x1fff000040 is on thread 1's stack
==1746000==  in frame #1, created by replicate (postgres.c:109)
==1746000== 
==1746002== Syscall param execve(envp[i]) points to uninitialised byte(s)
==1746002==    at 0x4998917: execve (syscall-template.S:120)
==1746002==    by 0x10A41B: replicate (postgres.c:250)
==1746002==    by 0x109581: execute_db_operations (db.c:34)
==1746002==    by 0x10968A: main (main.c:45)
==1746002==  Address 0x1fff000040 is on thread 1's stack
==1746002==  in frame #1, created by replicate (postgres.c:109)
==1746002== 
test.yaml
construct
VERBOSE:connect_pg	Origin-database connection: Success!
VERBOSE:connect_pg	Target-database connection: Success!
VERBOSE:replicate	Checking if $PG_BIN environmental variable.
VERBOSE:replicate	$PG_BIN was not found, default path /usr/bin/ will be used.
VERBOSE:replicate	Starting `pg_dump` process.
VERBOSE:replicate	Starting `pg_restore` process.
VERBOSE:replicate	Database Replication was successful!
==1745997== 
==1745997== HEAP SUMMARY:
==1745997==     in use at exit: 233 bytes in 5 blocks
==1745997==   total heap usage: 59,760 allocs, 59,755 frees, 4,888,693 bytes allocated
==1745997== 
==1745997== 8 bytes in 1 blocks are still reachable in loss record 1 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x50192C9: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x48809ED: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 24 bytes in 1 blocks are still reachable in loss record 2 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x5019273: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x48809ED: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 48 bytes in 1 blocks are still reachable in loss record 3 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x51B7F0C: krb5int_setspecific (in /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1)
==1745997==    by 0x50192E5: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 73 bytes in 1 blocks are still reachable in loss record 4 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x49627F9: strdup (strdup.c:42)
==1745997==    by 0x501938F: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x502D0B1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5031D33: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997== 
==1745997== 80 bytes in 1 blocks are still reachable in loss record 5 of 5
==1745997==    at 0x48455EF: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x109489: execute_db_operations (db.c:17)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1745997== LEAK SUMMARY:
==1745997==    definitely lost: 0 bytes in 0 blocks
==1745997==    indirectly lost: 0 bytes in 0 blocks
==1745997==      possibly lost: 0 bytes in 0 blocks
==1745997==    still reachable: 233 bytes in 5 blocks
==1745997==         suppressed: 0 bytes in 0 blocks
==1745997== 
==1745997== Use --track-origins=yes to see where uninitialised values come from
==1745997== For lists of detected and suppressed errors, rerun with: -s
==1745997== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

@charmitro
Copy link
Contributor

valgrind --leak-check=full --show-leak-kinds=all ./cnc -f test.yaml -v
==1745997== Memcheck, a memory error detector
==1745997== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1745997== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==1745997== Command: ./cnc -f test.yaml -v
==1745997== 
==1745997== Conditional jump or move depends on uninitialised value(s)
==1745997==    at 0x484663C: strncat (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x10A033: replicate (postgres.c:195)
==1745997==    by 0x109581: execute_db_operations (db.c:34)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1745997== Conditional jump or move depends on uninitialised value(s)
==1745997==    at 0x4846647: strncat (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x10A033: replicate (postgres.c:195)
==1745997==    by 0x109581: execute_db_operations (db.c:34)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1746000== Syscall param execve(envp[i]) points to uninitialised byte(s)
==1746000==    at 0x4998917: execve (syscall-template.S:120)
==1746000==    by 0x10A1D1: replicate (postgres.c:220)
==1746000==    by 0x109581: execute_db_operations (db.c:34)
==1746000==    by 0x10968A: main (main.c:45)
==1746000==  Address 0x1fff000040 is on thread 1's stack
==1746000==  in frame #1, created by replicate (postgres.c:109)
==1746000== 
==1746002== Syscall param execve(envp[i]) points to uninitialised byte(s)
==1746002==    at 0x4998917: execve (syscall-template.S:120)
==1746002==    by 0x10A41B: replicate (postgres.c:250)
==1746002==    by 0x109581: execute_db_operations (db.c:34)
==1746002==    by 0x10968A: main (main.c:45)
==1746002==  Address 0x1fff000040 is on thread 1's stack
==1746002==  in frame #1, created by replicate (postgres.c:109)
==1746002== 
test.yaml
construct
VERBOSE:connect_pg	Origin-database connection: Success!
VERBOSE:connect_pg	Target-database connection: Success!
VERBOSE:replicate	Checking if $PG_BIN environmental variable.
VERBOSE:replicate	$PG_BIN was not found, default path /usr/bin/ will be used.
VERBOSE:replicate	Starting `pg_dump` process.
VERBOSE:replicate	Starting `pg_restore` process.
VERBOSE:replicate	Database Replication was successful!
==1745997== 
==1745997== HEAP SUMMARY:
==1745997==     in use at exit: 233 bytes in 5 blocks
==1745997==   total heap usage: 59,760 allocs, 59,755 frees, 4,888,693 bytes allocated
==1745997== 
==1745997== 8 bytes in 1 blocks are still reachable in loss record 1 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x50192C9: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x48809ED: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 24 bytes in 1 blocks are still reachable in loss record 2 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x5019273: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x48809ED: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 48 bytes in 1 blocks are still reachable in loss record 3 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x51B7F0C: krb5int_setspecific (in /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1)
==1745997==    by 0x50192E5: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50193A1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5005265: gss_acquire_cred (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x48961CF: ??? (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997==    by 0x488005B: PQconnectPoll (in /usr/lib/x86_64-linux-gnu/libpq.so.5.15)
==1745997== 
==1745997== 73 bytes in 1 blocks are still reachable in loss record 4 of 5
==1745997==    at 0x48407B4: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x49627F9: strdup (strdup.c:42)
==1745997==    by 0x501938F: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50194E1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50179EC: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50184CF: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x502D0B1: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5031D33: ??? (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x50049D1: gss_add_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997==    by 0x5004FC8: gss_acquire_cred_from (in /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2)
==1745997== 
==1745997== 80 bytes in 1 blocks are still reachable in loss record 5 of 5
==1745997==    at 0x48455EF: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1745997==    by 0x109489: execute_db_operations (db.c:17)
==1745997==    by 0x10968A: main (main.c:45)
==1745997== 
==1745997== LEAK SUMMARY:
==1745997==    definitely lost: 0 bytes in 0 blocks
==1745997==    indirectly lost: 0 bytes in 0 blocks
==1745997==      possibly lost: 0 bytes in 0 blocks
==1745997==    still reachable: 233 bytes in 5 blocks
==1745997==         suppressed: 0 bytes in 0 blocks
==1745997== 
==1745997== Use --track-origins=yes to see where uninitialised values come from
==1745997== For lists of detected and suppressed errors, rerun with: -s
==1745997== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Nevermind this, was testing the wrong patch.

Copy link
Contributor

@charmitro charmitro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename pr_info to pr_debug or pr_warn to avoid confusing it as the programs main STDOUT.

pr_info is intended for verbose output only. Please introduce regular printf's as users will probably (or always) not run with -v enabled.

src/postgres.c Outdated
* 0 Success
* -1 Failure to connect
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove newline.

free(pg_db_t->pg_conf);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline.

src/postgres.c Outdated
Comment on lines 267 to 268
}
void construct_pg_password(char *env_pass, const char *config_pass)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline.

src/postgres.c Outdated
Comment on lines 142 to 146
/*
* This will be used as the path for the output of `pg_dump` and the input
* of the `pg_restore`, and it will be deleted before we return from `replicate()`
*/
char dump_path[] = "/tmp/backup.dump";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes about this.

  1. We should keep it after the procedure is done for obvious reasons (debugging, examining etc.)
  2. Don't use /tmp/ if possible.

src/postgres.c Outdated
free(restore_pass);
return -2;
} else if (p_dump == 0) {
execve(strncat(command_path, "pg_dump", strlen("pg_dump") + 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid calling functions inside other functions arguments.

src/postgres.c Outdated
Comment on lines 205 to 241
pr_info("Starting `pg_dump` process.\n");
p_dump = fork();
if (p_dump < 0) {
if (get_verbose()) {
pr_info("`pg_dump` process failed to start: %s. Replication process will be terminated.\n",
strerror(errno));
} else {
printf("%s\n", strerror(errno));
}
free(dump_pass);
free(restore_pass);
return -2;
} else if (p_dump == 0) {
execve(strncat(command_path, "pg_dump", strlen("pg_dump") + 1),
dump_args, dump_envp);
pr_info("`pg_dump` error: %s\n", strerror(errno));
exit(-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use pipe1 & dup22 for this instead of just exiting with an error code. Using these will allow us to see the actual output(if any) from the children process.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we dont exit there, the child process will continue running in parallel with the parent process, and we need to avoid that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if you kill the children process with its PID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok i see thanks!

Copy link
Collaborator Author

@panosfol panosfol Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im confused in a couple of things with this.
First using pipe and dup2 I can actually read the output of the command even if the command fails., so it is ok if we exit(-1) if we fail.

Second, i don't see the reason using pipes since execve will either succeed, therefore they will be no output, or it will fail, and we can see the error with pr_info("`pg_dump` error: %s\n", strerror(errno));.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exiting there is wrong, we need to do stuff prior to exiting our program. If you remember our design we would have logs being kept in order to save previous runs status and also send them by email.

Plus, all your pg_* commands should run with verbose on (i.e. pg_dump -v ...).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't exit the program there we just exit the child process, we can still do all the things we need in this block after:

	wait(&wstatus);
	// Check the status of the forked process, anything other than 0 means failure.
	if (wstatus != 0) {
		if (verbose) {
			pr_debug(
				"`pg_dump` was unsuccessful. Replication process will be terminated.\n");
		} else {
			printf("`pg_dump`: failed\n");
		}
		free(dump_pass);
		free(restore_pass);
		return -1;
	}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of this, we need to hold ALL STDOUT and STDERR in order to send it somewhere.

By "exiting" I mean losing the process information.

src/postgres.c Outdated
Comment on lines 268 to 352
void construct_pg_password(char *env_pass, const char *config_pass)
{
strncpy(env_pass, "PGPASSWORD=", strlen("PGPASSWORD=") + 1);
strncat(env_pass, config_pass, strlen(config_pass));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a parameter len to the function to avoid multiple invocations of strlen here.

This commit also contains some minor memory fixes.

Signed-off-by: Panagiotis Foliadis <pfoliadis@hotmail.com>
src/postgres.c Outdated
execve(strncat(command_path, "pg_dump", strlen("pg_dump") + 1),
dump_args, dump_envp);
pr_info("`pg_dump` error: %s\n", strerror(errno));
exit(-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exit here? According to our design there is a "emailing" step you need to take into account. Exiting will not work.

src/postgres.c Outdated
Comment on lines 208 to 210
if (get_verbose()) {
pr_info("`pg_dump` process failed to start: %s. Replication process will be terminated.\n",
strerror(errno));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr_info already does the verbose check.

cnc/include/log.h

Lines 7 to 16 in a91f31b

/* Only if `verbose` is enabled (throught "-v" flag)
* this print is going to print to STDOUT. Name
* "pr_..." is inspired by pr_{info|warn|error} of
* the Linux Kernel.
*/
#define pr_debug(...) \
if (get_verbose()) { \
fprintf(stdout, "VERBOSE:%s\t", __func__); \
fprintf(stdout, __VA_ARGS__); \
}

@charmitro
Copy link
Contributor

I did not review it, but with a very quick look I believe the pipes of the second command (pg_restore) are not redirecting due to them being closed at that time.

@panosfol
Copy link
Collaborator Author

I did not review it, but with a very quick look I believe the pipes of the second command (pg_restore) are not redirecting due to them being closed at that time.

I tested it and it seems to work, not sure if there is an edge case that might fail.
I get both the verbose output of pg_dump and pg_restore.

@charmitro
Copy link
Contributor

charmitro commented Dec 20, 2023 via email

@panosfol
Copy link
Collaborator Author

Panagiotis Foliadis @.***> writes:
I did not review it, but with a very quick look I believe the pipes of the second command (pg_restore) are not redirecting due to them being closed at that time. I tested it and it seems to work, not sure if there is an edge case that might fail. I get both the verbose output of pg_dump and pg_restore.
"read_buffer" is where the output should come from, right? If that is the case, you can test it by adding a debug word in the "printf" of the function. I can see it for pg_dump but not for pg_restore. c printf("%s %s", __func__, buffer);

Ok I see what happened, you are right. The output of pg_restore was being printed but not through the buffer.
Shoud I use another pipe?

@charmitro
Copy link
Contributor

charmitro commented Dec 20, 2023 via email

src/postgres.c Outdated
Comment on lines 223 to 246
p_dump = fork();
if (p_dump == -1) {
pr_debug(
"`pg_dump` process failed to start: %s.\n Replication process will be terminated.\n",
strerror(errno));
printf("%s\n", strerror(errno));
free(dump_pass);
free(restore_pass);
return -2;
} else if (p_dump == 0) {
char *pg_dump_path =
strncat(command_path, "pg_dump", strlen("pg_dump") + 1);
dup2(pipefd[1], STDERR_FILENO);
close(pipefd[0]);
close(pipefd[1]);
execve(pg_dump_path, dump_args, dump_envp);
printf("Error executing `pg_dump`\n");
pr_debug("`pg_dump` error: %s\n", strerror(errno));
exit(-1);
}
close(pipefd[1]);
read_buffer(pipefd);

wait(&wstatus);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define READ_END 0
#define WRITE_END 1


        p_dump = fork();
        if (p_dump == 0) {
                char *pg_dump_path =
                        strncat(command_path, "pg_dump", strlen("pg_dump") + 1);
                close(pipefd[READ_END]);
                dup2(pipefd[1], STDOUT_FILENO);
                dup2(pipefd[1], STDERR_FILENO);

                close(pipefd[READ_END]);
                close(pipefd[WRITE_END]);
                execve(pg_dump_path, dump_args, dump_envp);
                printf("Error executing `pg_dump`\n");
                pr_debug("`pg_dump` error: %s\n", strerror(errno));
                exit(-1);
        } else {
                        pr_debug(
                                "`pg_dump` process failed to start: %s.\n Replication process will be terminated.\n",
                                strerror(errno));
                        printf("%s\n", strerror(errno));
                        return <return code>
        }

        close(pipefd[WRITE_END]);
        read_buffer(pipefd);

        wait(&wstatus);

@charmitro charmitro marked this pull request as draft December 26, 2023 13:46
@panosfol panosfol marked this pull request as ready for review December 30, 2023 09:12
@charmitro
Copy link
Contributor

charmitro commented Dec 31, 2023 via email

@panosfol panosfol force-pushed the pg-connect branch 2 times, most recently from ccea3d3 to 1c00955 Compare December 31, 2023 19:08
src/postgres.c Outdated
Comment on lines 124 to 126
char *pg_pass =
(char *)calloc(prefixed_command_pass_size,
prefixed_command_pass_size * sizeof(char));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
char *pg_pass =
(char *)calloc(prefixed_command_pass_size,
prefixed_command_pass_size * sizeof(char));
char *pg_pass = calloc(prefixed_command_pass_size, sizeof(char));

This should be enough. Casting and multiplying size with sizeof(char) is not needed also.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run valgrind after making these changes, and I noticed that if i dont cast it as char * I get one more heap alloc, and one more free. Not sure if we care about this but I found it interesting. Should I cast it after all?

Copy link
Collaborator Author

@panosfol panosfol Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that was a mistake from my side. If i DONT cast i get these results so I will for sure not cast.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can't seem to figure it out, I still get one random alloc more irregardless if i cast as char * or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you are referring to. This is valgrind outputs (diffed) before and after.

--- #<buffer val2>
+++ #<buffer val>
@@ -1,9 +1,9 @@
 valgrind --leak-check=full ./cnc -f test.yaml
-==610146== Memcheck, a memory error detector
-==610146== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
-==610146== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
-==610146== Command: ./cnc -f test.yaml
-==610146== 
+==610191== Memcheck, a memory error detector
+==610191== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
+==610191== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
+==610191== Command: ./cnc -f test.yaml
+==610191== 
 test.yaml
 pg_dump: last built-in OID is 16383
 pg_dump: reading extensions
@@ -57,19 +57,19 @@
 pg_restore: connecting to database for restore
 pg_restore: implied data-only restore
 Database Replication was successful!
-==610146== 
-==610146== HEAP SUMMARY:
-==610146==     in use at exit: 153 bytes in 4 blocks
-==610146==   total heap usage: 59,762 allocs, 59,758 frees, 4,888,620 bytes allocated
-==610146== 
-==610146== LEAK SUMMARY:
-==610146==    definitely lost: 0 bytes in 0 blocks
-==610146==    indirectly lost: 0 bytes in 0 blocks
-==610146==      possibly lost: 0 bytes in 0 blocks
-==610146==    still reachable: 153 bytes in 4 blocks
-==610146==         suppressed: 0 bytes in 0 blocks
-==610146== Reachable blocks (those to which a pointer was found) are not shown.
-==610146== To see them, rerun with: --leak-check=full --show-leak-kinds=all
-==610146== 
-==610146== For lists of detected and suppressed errors, rerun with: -s
-==610146== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
+==610191== 
+==610191== HEAP SUMMARY:
+==610191==     in use at exit: 153 bytes in 4 blocks
+==610191==   total heap usage: 59,762 allocs, 59,758 frees, 4,888,240 bytes allocated
+==610191== 
+==610191== LEAK SUMMARY:
+==610191==    definitely lost: 0 bytes in 0 blocks
+==610191==    indirectly lost: 0 bytes in 0 blocks
+==610191==      possibly lost: 0 bytes in 0 blocks
+==610191==    still reachable: 153 bytes in 4 blocks
+==610191==         suppressed: 0 bytes in 0 blocks
+==610191== Reachable blocks (those to which a pointer was found) are not shown.
+==610191== To see them, rerun with: --leak-check=full --show-leak-kinds=all
+==610191== 
+==610191== For lists of detected and suppressed errors, rerun with: -s
+==610191== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run valgrind 2 times consecutively without changing anything. First output:

==610288== 
==610288== HEAP SUMMARY:
==610288==     in use at exit: 153 bytes in 4 blocks
==610288==   total heap usage: 59,762 allocs, 59,758 frees, 4,885,168 bytes allocated
==610288== 
==610288== LEAK SUMMARY:
==610288==    definitely lost: 0 bytes in 0 blocks
==610288==    indirectly lost: 0 bytes in 0 blocks
==610288==      possibly lost: 0 bytes in 0 blocks
==610288==    still reachable: 153 bytes in 4 blocks
==610288==         suppressed: 0 bytes in 0 blocks
==610288== Reachable blocks (those to which a pointer was found) are not shown.
==610288== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==610288== 
==610288== For lists of detected and suppressed errors, rerun with: -s
==610288== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Second output:

==610297== 
==610297== HEAP SUMMARY:
==610297==     in use at exit: 153 bytes in 4 blocks
==610297==   total heap usage: 59,763 allocs, 59,759 frees, 4,917,984 bytes allocated
==610297== 
==610297== LEAK SUMMARY:
==610297==    definitely lost: 0 bytes in 0 blocks
==610297==    indirectly lost: 0 bytes in 0 blocks
==610297==      possibly lost: 0 bytes in 0 blocks
==610297==    still reachable: 153 bytes in 4 blocks
==610297==         suppressed: 0 bytes in 0 blocks
==610297== Reachable blocks (those to which a pointer was found) are not shown.
==610297== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==610297== 
==610297== For lists of detected and suppressed errors, rerun with: -s
==610297== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Copy link
Contributor

@charmitro charmitro Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really care about these allocations , we should care about total byte allocations, which is much lower with the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read wrong probably. Can you diff your code and paste it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree but why do I get 2 different outcomes without changing anything? Is it an OS thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know which gcc and valgrind versions you use. GCC can have differences in the assembly generations between versions.

src/postgres.c Outdated
Comment on lines 115 to 116
int command_pass_size = strlen(pg_db_t->pg_conf->origin_password) + 1;
int prefixed_command_pass_size = PG_PASS_PREFIX + command_pass_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that command_pass is not the correct variable name for these. Maybe pg_pass_size or pass_env_var_size or whatever?

src/postgres.c Outdated
int command_pass_size = strlen(pg_db_t->pg_conf->origin_password) + 1;
int prefixed_command_pass_size = PG_PASS_PREFIX + command_pass_size;
int pg_command_size = strlen(PG_DUMP_COMMAND);
int prefix_pass_size = strlen("PGPASSWORD=");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this variable? Isn't PG_PASS_PREFIX the same?

src/postgres.c Outdated
Comment on lines 299 to 307
/*
* execve_binary
*
* 0 Success
* -1 Failure of`pg_dump` or `pg_restore`
* -2 Failure of creation of fork() or pipe()
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some documentation here?

src/postgres.c Outdated
Comment on lines 100 to 116
/*
* replicate
*
* 0 Success
* -1 Failure of`pg_dump` or `pg_restore`
* -2 Failure of creation of fork() or pipe()
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some documentation here?

The main logic of `replicate` is achieved by using `fork`
to create two new processes, one for `pg_dump` and one for
`pg_restore`. The commit also contains utilization of `log.h`
library and some fixes for better error handling.

Signed-off-by: Panagiotis Foliadis <pfoliadis@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants