Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changing xcat from to postgres threw away node switch information. #2037

Closed
ralphbellofatto opened this issue Oct 26, 2016 · 9 comments
Closed

Comments

@ralphbellofatto
Copy link

While closing out this issue we found a problem with xcat:
https://gitlabhost.rtp.raleigh.ibm.com/4A8491897/research-92-cluster/issues/141

We migrated our xcat db from SQLite to postgres by followed the directions here:

http://xcat-docs.readthedocs.io/en/stable/advanced/hierarchy/databases/postgres_configure.html

specifically we executed:

1) yum install postgresql*
yum install perl-DBD-Pg

2) pgsqlsetup -i -V  (-V will spit out verbose output, looking at trimming this down some..) 

That should automatically migrate the existing SQLite DB to PostgreSQL 

1:59:21 PM: Will look like this: 

[root@fs4 log]# lsxcatd -a
Version 2.12.4 (git commit 2bf96ee97df3e3a04ef72c42f2e756af22ecad5a, built Tue Oct 25 11:15:48 EDT 2016)
This is a Management Node
cfgloc=Pg:dbname=xcatdb;host=192.168.3.25|xcatadm
dbengine=Pg
dbname=xcatdb
dbhost=192.168.3.25
dbadmin=xcatadm

[root@fs4 log]# cat /etc/xcat/cfgloc 
Pg:dbname=xcatdb;host=192.168.3.25|xcatadm|XXXXX

When we were done, we compared the output of the lsdef -l command with what it was prior to doing the update and we found the following two fields are missing on every node:

switch=c460tors01
switchport=3

The following attachment contains the before(orig) and after(new) node definitions extracted by xcat.
xcat-postgres-migrate.zip

@ralphbellofatto
Copy link
Author

NOTE:
We attempted to restore the above configuration with
cat node.lsdef | chdef -z

And it failed to update the switch ports...

We extracted the data by hand from the lsdef file and then constructed a script as follows:

chdef c460c001 switch=c460tors01 switchport=3
chdef c460c002 switch=c460tors01 switchport=4
chdef c460c003 switch=c460tors01 switchport=5
chdef c460c004 switch=c460tors01 switchport=6
chdef c460c005 switch=c460tors01 switchport=7

And then the data stuck...

@ralphbellofatto
Copy link
Author

reopening...

@whowutwut
Copy link
Member

To add to this, if we create a group with switch/switchport defined in the group, it's not able to be stored into the postgres DB. The chdef command does not report and error.

[root@fs4 fs4]# chdef -t group -o disco switchport='|\D+(\d+)\D+(\d+)|($2*1)|'
1 object definitions have been created or modified.
[root@fs4 fs4]# lsdef -t group  -o disco -i switchport
Object name: disco
    switchport=

@tingtli Can you review the FVT scenarios for PostGresDB and the test for attributes against the DB?

@whowutwut
Copy link
Member

whowutwut commented Oct 26, 2016

This one must be resolved in 2.12.4, adding high priority. Looks like duplicate with #2007

@immarvin
Copy link
Contributor

hi @chenglch , would you please take a look at this?

@chenglch
Copy link
Contributor

chenglch commented Oct 27, 2016

Look at table structure at first, the column node, switch and port are defined as composite primary keys. On sqlite and mysql db, one of the filed with null value within the composite primary keys is accepted. Actually, none of the field within the composite primary keys is allowed to be null according to the standard rule of SQL, and this is what we encountered on postgres SQL.

switch => {
cols => [qw(node switch port vlan interface comments disable)],
keys => [qw(node switch port)],

Maybe we will encounter more problems not only the switch table when we switch to the postgres sql. The root cause is the bad programming habit - without error handler ...

chenglch added a commit to chenglch/xcat-core that referenced this issue Oct 27, 2016
As history reasons, null value is always set within the composite primary
keys, this patch is just a work aroud for the issue encountered on postgres.
Correct fix is to report error to the client side, but too much error handler
is missing in xcat code

close-issue: xcat2#2037
close-issue: xcat2#2007
@daniceexi
Copy link
Contributor

@chenglch Thanks for the digging.

@immarvin is it possible to add certain check in the mkdef cmd to find out the case that missing certain composite key.

chenglch added a commit to chenglch/xcat-core that referenced this issue Oct 31, 2016
As history reasons, null value is always set within the composite primary
keys, this patch is just a work aroud for the issue encountered on postgres.
Correct fix is to report error to the client side, but too much error handler
is missing in xcat code

close-issue: xcat2#2037
close-issue: xcat2#2007
immarvin pushed a commit that referenced this issue Nov 1, 2016
…2045)

As history reasons, null value is always set within the composite primary
keys, this patch is just a work aroud for the issue encountered on postgres.
Correct fix is to report error to the client side, but too much error handler
is missing in xcat code

close-issue: #2037
close-issue: #2007
@whowutwut
Copy link
Member

This one should have been resolved in the 2.12.4 release. @tingtli Can you assign FVT team to help verify and close out?

@whowutwut
Copy link
Member

I'm going to close this one, please re-open if issues persist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants