Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing 'config load_minigraph' got the session stuck forever #145

Closed
Anandaraj-Maharajan opened this issue Jan 20, 2018 · 14 comments
Closed

Comments

@Anandaraj-Maharajan
Copy link

Anandaraj-Maharajan commented Jan 20, 2018

root@switch1:/home/admin# show version
SONiC Software Version: SONiC.HEAD.466-8cfa223
Distribution: Debian 8.10
Kernel: 3.16.0-4-amd64
Build commit: 8cfa223
Build date: Fri Jan 19 07:46:47 UTC 2018
Built by: johnar@jenkins-worker-4

Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm HEAD.466-8cfa223 ba4f612f2f21 317.3 MB
docker-syncd-brcm latest ba4f612f2f21 317.3 MB
docker-orchagent-brcm HEAD.466-8cfa223 00b26f843707 258.3 MB
docker-orchagent-brcm latest 00b26f843707 258.3 MB
docker-lldp-sv2 HEAD.466-8cfa223 463421931d7f 255.4 MB
docker-lldp-sv2 latest 463421931d7f 255.4 MB
docker-dhcp-relay HEAD.466-8cfa223 33afa8dbdafd 252 MB
docker-dhcp-relay latest 33afa8dbdafd 252 MB
docker-database HEAD.466-8cfa223 12d8072fd2be 250.7 MB
docker-database latest 12d8072fd2be 250.7 MB
docker-teamd HEAD.466-8cfa223 a02ee624f33a 255.4 MB
docker-teamd latest a02ee624f33a 255.4 MB
docker-snmp-sv2 HEAD.466-8cfa223 388d5998f08d 290.5 MB
docker-snmp-sv2 latest 388d5998f08d 290.5 MB
docker-router-advertiser HEAD.466-8cfa223 636ac6b4cf17 248.3 MB
docker-router-advertiser latest 636ac6b4cf17 248.3 MB
docker-platform-monitor HEAD.466-8cfa223 4f5b3285c8c5 270 MB
docker-platform-monitor latest 4f5b3285c8c5 270 MB
docker-fpm-quagga HEAD.466-8cfa223 86070843448c 261.9 MB
docker-fpm-quagga latest 86070843448c 261.9 MB

root@switch1:/home/admin#

In a vanilla box, tried changing the hostname in minigraph. Changed it under Device tag and ran config load_minigraph

  <Device i:type="LeafRouter">
    <Hostname>Z9100-17</Hostname>          <<<<<<
    <HwSku>Force10-Z9100</HwSku>
  </Device>

root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Traceback (most recent call last):
File "/usr/local/bin/sonic-cfggen", line 220, in
main()
File "/usr/local/bin/sonic-cfggen", line 165, in main
deep_update(data, parse_xml(minigraph, data['platform']))
File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
'type': devices[hostname]['type']
KeyError: 'switch1'
root@switch1:/etc/sonic#

After the error changed it in another place at the end and executed the command again, but the command is stuck forever. Only ctrl+c works to get out of this.

root@switch1:/etc/sonic# vi minigraph.xml

Z9100-17 <<<< Force10-Z9100

root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y

^C
Aborted!
root@switch1:/etc/sonic# vi minigraph.xml
root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
^C
Aborted!
root@switch1:/etc/sonic#

@prsunny
Copy link
Contributor

prsunny commented Jan 20, 2018

Can you upload the minigraph file?

@Anandaraj-Maharajan
Copy link
Author

This is the default minigraph file that was there when i installed sonic in my Z9100. changed the hostname in line 1042 then in line 1077. Have highlighted the same in the attached file as well.

minigraph.txt

@jleveque
Copy link
Contributor

jleveque commented Jan 20, 2018

@Anandaraj-Maharajan: It sounds like the the first call to config load_minigraph that failed left the ConfigDB in a bad state (possibly empty) which then caused subsequent calls to config load_minigraph to hang.

Could you please run the following command and post the output: redis-cli -n 4 keys *

@prsunny
Copy link
Contributor

prsunny commented Jan 20, 2018

I just did config load_minigraph with the uploaded file and it works on my Z9100 unit with a different image.

admin@str-z9100-acs-2:~$ sudo config load_minigraph 
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Running command: echo switch1 > /etc/hostname
Running command: hostname -F /etc/hostname
Running command: sed -i "/\sstr-z9100-acs-2$/d" /etc/hosts
Running command: echo "127.0.0.1 switch1" >> /etc/hosts
Running command: service interfaces-config restart

admin@switch1:~$ 
admin@switch1:~$ hostname
switch1

Please provide the output requested by Joe.

@Anandaraj-Maharajan
Copy link
Author

Anandaraj-Maharajan commented Jan 20, 2018

@prsunny switch1 is the default hostname. change it to some other name in line 1042, try config load_minigraph you will see the error. subsequent changes in minigraph and config load_minigraph will fail..

@jleveque here is redis-cli output before and after getting the error

root@switch1:~# redis-cli -n 4 keys *

  1. "BGP_NEIGHBOR|10.0.0.39"
  2. "PORT|Ethernet48"
  3. "MIRROR_SESSION|everflow0"
  4. "PORT|Ethernet44"
  5. "PORT|Ethernet28"
... 132) "PORT|Ethernet4" 133) "DEVICE_NEIGHBOR|Ethernet56" 134) "PORT|Ethernet124" 135) "BGP_NEIGHBOR|10.0.0.43" 136) "INTERFACE|Ethernet80|10.0.0.40/31" root@switch1:~# cd /etc/sonic/ root@switch1:/etc/sonic# ls config_db.json deployment_id_asn_map.yml init_cfg.json minigraph.xml minigraph.xml.bak snmp.yml sonic_version.yml updategraph.conf root@switch1:/etc/sonic# vi minigraph.xml root@switch1:/etc/sonic# config load_minigraph Reload config from minigraph? [y/N]: y Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db Traceback (most recent call last): File "/usr/local/bin/sonic-cfggen", line 220, in main() File "/usr/local/bin/sonic-cfggen", line 165, in main deep_update(data, parse_xml(minigraph, data['platform'])) File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml 'type': devices[hostname]['type'] KeyError: 'switch1' root@switch1:/etc/sonic# redis-cli -n 4 keys * (error) ERR wrong number of arguments for 'keys' command root@switch1:/etc/sonic#

@jleveque
Copy link
Contributor

@Anandaraj-Maharajan: Thanks for the reply. Unfortunately, I didn't get the information I was looking for due to the "ERR wrong number of arguments for 'keys' command" error.

Please run the following command (note the addition of single quotes around the asterisk) and post the output: redis-cli -n 4 keys '*'

@Anandaraj-Maharajan
Copy link
Author

retried the command with single quotes

root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Traceback (most recent call last):
File "/usr/local/bin/sonic-cfggen", line 220, in
main()
File "/usr/local/bin/sonic-cfggen", line 165, in main
deep_update(data, parse_xml(minigraph, data['platform']))
File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
'type': devices[hostname]['type']
KeyError: 'switch1'
root@switch1:/etc/sonic# redis-cli -n 4 keys '*'
(empty list or set)
root@switch1:/etc/sonic#

@jleveque
Copy link
Contributor

The database is empty after the first call to config load_minigraph crashed, as I suspected.

Try running the following command: redis-cli -n 4 SET CONFIG_DB_INITIALIZED true

After running this command, config load_minigraph should work again (unless it ever crashes again, in which case you may need to repeat this process).

I will submit an issue for this, as the ConfigDB should never get into a broken state like this, even if config load_minigraph should happen to crash.

@Anandaraj-Maharajan
Copy link
Author

The command config load_minigraph works but again crashes and database is empty like the first time.

root@switch1:/etc/sonic# redis-cli -n 4 SET CONFIG_DB_INITIALIZED true
OK
root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Traceback (most recent call last):
File "/usr/local/bin/sonic-cfggen", line 220, in
main()
File "/usr/local/bin/sonic-cfggen", line 165, in main
deep_update(data, parse_xml(minigraph, data['platform']))
File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
'type': devices[hostname]['type']
KeyError: 'switch1'
root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
^C
Aborted!
root@switch1:/etc/sonic# redis-cli -n 4 keys ''
(empty list or set)
root@switch1:/etc/sonic# redis-cli -n 4 SET CONFIG_DB_INITIALIZED true
OK
root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Traceback (most recent call last):
File "/usr/local/bin/sonic-cfggen", line 220, in
main()
File "/usr/local/bin/sonic-cfggen", line 165, in main
deep_update(data, parse_xml(minigraph, data['platform']))
File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
'type': devices[hostname]['type']
KeyError: 'switch1'
root@switch1:/etc/sonic# redis-cli -n 4 keys '
'
(empty list or set)
root@switch1:/etc/sonic# redis-cli -n 4 SET CONFIG_DB_INITIALIZED true
OK
root@switch1:/etc/sonic# redis-cli -n 4 keys '*'

  1. "CONFIG_DB_INITIALIZED"
    root@switch1:/etc/sonic# config load_minigraph
    Reload config from minigraph? [y/N]: y
    Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
    Traceback (most recent call last):
    File "/usr/local/bin/sonic-cfggen", line 220, in
    main()
    File "/usr/local/bin/sonic-cfggen", line 165, in main
    deep_update(data, parse_xml(minigraph, data['platform']))
    File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
    'type': devices[hostname]['type']
    KeyError: 'switch1'
    root@switch1:/etc/sonic# redis-cli -n 4 keys '*'
    (empty list or set)
    root@switch1:/etc/sonic#

@jleveque
Copy link
Contributor

The crash appears to be due to a problem parsing your minigraph:

File "/usr/local/lib/python2.7/dist-packages/minigraph.py", line 389, in parse_xml
'type': devices[hostname]['type']
KeyError: 'switch1'

Double-check the minigraph is formatted correctly.

@Anandaraj-Maharajan
Copy link
Author

As I mentioned earlier, I changed only hostname key in one place and landed in this state. I went ahead replaced the default hostname 'switch1' to a custom name 'Z9100-17' in all the places, reinitialized the DB and load_minigraph. Now the command is stuck in a different location.

root@switch1:/etc/sonic# vi minigraph.xml
root@switch1:/etc/sonic# redis-cli -n 4 SET CONFIG_DB_INITIALIZED true
OK
root@switch1:/etc/sonic# config load_minigraph
Reload config from minigraph? [y/N]: y
Running command: sonic-cfggen -m -j /etc/sonic/init_cfg.json --write-to-db
Running command: service hostname-config restart
Running command: service interfaces-config restart

Stuck here

@jleveque
Copy link
Contributor

Well, now it appears to have loaded the minigraph properly. If service interfaces-config restart is hanging, maybe you can try rebooting the device and try again.

@Anandaraj-Maharajan
Copy link
Author

I rebooted the device, now the config load_minigraph doesn't hang anymore. The hostname is changed as well. Thanks.

@jleveque
Copy link
Contributor

You're welcome. Glad to hear everything's working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants