Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[systemd]: Add platform-init and role-config services; Deprecate rc.local #427

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Conversation

jleveque
Copy link
Contributor

@jleveque jleveque commented Mar 23, 2017

  • platform-init.service

    • Takes over the responsibilities formerly managed by rc.local. Use of rc.local is discouraged in the modern era of system/service managers, so I replaced rc.local with an aptly-named service, which also makes it easier to understand what its function is.
    • platform-init is started in the same relative order as rc.local was previously.
    • Like rc.local was, it is configured to run only once at first boot, not on subsequent boots.
    • Additionally, rc.local would get started simultaneously with updategraph.service, which allowed for a potential race condition. It was possible (albeit rare) for updategraph to download the proper minigraph only to have it overwritten by the default minigraph in rc.local. To fix this, I have added an explicit dependency between platform-init and updategraph services.
  • role-config.service

    • Handles dynamically enabling/disabling services at boot based on the device's role as determined by the minigraph.
    • Runs on every boot
    • Explicitly runs after updategraph.service to ensure it reads the appropriate minigraph
    • Writes or deletes files to/from /etc/sonic/role/ that are used as the condition to start certain services (currently teamd and dhcp_relay).


if [ $DEVICE_ROLE == "ToRRouter" ]; then
echo "Device requires DHCP relay service. Ensuring start file exists..."
touch $DHCP_RELAY_SERVICE_START_FILE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do systemctl disable dhcp-relay.service instead of using additional files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my original intent because it's very clean, and I tested it first. Unfortunately, when systemd starts up, it loads all of the unit files into memory, so calling systemctl disable has no effect on the current boot - the service you disabled will still be started (or vice-versa, if you enable a disabled service, it will not be started). The changes made will not take effect until the next boot.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel using systemctl disable/enable is cleaner. can we by default disable those two service (teamd/dhcprelay) and only enabled them ondemand?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that @jleveque is right and using systemd enable/disable will require starting services manually or restarting device for changes to take effect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean disable them by default, and then if we need to enable it, we can do start first and enable for the next reboot.

Copy link
Contributor Author

@jleveque jleveque Mar 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the question becomes "When do we start the role-specific services?" Doing what you describe (starting the services manually) defeats the dependency mechanism in systemd. The only safe solution would be to start these services last. How would we ensure this? Also, what if in the future we have dependencies between role-specific services? That would be nearly impossible to manage manually.

Also, what happens if someone changes the minigraph, and role-specific services are no longer needed? We would have to do the inverse: stop an already-started service, which feels quite sloppy to me.

marian-pritsak
marian-pritsak previously approved these changes Mar 24, 2017
Type=oneshot
ExecStartPre=/bin/echo "First boot detected. Initializing platform..."
ExecStart=/usr/bin/platform-init.sh
ExecStartPost=/bin/rm -f /host/platform/firsttime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/host/platform/firsttime [](start = 25, length = 24)

Is it possible to disable the service, instead of manipulating a file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be possible to disable the service after it completes. I hadn't considered the possibility, as the 'firsttime' file method is currently used, and it just seems a little strange to have a service disable itself.

Anyone else have an opinion on this?

stcheng
stcheng previously approved these changes Apr 8, 2017
jleveque added 11 commits April 10, 2017 23:04
 - device-config service will run only once on the first boot
 - This service installs platform drivers (what rc.local was doing), and
   also disables services that are not needed by that device based on
   the device's configuration (currently dhcp_relay and teamd)
 - rc.local use is discouraged in the modern era of system/service
   managers
…) and device-config (which runs on every boot)
 - Move role-config service and script in to role/ directory
 - role-config now writes or deletes files to/from /etc/sonic/role/
   directory
 - Services which are role-dependent use 'ConditionPathExists=' to
   determine whether or not to start
@jleveque
Copy link
Contributor Author

One other thing I would like to get feedback on: I'm not convinced that "role-config" is the best name for this service. Currently, the DHCP relay agent service is the only role-specific service. Teamd, however, is not specific to any particular role, but instead to any configuration utilizing LAG. Thus, I think this service should have a more appropriate name for posterity's sake. Any suggestions?

yxieca added a commit to yxieca/sonic-buildimage that referenced this pull request Mar 19, 2019
Submodule src/sonic-utilities f95da07..2fe01fe:
  > neighbor advertiser script (sonic-net#469)
  > [aclshow] restore PRIO column and sort entries by priority (sonic-net#476)
  > Update watermark default polling interval to 10s (sonic-net#470)
  > show interface status <interface-name> throws error (fixes sonic-net#427) (sonic-net#440)

Submodule src/sonic-swss 90eb25d..91171b6:
  > fix a unstable swss egress acl test (sonic-net#776)
  > [aclorch] Remove  L4 port range support limitation on egress ACL table and add new SWSS virtual test. (sonic-net#741)
  > Fix orchagent SEGV when PortConfigDone not set (sonic-net#803)

Submodule src/sonic-swss-common 2592b0c..5f4abd9:
  > Force only supported commands on consumer table (sonic-net#261)
  > Add multiple fields hdel support (sonic-net#267)


Signed-off-by: Ying Xie <ying.xie@microsoft.com>
yxieca added a commit that referenced this pull request Mar 19, 2019
…#2679)

Submodule src/sonic-utilities f95da07..2fe01fe:
  > neighbor advertiser script (#469)
  > [aclshow] restore PRIO column and sort entries by priority (#476)
  > Update watermark default polling interval to 10s (#470)
  > show interface status <interface-name> throws error (fixes #427) (#440)

Submodule src/sonic-swss 90eb25d..91171b6:
  > fix a unstable swss egress acl test (#776)
  > [aclorch] Remove  L4 port range support limitation on egress ACL table and add new SWSS virtual test. (#741)
  > Fix orchagent SEGV when PortConfigDone not set (#803)

Submodule src/sonic-swss-common 2592b0c..5f4abd9:
  > Force only supported commands on consumer table (#261)
  > Add multiple fields hdel support (#267)


Signed-off-by: Ying Xie <ying.xie@microsoft.com>
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 10, 2019
…sonic-net#2679)

Submodule src/sonic-utilities f95da07..2fe01fe:
  > neighbor advertiser script (sonic-net#469)
  > [aclshow] restore PRIO column and sort entries by priority (sonic-net#476)
  > Update watermark default polling interval to 10s (sonic-net#470)
  > show interface status <interface-name> throws error (fixes sonic-net#427) (sonic-net#440)

Submodule src/sonic-swss 90eb25d..91171b6:
  > fix a unstable swss egress acl test (sonic-net#776)
  > [aclorch] Remove  L4 port range support limitation on egress ACL table and add new SWSS virtual test. (sonic-net#741)
  > Fix orchagent SEGV when PortConfigDone not set (sonic-net#803)

Submodule src/sonic-swss-common 2592b0c..5f4abd9:
  > Force only supported commands on consumer table (#261)
  > Add multiple fields hdel support (#267)


Signed-off-by: Ying Xie <ying.xie@microsoft.com>
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 10, 2019
…sonic-net#2679)

Submodule src/sonic-utilities f95da07..2fe01fe:
  > neighbor advertiser script (sonic-net#469)
  > [aclshow] restore PRIO column and sort entries by priority (sonic-net#476)
  > Update watermark default polling interval to 10s (sonic-net#470)
  > show interface status <interface-name> throws error (fixes sonic-net#427) (sonic-net#440)

Submodule src/sonic-swss 90eb25d..91171b6:
  > fix a unstable swss egress acl test (sonic-net#776)
  > [aclorch] Remove  L4 port range support limitation on egress ACL table and add new SWSS virtual test. (sonic-net#741)
  > Fix orchagent SEGV when PortConfigDone not set (sonic-net#803)

Submodule src/sonic-swss-common 2592b0c..5f4abd9:
  > Force only supported commands on consumer table (#261)
  > Add multiple fields hdel support (#267)


Signed-off-by: Ying Xie <ying.xie@microsoft.com>
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 10, 2019
…sonic-net#2679)

Submodule src/sonic-utilities f95da07..2fe01fe:
  > neighbor advertiser script (sonic-net#469)
  > [aclshow] restore PRIO column and sort entries by priority (sonic-net#476)
  > Update watermark default polling interval to 10s (sonic-net#470)
  > show interface status <interface-name> throws error (fixes sonic-net#427) (sonic-net#440)

Submodule src/sonic-swss 90eb25d..91171b6:
  > fix a unstable swss egress acl test (sonic-net#776)
  > [aclorch] Remove  L4 port range support limitation on egress ACL table and add new SWSS virtual test. (sonic-net#741)
  > Fix orchagent SEGV when PortConfigDone not set (sonic-net#803)

Submodule src/sonic-swss-common 2592b0c..5f4abd9:
  > Force only supported commands on consumer table (#261)
  > Add multiple fields hdel support (#267)


Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Kalimuthu-Velappan pushed a commit to Kalimuthu-Velappan/sonic-buildimage that referenced this pull request Sep 12, 2019
) (sonic-net#440)

* sonic-utilities: Bug fix while showing interface status for a specific interface.

* While getting the keys from port table for a specific interface, wrong db is passed, fixed the bug

Signed-off-by: kiran.kella@broadcom.com
seiferteric pushed a commit to project-arlo/sonic-buildimage that referenced this pull request Oct 14, 2019
* Update src/sonic-sairedis from branch 'broadcom_sonic'
  to 9de2a0cd3371723c925341091c3ee9bc6df88527
  - Merge 201904 branch to broadcom_sonic branch on Mon Jul  1 13:57:56 PDT 2019
    
    Change-Id: Ibc8bd38f6819bda73ba52fff6358b794f7366d46
    
  - Fix typo  (sonic-net#467)
    
    * Rename to fix typo
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Correct shared_ptr creation parameters
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
  - Modify sai_create_port to breakout a port for virtual switch (sonic-net#454)
    
    For breakout a port, it needs to support create port and remove port.
    When create a port, it needs to init some attributes.
    
    Signed-off-by: chiourung_huang <chiourung_huang@edge-core.com>
    
  - Add synchronous clear_stats operation path (sonic-net#463)
    
    * Send clear_stats op from orchagent to syncd via Redis pipeline
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Handle clear_stats op in syncd
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Receive clear_stats op status response from sycnd in orchagent context
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Shift clear_stats to get synchronous response from ASIC
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Fix compilation error
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Fix log message output
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Remove debugging symbols
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Remove debugging symbols
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Change the validation order of KeyOpFieldsValuesTuple responded from
    syncd
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Expand status log utility to include op type as argument
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Address comments: check if object id is present in local db
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Leverage newly merged infrastructure to check if object id is present in
    the local db
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
    * Fix compile error
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
  - Full sleep wait change for PFC watchdog (sonic-net#465)
    
    * Sleep wait change for pfc watchdog, which still has plugins in place so
    isEmpty() always returns true.
    
    When issuing 'pfcwd stop' cli to stop PFC watchdog, it only clears the counter id list for physical ports under the hood, leaving lua script plugins still installed and ieEmpty() returns false.
    
    So even when PFC watchdog is stopped, the flex counter thread would wake up every 200 ms.
    
    Jun 4 17:58:51.273413 str-a7050-acs-1 ERR syncd#syncd: :- flexCounterThread: End of flex counter thread FC PFC_WD, took 33 ms
    Jun 4 17:58:51.471288 str-a7050-acs-1 ERR syncd#syncd: :- flexCounterThread: End of flex counter thread FC PFC_WD, took 30 ms
    Jun 4 17:58:51.672930 str-a7050-acs-1 ERR syncd#syncd: :- flexCounterThread: End of flex counter thread FC PFC_WD, took 31 ms
    Jun 4 17:58:51.882972 str-a7050-acs-1 ERR syncd#syncd: :- flexCounterThread: End of flex counter thread FC PFC_WD, took 40 ms
    Jun 4 17:58:52.074862 str-a7050-acs-1 ERR syncd#syncd: :- flexCounterThread: End of flex counter thread FC PFC_WD, took 31 ms
    
    This PR is to address the above observation:
    
    Fine granularize the isEmpty() to two functions---isIdsEmpty() and isPluginsEmpty(), and use the emptiness check of the counter id list, isIdsEmpty(), as the criteria for full sleep wait. This approach is generally valid because current flex counter use cases would have the presence of non-empty counter id list to be meaningful, followed by an optional execution of the Lua script plugins if they are installed.
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
  - Check if port VID exists in db on flex counter query (sonic-net#464)
    
    
  - add support for SAI_ATTR_VALUE_TYPE_ACL_CAPABILITY (sonic-net#460)
    
    Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
  - Full sleep wait flex counter polling thread when POLL_COUNTER_STATUS is disable (sonic-net#462)
    
    Signed-off-by: Wenda Ni <wenni@microsoft.com>
    
  - [debian] increment debian compatibility to 10 to enable parallel package build (sonic-net#461)
    
    * fix parallel build
    
    Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
    
    * [debian] increment debian compatibility to 10 to enable parallel package
    build
    
    From debhelper man pages:
    
    "If neither option is specified, debhelper currently defaults to
    --parallel in compat 10 (or later) and --no-parallel otherwise."
    
    Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
    
    * make tests run serial
    
    Signed-off-by: Stepan Blyschak <stepanb@mellanox.com>
    
  - [debian]: Change build order in target binary (sonic-net#452)
    
    Make binary-syncd the last target, so the non-rpc requisites will be available after the first build
    It will allow doing incremental builds later on for the main binary type
  - Add buffer pool stat to flex counter architecture (sonic-net#451)
    
    
  -  Watermark: test SAI clear api if the stats mode is READ_AND_CLEAR (sonic-net#427)
    
    * Test SAI clear api if the stats mode is READ_AND_CLEAR
    
    Signed-off-by: Wenda <wenni@microsoft.com>
    
    * Address compile error
madhanmellanox pushed a commit to madhanmellanox/sonic-buildimage that referenced this pull request Mar 23, 2020
* Add python test for setting RO attribute

* Use redis-py to obtain switch real object id

* Remove RedisClient connection
dmytroxshevchuk pushed a commit to dmytroxshevchuk/sonic-buildimage that referenced this pull request Aug 31, 2020
…onic-net#427)

* Test SAI clear api if the stats mode is READ_AND_CLEAR

Signed-off-by: Wenda <wenni@microsoft.com>

* Address compile error
qiluo-msft added a commit that referenced this pull request Dec 13, 2020
Including commits in sonic-swss-common repo:
```
40e695d 2020-12-12 | [pyext] Implement DBConnector::pubsub() to emulate python Redis.pubsub() (#427) [Qi Luo]
9a2bf6c 2020-12-12 | Add fdb_flush.v2.lua script to makefile (#425) [Kamil Cudnik]
7d60e7e 2020-12-07 | Handle exception in Select (#424) [Qi Luo]
```
@liat-grozovik liat-grozovik dismissed stale reviews from stcheng and marian-pritsak via f510a65 April 13, 2023 11:18
mssonicbld added a commit that referenced this pull request Jan 18, 2024
… automatically (#17822)

#### Why I did it
src/sonic-platform-common
```
* 65e3cc3 - (HEAD -> master, origin/master, origin/HEAD) Fix memory map parsing issue (#427) (18 minutes ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
yxieca pushed a commit that referenced this pull request Jan 22, 2024
… automatically (#17845)

src/sonic-platform-common

* 570bb3f - (HEAD -> 202311, origin/202311) Fix memory map parsing issue (#427) (3 days ago) [Stephen Sun]
mssonicbld added a commit that referenced this pull request Feb 8, 2024
… automatically (#18069)

#### Why I did it
src/sonic-platform-common
```
* b6f8a8d - (HEAD -> 202305, origin/202305) Fix memory map parsing issue (#427) (22 hours ago) [Stephen Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants