Jump to content
funtoo forums
walterw

rc-status reports service as stopped even though it is running

Recommended Posts

I updated my systems and my recent builds are having this issue both on desktops and servers.  When running rc-status, some services are reported as stopped even though they are running and when I try to either do a service <service name> start or service <service name> restart, it reports that the service is already running.

 

I am only starting my services through the /usr/sbin/service command.  The only thing I can think of is checking the permissions at /run, and that seems fine.  What else / where else should I look to sort this out?

 

 

Thanks in advance,

Walter

Share this post


Link to post
Share on other sites

I would definitely restrict your troubleshooting to a single service at a time. Look at 'ps axu' and 'ps axu --forest' to be extra sure that the daemon is not running. You should be able to work-around this with the 'service foo zap' command (or /etc/init.d/scriptname zap) which will reset the 'running' status to 'not running' in the case that the daemon has died for some reason.

For extended troubleshooting, I recommend trying to start the daemon from the command-line with very similar options to how it is started from the script, but run it in debug mode so you get more output (you can often also run the daemon so it doesn't detach and will output the debug info to your console.)

Good luck, and post more detailed info if you need more help.

Share this post


Link to post
Share on other sites

Thanks - I am still debugging and am guessing I need to dig into /sbin/openrc-run.

  1. /var/log/messages capture the output from running service <service name> start and shows the service started
  2. The pidfile for that service exists (and has the correct PID for that service)
  3. The output of ps aux --forest | grep <service name> shows that PID and the same cmdline that the init script has

 

I am starting these services through a NetworkManager dispatcher script that executes through the at daemon (since it takes some time to complete).  No, zapping does not work because the process is still running.  When issuing a restart, I get the same problem.  I must kill that PID and then restart the service.  At that point, rc-status and service <service name> work fine.

 

I tested this for privoxy, but was previously also having the same problem for sshd, shorewall, and unbound.

Share this post


Link to post
Share on other sites

For OpenRC to manage a service through an initscript, generally OpenRC (the initscript) needs to be the thing that actually starts the service. If you are having NetworkManager start the daemons, then generally it is the one that manages them. Not sure if that is the issue but OpenRC scripts are generally only going to stop what they have started. There is no hard rule for this but it is just how the system is designed to work.

Share this post


Link to post
Share on other sites

Thanks for the reply - NetworkManager is merely acting as a catalyst to tell OpenRC to start the service.  The process is like this:

 

1. NetworkManager starts (part of the default runlevel)

2. NetworkManager joins a specific network and fires an event, invokes dispatcher script

3. The NetworkManager dispatcher script calls a bunch of stuff which ends up calling service <service name> restart.  I call restart because the service may already be running and was just reconfigured for a given network.  These services are not part of the default or boot runlevels.

 

OpenRC is still starting/stopping the services.  They're just invoked via NetworkManager.

Share this post


Link to post
Share on other sites

Thanks - I'm still thinking on how exactly I could debug this.  I think it is one of 2 things, a configuration I'm doing or something with OpenRC.  Since many people use OpenRC and haven't reported this such issue, I suspect it is more of a configuration I'm doing.

Share this post


Link to post
Share on other sites

I am still debugging this.

So, I went through my logs and found I have this issue for both init scripts that I wrote and that were prepackaged with Gentoo/Funtoo.  That said, I am investigating one that I wrote for argus.  One issue is that the pidfile was not created; however, I even manually created it and the start-stop-daemon doesn't seem to be recognizing that it is started.

 

I'm going through the source for openrc to try to piece together what's going on: https://github.com/OpenRC/openrc/blob/90d9ea656ff7c6b5d618df4e4261ebfa4033f1a8/src/rc/start-stop-daemon.c

 

Line 690 appears to be where it is finding that an existing PID exists.

 

Lastly, I did something more interesting.  Unbound was running and I could restart it with no issues.  I did the following:

1. rename the pid file from /run/unbound.pid to /run/unbound.pid.bak

2. issued service unbound restart

3. got what I expected, socket already in use, failed to start unbound

3. rename the pid file from /run/unbound.pid.bak to /run/unbound.pid

4. issue service unbound restart

5. got what I did NOT expect, same issue I have above.

 

When I purposely screw up the state of the the daemon, the files under /run/openrc/daemons disappear.  These files appear to be critical in determining the state of the daemon EVEN IF I PUT THE PIDFILE BACK.  So, I started unbound and got it running, then backed up those files under openrc:

openrc/daemons/unbound
openrc/daemons/unbound/001
openrc/started/unbound
 

 

I then screwed up the state of it by moving the pidfile.  At this point, I could not restart unbound as predicted.  So, I restored the state files under /run/openrc and voila, I was able to start/stop unbound again.

 

Hmm, so with all of that, it seems that if the pidfile fails to be created for whatever reason, openrc is unable to correctly determine the state of the daemon.

 

So, why is the pidfile failing to be created if I get a successful status when this stuff starts up?  Lastly, if it fails to be created, I need to check, but openrc should return an error if it isn't already doing so.

Share this post


Link to post
Share on other sites
Guest
You are commenting as a guest. If you have an account, please sign in.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoticons maximum are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×