[Lcdproc] LCDd not loading properly

Stewart W. Putnam stewartputnam@comcast.net
Wed Apr 25 16:36:02 2007


To see if the CFontz driver has completed init properly so that 
wave_to_parent should be called next, look for lines like:
 Date: time localhost LCDd: CFontz: init() done
 Date: time localhost LCDd: Driver [CFontz] loaded

To see if the child is actually calling the wave_to_parent function:
 Date: time localhost LCDd: wave_to_parent( parent_pid=2700 )

To see if the parent wait() returned:
 Date: time localhost LCDd: child_ok_func( signal=10 )
 Date: time localhost LCDd: Got OK signal from child.

At least one of these reports will ony appear if --enable debug is 
passed to configure and ReportLevel=5

I see in server/main.c the calls
    kill( parent_pid, SIGUSR1 );
    wait( &child_status );
have no error checking, so if  either of these are failing it would not 
be reported currently.  /* best place to add a little debug code */  And 
there are comments in several places suggesting some previous difficulty 
with this:

        /* Exit now !    because of bug? in wait() */
        _exit( 0 ); /* Parent exits normally. */

        /* Install handler at parent for child's signal */
        /* sigaction should be more portable than signal, but it does not
         * work for some reason. */

                wait( &child_status );
                /* BUG? According to the man page wait() should also return
                 * when a signal comes in that is caught. Instead it
                 * continues to wait. */


Peter McCurdy wrote:

> On 4/24/07, brian <turbo@talstar.com> wrote:
>
>> Peter McCurdy wrote:
>> > It sounds like there's a problem in having the child process signal
>> > the parent process that it's up and running.  When the original LCDd
>> > process forks the child to run in the background, it sits in a wait(2)
>> > system call for the child to either die (so the parent can exit
>> > abnormally) or send signal SIGUSR1 saying that it started OK (so the
>> > parent can exit normally).
>> >
>> > What happens if you 'kill -USR1' the parent LCDd process while it's 
>> stuck?
>>
>>    Okay, I just tried that.  Opened two ssh sessions into my MythTV 
>> box, and shut down the LCDd service.  In session one, I then ran
>> "service LCDd start" and it 'hung' as usual.
>>
>>    In the second session, I first ran "ps ax | grep LCDd" and saw:
>>
>>      [root@myth ~]# ps ax | grep LCDd
>>       8500 pts/2    S+     0:00 /bin/sh /sbin/service LCDd start
>>       8503 pts/2    S+     0:00 /bin/sh /etc/init.d/LCDd start
>>       8506 pts/2    S+     0:00 /bin/bash -c ulimit -S -c 0 
>> >/dev/null 2>&1 ; /usr/local/sbin/LCDd -c /usr/local/etc/LCDd.conf
>>       8507 pts/2    S+     0:00 /usr/local/sbin/LCDd -c 
>> /usr/local/etc/LCDd.conf
>>       8508 ?        Ss     0:00 /usr/local/sbin/LCDd -c 
>> /usr/local/etc/LCDd.conf
>>       8510 pts/3    R+     0:00 grep LCDd
>>
>>    So, then I typed "kill -USR1 8507" in the second session, and in 
>> the first session, saw:
>>
>>      [root@myth ~]# service LCDd start
>>      Starting up LCDd:                                          [  OK  ]
>>      [root@myth ~]#
>>
>>    ...in other words, *that* worked, and the service is running as 
>> expected:
>>
>>      [root@myth ~]# ps ax | grep LCDd
>>       8508 ?        Ss     0:00 /usr/local/sbin/LCDd -c 
>> /usr/local/etc/LCDd.conf
>>       8515 pts/3    R+     0:00 grep LCDd
>>
>>   So, now the question is:  Why isn't the child process sending the 
>> SIGUSR1 signal?
>
>
> That's where I get lost.  Looking at the code, there doesn't seem to
> be any plausible way for the child process to keep running (as it
> does) but not send SIGUSR1.  I don't know many details about your
> MythTV system; I gather it's based on Redhat, but what version?  Does
> it do strange things with signals elsewhere?  Do you have a debugger
> on there that can attach to the process when it starts?  If so, put a
> breakpoint on the "wave_to_parent" function and see if it even tries
> to send the signal.
>
> It's particularly strange since this same code, including the init
> script, works just fine for me on CentOS 4 and 5 (aka Red Hat
> Enterprise Linux) without any fiddling.  So there's got to be
> something odd with the system, either at build time or at runtime.
>
> Peter.
> _______________________________________________
> LCDproc mailing list
> LCDproc@lists.omnipotent.net
> http://lists.omnipotent.net/mailman/listinfo/lcdproc
>