[wish-info] Something obvious?

Discussion:

Dave Close

2004-05-30 20:15:01 UTC

I'm trying to get a PowerLinc USB working. I tried for a while to get
WiSH 2 working on a 2.6 kernel, but no luck. Now that I understand
the difficulties and have read Scott's statement, I've given that up
and returned to 2.4. That ought to be easy, right? Lots of folks are
using it successfully...

Starting with a vanilla Wish 1.6.9 tar file and a customized 2.4.19
kernel, with all kernel source in the right place, I had no trouble
compiling or installing the program. Of course, the install script puts
the modules in /usr/local, so I moved them to /lib/modules where my
kernel can find them. (I don't see any build dependencies on that.) The
example init script does not include rmmod hid, but I've added that. I
modified the init script to replace x10_pl with x10_plusb both places.
Here's what the kernel logs when I start it.

usb.c: deregistering driver hiddev
usb.c: deregistering driver hid
x10: X10 Transceiver module v1.6.9 (***@sprintmail.com)
x10: $Id: x10_core.c,v 1.47 2004/02/08 19:40:54 whiles Exp $
x10: $Id: x10_ldisc_plusb.c,v 1.17 2003/05/30 03:19:42 whiles Exp whiles $
x10: $Id: x10_xcvr_plusb.c,v 1.14 2003/05/27 05:49:09 whiles Exp $
usb.c: registered new driver PowerLincUSB
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0115e6a
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0115e6a>] Not tainted
EFLAGS: 00010092
eax: e09c6610 ebx: 00000000 ecx: d6e53e14 edx: d6e53e0c
esi: 00000202 edi: 00000008 ebp: 00000000 esp: d6e53e00
ds: 0018 es: 0018 ss: 0018
Process insmod (pid: 2112, stackpage=d6e53000)
Stack: e09c6604 d6e52000 c010773f 00000001 d6e52000 e09c6610 00000000 d6e53e78
d6e53e78 00000008 00000000 c010780b e09c6604 d6e53e78 e09c15c3 dfb63200
00000000 e0858d22 dfb63200 e0858c39 dfcc14c0 d6e53e78 e09c4ac0 e09c6040
Call Trace: [<e09c6604>] [<c010773f>] [<e09c6610>] [<c010780b>] [<e09c6604>]
[<e09c15c3>] [<e0858d22>] [<e0858c39>] [<e09c4ac0>] [<e09c6040>] [<e09bf08e>]
[<e09bbb40>] [<e09bb9e9>] [<e09c4ac0>] [<e09c6040>] [<e09c4ac0>] [<e09bef1f>]
[<e09c6040>] [<c0117a80>] [<e09c4ac0>] [<e09bb32f>] [<e09c17be>] [<e09c4a60>]
[<c012e601>] [<c0118755>] [<e09c38b4>] [<e09bb060>] [<c010891b>]

Code: 89 0b 56 9d 5b 5e c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90

A search of the mailing list archive doesn't show any previous reports
of this problem. And, as I said, I haven't modified the code in any way,
just compiled it straight "out of the box". Somewhere there is a symbol
that seems to be unresolved. What am I missing?

--
Dave Close, Compata, Costa Mesa CA +1 714 434 7359
***@compata.com ***@alumni.caltech.edu
"..the last seven decades of the twentieth century will be characterized
in history as the dark ages of theoretical physics." -- Carver Mead

Scott Hiles

2004-05-30 21:39:15 UTC

Permalink

Under some kernels, when x10_plusb.o is placed in the kernel path, it will
load both hid.o and x10_plusb.o. X10_plusb.o will output that it loaded but
will not be able to get the USB device since hid.o is holding onto it. When
you attempt to remove hid.o, the kernel will crash. This is why x10_plusb.o
is in /usr/local/lib/modules... And allows you to remove hid.o before you
attempt to load x10_plusb.o.

Scott

-----Original Message-----
Sent: Sunday, May 30, 2004 4:15 PM
Subject: [wish-info] Something obvious?
I'm trying to get a PowerLinc USB working. I tried for a
while to get WiSH 2 working on a 2.6 kernel, but no luck. Now
that I understand the difficulties and have read Scott's
statement, I've given that up and returned to 2.4. That ought
to be easy, right? Lots of folks are using it successfully...
Starting with a vanilla Wish 1.6.9 tar file and a customized
2.4.19 kernel, with all kernel source in the right place, I
had no trouble compiling or installing the program. Of
course, the install script puts the modules in /usr/local, so
I moved them to /lib/modules where my kernel can find them.
(I don't see any build dependencies on that.) The example
init script does not include rmmod hid, but I've added that.
I modified the init script to replace x10_pl with x10_plusb
both places. Here's what the kernel logs when I start it.
usb.c: deregistering driver hiddev
usb.c: deregistering driver hid
x10: $Id: x10_core.c,v 1.47 2004/02/08 19:40:54 whiles Exp $
x10: $Id: x10_ldisc_plusb.c,v 1.17 2003/05/30 03:19:42 whiles
Exp whiles $
x10: $Id: x10_xcvr_plusb.c,v 1.14 2003/05/27 05:49:09 whiles Exp $
usb.c: registered new driver PowerLincUSB
Unable to handle kernel NULL pointer dereference at virtual
address 00000000 printing eip: c0115e6a *pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0115e6a>] Not tainted
EFLAGS: 00010092
eax: e09c6610 ebx: 00000000 ecx: d6e53e14 edx: d6e53e0c
esi: 00000202 edi: 00000008 ebp: 00000000 esp: d6e53e00
ds: 0018 es: 0018 ss: 0018
Process insmod (pid: 2112, stackpage=d6e53000)
Stack: e09c6604 d6e52000 c010773f 00000001 d6e52000 e09c6610
00000000 d6e53e78
d6e53e78 00000008 00000000 c010780b e09c6604 d6e53e78
e09c15c3 dfb63200
00000000 e0858d22 dfb63200 e0858c39 dfcc14c0 d6e53e78
e09c4ac0 e09c6040
Call Trace: [<e09c6604>] [<c010773f>] [<e09c6610>]
[<c010780b>] [<e09c6604>]
[<e09c15c3>] [<e0858d22>] [<e0858c39>] [<e09c4ac0>]
[<e09c6040>] [<e09bf08e>] [<e09bbb40>] [<e09bb9e9>]
[<e09c4ac0>] [<e09c6040>] [<e09c4ac0>] [<e09bef1f>]
[<e09c6040>] [<c0117a80>] [<e09c4ac0>] [<e09bb32f>]
[<e09c17be>] [<e09c4a60>] [<c012e601>] [<c0118755>]
[<e09c38b4>] [<e09bb060>] [<c010891b>]
Code: 89 0b 56 9d 5b 5e c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90
A search of the mailing list archive doesn't show any
previous reports of this problem. And, as I said, I haven't
modified the code in any way, just compiled it straight "out
of the box". Somewhere there is a symbol that seems to be
unresolved. What am I missing?
--
Dave Close, Compata, Costa Mesa CA +1 714 434 7359
"..the last seven decades of the twentieth century will be
characterized in history as the dark ages of theoretical
physics." -- Carver Mead
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market...
Oracle 10g.
Take an Oracle 10g class now, and we'll give you the exam
FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
wish-info mailing list
https://lists.sourceforge.net/lists/listinfo/wish-info

Dave Close

2004-05-30 22:13:23 UTC

Permalink

Post by Scott Hiles
Under some kernels, when x10_plusb.o is placed in the kernel path, it will
load both hid.o and x10_plusb.o. X10_plusb.o will output that it loaded but
will not be able to get the USB device since hid.o is holding onto it. When
you attempt to remove hid.o, the kernel will crash. This is why x10_plusb.o
is in /usr/local/lib/modules... And allows you to remove hid.o before you
attempt to load x10_plusb.o.

Thanks. But that doesn't seem to be the case for my kernel. The log
(which I posted) doesn't show it trying to reload hid. Here's the lsmod
output after the problem. There are some other USB modules loaded.

Module Size Used by Not tainted
x10_plusb 46620 1 (initializing)
soundcore 6628 0 (autoclean)
binfmt_aout 5008 0
parport_pc 18244 1 (autoclean)
lp 8224 0 (autoclean)
parport 34368 1 (autoclean) [parport_pc lp]
autofs 11204 0 (autoclean) (unused)
nfs 80700 4 (autoclean)
lockd 55456 1 (autoclean) [nfs]
sunrpc 73780 1 (autoclean) [nfs lockd]
tulip 40864 1
microcode 4444 0 (autoclean)
ide-scsi 9792 0
ide-cd 35264 0
cdrom 34176 0 [ide-cd]
printer 8512 0
usb-storage 40036 1
mousedev 5088 0 (unused)
keybdev 2464 0 (unused)
input 5664 0 [mousedev keybdev]
ehci-hcd 24992 0 (unused)
usb-ohci 20768 0 (unused)
usbcore 78784 1 [x10_plusb printer usb-storage ehci-hcd usb-ohci]
rtc 7836 0 (autoclean)
ext3 58672 4
jbd 36960 4 [ext3]
aic7xxx 118576 0 (unused)
sd_mod 10288 2
scsi_mod 83408 4 [ide-scsi usb-storage aic7xxx sd_mod]

--
Dave Close, Compata, Costa Mesa CA +1 714 434 7359
***@compata.com ***@alumni.caltech.edu
"The Edsel is here to stay." -- Henry Ford II, 1957

Dave Close

2004-05-31 19:58:15 UTC

Permalink

More information. I edited the code to turn on debugging, then found
this output below in syslog. It appears that the code is entering the
x10_plusb_write() routine but not getting much further.

usb.c: deregistering driver hiddev
usb.c: deregistering driver hid
x10: X10 Transceiver module v1.6.9 (***@sprintmail.com)
x10: $Id: x10_core.c,v 1.47 2004/02/08 19:40:54 whiles Exp $
x10: $Id: x10_ldisc_plusb.c,v 1.17 2003/05/30 03:19:42 whiles Exp whiles $
x10: $Id: x10_xcvr_plusb.c,v 1.14 2003/05/27 05:49:09 whiles Exp $
x10_core.c/271/x10_init(): starting initialization
x10_core.c/324/x10_init(): DEVFS not active, using standard mode
x10_core.c/326/x10_init(): registering data_major device 120 for x10d
x10_core.c/340/x10_init(): registering control_major unit 121 for x10c
x10_core.c/354/x10_init(): X10 /dev devices registered
x10_ldisc_plusb.c/244/ldisc_init(): PowerLinc USB line discipline loading
x10_ldisc_plusb.c/187/usb_data_init() called
usb.c: registered new driver PowerLincUSB
x10_ldisc_plusb.c/378/plusb_probe() called
x10_ldisc_plusb.c/382/plusb_probe(): vendor=0x00ff product=0x0022 (no match)
x10_ldisc_plusb.c/204/plusb_delete() called
x10_ldisc_plusb.c/255/ldisc_init(): PowerLinc USB line discipline loaded (result=0)
x10_core.c/521/x10io_register_comm() called
x10_ldisc_plusb.c/283/x10_plusb_connect() called
x10_ldisc_plusb.c/286/x10_plusb_connect(): PowerLinc USB line discipline loaded
x10_xcvr_plusb.c/132/xcvr_init() called
x10_core.c/431/x10io_register_device() called
x10_xcvr_plusb.c/159/xcvr_connect() called
x10_ldisc_plusb.c/295/x10_plusb_write() called
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0115e6a
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0115e6a>] Not tainted
EFLAGS: 00010096
eax: e09c6690 ebx: 00000000 ecx: d679de10 edx: d679de08
esi: 00000202 edi: 00000008 ebp: 00000000 esp: d679ddfc
ds: 0018 es: 0018 ss: 0018
Process insmod (pid: 2272, stackpage=d679d000)
Stack: e09c6684 d679c000 c010773f 00000001 d679c000 e09c6690 00000000 d679de78
d679de78 00000008 00000000 c010780b e09c6684 00000001 e09c1613 e09c1f3b
e09c1e53 00000128 e09c1ede e09c1f51 e09c6504 00000000 d679de78 e09c4b40
Call Trace: [<e09c6684>] [<c010773f>] [<e09c6690>] [<c010780b>] [<e09c6684>]
[<e09c1613>] [<e09c1f3b>] [<e09c1e53>] [<e09c1ede>] [<e09c1f51>] [<e09c6504>]
[<e09c4b40>] [<e09c60c0>] [<e09bf08e>] [<e09c1d34>] [<e09bbb40>] [<e09bb9e9>]
[<e09c4b40>] [<e09c60c0>] [<e09c4b40>] [<e09bef1f>] [<e09c60c0>] [<e09c1cf3>]
[<e09c1d04>] [<c0117a80>] [<e09c4b40>] [<e09bb32f>] [<e09c17ea>] [<e09c17bb>]
[<e09c17c6>] [<e09c17fd>] [<e09c181e>] [<c0118755>] [<e09c3934>] [<e09bb060>]
[<c010891b>]

Code: 89 0b 56 9d 5b 5e c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90

Since there is a hex dump which should result soon after entering that
routine (with debug turned on), and that dump does not appear in the log, I
would guess there must be something wrong in the two statements before that.

ANNOUNCE;
if (!test_bit(0,&connected)) {
dbg("%s","write called before connected");
return -ENODEV;
}
if (down_interruptible(&sem_transmit))
return -ERESTARTSYS;
dbg("%s",dumphex(buf,len));

I've inserted a few additional debug statements, but I still don't see the
problem. Both connected and sem_transmit have non-zero values. len is 8.

--
Dave Close, Compata, Costa Mesa CA "You don't fight wars by blowing rose
***@compata.com, +1 714 434 7359 water through corn stalks."
***@alumni.caltech.edu -- Abraham Lincoln

Scott Hiles

2004-05-31 20:09:34 UTC

Permalink

I don't remember having to do any tricks with 2.4.19 and it is bombing
before it gets to write to the USB interface. My guess is that somewhere
along the lines before this call something gets messed up and it doesn't
surface till this point. I currently run version 1.6.10 on kernel 2.4.24.

The reason that I attempted to do version 2.0 was because of USB problems in
the 2.4 series of kernels and the difficulty in debugging kernel issues like
this one. But I ran out of time in developing code and never got 2.0 to
work reliably in anything other than a redhat 9 system.

If possible, try kernel 2.4.24. Also, when you built the drivers, I am
assuming that you built the kernel and it is referenced in /usr/src/linux or
/usr/src/linux-2.4.19 so that the drivers are picking up the .config from
the correct source tree.

Scott

-----Original Message-----
Sent: Monday, May 31, 2004 3:58 PM
Subject: Re: [wish-info] Something obvious?
More information. I edited the code to turn on debugging,
then found this output below in syslog. It appears that the
code is entering the
x10_plusb_write() routine but not getting much further.
usb.c: deregistering driver hiddev
usb.c: deregistering driver hid
x10: $Id: x10_core.c,v 1.47 2004/02/08 19:40:54 whiles Exp $
x10: $Id: x10_ldisc_plusb.c,v 1.17 2003/05/30 03:19:42 whiles
Exp whiles $
x10: $Id: x10_xcvr_plusb.c,v 1.14 2003/05/27 05:49:09 whiles Exp $
x10_core.c/271/x10_init(): starting initialization
x10_core.c/324/x10_init(): DEVFS not active, using standard mode
x10_core.c/326/x10_init(): registering data_major device 120 for x10d
x10_core.c/340/x10_init(): registering control_major unit 121 for x10c
x10_core.c/354/x10_init(): X10 /dev devices registered
x10_ldisc_plusb.c/244/ldisc_init(): PowerLinc USB line
discipline loading
x10_ldisc_plusb.c/187/usb_data_init() called
usb.c: registered new driver PowerLincUSB
x10_ldisc_plusb.c/378/plusb_probe() called
x10_ldisc_plusb.c/382/plusb_probe(): vendor=0x00ff
product=0x0022 (no match)
x10_ldisc_plusb.c/204/plusb_delete() called
x10_ldisc_plusb.c/255/ldisc_init(): PowerLinc USB line
discipline loaded (result=0)
x10_core.c/521/x10io_register_comm() called
x10_ldisc_plusb.c/283/x10_plusb_connect() called
x10_ldisc_plusb.c/286/x10_plusb_connect(): PowerLinc USB line
discipline loaded
x10_xcvr_plusb.c/132/xcvr_init() called
x10_core.c/431/x10io_register_device() called
x10_xcvr_plusb.c/159/xcvr_connect() called
x10_ldisc_plusb.c/295/x10_plusb_write() called
Unable to handle kernel NULL pointer dereference at virtual
address 00000000 printing eip: c0115e6a *pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0115e6a>] Not tainted
EFLAGS: 00010096
eax: e09c6690 ebx: 00000000 ecx: d679de10 edx: d679de08
esi: 00000202 edi: 00000008 ebp: 00000000 esp: d679ddfc
ds: 0018 es: 0018 ss: 0018
Process insmod (pid: 2272, stackpage=d679d000)
Stack: e09c6684 d679c000 c010773f 00000001 d679c000 e09c6690
00000000 d679de78
d679de78 00000008 00000000 c010780b e09c6684 00000001
e09c1613 e09c1f3b
e09c1e53 00000128 e09c1ede e09c1f51 e09c6504 00000000
d679de78 e09c4b40
Call Trace: [<e09c6684>] [<c010773f>] [<e09c6690>]
[<c010780b>] [<e09c6684>]
[<e09c1613>] [<e09c1f3b>] [<e09c1e53>] [<e09c1ede>]
[<e09c1f51>] [<e09c6504>] [<e09c4b40>] [<e09c60c0>]
[<e09bf08e>] [<e09c1d34>] [<e09bbb40>] [<e09bb9e9>]
[<e09c4b40>] [<e09c60c0>] [<e09c4b40>] [<e09bef1f>]
[<e09c60c0>] [<e09c1cf3>] [<e09c1d04>] [<c0117a80>]
[<e09c4b40>] [<e09bb32f>] [<e09c17ea>] [<e09c17bb>]
[<e09c17c6>] [<e09c17fd>] [<e09c181e>] [<c0118755>]
[<e09c3934>] [<e09bb060>] [<c010891b>]
Code: 89 0b 56 9d 5b 5e c3 eb 0d 90 90 90 90 90 90 90 90 90 90 90
Since there is a hex dump which should result soon after
entering that routine (with debug turned on), and that dump
does not appear in the log, I would guess there must be
something wrong in the two statements before that.
ANNOUNCE;
if (!test_bit(0,&connected)) {
dbg("%s","write called before connected");
return -ENODEV;
}
if (down_interruptible(&sem_transmit))
return -ERESTARTSYS;
dbg("%s",dumphex(buf,len));
I've inserted a few additional debug statements, but I still
don't see the problem. Both connected and sem_transmit have
non-zero values. len is 8.
--
Dave Close, Compata, Costa Mesa CA "You don't fight wars by
blowing rose
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market...
Oracle 10g.
Take an Oracle 10g class now, and we'll give you the exam
FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
wish-info mailing list
https://lists.sourceforge.net/lists/listinfo/wish-info

Dave Close

2004-06-01 03:52:48 UTC

Permalink

Post by Scott Hiles
I don't remember having to do any tricks with 2.4.19 and it is bombing
before it gets to write to the USB interface. My guess is that somewhere
along the lines before this call something gets messed up and it doesn't
surface till this point. I currently run version 1.6.10 on kernel 2.4.24.

I think I said I was trying to run version 1.6.9. I was misled by the
message in syslog; I am actually trying to use 1.6.10, just like you.

On the chance that the problem was not really WiSH related, but had
something to do with other processes on the machine, I've now installed
the program on a Red Hat 9 system with the official 2.4.20-8 kernel and
it is now working perfectly. The first machine I tried to use has other
USB devices, including a USB disk drive, that are active. I suspect that
removing hid is not sufficient in that case, and I can't remove other
modules as they are in active use.

So, my problems are resolved with a working system, though not on the
machine I would have preferred. And your concerns about the USB driver
seem even more well founded. Thanks for the support.

--
Dave Close, Compata, Costa Mesa CA +1 714 434 7359
***@compata.com ***@alumni.caltech.edu
"Everything that can be invented has been invented."
-- Charles H. Duell, Commissioner, US Patent Office, 1899