Friday, June 10, 2005

Solaris mdb, crash, adb Command Examples

Solaris mdb, crash, adb Command Examples

Solaris mdb(1) utility replaces crash(1M) and adb(1). In fact, beginning with
Solaris 9, crash and its man page are completely gone. This note lists some
case usage examples of crash and adb that I collected over the years. The
commands are all applicable in mdb. Using mdb (or crash or adb) usually
involves dealing with hexidecimal numbers, being able to pick the correct one
out of many in output, and applying the cryptic adb macros on them (only a
small number of macros, shown in Appendix, need no addresses or symbols).
Without enough examples, you would never learn how to use these powerful tools.
This note does not introduce you to these tools, which can be accomplished by
reading man pages and references [Panic] (Chapters 12,13) and [Garden]. If you
don't know or forget, at least remember commands $q (quit) and $c (stack trace).

Examples in [Filesystems]

[p.111] crash -> proc | grep myprogram -> user [proc slot, 1st column of proc
output] -> file [number following F under OPEN FILES...], which shows file ref
cnt, fs type, proc offset in file, flags=read

[p.138] crash, vnode -l [value under crash file cmd ADDR column]

[p.139] crash, od -x [value under crash file cmd ADDR column]. Output is:
hexnum1: hexnum2 hexnum3 hexnum4 hexnum5
hexnum6: hexnum7 hexnum8 hexnum9 hexnum10
vfsp stream pages type

[p.139] adb -k
[pages, hexnum9 above]$<pages shows file ops offset 0 meaning process at file
beginning.

[p.144-5] crash, as -f [proc slot] -> [number under DATA column on a row of
segvn_ops]$<segvn shows file ops offset and vp (vnode pointer) -> [vp from
above]$<vnode -> [data from above]$<inode shows number, which is inode plus
directory offset.

[p.151] p -f [proc slot] shows as (address space) -> [as from above]$<as ->
[segs from above]$<seglist -> [data from above]$<segvn -> [vp from
above]$<vnode -> [data from above]$<inode shows inode number

Examples in [Panic]

[p.39-40] adb -k -w /dev/ksyms /dev/mem -> rootdir/W 0 -> ls /
Never do that unless you need a system panic!

[p.55] adb, $<threadlist

[p.81] adb, $<utsname, hw_provider/s, architecture/s, srpc_domain/s

[p.83] adb -k /dev/ksyms /dev/mem, time/Y, lbolt/X -> [time output]-([lbolt
output]%0t100)=Y shows the boot time. Can also do: time/D, lbolt/D -> 0t[time
output]-0t[lbolt output]=Y

[p.84] adb, *panicstr/s

[p.85] adb, $<msgbuf

[p.112] adb, rootfs$<bootobj, swapfile$<bootobj, dumpfile$<bootobj

[p.137,142] adb, $<proconcpu (a macro written by authors showing which process
on which CPU)

[p.271] adb, [ADDR from ps -l]$<proc -> [pidp (used to be a typo pipd) from
above]$<pid

[p.272-6] adb, [ADDR from ps -l]$<proc2u shows u area. At least on Solaris 10
x86, it doesn't show ofile (open files). -> [ofile from above] $<file -> [vnode
from above]$<vnode -> [stram from above]$<stdata -> [wrq from above]$<queue ->
[next from above]$<queue, [qinfo from above]$<qinit

[p.279] [qinfo from above]/X shows struct for ldterm.

[p.280] [first from above $<queue output]$<mblk, [rptr from
above],[wptr-rptr]/C shows streams module output.

[p.317] [2nd arg of _trap() if shown in $c]$<regs -> [pc from above]?i (/i if
not from core file)

[p.346] adb, $<traceall (Need to verify)

[p.352] adb, <sp$<stacktrace (Need to verify)

[p.390] adb, $<modules shows more module info than modinfo(1M).

[p.402] adb, $<cpus

[p.423-4] adb, [owner of addr$<inode]$<proc -> [uarea from above]$<u

Examples in [Garden]

[p.566] crash, dis
[I'll update this section later...]

Other Examples

adb, $<cpus -> [lwp from above]$<lwp, [thread from above]$<thread

[Rodney,http://groups.google.com/groups?selm=37422E37.F4260028%40microworld.com]
ps -elp, adb -k /dev/ksyms /dev/mem -> [ADDR from above]$<proc -> [tlist from
above]$<thread -> [sp from above]$c, [WCHAN from ps output]$<mutex

REFERENCES

[Panic] Chris Drake, Kimberly Brown, "Panic! UNIX System Crash Dump Analysis",
1995, PTR-PH.
[Garden] Berny Goodheart, James Cox, "The Magic Garden Explained: the Internals
of UNIX System V Release 4", 1994, Prentice Hall.
[Filesystems] Steve Pate, "UNIX Filesystems", 2003, Wiley.

APPENDIX

$ cat noaddrmacros.ksh
#!/usr/dt/bin/dtksh
#noaddrmacros: prints names of all adb macro that don't need an address, i.e.
call by $<macro instead of addr$<macro

cnttotal=0
cntnoaddr=0
cd /usr/lib/adb
for i in *; do
if [[ $(file $i | awk '$2~/ascii/{print "ismacro"}') == "ismacro" ]]; then
cnttotal=$((cnttotal + 1))
if [[ $(head -1 $i | awk '$1!~/^\./ {print "noaddr"}') == "noaddr" ]]; then
cntnoaddr=$((cntnoaddr + 1))
echo $i
fi
fi
done
print -r "Total macros: $cnttotal; No-address macros: $cntnoaddr"
$ ./noaddrmacros.ksh #On Solaris 10 x86
audiotrace
buflist
buflist.nxt
buflistiter.nxt
callouts
ce_rxbufhist.nxt
ce_rxcomphist.nxt
ce_txhist.nxt
ce_txhist.nxt1
cglist
cglist.nxt
cglistchk.nxt
cglistiter.nxt
cpu_dptbl.nxt
cpus
dispqtrace.list
dispqtrace.nxt
ill_g_heads
inodelist
inodelist.nxt
inodelistiter.nxt
kmastat
major2snode.nxt
modules
modules.brief
modules.brief.nxt
modules.nxt
mount
msgbuf
nca_conntrace
nca_doortrace
nca_nodetrace
panicbuf
phyint_list
setproc.nop
sleepq.nxt
slpqtrace
slpqtrace.list
slpqtrace.nxt
svcpool_list
systemdump
threadlist
traceall.nxt
u.sizeof
utsname
v
v_call
vfslist
v_proc
Total macros: 911; No-address macros: 49

So 95% of macros need an address or symbol as starting address. [Panic p.110-1]
tells us a trick to find the symbol. Suppose you're debugging kernel and want
to use the bootobj macro. Since this macro needs a starting address or symbol,
let's find the symbol in kernel:
nm /dev/ksyms | grep -i boot | grep OBJT
That doesn't find the object named *boot*. Then
cd /usr/include/sys
find . -exec grep -i bootobj {} /dev/null \;
That finds the definition of bootobj struct in bootconf.h, which also says
rootfs, dumpfile and swapfile are of this type. These three variables also
exist in `nm /dev/ksyms`. So you can use them as symbols for bootobj macro,
e.g. rootfs$<bootobj, dumpfile$<bootobj.

The 5% macros that don't need an address won't necessarily work on core files.
For example, $<threadlist works on a system crash dump or live system (adb -k
/dev/ksyms /dev/mem), but it doesn't work on a process core (adb
[path/executable] core).

0 Comments:

Post a Comment

<< Home