Friday, November 18, 2005

Backups Under Solaris

Backups Under Solaris
Other Backup Utilities

In addition the basic Unix backup utilities, Solaris offers ufsdump
and ufsrestore. These two commands function as a pair. ufsrestore can
only restore from tapes created by ufsdump. Both are called from the
command line and follow the syntax:

ufsdump options arguments filenames

ufsrestore options argument filenames

ufsdump only copies data from a raw disk slice. It does not copy free
blocks. If a directory contains a symbolic link that points to a file
on another disk slice, the link itself is copied. When ufsdump is used
with the u option, the /etc/dumpdates file is updated. This file keeps
a record of when filesystems were last backed up, including the level
of the last backup, day, date and time. ufsdump can be used to back up
individual files and directories as well as entire filesystems.

If ufsdump is run without options or arguments the following defaults
are assumed:

ufsdump 9uf /dev/rmt/0 filenames

This means that ufsdump will create a level 9 incremental of the
specified file, update /etc/dumpdates, and dump the files to

ufsdump also supports an S option to estimate the amount of space, in
bytes, that the backup will require before actually creating it. This
number can then be divided by the capacity of the tape to determine
how many tapes the backup will need.

A series of tape characteristics and options can also be specified.

* c=cartridge
* d=density
* s=size
* t=number of tracks

These options can be given in any order as long as the arguments that
follow are in the same order, i.e.

ufsdump cdst 1000 425 9

This specifies a cartridge tape with a density of 1000, 425MB, and
nine tracks. In terms of tape options the syntax is as follows:

ufsdump 9uf /dev/rmtXAbn filenames


* X=the number of the drive beginning with 0.
* A=optional density.
o l=low
o m=medium
o h=high
o u=ultra
o c=compressed
* b=specifies Berkeley SunOS 4.X compatibility.
* n=no rewind option, which allows other files to be appended to the tape.

ufsrestore has an interactive mode which can be used to select
individual files and directories for restoration. It also supports an
option to read the table of contents from the archive file instead of
the backup media.

Limits of ufsdump:

* It does not automatically calculate the number of tapes needed
to backup a filesystem.
* It does not have a built in error checking mechanism.
* It does not enable the backing up of files that are remotely
mounted from a server.

Solaris also supplies volcopy, a utility to make an image or literal
copy of a file system.
Tips and Quirks

The Solaris version of tar includes extra options. The -I option
allows a list of files and directories that are backed up to be put
into a text file. The -X option allows an exclusion file to be
specified that lists the names of files and directories that should be

The Solaris version of mt supports an asf subcommand which moves the
tape to the nth file. n being the number of the file.

Friday, November 04, 2005



Hi all.
Sorry for my english.
I have E4800 server, T3 array and SAN. My server have 2 FC adapters.
One for T3, other for SAN (HP EVA5000, QLA2300). After reboot I see:
qlc driver trying attach to both FC adapters, but cannot attach to
QLA2300. I installed driver qla2300 from, but not initialized.
Always qlc driver trying to attach to FCA 2300. What do I need to do ?


Maybe you must look and configure the qla.conf.
The best is, you look here for answers:


The qlc driver can not attach to the QLA2300 card because the QLA2300
card does not with with the QLC driver. This is normal.

However, you should be able to install the qla driver and run both
drivers at the same time.


The first thing to check is who's HBA each is by looking at the PCI identifiers.

Look at the output of the "prtconf -vpD" command.

Each HBA will have two lines that are important. The first to appear
is "compatible:" and the second is "name:"
compatible: 'pci1077,2.1077.2.5' + 'pci1077,2.1077.2' + 'pci1077
,2' + 'pci1077,2200.5' + 'pci1077,2200' + 'pciclass,10000' + 'pciclass,0000'
name: 'QLGC,qla'

Now the OS will then look for a "name" listed in the
/etc/driver_aliases to match a driver to the HBA. If a "name" is not
found the OS starts using each of the compatible entries and will
match drivers to those entries.

What you could do is run the following commands and post their output
here and I'll tell you what's wrong:

prtconf -vpD | grep 1077
grep ql /etc/driver_aliases



Did you ever get a resolution for this? I have almost the exact same
situation and have been unable to get the QLA driver to attach to the


402) root@cohuxfs27:/etc/cfg/fp> prtconf -vpD | grep 1077
compatible: 'pci1077,2300.1077.106.1' + 'pci1077,2300.1077.106' +
'pci1077,106' + 'pci1077,2300.1' + 'pci1077,2300' + 'pciclass,0c0400'
+ 'pciclass,0c04'
subsystem-vendor-id: 00001077
vendor-id: 00001077
compatible: 'pci1077,2200.1077.4082.5' + 'pci1077,2200.1077.4082' +
'pci1077,4082' + 'pci1077,2200.5' + 'pci1077,2200' + 'pciclass,010000'
+ 'pciclass,0100'
subsystem-vendor-id: 00001077
vendor-id: 00001077
compatible: 'pci1077,2200.1077.4082.5' + 'pci1077,2200.1077.4082' +
'pci1077,4082' + 'pci1077,2200.5' + 'pci1077,2200' + 'pciclass,010000'
+ 'pciclass,0100'
subsystem-vendor-id: 00001077
vendor-id: 00001077
compatible: 'pci1077,2200.5' + 'pci1077,2200' + 'pciclass,010000' +
vendor-id: 00001077
(403) root@cohuxfs27:/etc/cfg/fp> grep ql /etc/driver_aliases
qlc "pci1077,2200"
qlc "pci1077,2300"
qlc "pci1077,2312"
qla2300 "pci1077,9"
qla2300 "pci1077,100"
qla2300 "pci1077,101"
qla2300 "pci1077,102"
qla2300 "pci1077,103"
qla2300 "pci1077,104"
qla2300 "pci1077,105"
qla2300 "pci1077,109"
qla2300 "pci1077,116"
qla2300 "pci1077,115"


I extracted the file from the QLogic SCLI utility.
It has an index of the identities of the QLogic cards. That first one
of 1077,106 is a Sun Amber-2 X6767, not an HP card as you thought.

The two 1077,4082 cards are Sun Amber HBA ports.

Lastly the 1077,2200,5 is probably either a generic qlogic card or a
fibre down if this is a SB100/280R...

how many file descriptors does the Xnewt process have open?

> When this happens, how many file descriptors does the Xnewt
> process have open? ('ls -l /proc/<pid>/fd' or something
> similar.)

With ls -l /proc/<pid>/fd | wc -l

227 for a fresh GNOME session and one gnome-terminal window open
222 for a fresh XFCE4 session and one xterminal window open

(different PIDs for each of these checks)

A given process is:

root 10091 9074 2 12:09 ? 00:00:00 /usr/X11R6/bin/Xnewt :26
-auth /var/lib/wdm/authdir/authfiles/A:26-AY24Hr -dpms

This is interesting, because here is some of the 'ls -l /proc/10091/fd'

lrwx------ 1 root root 64 Mar 16 12:11 10 ->
/var/lib/wdm/authdir/authfiles/A:12-d8mxzX (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 100 ->
/var/lib/wdm/authdir/authfiles/A:11-QHkbcU (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 101 ->
/var/lib/wdm/authdir/authfiles/A:11-bbVcHq (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 102 ->
/var/lib/wdm/authdir/authfiles/A:27-FVil3v (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 103 ->
/var/lib/wdm/authdir/authfiles/A:9-1iqtPQ (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 104 ->
/var/lib/wdm/authdir/authfiles/A:28-ZcO3OP (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 105 ->
/var/lib/wdm/authdir/authfiles/A:11-65HZQD (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 106 ->
/var/lib/wdm/authdir/authfiles/A:16-J7pYyu (deleted)
lrwx------ 1 root root 64 Mar 16 12:11 107 ->
/var/lib/wdm/authdir/authfiles/A:11-RYJlqJ (deleted)

Friday, October 28, 2005

httpd processes each with a size of 148M... is the top "size" display directly related to memory

> When I run top on this box, I can see 6 httpd processes each with a
> size of 148M... is the top "size" display directly related to memory?

Yes, it's the amount of virtual memory used by this process. You should
see a similar number (with much greater detail) by doing pmap -x on the

> If it is, how can I possibly be running 6x148M processes just on apache
> alone?

Every page used by the process is not necessarily private to that
process. Read-only portions of the Apache binary may be shared among
all 6, and system libraries (like libC) may be shared by many programs.

The 'pmap -x <pid>' output shows that more explicitly.

Then you might also want to know that there are even a lot more
that offer detailed information regarding system state and performance:
sar, vmstat, iostat, trapstat, cpustat, mpstat, cputrack, busstat, kstat

Darren already gave a good answer, but I wanted to elaborate a little.
(Or, after looking back on this after I've written the whole post,
apparently more than a little...)

On a simple computer, there is just a certain amount of RAM available
and every address that a program uses (in a pointer, in an address
register, or whatever) simply corresponds to part of that RAM. And
every program executes in the same address space, which means a given
address refers to the same thing whether it's in the context of one
process or another.

But on Solaris, there is virtual memory, and every process has its
own address space. Using the MMU hardware, the system maps several
different ranges of the process's address space to different things.
Some of the address ranges are private areas that only the process
itself has access to. Any memory you allocate with malloc() will be
in a private address range. Some address ranges correspond to regions
of files on disk. (In Solaris, executables are loaded by setting up
address ranges in the process's address space that correspond to
parts of the executable file. And the same thing is done when an
executable runs against a shared library.) Some address ranges
correspond to other things (sometimes even things like address ranges
that are used to communicate with hardware other than RAM).

Now to make things even a bit more complicated, just the existence
of an address range within a process's address space does not imply
that any RAM is used for that range. For example, if an address
range corresponds to a region of a file and if you've never either
read from or written to any addresses in that range, Solaris doesn't
need to waste time or memory putting that data into RAM. And to
make things yet more complicated, even if Solaris does need to use
RAM for (part of) an address range, if two processes are using the
same region of the same file, Solaris can use the same RAM for
both processes, even if the addresses that would be used within
the processes to access that data aren't the same addresses. And
to make things even yet more complicated, if a process has an
address range that contains private data and that process does a
fork(), then the twin processes that result can both use the same
RAM (or swap space) for that data until the time when one of them
tries to change the data (at which point a copy must be made so
they have their own separate copies).

So, when you run top, and you see the "SIZE" of the process, what
you're seeing is the size of all the address ranges that have been
created for that process. Most (but not all) of these address
ranges could correspond to data in RAM, but even if they do, it
might be data that is shared with another process. So when you
see 148M for an Apache process, that just means that there are 148M
worth of addresses that the Apache process could theoretically
access if it wanted to.

The "RES" column of top's output is a lot closer to the RAM usage
of a process, but it's still not exactly the same thing. The "RES"
column just tells you, of all the addresses that a process could
potentially access, how many of them are currently connected to a
particular spot in physical RAM. That is, how many of those pages
are resident in physical memory. It doesn't say how many of those
are shared with other processes. It's tempting to say that the
"RES" column tells you how much memory the process is using, but
that's not entirely accurate. It's totally possible for a process
to be totally dormant and not running have its resident size increase.
This could happen if the dormant process's address space refers to
some region of a file that some *other* process just accessed and
thus forced into memory.

And in fact, this type of thing partly accounts for why you see such
high numbers, even in the "RES" column. You could have a process
which only accesses a tiny portion of some file that's mapped into
its address space, but if a bunch of other processes also access
tiny portions of the same file, that will make increase the resident
size of all the processes that map the (whole) file bigger. That
may seem a little unlikely, but actually it is quite common because
things like (the C shared library) are used by a bunch of
processes, and even though each process might only use a few functions,
still when you count the total number of functions that are actively
used from that library, a significant portion of the library ends up
being resident, and that means that it inflates the resident size
numbers of all processes that use it.

> So I learned yesterday that the native stat tool for Solaris is
> prstat.. So I'm guessing from your posting Logan that in prstat, the
> SIZE column is the total amount of memory that each process can use and
> the RSS is the actual amout thats used..

"can use" and "that's used" as a description of memory seem incorrect to

The difference is between Virtual Memory Pages that are actually
resident in RAM (RSS) and those that are allocated (whether currently in
RAM or not (SIZE). I don't think that "in use" and "in RAM" are quite
the same thing...

> "In a virtual memory(1) system, a process' resident set is that part of
> a process' address space which is currently in main memory. If this
> does not include all of the process' working set, the system may
> thrash."
> That makes sense to me except for one thing... If the SIZE is the total
> amount, how come it fluctuates?

Processes allocate memory while they run. Most programs will grow, but
not shrink, but both are possible.

> Whoah.. I just looked at pmap.... thats insane.

Don't try to interpret everything. Most bits are just mappings from
other shared objects. In an average program, the most common place for
it to consume memory directly is in the [heap].

However the breakdown of shared/private/RAM/total can be useful.

If I can't map SIZE and RSS to whats available and whats in use, how
> can I tell when the memory needs to be upgraded?

I was making the distinction between "in use" (which is a little fuzzy
for me when talking about pages) and "in RAM" (which is well-defined).

You can map SIZE and RSS to what pages are in RAM at any one time, which
will probably be all you need.

What "in use" means, and whether or not that has anything to do with RAM
residency is a different question.


The top part of the display of top shows memory usage. The 'swap -s' command
shows how much swap is in use. I think both of these include swap as
backed up by the executables and shared images on disk as well as the
swap backed up by the swap file. The 'swap -l' command will show how much
of the actual swap space is in use. A rule of thumb is that when the amount
of swap in use starts to be as big as your memory, you need to add memory.
If you have a lot of mostly inactive programs in use, you can allow more
swap to be used without hurting performance.

One way is to look at the pi and po (page in and page out) columns
in the vmstat output. Assuming you're not starting lts of new programs,
high values here could indicate low memory. You should check out
Adrian Cockcroft's book, Solaris Performance Tuning (aka the Porche

Frequent complaints form users about poor performance can also be an
idicator of too much paging. :-)

Subject: V210 BGE0@1000FDX


When connecting a server to a Gig interface you need to enable autoneg
on the server. It will pick up the correct speed automatically. It
becomes a problem when trying to force it especially if using Cisco

Normally I run the following script

ndd -set /dev/bge0 adv_autoneg_cap 1
ndd -set /dev/bge0 adv_1000fdx_cap 1
ndd -set /dev/bge0 adv_1000hdx_cap 0
ndd -set /dev/bge0 adv_100fdx_cap 0
ndd -set /dev/bge0 adv_100hdx_cap 0
ndd -set /dev/bge0 adv_10fdx_cap 0
ndd -set /dev/bge0 adv_10hdx_cap 0

Hope that it resolves your problem.


Friday, October 21, 2005

SUMMARY: Tracking down system calls on Solaris 9

Subject: SUMMARY: Tracking down system calls on Solaris 9

Hi all,

Many thanks to everyone who responded - Aleksander Pavic, francisco, and
Darren Dunham. My original email is attached below, along with the
replies I got - but to summarise : I was seeing a very high sysload on a
Solaris 9 web server, and vmstat confirmed that a large number of system
calls were being generated. I wanted to track these down and find out
what was being called, but couldn't use Dtrace. Yet another argument for
moving to Solaris 10 :)

As Darren said in his response: "The limitations on existing tools like
'truss' are part of what drove
dtrace, so I don't know that there's any magic out there."

He then went on to suggest I analyse all the Apache processes with
truss, send the output to a file and then analyse that. This was also
the path suggested by Aleksander, who quite correctly pointed out that
truss can be made to follow any child processes generated via forking,
so I could therefore truss the main Apache process and follow all it's
children. He also suggested I send the output to a file, and
post-process it with awk or perl. Francisco also suggested the useful
lsof tool to see what files are open, as my original hypothesis was that
there were a large number of file handles being opened and closed.

In the end, I trussed every "httpd" process, and generated a summary
using "truss -c". I let this run for 20 seconds, and saw that there were
a very large number of resolvepath() and open() calls being generated,
roughly half of these calls returned with an error.

I then narrowed my search down, and examined what was actually being
passed as arguments to these calls. This is easily done with "truss -t
open,resolvepath". It turns out that a huge number of the
resolvepath()'s and open()'s were being generated by PHP scripts running
under Apache. They were using an inefficient include_path, and so when
most files were being included, PHP generated many resolvepath() and
open() calls which returned in error before finally finding the correct
location of the file.

We fixed the PHP include_path and also modified some of the scripts to
use an absolute path in include() or require() functions, and as
expected, the number of syscalls being generated halved itself.

There were a number of other code-related problems on that server as
well, but these were unrelated to my original request for help.

Once again, many thanks to those that responded. Problem resolved !


Configuring Qlogic HBA card

Subject: SUMMARY Configuring Qlogic HBA card

I found the solution. Thanks to those who replied!


ok> show-devs
<QLGC entry>
ok> select <QLGC entry>
ok> show-children
<Lists info about card such as WWN, LoopId Lun>
ok> show-connection-mode
Current HBA connection mode: 2 - Loop preferred,
otherwise point-to-point

(Possible connection mode choices:
0 - Loop Only
1 - Point-to-point only
2 - Loop preferred, otherwise point-to-point)

ok> set-connection-mode (0, 1, or 2)
ok> show-data-rate
Current HBA data rate: One Gigabit rate

Possible data rate choices:
0 - One Gigabit rate
1 - Two Gigabit rate
2 - Auto-negotiated rate

ok> set-date-rate (0, 1, or 2)
To set the data rate.

More info can be found at


Original Question:

I have a QLogic 2300 card in a SF V440. I am trying to
install SecurePath but the card is not seen by Solaris
8. (It is seen at OBP level)
At my last job I had a similar problem which I fixed
by setting the speed at the OBP level. *The card is
set to autoneg but I need to force it to 1 gig)
Unfortunately I could not keep my notes from that job
and cannot for
the life of me remember how to set the speed.
Has anyone got notes on how to do it. I have spent 2
hours scouring google with no luck.

How do I install Sun Explorer

4. How do I install Sun Explorer?

After downloading the SunExplorer.tar.Z file from

cp SunExplorer.tar.Z /var/tmp

cd to /var/tmp

uncompress SunExplorer.tar.Z

tar xvf SunExplorer.tar

or, if you have gzip installed,

zcat SunExplorer.tar.Z | tar xvf -

This will extract the contents of the archive into two directories,
SUNWexplo and SUNWexplu, located in the current directory.

As superuser, type the following command:

# pkgadd -d . SUNWexplo SUNWexplu

Ethernet card woes

Subject: SUMMARY: Ethernet card woes

Thanks to everyone to replied!
It turns out that /etc/inet/ipnodes had the old IP address
set - thanks to Dale Hirchert for pointing that out.
Also if I ran sys-unconfig it would have caught it aswell -
thanks Dominic Clark

So for future reference either change the following files :

or run sys-unconfig.

Thanks again!


read/ write performance on the volumes.

Hi Greg,

I can't offer any suggestions , but I am interested in knowing how you
are measuring the read/ write performance on the volumes. HDS tool? or
something more common.

Thanks, V

> Hello,

> Just a quick info gathering. I am at a customer site installing a new
> HDS 9990. The high level config overview:

> HDS 9990 (Open V 40GB LUNS)
> HDS 9200 (Legacy Array)
> Sun Fire v880
> Brocade 4100's (2 Fabrics)
> QLogic 2GB Cards (375-3102) to new SAN
> JNI FCE2-6412 cards to old HDS 9200 Array
> MPXIO enabled and configured for Round Robin
> VxVM 4.1
> Oracle 9i

> During this phased implementation, we are in the date migration stage.
> We are mirroring the old storage, which is from a HDS 9200 to the new
> LUNS on the TS (9990).
> Once the mirroring is complete, we will break of the plexes from the
> old array and be fully migrated to the new Hitachi.

> The customer decided not to break the mirrors yet. We have noticed a
> decrease in write and read performance on all the volumes on the host.
> I would expect a slight decrease in write performance, however, we are
> seeing upto a 1/5 milli-second increase in read time as well on each of
> the volumes. My assumption is that because of the double writes to two
> different (types) of LUNS, that is impacting our reads.

> Suggestions?



I am using vxstat -g 'diskgroup' -i 1 (the -i is the interval I am
polling, in this case every one second). This output is giving me a
format like this:


vol ora00 39364 388 4931856 6208 1.8 0.1
vol ora00 39585 389 4950704 6224 1.9 0.0
vol ora00 39571 391 4954960 6256 1.8 0.1

As for Solaris LUN metrics, I generally use iostat -xnpz 1, which is
giving me the disk & tape I/O data and excluding any zero's. It's a
lot of information, so what I do is grep out what I am looking for, for
example, iostat -xnpz 1 | grep c5t0d0.



cannot create /etc/foo: Operation not applicable

Subject: SUMMARY: cannot create /etc/foo: Operation not applicable

Original question:

> On a Solaris 8 system running fine for two months, I suddenly get this:
> # touch /etc/foo
> touch: cannot create /etc/foo: Operation not applicable
> Truss says:
> creat("/etc/foo", 0666) Err#89 ENOSYS
> I also noted truncated files in /etc.
> There is nothing interesting in the system log. System is a V210 running
> Solaris 8 with recommended patches from feb. 28 2005. Root filesystem is
> mirrored using SVM.

The responses I received include:

- Are you out of disk space
- Are you out of inodes
- Do you have the same problem on other partitions like /var or /opt
- Are you running the automounter
- Are the permissions wrong on /etc
- Is the "touch" command malfunctioning
- Is the root filesystem mounted read-only
- Are you also unable to modify files in /etc
- Does your metastat output show weird things
- Do you already have a file name "foo" in /etc

The answer is "no" to all these points. So I requested downtime with the
customer to bring the system into single-user mode to do a filesystem check.
As expected, many errors showed up, but it was able to repair the root
filesystem and the system is running fine now.

I also logged a case with Sun Support about this issue. They sent me two
documents from SunSolve that describe common reasons for filesystem
corruption. Since the call is closed now I cannot retrieve the document
ID's, sorry for that. The only two reasons that remain after reading these
documents are:

- Applications use the unlink(2) system call without checking if the
directory is empty. This is a classical UNIX problem.
- Bugs in the O.S.

I have no idea how to check if some of the running processes are misusing
unlink(2). Maybe dtrace can do this, but this is a Solaris 8 system. As for
bugs in the OS, I haven't found applicable ones on SunSolve.

Thanks to all who replied.


Question about Sun patch: How to find out what patches I've just installed

Subject: SUMMARY: Question about Sun patch: How to find out what
patches I've just installed

Many thanks to everybody who replied. You know who
you are :-)

Original quesiton:
I need to install a bunch of Sun patches into a
Solaris 8 system. How do I find out the list of Sun
patches I just installed (using patchadd patch#)

1. #ls -ltr /var/sadm/patch|tail -xx
where xx is the # of patches I've just installed
(ie if I installed 20 patches, then the xx number
will be 20; for example)

2. #showrev -p|egrep 'patch #1|patch #2|patch #3'
where patch #1,2,3 are the 3 patches I've just

"showrev -p" alone won't cut it (but otherwise is
technically correct) because the output
includes too many previous patches. It will be kind
of hard to verify which one.

logical volume problem

logical volume problem

Hi all,
I am using veritas VM 3.5. When I want to create a raid 5 volume by the
following command,

bash-2.05# vxassist -g diskgroup make vol-1 10m layout=raid5
vxvm:vxassist: ERROR: Too few disks for striping; at least 4 disks are
bash-2.05# vxdisk -g diskgroup list
c2t1d0s2 sliced diskgro02 diskgroup online
c2t2d0s2 sliced diskgro03 diskgroup online
c2t3d0s2 sliced diskgro01 diskgroup online

I get the error "ERROR: Too few disks for striping; at least 4 disks".
Raid 5 only need 3 disk. Why?


Because there are some interesting failure modes where a crash can occur
in the middle of a write, leaving you not knowing if parity is right or
wrong. Combined with a disk error, you can have problems.

To get around that failure mode, the default for a raid5 construction
with vxassist is to create an additional log device which is not on any
disk shared with the raid5 data. It's small, but must be on a separate
disk. There might be a way of using the same disk but replicating it.

If you don't want the log disk, you can specify 'nolog' or 'noraid5log',
then it will only need 3 columns.

vxassist -g diskgroup make vol-1 10m layout=raid5,nolog

log, nolog
Creates (or does not create) dirty region logs
(for mirrored volumes) or log plexes (for RAID-5
volumes) when creating a new volume. This attri-
bute can be specified independently for mirrored
and RAID-5 volumes with the raid5log and regionlog
layout specifications. The current implementation
does not support the creation of DCM logs in the
layout specification.

raid5log, noraid5log
Creates (default) or does not create log plexes
for RAID-5 volumes.

restored filesystem - comparison to original

restored filesystem - comparison to original

Having devised and operated a backup scheme and schedule since the start of the
month, I'd quite like to perform a restoration in order to test it.

I will restore the file system to a separate disk than the original,

But what's the "best" way to compare the two so I can be sure the scheme I have
devised is capable of backing up properly, but also my proposed restore
mechanism, restores properly.

The fs in question is only 1GB at present. So any suggested comparison can be
time consuming in nature.....

I'd obviously like to check for missing files/directories and errors with
ownerships, permissions, ACLs, and timestamps...

How do I go about this?


> Rob

running tripwire on the orignal and the copy comes to mind.
then compare the output tripwire databases.

> with ownerships, permissions, ACLs, and timestamps...

You could try the filesync tool with the "-n" option, which will make
it just find the differences and not attempt to make changes. If you
back up /foo and restore it into /restore/foo, then the filesync command
would be something like this:

filesync -n -ame -fsrc -s / -d /restore foo

The "-n" means not to make any changes, the "-aem" means to check
ACLs and modification times and flag everything found (even if it
can't be changed, the "-fsrc" means to consider the source directory
to be the authoritative one, "-s" specifies the directory that
CONTAINS the source thingy to be synchronized, "-d" specifies the
directory that contains the destination thing to be synced, and
"foo" is the thing to be synced.

If you wanted to compare all of "/" against something contained
in "/" (such as "/restore"), you could type this in ksh or bash:

cd /
filesync -n -ame -fsrc -s / -d /restore ./

Then when the cursor is at the end of the line, do ESC then "*" in
vi mode or Meta-"*" in emacs mode, and it will expand the list of
files, at which point you can delete "restore" from the list. (If
you don't delete "restore" from the list, it will think everything
in "restore" should be in "restore/restore", which make will cause
the output to be filled with extraneous stuff.)

- Logan

>I will restore the file system to a separate disk than the original,

The best way is to use "star" to compare both filesystems
as it is the only known program that is able to compare _all_
file properties and meta-data (except for Extended attribute files).

As I currently know nobody who uses Extended attribute files, I am
sure that this will fit your needs.


star -c -diff -vv -dump -acl -sparse diffopts=!atime,ctime,lmtime -C
fromdir . todir

BTW: This is also the fastest known method and if you like to copy
a filesystem, a similar method will copy the fs very fast.

Also have a look at star when doing incremental dumps.
It might be more interesting for you than ufsdump/ufsrstore.

How to free virtual memory

Re: How to free virtual memory

Looks like some process is leaking memory continously. The process
needs to fix it, use some open source tool like valgrind to find out
which process is leaking memory

For free software books visit

we deployed a Seebeyond project in Unix Machine in that
i have problem of excceding the virtual memory,the virtual memory is
keep on incresing ,at some point it is strucking up the unix server,

is there any way to free the virtual memomry faster,
and can we stop the incresing of virtual memory

i need some urgently


Tuning the I/O Subsystem

Tuning the I/O Subsystem

I/O is probably one of the most common problems facing Oracle users.
In many cases, the performance of the system is entirely limited by
disk I/O. In some cases, the system actually becomes idle waiting for
disk requests to complete. We say that these systems are I/O bound or
disk bound.

As you see in Chapter 14, "Advanced Disk I/O Concepts," disks have
certain inherent limitations that cannot be overcome. Therefore, the
way to deal with disk I/O issues is to understand the limitations of
the disks and design your system with these limitations in mind.
Knowing the performance characteristics of your disks can help you in
the design stage.

Optimizing your system for I/O should happen during the design stage.
As you see in Part III, "Configuring the System," different types of
systems have different I/O patterns and require different I/O designs.
Once the system is built, you should first tune for memory and then
tune for disk I/O. The reason you tune in this order is to make sure
that you are not dealing with excessive cache misses, which cause
additional I/Os.

The strategy for tuning disk I/O is to keep all drives within their
physical limits. Doing so reduces queuing timeā€”and thus increases
performance. In your system, you may find that some disks process many
more I/Os per second than other disks. These disks are called "hot
spots." Try to reduce hot spots whenever possible. Hot spots occur
whenever there is a lot of contention on a single disk or set of
Understanding Disk Contention

Disk contention occurs whenever the physical limitations of a disk
drive are reached and other processes have to wait. Disk drives are
mechanical and have a physical limitation on both disk seeks per
second and throughput. If you exceed these limitations, you have no
choice but to wait.

You can find out if you are exceeding these limits both through
Oracle's file I/O statistics and through operating system statistics.
This chapter looks at the Oracle statistics; Chapter 12, "Operating
System-Specific Tuning," looks at the operating system statistics for
some popular systems.

Although the Oracle statistics give you an accurate picture of how
many I/Os have taken place for a particular data file, they may not
accurately represent the entire disk because other activity outside of
Oracle may be incurring disk I/Os. Remember that you must correlate
the Oracle data file to the physical disk on which it resides.

debugging RC scripts. solaris9

debugging RC scripts. solaris9

I can remember how to debug the startup scripts. Can someone help me
out here. I just want Solaris to report what startup script it is
currently executing. I thought it was as simple as adding a "+" to the
/etc/rc* script but that didnt work.

There's no really simple way to do this. You may be thinking of adding
set -x to /etc/rc?, but that gets overly verbose for me.

Often I've made a small edit to /etc/rc?. There's a startup loop in
there where it runs /bin/sh $f start or so for each of the scripts.
Just add a "echo starting $f" and a "echo done starting $" inside the
"if" and outside the "case" statements. (and make a backup!). Then you
can tell what it's trying to do and where it hangs.

Once there, you can make that one script more verbose.

If you were running Solaris 10, you could use 'boot -m verbose'.


statvfs / df bug?

statvfs / df bug?

Hi all,
I am trying to get the filesystem information using statvfs/df. I have
an automounted partition mounted on /mntauto. I am running SunOS 5.8

If I say 'df -k /mntauto', I get the following output:
bash-2.03# df -k /mntauto
Filesystem kbytes used avail capacity Mounted on
cpsupsun1:/mntauto 482455 10 434200 1% /mntauto

I have written two different programs using 'statvfs' to print the
filesystem information. Following is the putput fron these two
different programs:

Program 1:
main() {
struct statvfs info;


if (-1 == statvfs("/", &info))
perror("statvfs() error");
else {
puts("statvfs() returned the following information");
puts("about the ('/mntauto') file system:");
printf(" f_bsize : %u\n", info.f_bsize);
printf(" f_blocks : %u\n", info.f_blocks);
printf(" f_bfree : %u\n", info.f_bfree );
printf(" f_bavail : %u\n", info.f_bavail);
printf(" f_files : %u\n", info.f_files);
printf(" f_ffree : %u\n", info.f_ffree);
printf(" f_fsid : %u\n", info.f_fsid);
printf(" f_flag : %X\n", info.f_flag);
printf(" f_namemax : %u\n", info.f_namemax);
printf(" f_basetype : %s\n", info.f_basetype);
printf(" f_fstr : %s\n", info.f_fstr);



statvfs() returned the following information
about the ('/mntauto') file system:
f_bsize : 8192
f_blocks : 4129290
f_bfree : 3026760
f_bavail : 2985468
f_files : 512512
f_ffree : 507103
f_fsid : 8388608
f_flag : 4
f_namemax : 255
f_basetype : ufs
f_fstr :

Program 2:

main() {
struct statvfs info;
struct stat sb;

if (stat("/mntauto/log", &sb) < 0)
printf("stat failed\n");

if (S_ISDIR(sb.st_mode))
printf("Not a dir\n");

if (-1 == statvfs("/mntauto", &info))
perror("statvfs() error");
else {
puts("statvfs() returned the following information");
puts("about the ('/mntauto') file system:");
printf(" f_bsize : %u\n", info.f_bsize);
printf(" f_blocks : %u\n", info.f_blocks);
printf(" f_bfree : %u\n", info.f_bfree );
printf(" f_bfree : %u\n", info.f_bavail);
printf(" f_files : %u\n", info.f_files);
printf(" f_ffree : %u\n", info.f_ffree);
printf(" f_fsid : %u\n", info.f_fsid);
printf(" f_flag : %X\n", info.f_flag);
printf(" f_namemax : %u\n", info.f_namemax);
printf(" f_basetype : %s\n", info.f_basetype);
printf(" f_fstr : %s\n", info.f_fstr);


statvfs() returned the following information
about the ('/mntauto') file system:
f_bsize : 8192
f_blocks : 964910
f_bfree : 964890
f_bfree : 868400
f_files : 247296
f_ffree : 247290
f_fsid : 80740364
f_flag : 0
f_namemax : 4294967295
f_basetype : nfs
f_fstr :

Could anyone please explain me the differences between the above
outputs? I assume all of the above should print the same answer...Is
this a know bug of statvfs?

Thanks in advance,


My apologies.
I got the error!
It was a typo...

I would say everything is working okay.
Program 1 does a statvfs of "/" and program 2 calls statvfs for "/mntauto".

use telnet command in a shell script

Re: use telnet command in a shell script

I would use ssh instead of telnet setup a connection with keys so that
the connection does not require a password (man ssh to find out how)
then call it as follows

ssh hostname -l username "command" | tee -a output.txt


> Our system is backed up to tape each night. Unfortunately our Sys Eng
> is on holidays and I do not know how to recover from tape.

> Could you walk me through it please.

There are lots of ways to do it. Without knowing which your system
admin chose, it's really hard to give you any useful information.

The most likely thing is that you need to use something like
"mt -f /dev/rmt/0n asf 2" to move to file #2 on the tape (the
number 2 is just a random number picked; you'll have to determine
where the backup of the filesystem you need is located on the
tape and use that number instead). Then you'd change to some
directory (like /tmp or some place with lots of space) and do
a "ufsrestore ivf /dev/rmt/0n". Then use "cd", "ls", and "pwd"
to navigate, "add" and "delete" to select which files to extract,
and "extract" (whose prompt you should answer with "1") to extract
them from the tape. Oh, and then "mt -f /dev/rmt/0n offline" to
rewind and eject the tape.

Of course, this assumes that the administrator chose to use ufsdump
to back up the files, which is definitely not a given. Also, it is
quite possible that the administrator chose to do incremental
backups, so if that is the case, you may need to restore from a full
backup tape *and* an incremental backup tapem, which makes things
even more complicated. It's really hard to know what the right
thing to do is without knowing what backup scheme the administrator
chose for that system.

if I do a metadb -a /dev/dsk/c0t0d0s4

i.e. if I do a metadb -a /dev/dsk/c0t0d0s4 will it just add another
> database replica into the slice?

No. You'll have to delete all the replicas in one slice, then create
all the replicas at one time.

metadb -d /dev/dsk/c0t0d0s4
metadb -a -c 2 /dev/dsk/c0t0d0s4

Tape Control -the mt Command:

Tape Control -the mt Command:

This assume that the device is at the 0 address.

Shows whether device is valid, whether tape is loaded, and status of tape

mt -f /dev/rmt/0 status:

Rewinds tape to start

mt -f /dev/rmt/0 rewind:

Shows table of contents of archive. If tar tvf produces an error, then
there are no more records on the tape.

tar tvf /dev/rmt/0:

Advanced to the next archive on the tape.

mt -f /dev/rmt/0 fsf:

Moves the tape to the end of the last archive that it can detect.

mt -f /dev/rmt/0 eom:

Erases the tape. Use with care.

mt -f /dev/rmt/0 erase:

Ejects the tape, if the device supports that option.

mt -f /dev/rmt/0 offline:

To extract lengthy archives even if you plan to log out, use the nohup
command as follows:

nohup tar xvf /dev/rmt/0 &

Identify the tape device

dmesg | grep st

Check the status of the tape drive

mt -f /dev/rmt/0 status

Tarring files to a tape

tar cvf /dev/rmt/0 *

Cpioing files to a tape

find . -print | cpio -ovcB > /dev/rmt/0

Viewing cpio files on a tape

cpio -ivtB < /dev/rmt/0

Restoring a cpio

cpio -ivcB < /dev/rmt/0

To compress a file

compress -v some.file

To uncompress a file

uncompress some.file.Z

To encode a file

uuencode some.file.Z some.file.Z

To unencode a file

uudecode some.file.Z some.file.Z

To dump a disk slice using ufsdump

ufsdump 0cvf /dev/rmt/0 /dev/rdsk/c0t0d0s0
ufsdump 0cvf /dev/rmt/0 /export/home

To restore a dump with ufsrestore

ufsrestore rvf /dev/rmt/0

To duplicate a disk slice directly

ufsdump 0f - /dev/rdsk/c0t0d0s7 |(cd /home;ufsrestore xf -)

Mirror Removal

How To: Mirror Removal

To remove a mirror from a volume (i.e., to remove one of the plexes
that belongs to the volume), run the following command:

vxplex -o rm dis

Any associated subdisks will then become available for other uses. To
remove the disk from Volume Manager control entirely, run the
following command:

vxdisk rm

For example, "vxdisk rm c1t1d0s2".

How To: Mirror Backup

The following techniques can be used to backup mirrored volumes by
temporarily taking one of the mirrors offline and then reattaching the
mirror to the volume once the backup has been run.

1. Disassociate one of the mirrors from the volume to be backed up:

vxplex dis

2. Create a new, temporary volume using the disassociated plex:

vxmake -g -U gen vol tempvol plex=

3. Start the new volume:

vxvol start tempvol

4. Clean the new volume before mounting:

fsck -y /dev/vx/rdsk//tempvol

5. Mount the new volume and perform the backup

6. Unmount the new volume

7. Stop the new volume:

vxvol stop tempvol

8. Disassociate the plex from the new volume:

vxplex dis

9. Reattach the plex to the original volume:

vxplex att

10. Delete the temporary volume:

vxedit rm tempvol

To display the current Veritas configuration, use the following command:


To monitor the progress of tasks, use the following command:

vxtask -l list

To display information related to plexes, run the following command:

vxprint -lp

Fixing Corrupted Files and wtmpx Errors

Fixing Corrupted Files and wtmpx Errors

Unfortunately, system accounting is not foolproof. Occasionally, a
file becomes corrupted or lost. Some of the files can simply be
ignored or restored from backup. However, certain files must be fixed
to maintain the integrity of system accounting.

The wtmpx files seem to cause the most problems in the daily operation
of the system accounting. When the date is changed manually and the
system is in multiuser mode, a set of date change records is written
into the /var/adm/wtmpx file. The wtmpfix utility is designed to
adjust the time stamps in the wtmp records when a date change is
encountered. However, some combinations of date changes and reboots
slip through the wtmpfix utility and cause the acctcon program to
How to Fix a Corrupted wtmpx File


Become superuser.

Change to the /var/adm directory.

Convert the wtmpx file from binary to ASCII format.

# /usr/lib/acct/fwtmp < wtmpx > wtmpx.ascii


Edit wtmpx.ascii to delete the corrupted records.

Convert the wtmpx.ascii file back to a binary file.

# /usr/lib/acct/fwtmp -ic < wtmpx.ascii > wtmpx

See fwtmp(1M) for more information.

Is there a way to determine the PID associated with a socket ?



> Is there a way to determine the PID associated with a socket ?

Or using native commands without using lsof is to use pfiles.

cd /proc
pfiles * > /tmp/pfiles.out

search through pfiles.out for the process that has the socket open you
are interested in. i.e. there will be entries such as:

3771: /export/home/archiver/bin/myprocess
Current rlimit: 256 file descriptors
0: S_IFCHR mode:0666 dev:85,0 ino:191320 uid:0 gid:3 rdev:13,2
1: S_IFCHR mode:0666 dev:85,0 ino:191397 uid:0 gid:0 rdev:24,2
2: S_IFREG mode:0644 dev:85,5 ino:17 uid:104 gid:1 size:139436
3: S_IFDOOR mode:0444 dev:293,0 ino:58 uid:0 gid:0 size:0
4: S_IFSOCK mode:0666 dev:287,0 ino:19574 uid:0 gid:0 size:0
sockname: AF_INET port: 9001
peername: AF_INET port: 9001

pid 3771 has port 9001 open locally.

snapshot error: File system could not be write locked

Thomas wrote:
> Is there anyway to fssnap the root file system? I would like to use a
> snapshot for backup, but when I try to do that I get the error:
> snapshot error: File system could not be write locked

Are you running ntp/xntp? By default that program runs in the realtime
processing class, and has a current directory in the root filesystem.
You can't write lock the filesystem while that's true.

>> Yes, we are running ntp. I will try killing that and see what happens.
>> I am thinking that it might be better to repartition the disk rather than
>> to take ntp down and up for every backup.
> Why? Simply create a script that stops xntpd, creates a snapshot, starts
> xntpd and perform backup. No need to repartition.

But that's not kind to ntp, which wants to keep running to remain

The only problem here is that it's working directory is in root. I see
two possible workarounds.

#1 Have it run in a non-RT class. You can use priocntl for that. I
don't think it'll have a dramatic effect on the timekeeping, but you
might not want to do this if you need very accurate time on this

$ pgrep ntp
$ /usr/bin/ps -o class,pid -p 302
RT 302
$ priocntl -s -c TS 302
$ /usr/bin/ps -o class,pid -p 302
TS 302

Thus taking it from the realtime class to the timesharing class. (I
suppose I should have tried a fssnap at that point, but didn't...)

#2 Run it with the working directory not in root. I don't see any
reason it couldn't run in /tmp or /var/run, unless you wanted to
retain any core files that might be generated. I'm not certain how
best to achieve that, but I saw a post that suggested someone had
good luck using chroot.

Friday, October 14, 2005

Problems with port forwarding using SSH[R]

Port forwarding feature using SSH is failing.
Resolution: Top

Confirm these configuration settings in /etc/ssh/sshd_config file:

AllowTcpForwarding yes
GatewayPorts yes

Then execute:

# ssh -g -L 8080:webserver:80 webserver

In this example, systems connecting to http://webserver:8080 will be
forwarded to the web server daemon httpd listening on TCP port 80 on
the host webserver.

If you allow root access with this setting in /etc/ssh/sshd_config:

PermitRootLogin yes

You can use privileged or reserved ports (range 1-1023) with the above command.
Temporary Workaround: Top

Additional Information: Top

SSH Frequently Asked Questions Keys

1.1. General troubleshooting hints


In order for us to help you, the problem has to be repeatable.

When reporting problems, always send the output of

$ ssh -v -l <user> <destination>


If you have a root account on the destination host, please run

# /usr/sue/etc/sshd -d -p 222

or (on Linux)

# /usr/sbin/sshd -d -p 222

as root there and connect using

$ ssh -v -p 222 <user> <destination>

(the sshd server will exit after each connection, and you may
run into trouble with a local firewall that prevents you from
connecting from a different machine. Same-machine connections to
"localhost" should work)

If you do not have root access on the server, you can generate
your own "server" key pair and run on an unprivileged port:

$ ssh-keygen -P "" -f /tmp/sshtest
$ pagsh -c "/usr/sue/etc/sshd -d -p 2222 -h /tmp/sshtest"

or (on Linux)

$ pagsh -c "/usr/sbin/sshd -d -p 2222 -h /tmp/sshtest"

Then connect using

$ ssh -v -p 2222 <user> <destination>

1.5. log in using RSA keys

Do you really want to do this? Using RSA for login means you will not
get an AFS token, so you cannot access most of your home directory on
the public servers. There is no way to "translate" between RSA key and
AFS tokens.

If you want to give it a try, check the following common errors:


the UNIX permissions must be correct: 0600 for
~/.ssh/authorized_keys, 0755 for ~/.ssh (and AFS read access for
everybody!), home directory not writable by anybody but you.


Please make sure that your private key is somewhere safe (e.g.
in ~/private, with a symlink to ~.ssh), and encrypted using a good
pass phrase.

in ~/.ssh/authorized_keys, there has to be one key per line (no
linebreaks allowed)

The debugging tips at the beginning of this chapter (running the
server in debug mode) should point out the reason for failure pretty

1.6. New warning messages

OpenSSH stores both the host name and the IP number together with the
host key. This leads to some new messages:

Warning: Permanently added 'lxplus001,' (RSA) to
the list of known hosts.
Warning: Permanently added the RSA host key for IP address
'' to the list of known hosts.

If these annoy you, use "CheckHostIP no" in your $HOME/.ssh/config
file. However, please be aware that you are turning off an intentional
security feature of ssh.

Some warning that may appear while connecting to the PLUS servers
under their common DNS name (e.g. RSPLUS, HPPLUS) is due to the fact
that for load-balancing purposes, these servers' DNS entry is
constantly changing. This is detected and reported by ssh (as it
should be).

The RSA host key for rsplus has changed,
5 and the key for the according IP address
is unknown. This could either mean that
DNS SPOOFING is happening or the IP address for the host
and its host key have changed at the same time
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that the RSA host key has just been changed.
15 Please contact your system administrator.
Add correct host key in /afs/
to get rid of this message.
Offending key in /afs/
Password authentication is disabled to avoid Trojan horses.
Agent forwarding is disabled to avoid Trojan horses.

To avoid these, use qualified hostnames like rsplus01, hpplus01 etc..
(LXPLUS and SUNDEV are not prone to this problem, since a common host
key is used on all the servers in the cluster)

An alternative is to (manually) insert into $HOME/.ssh/known_hosts the
PLUS name after each qualified machine name that belongs to this PLUS

rsplus01,rsplus 1024 37 15457042575...
rsplus02,rsplus 1024 37 10734479336...

To remove the above error message, simply edit the file
~/.ssh/known_hosts (or ~/.ssh/known_hosts2 for the SSH-2 protocol) and
remove the line (which should start with the hostname and/or IP
address). Be careful not to break the long lines, it has to have one
line per host/key. Next time you connect, ssh should ask you whether
you actually want to connect, etc..
1.7. Statistics options for scp

OpenSSH scp does not support a few of the command line options from
ssh-1.2.26. Besides, the statistics output is different. The
environment variables controlling statistics output (SSH_SCP_STATS,
supported, either. The changed options are

ssh-1.2.26 option meaning OpenSSH option
-a Turn on statistics display for each file (on by default) (on by default)
-A Turn off statistics display for each file. This appears to be a
no-op for ssh-1.2.26 (n.a., use -q to turn off all statistics)
-L Use non privileged port -o UsePriviledgedPort=no (works as well on
-Q Turn on statistics display (on by default)

Sample statistics output from OpenSSH scp (no explicit options)

junk 100% |*****************************| 22867 00:00
zeroes 100% |*****************************| 512 KB 00:00

and output from ssh-1.2.26 scp:

junk | 22 KB | 22.3 kB/s | ETA:
00:00:00 | 100%
zeroes | 512 KB | 512.0 kB/s | ETA:
00:00:00 | 100%

If you actually parse this output in scripts, you would have to change them.
1.8. Errors on exit regarding X11 applications

Since the ssh client does forwarding for the X11 traffic from the
remote host, it won't exit until the last X11 application has been
closed. It appears that this mechanism sometimes fails, and the ssh
program will report errors like below even if all remote X11
applications are done:

Waiting for forwarded connections to terminate...
The following connections are open:
X11 connection from port 2352

The session will appear to hang. It can be closed by typing "~."
(without the quotes), and this should return you to your previous
shell. You could use "~&" as well to leave the current connection as a
background process.

If you are sure that there are no X11 windows or icons from the remote
server around, and if you can reproduce the problem, please contact

A current suspicion is that the regular network scanning mechanism
plays a role in this: by opening a connection to the remote X11 port,
but failing to connect through the forwarded channel, this could mess
up the internal bookkeeping done by ssh. To be confirmed.

Problem replacing disk in StorEdge T3

Problem replacing disk in StorEdge T3

At work we have a T3 where all disks are configured for RAID5. One of
the disks has failed, which means that accessing the data on the T3 is
really slow.

When I entered the replacement disk, it seemed to be taken in use
automatically (proc list showed some progress), but then it failed with
a 0D status (see vol stat, fru stat etc below).

I noticed that the disk is not exactly the same as the other, could this
be the reason? It is a proper replacement disk bought from Sun with the
proper bracket and everything, so it should work or what? What can I do
to fix this?

- Erlend Leganger

T300 Release 1.17b 2001/05/31 17:47:22
Copyright (C) 1997-2001 Sun Microsystems, Inc.
All Rights Reserved.

bigdaddy:/:<1>vol stat

v0 u1d1 u1d2 u1d3 u1d4 u1d5 u1d6 u1d7 u1d8
mounted 0D 0 0 0 0 0 0 0 0

bigdaddy:/:<2>fru list
------ ----------------- ----------- ----------- -------- --------
u1ctr controller card SLR-MI 375-0084-02- 0210 022813
u1d1 disk drive SEAGATE ST336605FSUN A338 3FP0H63D
u1d2 disk drive SEAGATE ST336704FSUN A42D 3CD0VFBL
u1d3 disk drive SEAGATE ST336704FSUN A42D 3CD0T89W
u1d4 disk drive SEAGATE ST336704FSUN A42D 3CD0VCZ4
u1d5 disk drive SEAGATE ST336704FSUN A42D 3CD0VF5L
u1d6 disk drive SEAGATE ST336704FSUN A42D 3CD0TG33
u1d7 disk drive SEAGATE ST336704FSUN A42D 3CD0TT8G
u1d8 disk drive SEAGATE ST336704FSUN A42D 3CD0VD4T
u1d9 disk drive SEAGATE ST336704FSUN A42D 3CD0TXQF
u1l1 loop card SLR-MI 375-0085-01- 5.02 Flash 033179
u1l2 loop card SLR-MI 375-0085-01- 5.02 Flash 030038
u1pcu1 power/cooling unit TECTROL-CAN 300-1454-01( 0000 028800
u1pcu2 power/cooling unit TECTROL-CAN 300-1454-01( 0000 028799
u1mpn mid plane SLR-MI 370-3990-01- 0000 021282
bigdaddy:/:<3>fru stat
------ ------- ---------- ---------- ------- ----
u1ctr ready enabled master - 30.5

------ ------- ---------- ---------- --------- --------- ---- ------
u1d1 ready disabled data disk ready ready 30 v0
u1d2 ready enabled data disk ready ready 33 v0
u1d3 ready enabled data disk ready ready 34 v0
u1d4 ready enabled data disk ready ready 32 v0
u1d5 ready enabled data disk ready ready 33 v0
u1d6 ready enabled data disk ready ready 33 v0
u1d7 ready enabled data disk ready ready 36 v0
u1d8 ready enabled data disk ready ready 32 v0
u1d9 ready enabled data disk ready ready 32 v0

------ ------- ---------- ------- --------- --------- ----
u1l1 ready enabled master - - 27.0
u1l2 ready enabled slave - - 27.5

------ ------- --------- ------ ------ ------- ------ ------ ------
u1pcu1 ready enabled line normal fault normal normal
u1pcu2 ready enabled line normal fault normal normal
Connection closed by foreign host.


> to fix this?

I'd say, complain @ Sun.
Searching google, I found the documentation from Seagate. Among other
things, it lists this:
ST336605: 29,549 cyl / 4 heads / 71,687,371 data blocks
ST336704: 14,100 cyl / 12 heads / 71,687,369 data blocks
I don't know whether these differences are a problem in this case. Sun
should be able to tell...
Maybe the issue can be fixed with a firmware update on the new drive (or
on all the old ones)?

You need to take a look at the syslog file right after the rebuild
fails. There should be more information in there. I have had this
happen before where the rebuild fails because of a read error on
another disk...


1) Your boot firmware is very old.
2) Your disk firmware is way out of date.
3) Both the batteries in your PCUs are expired.

The latest boot firmware is 1.18.04 and you're at 1.17b.
That's at least 3 years out-of-date!

The latest disk firmware for the ST336605FSUN is A838
The latest disk firmware for the ST336704FSUN is AE26

If you're lucky you'll be able to recover. The 'proc list' command will show
if the new disk is being reconstructed to. Otherwise hopefully you have a
way to backup the data. If so you can get the batteries replaced, upgrade
all the firmware and reinitialize the volume and restore the data.


> another disk...

Thanks for the tip. I have now learnt that the disk should be OK, so I
will try this again tomorrow and watch the syslog as you suggest. I will
be back with the result.

- Erlend Leganger


> all the firmware and reinitialize the volume and restore the data.

I guess this is what happens when you have a device that works OK, you
just forget about it... The batteries have been replaced though, we had
ordered them in.

I was able to copy the data from the T3 to other disk areas on the
server, so I'm OK with the files (I also have a backup on tape made
before it failed). I haven't RTFM yet, but are there any tips I should
be aware of when upgrading boot and disk firmware? What to do first?
Where do I get hold of the firmware updates?

> ordered them in.

You have to do more than just replace the batteries or the T3 won't know
anything has changed. Commands need to be ran to reset the dates back to
zero so the errors will go away.

This InfoDoc should explain the procedures:

Also the batteries should now last 3 years instead of 2 years per Sun.

In the same patch you would use to upgrade the boot and disk firmware:

there is a T3extender program that will run commands to set the battery
expiration life to 36 months instead of 24 months.

> another disk...

You were 100% correct. The warning light was lit on disk u1d1, so this
disk was replaced and attempted rebuilt. The rebuild failed after a
while, with a note of multiple disk errors in the syslog - it seems as
u1d4 has a problem as well. I was fooled by vol stat only showing error
on on u1d1 - I will check the syslog more carefully in the future.

- Erlend Leganger


Excellent, thank you. I need to wait for my second replacement disk, but
after reading up on the patch installation method, it doesn't seem too
difficult to do.

> there is a T3extender program that will run commands to set the battery
> expiration life to 36 months instead of 24 months.

I had a look at the T3extender program code and I decided that using
this patch is an extreme overkill (creating a long perl script and even
include perl itself in the patch) to do a small job: I only made two
".id write blife <pcu> 36" commands which seems to do the trick (see
below). Of course, if you have a room full of racks fully populated with
T3s, the script would be handy...

- Erlend Leganger

bigdaddy:/:<48>id read u1pcu1
Revision : 0000
Manufacture Week : 00442000
Battery Install Week : 00412005
Battery Life Used : 0 days, 2 hours
Battery Life Span : 730 days, 12 hours
Serial Number : 028800
Battery Warranty Date: 20051010082149
Battery Internal Flag: 0x00000000
Model ID : 300-1454-01(50)
bigdaddy:/:<49>id read u1pcu2
Revision : 0000
Manufacture Week : 00442000
Battery Install Week : 00412005
Battery Life Used : 0 days, 2 hours
Battery Life Span : 730 days, 12 hours
Serial Number : 028799
Battery Warranty Date: 20051010082152
Battery Internal Flag: 0x00000000
Model ID : 300-1454-01(50)
bigdaddy:/:<50>.id write blife u1pcu1 36
bigdaddy:/:<51>.id write blife u1pcu2 36
bigdaddy:/:<52>id read u1pcu1
Revision : 0000
Manufacture Week : 00442000
Battery Install Week : 00412005
Battery Life Used : 0 days, 2 hours
Battery Life Span : 1095 days, 18 hours
Serial Number : 028800
Battery Warranty Date: 20051010082149
Battery Internal Flag: 0x00000000
Model ID : 300-1454-01(50)
bigdaddy:/:<53>id read u1pcu2
Revision : 0000
Manufacture Week : 00442000
Battery Install Week : 00412005
Battery Life Used : 0 days, 2 hours
Battery Life Span : 1095 days, 18 hours
Serial Number : 028799
Battery Warranty Date: 20051010082152
Battery Internal Flag: 0x00000000
Model ID : 300-1454-01(50)

Secure remote tasks with ssh and keys

Secure remote tasks with ssh and keys

If you want to set up another administrator on your server or execute
remote tasks securely, learn to use ssh with keys. Vincent Danen tells
you how in this Linux tip.

Often, if you're administering a server, you'll find you need to
execute some small task on the server, or you want to delegate a task
to another administrator, but you don't want to give them full access.
Perhaps you want to execute a remote backup or status test. This can
all be accomplished using ssh with keys so that it can be unattended,
but still secure.

The first step is to create the ssh key using the ssh-keygen utility.
This is extremely straightforward. If you plan to have the task
unattended, be sure to not give it a password. To increase security,
make a special account to execute the task; make sure it can't log in,
and make sure that the ssh public key is used only on a particular
server or set of servers.

On the remote server, copy the user's ssh public key into
~/.ssh/authorized_keys. You will need to make some modifications to
the line in authorized_keys. To begin, you should set a "command"
keyword to ensure that only one particular command can be executed by
that key. The syntax looks like:


command="" KEY


where command could be something as simple as "/usr/bin/rsync" or
"/usr/local/bin/". To enhance and secure this further, add the
following options to authorized_keys:


forwarding,no-agent-forwarding,no-pty KEY


This ensures that anyone connecting cannot do any port forwarding, X11
forwarding, agent forwarding, and ssh doesn't allocate a pseudo-TTY
which prevents the issuing of commands through an interactive session.

If the client system is adequately secured to protect the
password-less key, and the availability of commands is restricted on
the server, using SSH to execute remote commands is a breeze.

plumb and unplumb

> Basically it seesm its unplumbed but still existing in the running system
> so commands like
> arp -a
> netstat -i
> should show it,

No. Unplumbed devices do *NOT* appear in either of those two lists.
Unplumbed devices are simply unknown to IP and ARP.

> If you dont want to reboot, try plumbing and unplumbing it, might do the
> trick.

Very likely not.

Driver loading and operation is only indirectly related to plumbing.
"Plumb" means that IP opens the driver (triggering it to load into
memory if necessary) and begins using it.

"Unplumb" means only that IP closes the driver stream. If the driver
itself is still in memory (and a driver that manages multiple
instances and has one instance still plumbed, as in the original
poster's stated configuration, is certainly in that state), then --
depending on how the driver itself is designed -- may still be
fielding interrupts from the underlying hardware.

Plumbing and unplumbing IP will do nothing in that case.

005 11:01 am Subject: directio


I have a running process which does io:

last pid: 22838; load averages: 0.94, 0.91, 0.82
103 processes: 100 sleeping, 1 zombie, 2 on cpu
CPU states: % idle, % user, % kernel, % iowait, %
Memory: 4096M real, 256M free, 7188M swap in use, 3392M swap free

7335 root 2 20 0 1223M 1164M cpu/2 24.7H 49.53%

truss -p 7335

/1: read(15, "020301\0\0\0\0\0\0\0\0\0".., 8192) = 8192
/1: lseek(15, 0x8F48CA00, SEEK_SET) = 0x8F48CA00
/1: read(15, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
/1: lseek(15, 0x8F48CB20, SEEK_SET) = 0x8F48CB20
/1: read(15, "020301\0\0\0\0\0\0\0\0\0".., 8192) = 8192
/1: lseek(15, 0x8F48CA00, SEEK_SET) = 0x8F48CA00
/1: read(15, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
/1: lseek(15, 0x8F48CB20, SEEK_SET) = 0x8F48CB20
/1: read(15, "020301\0\0\0\0\0\0\0\0\0".., 8192) = 8192
/1: lseek(15, 0x8F48CA00, SEEK_SET) = 0x8F48CA00
/1: read(15, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192

When i monitor the system , both for directIO and normal - cached io ,
i see the following patterns.I will ve appreciated if someone can
comment in order to eplain this:

The program reads data from the /data filesystem. This is ufs - on emc
disk array . ( i have 1 hba - fibre channel)

mount -o remount,noforcedirectio /data

# sar 30 10000

SunOS verdenfs1 5.9 Generic_117171-12 sun4u 10/05/2005

17:53:42 %usr %sys %wio %idle
17:54:12 14 37 0 49
17:54:42 13 38 0 48
17:55:12 14 39 0 47
17:55:42 13 38 0 49

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.4 0.1 1206.1 0.4 0.0 0.1 0.1 44.8 0 7 c2t16d65
0.1 1.2 1.1 9.5 0.0 0.0 0.0 9.2 0 1 c1t1d0
0.0 1.0 0.0 7.8 0.0 0.0 15.2 35.4 0 0 c1t0d0
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.2 0.1 1183.2 0.4 0.0 0.1 0.1 48.7 0 6 c2t16d65
0.3 1.2 2.7 9.5 0.0 0.0 0.0 7.7 0 1 c1t1d0
0.0 2.4 0.0 19.3 0.1 0.1 29.6 26.9 0 1 c1t0d0
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
7.2 0.1 1227.7 0.4 0.0 0.1 0.0 11.1 0 8 c2t16d65
0.1 1.2 1.1 9.5 0.0 0.0 0.0 7.4 0 0 c1t1d0
0.0 0.4 0.0 2.3 0.0 0.0 0.0 11.1 0 0 c1t0d0
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
1.3 0.1 1264.0 0.4 0.0 0.1 0.1 44.9 0 6 c2t16d65
0.3 1.1 2.7 9.5 0.0 0.0 0.0 8.0 0 1 c1t1d0
0.0 0.2 0.0 1.6 0.0 0.0 0.0 11.4 0 0 c1t0d0

verdenfs1@root/tmp #vmstat -p 30
memory page executable anonymous
swap free re mf fr de sr epi epo epf api apo apf fpi
fpo fpf
7402800 1789416 73 89 10 0 1 1 0 0 0 1 1 2626
11 9
3474096 262832 158 0 0 0 0 0 0 0 0 0 0 1207
0 0
3474120 262520 187 249 0 0 0 0 0 0 0 0 0 1216
0 0
3474072 262304 166 33 1 0 0 0 0 0 0 0 0 1188
1 1
3474208 262144 159 0 0 0 0 0 0 0 0 0 0 1269
0 0

mount -o remount,forcedirectio /data

# sar 30 10000

SunOS verdenfs1 5.9 Generic_117171-12 sun4u 10/05/2005

%usr %sys %wio %idle
17:57:12 8 24 23 46
17:57:42 2 10 39 49

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
257.8 0.0 62434.5 0.0 0.0 0.8 0.0 3.1 1 77 c2t16d65
0.1 1.2 1.1 9.4 0.0 0.0 0.0 8.3 0 1 c1t1d0
0.4 0.4 3.2 3.2 0.0 0.0 0.0 8.8 0 0 c1t0d0
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
244.4 0.0 64447.9 0.0 0.0 0.8 0.0 3.3 1 78 c2t16d65
0.3 1.2 2.7 9.2 0.0 0.0 0.0 7.7 0 1 c1t1d0
0.7 2.7 4.8 15.6 0.0 0.1 7.1 15.8 0 2 c1t0d0

verdenfs1@root/tmp #vmstat -p 30
memory page executable anonymous
swap free re mf fr de sr epi epo epf api apo apf fpi
fpo fpf
3474064 263456 31 15 0 0 0 0 0 0 2 0 0 61924
0 0
3474192 263440 25 8 0 0 0 0 0 0 1 0 0 63965
0 0
3474064 263088 43 88 0 0 0 5 0 0 0 0 0 63966
0 0

1.When i use noforcedirectio , sar reports no wio but 38 %sys , on the
other hand iostat shows about 1MB. read per second with 44.9 msec
service time.This io utilized the disk only 6 or 8 percent. And ,
vmstat -p shows that the system does pagins for file io only for 1MB.

2.But when i force directio , i have %39 wio , io rate grows
significantly (65MB per second) , the disks are utilized at 77 percent
and service time is 3.1 msec. At the same time , i see 65MB. of fpi.

1. Why does the low io rate in 1 (noforcedirectio) create 44.9 msec
service time while the high io rate create only 3.1 msec? Isnt it
logical to expect to see more io more service time.

2. Since the option 2 uses forcedirectio , how can i explain the large
fpi value? (If directio is in use , why does the operating system cache
file data?)

3. comparing the %38 sys with %39 wio , which one of them is better ?

Kind Regards,
I think the output is related with the IO's type of your application.

If IO is through "dd" command, the output is just what you think
right, that is to say,1)using "noforcedirectio" option, IO is using
more cpu power, and cached io(fpi) ,IO thoughput is higher;2)using
"forcedirectio" option, opposite to the former.
"dd" command is using "read/write" system calls.

Is there any progress about it?

To confirm whether using the filecache or not, you can install the
tool bundle "memtool" to assist you.
As a command from "memtool", "memps -m" can tell you which file is in
cache and how much cache the file occupy.

PS, another tool "directiostat" could be also helpful.

>Is there any progress about it?

I dont understand, if cache is turned off, physial disk IO increases.
This is natural, and the perfomance problems are to be found elsewhere.

The applications such as oracle, which has own file IO management
system, will be benefit from directio (file cache disabled);but the
performance of the normal file system IO will get an impact.

The applications such as oracle, which has own file IO management
system, will benefit from directio (file cache disabled);but the
performance of the normal file system IO will get an impact.

graphics monitor and serial console on V440

Subject: Re: graphics monitor and serial console on V440
> We have a V440 running Solaris 9. The system has an XVR-100 graphics
> card and CRT monitor attached. Currently the CRT monitor is acting as
> the console. Is there any way to continue to utilize the CRT monitor
> for logging in and windowing (i.e. continue to see the dtgreet login
> screen and login to CDE) while having another device attached to the
> serial management port act as the console for the system?

Yes, that's pretty common.

Force the console to the device you want via 'input-device' and
'output-device' in the eeprom.

Then cp /usr/dt/config/Xservers to /etc/dt/config (if you don't have one
there already). Read the examples at the top for the "if no character
device is associated" example and use that at the bottom instead of the
existing line. It'll probably look something like this when you're

:0 Local local_uid@none root /usr/openwin/bin/Xsun :0 -nobanner

The monitor should come alive when dtlogin launches.


Sun warns against putting raw data on s2

It should be noted though, that Sun warns against putting raw data on s2,
since block zero contains the disk label, and labelling will overwrite the
beginning of your raw data.

Finally, I'll mention that I saw an interesting case many years ago, where
a disk with s0 consisting of the entire disk got unmounted rather abruptly
to say the least. fsck on s0 complained about a bad superblock, even with
any of the alternate superblock locations. However, I was able to fsck s2
with success, and then mount cleanly. Don't know enough about the guts of
the disk layout to know why this worked, but it did.

On the other hand, changing the length of s2 to something other than the
entire disk is an explicit no-no, according to Sun.

Understanding Data Link Errors

Understanding Data Link Errors
Many performance issues with NICs can be related to data link errors.
Excessive errors
usually indicate a problem. When operating at half-duplex setting,
some data link errors
such as Frame Check Sequence (FCS), alignment, runts, and collisions are normal.
Generally, a one percent ratio of errors to total traffic is
acceptable for half-duplex
connections. If the ratio of errors to input packets is greater than
two or three percent,
performance degradation may be noticed.
In half-duplex environments, it is possible for both the switch and
the connected device
to sense the wire and transmit at exactly the same time and result in
a collision.

Collisions can cause runts, FCS, and alignment errors due to the frame
not being completely copied to the wire which results in fragmented
When operating at full-duplex, FCS, Cyclic Redundancy Checks (CRC), alignment
errors, and runt counters should be minimal. If the link is operating
at full-duplex, the
collision counter is not active. If the FCS, CRC, alignment, or runt
counters are
incrementing, check for a duplex mismatch. Duplex mismatch is a
situation where the
switch is operating at full-duplex and the connected device is
operating at half-duplex, or
vice versa. The result of a duplex mismatch will be extremely slow performance,
intermittent connectivity, and loss of connection. Other possible
causes of data link errors
at full-duplex are bad cables, faulty switch port, or NIC
software/hardware issues.

Explanation of Port Errors Counter Description

Troubleshooting NIC Compatibility Issues on ISUnet

Alignment Errors
Alignment errors are a count of the number of frames received that don't end
with an even number of octets and have a bad CRC.

FCS(Frame Check Sequence)
FCS error count is the number of frames that were transmitted/received
with a bad
checksum (CRC value) in the Ethernet frame. These frames are dropped and not
propagated onto other ports.

This is an indication that the internal transmit buffer is full.

This is an indication that the receive buffer is full.

These are frames which are smaller than 64 bytes (including FCS) and have a
good FCS value.

Single Collisions
Single collisions are the number of times the transmitting port had
one collision
before successfully transmitting the frame to the media.

Multiple Collisions
Multiple collisions are the number of times the transmitting port had more than
one collision before successfully transmitting the frame to the media.

Late Collisions
A late collision occurs when two devices transmit at the same time and
neither side
of the connection detects a collision. The reason for this occurrence
is because the
time to propagate the signal from one end of the network to another is
longer than
the time to put the entire packet on the network. The two devices that cause the
late collision never see that the other is sending until after it puts
the entire
packet on the network. Late collisions are detected by the transmitter
after the first
"slot time" of 64 byte times. They are only detected during transmissions of
packets longer than 64 bytes. Its detection is exactly the same as for a
normal collision; it just happens late when compared to a normal collision.

Excessive collision are the number of Collisions frames that are
dropped after 16 attempts
to send the packet resulting in 16 collisions.

Carrier Sense
Carrier Sense occurs every time an Ethernet controller wants to send data
and the counter is incremented when there is an error in the process.

These are frames smaller than 64 bytes with a bad FCS value.

These are frames that are greater than 1518 bytes and have a bad FCS value.

Possible Causes for Incrementing Port Errors
Counter Possible Cause

Alignment Errors
These are the result of collisions at half-duplex, duplex mismatch, bad hardware
(NIC, cable or port), or connected device generating frames that do
not end with on
an octet and have a bad FCS.

FCS (Frame Check Sequence)
These are the result of collisions at half-duplex, duplex mismatch, bad hardware
(NIC, cable, or port), or connected device generating frames with bad FCS.

This is an indication of excessive input rates of traffic. This is
also an indication
of transmit buffer being full. The counter should only increment in situations
where the switch is unable to forwarded out the port at a desired
rate. Situations
such as excessive collisions and 10 megabit ports will cause the transmit
buffer to become full. Increasing speed and moving link partner to full-duplex
should minimalize this occurance.

This is an indication of excessive output rates of traffic. This is
also an indication
of the receive buffer being full. This counter should be zero unless there is
excessive traffic through the switch. In some switches, the outlost
counter has a
direct correlation to the Rcv-Err.

UnderSize This is an indication of a bad frame generated by the
connected device.

Single Collisions
This is an indication of a half-duplex configuration.

Multiple Collisions
This is an indication of a half-duplex configuration.

Late Collisions
This is an indicationof faulty hardware (NIC, cable, or switch port) or duplex

Excessive Collisions
This is an indication of over-utilization of switch port at
half-duplex or duplex

Carrier Sense
This is an indication of faulty hardware (NIC, cable, or switch port).

This is an indication of the result of collisions, duplex mismatch, dot1q, or
ISL configuration issue.

This is an indication of faulty hardware, dot1q, or ISL configuration issue.

Additional Troubleshooting for 1000BaseX NICs
Gigabit Auto-Negotiation (No Link to Connected Device)
Gigabit Ethernet has an auto-negotiation procedure that is more
extensive than what is
used for 10/100 Mbps Ethernet (Gigabit Auto-negotiation spec: IEEE Std
The Gigabit Auto-negotiation negotiates flow control, duplex mode, and
remote fault
information. You must either enable or disable link negotiation on
both ends of the link.
Both ends of the link must be set to the same value or the link will
not connect.
If either device does not support Gigabit auto-negotiation, disabling
Gigabit auto-
negotiation will force the link up. Disabling auto-negotiation "hides"
link drops and other
physical layer problems. Only disable auto-negotiation to end-devices
such as older
Gigabit NICs that do not support Gigabit auto-negotiation. Do not disable auto-
negotiation between switches unless absolutely required as physical
layer problems may
go undetected and result in spanning-tree loops. The alternative to
disabling auto-
negotiation is contacting the vendor for software/hardware upgrade for
IEEE 802.3z
Gigabit auto-negotiation support.