To compile a custom kernel for the same release as you already
have installed you only need the kernel
syssrc.tgz
file. For a given release this is held
in the gzipped tarfile 'source/sets/syssrc.tgz' under the main
directory for that release. For example NetBSD 2.0 kernel
source is in the file /pub/NetBSD/NetBSD-2.0/source/sets/syssrc.tgz.
If you have a NetBSD
CD-Rom, the 'source/syssrc.tgz' file should be
included. The source can be extracted anywhere, though the
traditional location is inside /usr/src. To extract use
"cd / ; tar xvzpf
<FILENAME>
".
The latest kernel sources are available from ftp.NetBSD.org or one of the mirrors in the directory /pub/NetBSD/NetBSD-current/src/sys/. To compile a kernel you should download the following from /pub/NetBSD/NetBSD-current/tar_files/src:
You should first build and install the 'config' program, in case it has changed since the version you are running. Since -current is on the active edge of NetBSD development, there can be problems compiling a -current kernel. You are recommended to use source from an Official Release until you are familiar with the configuration process.
You might need to do this if you have installed a snapshot on your machine and need to rebuild the kernel (and the -current kernel is too recent). Follow the directions on how to Track NetBSD-current with anoncvs.
comp.tgz
) set
that came with your base system.
cd /usr/src && ./build.sh tools
cd /usr/src/sys/arch/<ARCH>/conf
",
where <ARCH> is your machine's architecture such as 'i386',
'sparc', 'mac68k'.
cp GENERIC <MYCONF>
", where
<MYCONF> is your name for this configuration. You
could use your hostname, the machine type, or even your
first name. Keep to letters, numbers, and _ characters.
dmesg
" or
"dmesg | grep ' at '
". For every line containing
'<XXX> at <YYY>' you need to keep the entries for both
<XXX> and <YYY>. You should also read options(4)
for information on the different kernel configuration options.
cd /usr/src && ./build.sh
kernel=MYCONF
.
If you use this approach, you can skip the next 4 steps.
config <MYCONF>
" will generate the
kernel build directory for <MYCONF>.
cd ../compile/<MYCONF>
" changes to the
kernel build directory.
make depend
" generates a '.depend' file that
enables the make program to see what needs to be
rebuilt (at this point it will be everything!).
make
" will compile the kernel. If all goes
well you will be left with a 'netbsd' kernel. This may take
some significant time if you are on a VAX, some time
on a big Alpha, and somewhere in-between for the
rest of us.
mv /netbsd /netbsd.old ; mv
/usr/src/sys/arch/<ARCH>/compile/<MYCONF>/netbsd
/
" saves your current kernel,
(very important), and moves the new
kernel ready to be booted.
reboot
" should reboot using your new kernel -
the boot messages should contain a line of the form:
'NetBSD <VERSION> (<MYCONF>) #0: <COMPILE_DATE>'
SPACE
when the first NetBSD message appearsboot netbsd.old -s
"fsck /
"mount /
"mv netbsd.old netbsd
"exit
"The term GENERIC refers to a kernel that is configured to run on just about any machine supported by the machine architecture. The term originated from a line in the kernel configuration file which specified that the root device was "generic" as well as a configuration option. This option and that format of the configuration line is no longer used, but the name will probably stick for a while.
Since these kernels tend to include support for all the available device drivers and many models of machines that you are not using, you are encouraged to compile your own custom kernel.
mclpool limit reached: increase NMBCLUSTERS
mean? (top)
This means the kernel has run out of space to map mbuf clusters. mbuf clusters are used by the network code to store packets and other network related data.
The default setting for NMBCLUSTERS is 1024 (256 in NetBSD 1.5 and earlier), so if you have this problem, try doubling the value until the error message disappears. To display the current value of NMBCLUSTERS you can use sysctl(8) as follows:
# sysctl kern.mbuf.nmbclusters |
Alternatively, try
# echo 'print nmbclusters' | gdb -q /netbsd |
See also options(4) for more details on kernel configuration options. To change the value, add
options NMBCLUSTERS=2048 |
to your kernel configuration, or patch the binary:
# gdb --write /netbsd (gdb) set nmbclusters=2048 (gdb) quit |
Note that if you patch the binary only, you will need to reboot for the change to take effect. If you're on a platform which supports it, you can set the value with:
# sysctl -w kern.mbuf.nmbclusters=2048
|
This will work, but will be lost on the next reboot. Combining this, and patching the binary, would mean no need to build a new kernel or reboot.
WARNING: SPL NOT LOWERED ON SYSCALL EXIT
mean? (top)
This kernel message means that there is a bug in the kernel
where a syscall did "int x = splfoo();
" and did
not call "splx(x);
" before it returned. The
splx(x);
function in this example would restore
the system priority level to the one encoded in
x
, which was a
value previously returned by one of the other spl functions
(in this case, the made up example of
splfoo();
).
If you get this kernel message you should be dropped into ddb(4), the in-kernel debugger. A stack trace in ddb, accomplished by pushing 't', might show you the offending syscall(). It is probably a good idea to send-pr(1) the output of the trace command (as well as any other relevant information), since you should under no circumstances be getting this kernel message.
See also spl(9) for more information on spl functions.
Stray interrupt on IRQ 7
mean? (top)
The "Stray interrupt on IRQ 7" kernel message means that the interrupt controller reported an unmasked interrupt on IRQ 7, but no driver attached to that IRQ 'claimed' it.
There are two reasons this can happen:
In anything other than a PC, it would almost always means that there is a driver attached to the IRQ (otherwise it would be masked), but it is the wrong driver.
In a PC, there is the more obscure issue of 'default IR7's. That is, when a device asserts an IRQ, but the IRQ is deasserted after the PIC latches the interrupt and before the CPU acknowledges it, the PIC just flat out lies about which IRQ it was.
There is a scheme for recognizing 'default IR7's, but it turns out that it fails badly on some older systems, and in general it's better to fix drivers to not generate them in the first place. In some cases it's difficult to completely prevent them when using edge-triggered interrupts though.
You should only get this kernel message running a kernel with the DEBUG option defined.
-msoft-float
(top)
When a process makes a system call the kernel needs to save the processor state, so that it can later switch back to the process. Floating point registers tend to be large and relatively plentiful, making saving and restoring them an expensive operation. If the FPU is in the middle of an operation the CPU will additionally be forced to sit and wait for it to finish before it can then copy the registers.
Avoiding floating point registers in the kernel gives a significant performance win for system calls. Some processors, such as sparc, can also use lazy FP context switching to sometimes avoid having to save and restore FP registers even when switching between processes.
On some architectures the compiler can use floating point registers to speed up certain operations (such as block memory copies), breaking the above, so '-msoft-float' is required.
By default NetBSD installs a GENERIC kernel which includes drivers for almost every supported item of hardware, network protocol, and filesystem. While this allows one kernel to run on virtually every machine for a given port, it does result in it using more space than is really needed, particularly on a small memory machine. This is compounded by kernel compiles using the -O2 optimisation level which tells the compiler to use extra memory and time to make the kernel as fast as possible.
One option when building
your own kernel is to use "make COPTS=-O
" which
instructs the compiler to perform only the most effective
optimisations. This will result in a fractionally slower
kernel, but it will take less time to compile.
If you are intending to take several 'compile and reboot into
new kernel' passes while customising a kernel on a low memory
machine it may make sense to make the first few passes using
"make COPTS=-O
", and then switch to
"make
" for the final pass.
Of course, generally the fastest way to compile a kernel on a low memory machine is to use another machine, or temporarily add some more memory!
The first point to note is you should subscribe to the current-users mail list. Tracking -current without reading current-users is akin to driving in the dark without any lights. You have been warned :).
It is always worth downloading the latest config.tar.gz, compiling, installing and rerunning on your config file - config changes reasonably frequently between releases.
Sometimes, binaries and/or libraries need to be updated before you will be able to build -current on a release. In these situations, it may be simpler to install from a binary snapshot, and then build -current. Snapshots of -current for i386 (for example) can be found in /pub/NetBSD/arch/i386/snapshot/. The src/UPDATING file contains information about these important changes which you should be aware of when attempting to build -current, or a -current kernel.
DEBUG
, and 'makeoptions
DEBUG="-g"
' enabled in the config file.
gdb netbsd.gdb
" (in kernel compile directory).
target kcore /var/crash/netbsd.0.core
" at the
gdb prompt.
You can use the usual gdb(1) commands, such as 'bt
' to get a backtrace.
You can get backtraces of an arbitrary process from gdb when debugging a kernel crash dump with two easy steps:
ps -ax -O paddr -M
netbsd.x.core
proc 0x<addr>
"
DDB is the optional in-kernel debugger. It is usually entered via one of three methods:
boot netbsd -d
).
Some of the more useful commands are:
trace
- Produce a stack trace. Very useful when
submitting a
PR on a kernel panic.
reboot
- Reboot the system.
sync
- produce a crashdump and reboot
Usually the kernel will automatically generate a crashdump on
panic, which is then picked up by savecore(8) on
reboot. However you can force a crashdump in ddb(4) by
using sync
(or reboot 0x100
).
If the kernel panics or hangs while
attempting to sync the buffer cache you can use
reboot 0x104
which will bypass the sync.
Some ports are already setup to build a boot floppy by
"cd
/usr/src/distrib/<ARCH>/floppies ;
make
". (You may need to build the INSTALL kernel
manually before running this. If you have an existing
boot.fs
image you can replace the kernel
with:
vnconfig -c vnd0 boot.fs
mount /dev/vnd0a /mnt
gzip -c -9 < netbsd > /mnt/netbsd.gz
umount /mnt
vnconfig -u vnd0
This assumes you have "pseudo-device vnd
" in your
kernel config file, and a ready to use kernel.
By default, SCSI devices under NetBSD are numbered starting
from 0 in the order of SCSI ID number. In other words, you
lowest-numbered SCSI device will be /dev/sd0
, the
next device will be /dev/sd1
, etc. Notice that
this is the assignment that they are given during the boot
process.
If you compile your own kernels, you can set the SCSI devices to point to any SCSI ID number you want with a kernel configuration line like:
sd0 at scsibus0 target 4 lun 0 sd* at scsibus? target ? lun ? |
The above lines will make device sd0 point to the disk at SCSI
ID#4 and the rest of the devices will be assigned as described
above. This is often referred to as "hardwiring" your SCSI
devices, and is recommended if you are using RAIDframe or
ccd
so as to avoid the device IDs being changed
out from under the configuration if one device is powered off
or broken.
device not configured
mean? (top)
If this message appears during the autoconfiguration output of system boot, it means that the kernel discovered a piece of hardware in your system that it doesn't have a device driver for. This means that either the device driver exists and has not been compiled into the kernel you booted, or the device driver doesn't exist at all (in which case, it's time to contact a friendly developer and offer him testing hardware in exchange for writing a device driver).
Since GENERIC kernels are used for basic installation, it is important that they be stable and known to work; as such, device drivers that are not yet stable are not compiled into GENERIC kernels by default. Examination of a GENERIC kernel configuration file for your system may reveal experimental device drivers for your device which are "commented out." If you compile a kernel of your own (please don't call it GENERIC), you can try experimental device drivers.
If this message appears when you try to access a device
node in /dev
(e.g. a SCSI disk), this means
that the driver can't find the specific device unit you
tried to access, e.g. accessing a SCSI disk that isn't
there.
Often, this happens when the device nodes in
/etc/fstab
don't match what the kernel
found during autoconfiguration at boot time, and the
"mount" command in /etc/rc
tries to mount
all the filesystems. You should double check that the
devices you're trying to use were actually found by the
kernel at boot time, by examining
/var/run/dmesg.boot
(a saved copy of the
boot time autoconfiguration output).
If a kernel is compiled with WDCDEBUG
defined,
then gdb can be used to patch
wdcdebug_atapi_mask
and
wdcdebug_mask
. Setting the appropriate bits in
these variables will cause the kernel to output verbose
information about ATAPI and ATA operations. (Currently
NetBSD defaults to WDCDEBUG
enabled.)
For the maximum level of output, use:
# gdb --write /netbsd (gdb) set wdcdebug_atapi_mask=0xff (gdb) set wdcdebug_mask=0xff (gdb) quit |
Note: This will produce an extremely large quantity of output. To select individual options, look for the list of bitflags directly above:
wdcdebug_atapi_mask
in /usr/src/sys/dev/scsipi/atapi_wdc.c
wdcdebug_mask
in /usr/src/sys/dev/ic/wdc.c
If you have problems with USB devices, you can enable verbose messages in the usb driver:
USB_DEBUG
and
DDB
defined.
-d
to enter ddb(4).
continue
.
This assumes the device is of a generic type which is already supported, but the device ID is not recognised. Adding a device that performs differently includes writing code.
It should be reported in the boot messages as 'not
configured
'. Note the device ID (in this case
USR3031
):
isapnp0: <U.S. Robotics 56K FAX INT, USR3031, , > port 0x3e8/8 irq 5 not configured |
Add an appropriate entry to
/usr/src/sys/dev/isapnp/isapnpdevs
:
devlogic com USR3031 USR 56k Faxmodem |
isapnpdevs.{c,h}
using
'make -f Makefile.isapnpdevs
'.
This assumes the device is of a generic type which is already supported, but the device ID is not recognised. Adding a device that performs differently includes writing code.
PCMCIAVERBOSE
.
not configured
'. Note the Manufacturer and
product codes (in this case 0x143
and 0x201
):
pcmcia0: CIS version PCMCIA 2.0 or 2.1 pcmcia0: CIS info: Grey Cell, GCS2000, Gold II, 1 pcmcia0: Manufacturer code 0x143, product 0x201 pcmcia0: function 0: network adapter, ccr addr 3f8 mask 1 |
vendor
and product
entries to
/usr/src/sys/dev/pcmcia/pcmciadevs
pcmciadevs.h
and
pcmciadevs_data.h
using 'make -f
Makefile.pcmciadevs
'.
/usr/src/sys/dev/pcmcia/
,
for example an ne2000 compatible card would use /usr/src/sys/dev/pcmcia/if_ne_pcmcia.c
There are patches from Martin Husemann which add PLIP support to NetBSD/i386, submitted as PR 1278. The patches at the end of the PR should apply to the NetBSD 1.3.3 source tree.
UBC stands for the Unified Buffer Cache project. It was written by Chuck Silvers, and has been integrated into NetBSD since 1.5L (Nov 2000). When upgrading from a non-UBC setup, you'll need to rerun config(8) again, but before you do, you'll want to remove any settings for "BUFCACHE", "NBUF" or "BUFPAGES", and let the size of the buffer cache go back to the default. Under UBC, the traditional buffer cache is no longer used for storing regular file data, only metadata, so you want to allow the VM system to manage most of your physical memory. The default buffer cache size will be fine for most people, regardless of the amount of memory in the machine. The amount of memory in the boot messages about "using X buffers containing Y memory" no longer indicates the amount of memory available for caching file data, so don't worry if those numbers don't change.
The important difference is that more memory will be available for caching regular file data, so filesystem i/o will be faster since there will be more times when the data you're accessing is already in memory. How much faster depends on what you're doing, but you'll probably notice the difference.
See also: UBC: An Efficient Unified I/O and Memory Caching Subsystem for NetBSD by Chuck Silvers.
Jochen Kunz' "NetBSD Device Driver Writing Guide":
[gzipped PDF | gzipped PS] (English)
[gzipped PDF | gzipped PS] (German)