AMD64 work starting

Matt Dillon’s planning to work on AMD64 support for February. He listed these steps:

“* build support and cross compilation work
* kernel build
* boot 64-bit kernel almost to single user
* 32 bit userland support
* boot kernel to single user
* basic device driver and filesystem testing
* boot kernel to multi user (fully working system at this point)
* everyone w/ 64 bit boxes start banging on it, fixing additional
device drivers, get 64 bit buildworlds working, and so forth.”

Systimer stuff

Matt Dillon’s been working on this patch, described as so:

“These are dyamic[sic] interrupt-driven timers. They replace the old fixed periodic ‘hardclock’ interrupt that exists now and allow per-cpu multiple periodic or one-shot timer interrupts to be registered with the system. Systimers operate outside the MP lock, so any code developed to use it has to be MP safe. Systimers are intended to be able to make use of per-cpu timers (e.g. LAPIC), when available, and will eventually be augmented to use them.”

It also has the added bonus of making nanosleep() very accurate.

gcc 3 going on

Joerg Sonnenberger has added gcc 3.3 to the base system. You can set ‘CCVER=gcc3’ to use it, even to do a buildworld/buildkernel, though that is “not recommended”. Andreas Hauser already reported a successful build and boot doing it, though.

xl expanded

Matt Dillon has made some changes to the xl driver that apparently solves a mysterious bug; I’m quoting from his changelog message below:

“Turn off hardware assisted transmit checksums by default. In buildworld loop tests this has been conclusively shown to corrupt transmit packets about one out of every million packets. The receive will not know the the packet is bad because hardware assist also apples the correct checksum to the corrupted packet. The result are random failures or corruption of network data in certain situations. On DragonFly, for some reason, doing a ‘resident /usr/bin/*’ seems to bring the problem out every few buildworlds with (primarily) mkdep’s cpp complaining about odd errors trying to open non-existant header files (during a header file search), such as EPROTONOSUPPORT. A tcpdump on both NFS client and server showed the client transmitting an access RPC and the server seeing a corrupted access RPC on its end, and then responding with EPROTONOSUPPORT. Other uncaught errors are also almost certainly occuring. mkdep is more likely to catch them because it actually checks the errno of a failed open() and does a huge number of open()’s (and as an NFS client this generates a huge amount of packet traffic).”

Debugging options

Matt Dillon noted that:

makeoptions DEBUG=-g
options DDB

are the only options you need when building with debugging options on.

Faster file descriptor allocation

Skip Ford ported Tim Robbins’ FreeBSD port of Niels Provos’ NetBSD file descriptor allocation code. Normally I don’t post about code until it gets committed, but he posted some numbers on how well it improves things, as benchmarked by Niels Provos’ test program that opens/closes files repeatedly. The numbers seem to indicate a 50% speedup:

File DescriptorsUnpatchedPatched

Sound support superior

Thanks to Emiel Kollof (sorry I screwed up attribution!), AC97 support has been synced up with FreeBSD, plus support for the following devices has been added by Jeroen Ruigrok:

Diamond Technology Monster (DT0398)
Intersil (Harris) HMP9701
Integrated Technology Express (ITE) ITE2226E and ITE2646E
Texas Instruments TLC320AD90
Winbond W83971D
Asahi Kasei AK4544A and AK4545
Realtek ALC850
Wolfson WM9711L, WM9712L, and WM9709
Texas Instruments’ TLV320AIC27
Conexant SmartDAA HSD11246

Resident Good

Matt Dillon is bringing in resident executable support. This speeds loading of dynamically linked programs by saving a copy of their vmspace with vmspace_fork() in the kernel, and using that when executed instead of going through the regular, slower startup. This will replace prebinding.

Matt Dillon posted these preliminary numbers running a test program with Perl:

dynamic:2.860u 2.668s 0:05.61 98.3% 87+231k 0+0io 0pf+0w
prebinding:1.821u 2.095s 0:03.90 100.2% 34+202k 0+0io 1pf+0w
resident:1.239u 1.846s 0:03.08 99.6% 137+280k 0+0io 0pf+0w
statically linked:0.418u 0.867s 0:01.28 99.2% 808+616k 0+0io 0pf+0w

It is planned to make dynamic loading as fast as static. For those of you not familiar with the output of time, the first column is total time taken, the second is time taken to run the tested program, and the third is time consumed by system overhead. Yes, there are more columns than that. No, I don’t know what they mean.