The reason I started using NetBSD was that I became intensely frustrated with Linux on the Digital Alpha, in particular with the lack of release engineering: you can't download kernel sources from the usual master sites, because even on the ``stable'' branch, Linus builds it on his PeeCee and releases it. ``Releases'' aren't even tested on anything except Linus's PeeCee. For notPeeCee's, you have to get specific kernel versions and then go hunting for all these ``patch kits''. Even for an XBox, which is only just barely not a PeeCee, you have to do this. RedHat and IBM have employed most of the serious Linux developers for many years, and in many ways this is great. They do so with a level of transparency that gives Sun something to strive towards, and if honestly examined, outs Apple for the double-talking scheming proprietary worms of the OS world that they are. But it also means RedHat and IBM overwhelm the goals of the project with raw productivity. It's a wonderful thing about the open source world that writing working code is mostly enough to give you control over the direction of a project, and I don't wish to change the following fact but: the logical consequence is that every open-source project, like it or not, is nakedly for sale, cheaply, to any corporation that wants to throw developers at it.
RedHat and IBM have bought Linux and defined the Linux culture, which is a bit different from the Unix culture. It's a PeeCee-based culture:
The last one needs more explaining. In Linux, there's only one kernel branch: The Future. RedHat ironically makes money mostly by selling you a fix for the broken culture they nurtured, by getting you closer (maybe not close enough) to the releasing scheme of Solaris, HP-UX, AIX. There are three kinds of change:
With a new Linux kernel, you get a mix not of your choosing of all three. Will there be more bugs or fewer bugs in the new version? Who's to say? One trick is to keep upgrading until you find something that's not broken, then hold onto it as tightly as possible, knowing that eventually security problems or your replacements for broken hardware will force you back to the latest, again broken, system.
RedHat, IBM, other Linux distribution vendors, optimistically, filter these changes out so you can get the first two kinds of changes with less brokenness introduced by the third. But support for new hardware routinely breaks old hardware.
In practice, RedHat and IBM have their business priorities. Their cost-conscious customers are not too interested in support for hardware more than two years old, and definitely not for anything non-i386. They break old hardware all the time, and aren't interested in fixing it. That's what drove me off Linux onto NetBSD.
A side-effect of this voting-with-lines-of-code system is that system integration is supposed to happen magically. Would you like to do something moderately minority? Examples:
In this case, the majority will break your shit faster than you can fix it. You won't spend your time working on the thing that interests you---the different thing you're doing. You'll spend it learning how Gnome works so you can fix some arcane endian bug that shows up on big-endian powerpc machines but not i386, untangling OpenOffice's build system because they've started assuming you have the Gnu version of some Unix tool, forward-porting XBox or SATA patches that no longer apply to the latest kernel. This isn't related to what interests you. It's related to what other people are touching and breaking.
Back then, there actually was a development branch for Linux---when
the digits after the first decimal were an odd number, that was a
development version. The RedHat guys would actually do all their work
on the stable kernel, not even comitting it to the development branch
at all, and they had no trouble pushing changes through. Non-RedHat
developers had to do the work of forward-porting RedHat additions to
Linux 1.2.x into the 1.3.x development branch, so a lot of the time
people using development kernels actually lacked
features. Now, Linux seems to have given up on development branches
entirely---instead, developers work against the stable kernel, and
they keep their work inside ``patch files'' that they perfect and
push, push, push until someone gobbles them into the so-called stable
kernel. My Grub menu says 2.6.15.1+libata1
because I've
added Jeff Garzik's SATA pachset.
It was (is) a total disaster, and it was preventing me from getting any work done. All I did was keep trying patches and then rolling them back, hoping I'd find one that actually stayed up more than a day and supported my AIC7xxx SCSI chip. After deciding to use BSD on my Alpha, I used NetBSD because I liked their web page more than OpenBSD's. And it was great.
I think the main reason I kept using NetBSD instead of another BSD is that I have a long-term goal of becoming a good programmer. I admit to not making rapid progress on that, but I think reading NetBSD code and watching NetBSD design decisions helps me more than reading and watching Linux, and I believe also more than FreeBSD or OpenBSD.
However, lately it seems like NetBSD and Unix in general is in a real crisis. Problems that were introduced one or two years ago are not getting fixed---problems like (as of 2005-10-30):
XXX -- this is getting fixed. I'm very optimistic that it will work, not quite so optimistic that the debuggers will be bug-free.
[XXX--actually this is not so bad now. I think all three BSD's have relatively recent Java, on i386 only, right now? Of course Linux has it on PowerPC also, but it's still an improvement! ISTR this got better a few months before Sun made their Java partial-GPL announcement. I think small attitude changes and tiny amounts of work by the right highly-competent grossly-underpaid-at-any-price people make a big difference.]
In the BSD/OS days c. 1997 we had LFS, but no one can convince it to work again. NetBSD is still trying, but there is some weakness in their debugging system now---often one cannot get kernel coredumps because some locking problem causes a second panic when you ask for a coredump after a problem with filesystems or disk layers. I don't know if this is a big hindrance to the LFS geeks or not, but it does mean, as a non-developer, one can't help much to flush the bugs out of LFS.
We also don't have an equivalent to Linux's JFFS2, which makes NetBSD not very interesting on cheap devices with tiny NOR FLASH chips.
I figured maybe I wasn't so comfortable slamming gdb without trying it, so I better do a quick check and attach gdb to BIND 9, since I heard that uses threads, and I don't want to start up any other threaded programs since last time it made my whole machine panic.
(gdb) attach 316 Attaching to process 316 /export/src/gnu/dist/gdb/gdb/solib-svr4.c:1282: gdb-internal-error: legacy_fetch_link_map_offsets called without legacy link_map support enabled. An internal GDB error was detected. This may make further debugging unreliable. Quit this debugging session? (y or n)
so....um....it looks like gdb does not work at all, with or without thread support. Here, let me try again:
castrovalva:~$ uname -a NetBSD castrovalva 3.0_BETA NetBSD 3.0_BETA (CASTROVALVA-$Revision: 1.20 $) #1: Tue Sep 20 18:06:04 EDT 2005 carton@castrovalva:/export/src/sys/arch/alpha/compile/CASTROVALVA alpha castrovalva:~$ cat > t0.c #include <stdlib.h> #include <stdio.h> int main(int argc, char **argv) { printf("Hello, world.\n"); return EXIT_SUCCESS; } ^D castrovalva:~$ gcc -pthread -g -o t0 t0.c castrovalva:~$ ./t0 Hello, world. castrovalva:~$ gdb t0 GNU gdb 5.3nb1 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "alpha--netbsd"... (gdb) run Starting program: /a/castrovalva/export/home/carton/t0 thread_resume_suspend_cb: td_thr_suspend(0x1202d4f80): generic error. [Switching to LWP 1] Stopped due to shared library event (gdb)
I just can't get any work done this way. Let's try FreeBSD:
# uname -a FreeBSD lucette 5.4-RELEASE FreeBSD 5.4-RELEASE #0: Tue Sep 20 23:05:38 UTC 2005 carton@lucette:/usr/src/sys/sparc64/compile/LUCETTE sparc64 # cat > t0.c #include <stdlib.h> #include <stdio.h> int main(int argc, char **argv) { printf("Hello, world.\n"); return EXIT_SUCCESS; } ^D # gcc -pthread -g -o t0 t0.c # ./t0 Hello, world. # gdb t0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc64-marcel-freebsd"... (gdb) run Starting program: /root/t0 Hello, world. Program exited normally. (gdb)
It works now. But on the mailing list as of 2008-01-01 there are some reports of problems using gdb to debug a threaded KDE library. How about Solaris:
bash-3.00$ uname -a SunOS amber 5.10 Generic_118822-20 sun4u sparc SUNW,Ultra-5_10 bash-3.00$ cat > t0.c #include <stdlib.h> #include <stdio.h> int main(int argc, char **argv) { printf("Hello, world.\n"); return EXIT_SUCCESS; } ^D bash-3.00$ /opt/SUNWspro/bin/cc -mt -g -o t0 t0.c -lpthread bash-3.00$ ./t0 Hello, world. bash-3.00$ gdb t0 GNU gdb 6.2.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.10"... (gdb) run Starting program: /home/carton/t0 warning: Lowest section in /lib/libpthread.so.1 is .dynamic at 00000074 warning: Lowest section in /lib/libthread.so.1 is .dynamic at 00000074 Hello, world. Program exited normally. (gdb)
I don't know for sure if FreeBSD and Solaris can actually debug threaded programs helpfully, but NetBSD can't even keep the debugger from crashing. But, see, that's okay, because what people do is debug their threaded program under Linux or Windows, and then port it to NetBSD or FreeBSD. If there are any further bugs, they're with the OS's thread support, not the program itself, so gdb won't do you much good. What you need is something to go through your kernel core dump now that your machine has paniced (assuming you were lucky enough to get one, which you often aren't). I don't know what else they could be thinking. I know people are working on thread support, yet clearly gdb is useless.
Anyway, back to gcc. i386 makes it easy on you, because most of the compiler is in silicon. I've heard like half the chip's transistors and power draw are devoted to translating from i386 bytecode into some reasonable, modern machine language. This is perfect for people with the level of competence creating compilers of gcc and Microsoft.
I used to complain all the time about Intel's backward instruction set, but maybe it is time to stop sparing the people feeding their hedgemony: how many people do you know who learned x86 assembler, and are very proud of it, but refuse to learn any other machine language? In fact, they call the i386 instruction set simply ``machine language'' without bothering to specify which machine, and they print on T-shirts jokes about ``EIP'' as if everyone knew what they were talking about, and as if it weren't a huge embarassment to have all your register names start with E because of the 16-bit legacy. If that's the level of motivation and perspective of the braintrust you have to work with, perhaps it makes sense to maintain an instruction set that was considered mediocre, even twenty-seven years ago when it was first released. If most of your CPUs are going to have Microsoft code running on them built with Microsoft compilers, maybe you'll get more competitive performance-per-dollar by building a chip with a built-in recompiler so that people won't think your chips are slower than the other guy because Microsoft's compilers suck.
Unfortunately, gcc seems to be right there at Microsoft's level performance-wise. On any truly modern chip with a challenging instruction set that expects considerable intelligence from the compiler like sparc, alpha, mips, ppc, you'll find gcc is a notorious dog, producing code sometimes even half the speed of the payware alternative SUNpro, MIPSpro, Tru64, Codewarrior, whatever compiler. We used to say gcc just needed time to catch up on these less popular targets, but the entire life cycle from development to prominence to obscurity, for multiple generations of each of these CPUs, has passed before us, and now people are running gcc on sort of functioning-museum relics of these old machines, and gcc's performance still sucks.
Look at what Sun has done to gcc . I believe it works like this:
You have to download their modified gcc in two pieces: (1) the gcc piece, with source, licensed under the GPL, and (2) the so-called ``Code Generator'' under a Free as in Free Beer license with typically onerous provisions, so Sun can pull it off their web site at any time and stop redistribution, and:
(d) Unless enforcement is prohibited by applicable law, you may not decompile, or reverse engineer Software.(f) You may not publish or provide the results of any benchmark or comparison tests run on Software to any third party without the prior written consent of Sun.
The SPARC architecture and the demand for more clever compiler optimizations that comes with it. has been popular since the late 80's. The default architecture targeted by these tools, UltraSPARC II, was released in 1997, and gcc still doesn't have a reasonable back-end for it. Sun dragging the obstinate open-source world kicking and screaming into the future, in this rather misleading and deceitful way: you're encouraged to think you're getting the Freedom of the gcc brand along with the Performance of Sun, but you're not.
I believe Sun is motivated to do this gcc work because they want to sell UltraSPARC T1 systems. Open-source developers usually test only on Linux/gcc, so packages that build under Solaris with gcc often won't build under Solaris with SunStudio compilers, because of differences in C dialects or compiler command line options. Their gcc hack makes a bunch of Linux-centric packages suddenly work with SunStudio-quality optimization, so their T2000 operates twice as fast. This was less important to them when they were fighting with IBM and HP over who could run Oracle fastest, but now that they're flogging the T2000 to pursue the crappy-Linux-webapp LAMP market, running things that only build with gcc fast becomes important.
I can imagine what the open-source zealot will say: ``A little speed isn't worth your Freedom. If the version of gcc that gives you Freedom runs too slowly on Sun CPU's, buy from their competitor.'' Yeah, you've got it all figured out, haven't you. a fast answer ready for everything. But this doesn't wash with me, not even on software-freedom idealogical grounds much less practical. I have two reasons.
First, the praxis of free software zealotry strengthens so-called ``natural'' monopolies. Back when we had to pay for Unixes and C compilers, we had Digital/SGI's MIPS, Digital's Alpha, IBM's POWER, Sun's SPARC, HP's PA-RISC, and each company employed developers to create a performant C compiler and math library, because this compiler they provided would be used to build Oracle and other benchmarks by which shoppers would compare their systems' speeds. In the free software world, gcc has world-class performance on i386, but nobody can be bothered to get it more than barely-working on any other CPU. ``Freedom'' means I have to buy my CPU from one specific company? I see this argument, but this Freedom,...it doesn't feel like Freedom. For me it feels the same as being forced to use Windows. And I don't think it's a good thing for the industry that the rise of free software zealotry naturally promotes monopolistic hedgemony among CPU companies.
Second, it's a case of the Atheros Problem. Free software zealots, myself included, get their panties in a bunch because Linux, FreeBSD, NetBSD, all expect you to use a bug-ridden binary ath_hal.o module released by this ass-pirate Sam Leffler if you want to use an Atheros wireless card. [update 2008-01-01: this is finally changing! I can hardly believe it. Wake me up when Reyk's HAL is in OpenWRT! and when 802.11a works, like, at all!] ``It's a binary driver!'' they scream. ``The old PRISM2 chip had a completely open-source driver. I know Atheros is a better chip, but at what cost? At the cost of your FREEDOM?! NO WAI! PRISM2 4 LYFEE!!'' Now, there's the problem, in that second part. PRISM2 isn't any more free than Atheros. If you look at the pieces of the driver in functional terms, the free software bit of the Atheros driver does all the work of the corresponding entirely-free-software PRISM2 driver, plus more. The work done inside this offensive ath_hal.o is proprietary, closed-source on PRISM2 as well: the functions are inside Intersil's proprietary closed-source firmware, running on a CPU inside the wireless card. If we have so much Freedom with PRISM2, why don't we get the source code for THAT software? ``Because it's FIRMware, not software. The Code of Zealotry doesn't apply to FIRM ware.'' what's the fucking difference? We both know there is a directory full of .c files and probably even a Linux toolchain for building that FIRMware, sitting at Intersil headquarters behind an IP wall. I'd be totally unsurprised even to discover some access point customers sign NDA's and get access to that source, just like they do for Atheros.
The problem with Atheros over PRISM according to OpenBSD is that it's easier to exploit your system through an unauditable bug in ath_hal.o than an unauditable bug in the PRISM2 firmware. This may be true through the fashion and skillset in the penetration world right now, but I'm not sure how fundamentally true it is.
The problem with Atheros over PRISM according to me is that it's impossible to run the same version of ath_hal.o that the Windows people are using. With PRISM2, Intersil's card has to function under Windows using the same binary firmware that they provide to free software people---Intersil doesn't have much choice here. With Atheros, Leffler checks in his bug-ridden beta versions to all the free software projects and gets free testing for his employer Atheros, with minimal risk to the part of Atheros's reputation that they care about. Meanwhile, Atheros gives higher quality, better-tested ath_hal.o's to their Windows driver team and to their customers making proprietary high-end access points.
The Intersil firmware<->driver API is also a little more stable, so that, even without source, one can make his own choice of which Intersil firmware he'd like to FLASH into his wireless card, and pick one that's stable. And it actually it's not the most recent 1.8.x version that I found I preferred---it was 1.7.4. I don't think I can do that with ath_hal.o, though I suppose we could have tried harder if we were paying attention and not just letting Sam muck it up while sucking him off and telling him what a great guy he is for his ``contribution.'' I don't know why we have to act like beggars all the time. I'm gladly and frequently paying extra for hardware with good drivers---it's not like I don't contribute any money to this kind of software just because I value freedom.
Anyway, the gcc/Intel vs. SunStudio/SPARC problem is the same as the open-driver/PRISM2+firmware vs. ath_hal.o/Atheros problem. Intel has built a clever optimizing compiler for a secret machine language burried inside their modern i386 chips. This compiler translates from i386 opcodes into the secret machine language, performing optimizations. Without the extreme cleverness of these proprietary optimizations, Intel's chips would perform half as fast, or worse. The compiler is burned into the chip. I argue that the i386 opcodes are analagous to the gcc MI state that Sun is dumping into /tmp files with their frankengcc.
The two differences I see between gcc/Intel and frankengcc/SPARC are: (1) with Sun, you get to download the compiler, while with Intel it's burned into your chip, and (2) with Sun, you get to see the real machine language---in fact you can even get free (as in Freedom) Verilog and VHDL source code for a chip that executes the real machine language performantly. With Intel, the true machine language that the chip runs internally is a trade secret---your OS hands off to the chip in i386 opcodes, which function as a crufty interpreted language sort of like a Java bytecode. There is an extra layer of propriety in that it's impossible for you to reimplement an open version of the proprietary piece if you were able to do a good job of it (although, so far, it seems we aren't).
I'm not saying Sun deserves no ire for this sneaky bullshit they pulled with gcc. I'm just much more disappointed with gcc. If the gcc zealots really care about freedom, they should fucking know better than to blow all their effort on the crappy i386 target, where we will never be truly free, and better than to continue shipping these slapdash joke backends on all the other CPU architectures.
/usr/include/sys/*.h
into chatty English!
In some places /* */
comments are overtly duplicated or
paraphrased.A student, IMHO, needs mostly two things. First, a discussion of design motivations. Not, what are the pieces shaped like and how do they talk to each other, but why does this piece exist at all. Without presenting architecture along with its motivation, I think geeks who are too close for too long to big, arcane systems will accidentally leave out too much detail that the student needs. Information needs to be organized before one can cram it into a brain, and one cheap easy way to accomplish this is, reorganize it into the same structure it was in when it got crammed into the inventors' brains. That is the motivation-based structure.
Second, he or she needs exercises. The student needs to write some code that's tested against reality. The exercises must be clever because at the beginning they must keep lots of stuff behind a curtain, then they must slowly lift curtains one-by-one until the student learns how to lift the remaining curtains herself. Most books make me want to strangle someone because they skip the second step. It's so frustrating to feel stupid.
I have some more petty grievance with documentation regressions, where
things that used to be documented have missing or bitrotted
documentation. Working systems need to be delivered with all
the documentation that comes in the distribution tarball. In the case
of NetPBM, it needs to be delivered with more documentation
than comes in the distribution tarball, because the imbecile who took
over maintenance of this ancient package just stopped shipping
documentation one day. He no longer revision-controls his
documentation at all! This dizzy-headed AOLeet poser just refers
everyone to the web page, where he purports to document all versions
of NetPBM throughout history simultaneously. He also documents
extensively things he hasn't written yet, then says ``just kidding!''
after you learn how he means for the thing to someday work. You're
choosing to read my rants right now (you can stop, you know). Imagine
if I'd inflicted all this garbage on you when all you want to know is
the command line arguments to pnmscale
because you
already tried ImageMagick convert
, and it didn't work.
All the new X extension stuff seems to be missing documentation: XKB, Freetype, utf8. Even when it's there, the platform-dependent piece of X is not documented. They pretend all their docs are platform-neutral, but you can't remap your keyboard without understanding how USB keyboards work on $RANDOM_WEIRD_UNIX. In an old-style vendor Unix, the X server's keyboard documentation would include a brief essay on USB HID's and references to other man pages for USB tuning and enumeration programs. These days, that's an absurd fantasy!
The BIND and NTP documentation that used to ship with NetBSD has disappeared. I think BIND got added back after one formal release, but not NTP.
The most recent thing that showed up to annoy me was telnet.
castrovalva:~$ telnet localhost Trying ::1... Connected to localhost. Escape character is '^]'. Trying SRA secure login: User (carton):
wtf is ``SRA secure login''? It turns out this is not a NetBSD omission. Rather, all this garbage was added to telnet decades ago, and no one updated the man page at all.
Christos pointed out that there's no need to get upset because I can just google for it. But this is telnet. It's old. Why is it changing at all, much less silently?
Of course the man page already fails to describe any of the other telnetd AUTHENTICATION options, which from the source code seem to be: SPX, KRB5, KRB4, KRB4_ENCPWD, RSA_ENCPWD, SRA. What are all those? SPX as in IPX/SPX? SPX was their TCP-ish protocol, over which remote console programs ran. I hope not. I knew about Kerberos, but three kinds of Kerberos? wait, wtf is SPX? what IS IT?! Many of them have an AUTH_HOW_MUTUAL and AUTH_HOW_ONEWAY variant, which may have something to do with man-in-the-middle safety, but I have no idea. Likely, I thought I was safe by invoking telnet with '-s' and hoping it would only accept Kerberos or S/Key logins, but all along it was willing to do other weird KerberRSAENCPWDweirD logins that have who knows what man-in-the-middle problems. So, I don't know any more. I just don't think it's appropriate to say ``see Google for documentation,'' even though upon further inspection it's not a new problem and taken in context is a fair response. It's just that telnet, over the years, has been accumulating this highly questionable garbage, and now there are 5 - 10 authentication methods lurking inside it with zero documentation---the existence and enumeration of the methods isn't even mentioned outside the source code, much less a short description of the character and requirements of each.
Would I be crazy to ask for a runtime warning that I'm using one of them, a warning I could correlate with the manpage for the command I just typed? In this world, yes. Yes that's now a crazy thing to ask.
The crisis isn't so much that these problems exist. It's that they've existed for several years, and they aren't getting fixed. As time passes, BSD and other free Unix accumulate more of these problems, like fish taking on mercury or birds accumulating DDT. The projects are turning into lurching jalopies. I see a future in which these systems all have to develop cross-build architectures because they aren't stable enough to self-host, like Mach/4.4Lites.
For architectures other than i386, that day is basically already here, but I'm saying pretty soon we're going to be cross-building BSD from under Linux or Windows, and running it only inside emulators, using it only in classrooms and computing museums. no decent filesystems, no self-hosting debugging ability, inability to provide the API (threads! Javur!) needed by programs one can't live without like Firefox, Postgres, OpenOffice---how long before our kernel can't even handle running gcc? It's already unable to handle gdb!
But, who knows. Maybe we just need three years to get scheduler activations working instead of two.