Hi Brian!, thanks for your talk. Honestly I think it was more
interesting to me than Ian's for what I'm doing right now. :) I am
actually very interested in how Sun plans to make money, both as a
general business-strategy thing and as an
I-hope-Sun-does-well-so-I-have-some-choices thing, and I expect a big
part of this will be selling access to certain branches of a
complicated revision tree. This brings in revision control, which is
also an interesting topic to me. but since the Indiana project is
just starting and the audience uninitiated I guess the talk was
necessarily short on technical details.
Anyway, I didn't really take Nexenta seriously before because I
thought it took away the best parts from both Linux and Solaris---the
careful integration, release engineering, i18n, and support offering
of Solaris gone and replaced with ``at least what the Nexenta people
have used themselves does work, and for everything else it's `open
source' so fix it yourself instead of complaining,'' and the real
freedom of Linux where almost all the source really is available
unlike Solaris(/Nexenta) where it's only a very tiny amount with huge
chunks of the base system, the toolchain, and even the kernel missing
to an extent that no Linux zealot would _dream_ of tolerating, and
constantly new binary-only drivers are committed even for hardware
that Sun sells themselves, while still people go around saying things
like ``didn't even know Solaris was open-source now.'' but at least
the demos kind of shocked me by making Nexenta look to me polished,
practical, attractive, and I was thinking ``maybe I wouldn't feel so
far behind if I were using this,'' so I think it was a good idea to
show them and talk about them.
I've been thinking about the differences among a bunch of different
package systems:
* BSD systems which are all pkgsrc-like. Within these there are some
weird things going on like:
* NetBSD's ``bulk builds''
* NetBSD's 'pkgviews' (this one is potentially important to Solaris
because it can actually solve the ``glibc version'' problem.
someone in the back mentioned this example as a more insidious
incarnation of the IMHO-mostly-solved multiple-JRE-versions
problem Mark brought up.)
* FreeBSD's 'portupgrade' (I don't fully understand this tool yet.
I don't know if it's Gentoo-ish or NetBSD-ish.)
* Gentoo, which differs from BSD in a few interesting ways, like
* the lack of a ``base'' system
* the weaker way build dependencies are treated which makes
bootstrapping possible without a base system and makes small
library upgrades very fast, but which also makes it possible to
``corrupt'' your system and creates all these checking ``etools''
like 'revdep-rebuild'.
* the cleaner-than-BSD way that multiple installable revisions of
the same package are handled. You still aren't allowed to
install two versions of the same package like pkgviews, but in
Gentoo at least you can _find_ the version you want to install
easier, and the KEYWORDS mask (stable 'x86' vs unstable '~x86')
automates this in a way that BSD can't.
* The Linux binary package systems
* RedHat's lesson of charging more to let you keep _older_
packages, and fixing the security bugs in them. Cheaper/free
package systems follow the same revision tree as their upstream
projects, which means more often you have to have to upgrade to
an actual new version to fix security bugs than is a CentOS user
or RedHat contract-holder.
* all the horrible grief Linux binary systems get from letting you
or forcing you to download non-revision-controlled packages
directly from vendors outside the distribution project.
* The difference between source dependencies and binary
dependencies. The Linux binary distributions constantly have to
stress and wring their hands over the
shared-library-version-problem. When should we increment the
library's version number and force a rebuild/upgrade of all the
dependent packages? When should we simply release a new version
of the shared-library binary package, but leave the version
number marked on the .so inside unchanged so you can keep using
all your old dependent packages?
BSD has this library version number problem only for the base
system. And it's only on the base system that they make binary
ABI commitments on formal releases. In the package system,
shared library version numbers aren't really used at all unless
someone's carefully set up two packages by hand to permit two
copies of the same library to be installed at once. They're not
used for dependencies. In general the version is just
libfoo.so.0.0, and the package dependencies enforce that
libraries are only installed on the same system with compatible
packages (``compatible'' meaning packages that were built from
source against that exact library). Make _any_ change to the
shared library, and you have to rebuild all the dependent
packages. This means they only care about source dependencies.
Suppose upgrading from libpng-1.2.8 to libpng-1.2.9 requires
rebuilding Mozilla because the ABI (some .h files) to libpng has
changed. But no changes to the Mozilla source code are required.
so, PNG 1.2.8 and 1.2.9 are source-compatible, but not
binary-compatible.
| dpkg/rpm | pkgsrc
| old | new | old |new
--------+------------+-------------+-------------+------------
library |1.2.8-1 | 1.2.9-0 | 1.2.8-nb1 |1.2.9
package | | | |
--------+------------+-------------+-------------+------------
soname |libpng.so.2 | libpng.so.3 | libpng.so.0 |libpng.so.0
--------+------------+-------------+-------------+------------
Mozilla |2.0-1 | 2.0-2 | 2.0-nb1 |2.0-nb1
package | | | |
Because some .h files changed, both Linux and BSD need to rebuild
Mozilla. There's no way around that. But in BSD, the dependency
of Mozilla on libpng, in the _installed_ system although not in
the installable pkgsrc tree, is tracked abstractly, as one object
on another object, not by version number. If you touch libpng at
all, whether you change libpng's version number or not, pkgsrc
will want to rebuild Mozilla. The ``new'' Mozilla 2.0-nb1 on the
disk, and the binary package you'd make from it, isn't the same
as the old one. Yet the two have the same version number because
nothing has changed in the source package, nothing in the Mozilla
code nor the build instructions for Mozilla itself.
Binary dependencies aren't tracked at all. And while you can
have ``binary'' packages in pkgsrc, the package version number
marked on the binary package doesn't uniquely identify it like it
does in Linux binary systems. A binary package in pkgsrc is
identified by a tuple of { the version of the NetBSD base system
used to build the package , the date/CVSbranch of the /usr/pkgsrc
tree used to build the package , the package name and version }.
In practice people get a little sloppy and slightly Linuxy with
this, and the package system doesn't actually record the CVS
date/branch inside the binary package (maybe it should) but if
you want an absolutely guaranteed-to-work system you have to
install one big bag of binary packages all at once, all built
from source at the same time. The pkgsrc guys run continuous
``bulk builds'' to produce these bags of consistent binary
packages. The alternative is, don't use binary packages---use
/usr/pkgsrc---then you can upgrade individual things and rebuild
dependencies as needed.
This is actually a bit paradoxical, because on one hand it
reduces the amount of regression testing you need to do. On
Linux, the burden of one of these libpng.so.2 -> libpng.so.3
changes, having to release, download, and install all those
Mozilla 2.0-1 -> 2.0-2 binary packages, is enormous, so as much
as possible they will change libpng but keep the same soname
libpng.so.2 in both the old and new packages, so Mozilla doesn't
need to be rebuilt.
| dpkg/rpm | pkgsrc
| old | new | old |new
--------+------------+-------------+-------------+------------
library |1.2.9-0 | 1.3.0-0 | 1.2.9 |1.3.0
package | | | |
--------+------------+-------------+-------------+------------
soname |libpng.so.3 | libpng.so.3 | libpng.so.0 |libpng.so.0
--------+------------+-------------+-------------+------------
Mozilla |2.0-2 * | 2.0-2 * | 2.0-nb1 + |2.0-nb1 +
package | | | |
* these two are the same mozilla.bin
+ although the package version number is the same, mozilla is
rebuilt between these two. mozilla.bin may have a different
checksum.
Sometimes the Linux guys will make a mistake, and give
incompatible versions of a library the same soname. This can be
a disaster, because it causes problems for only a small subset of
customers that you can identify only by the exact version numbers
of every single package on their system. It's hard to detect,
hard to reproduce. a really nasty bug in your release. And the
instructions to customers on how to avoid the bug can be
complicated: ``don't use this version of libpng! but it's ok to
keep using it if you have this version of this, and that version
of that, and ...'' so I like the BSD way better.
On the other hand, if you do not use binary packages and build
from source using /usr/pkgsrc, the packages high up in the
dependency tree will tend to have binaries with many different
checksums on each customer. One guy rebuilt libpng for a
security flaw, but kept the old libpango. Another guy used a
later version of /usr/pkgsrc, so he got new libpng, and also new
libpango which was upgraded for some non-security-related thing.
Both have the exact same revision Mozilla package, but different
Mozilla binaries. A package with lots of dependencies could have
hundreds of different mozilla.bin's, all supposedly correct and
working, but if there is a bug somewhere, again, it's hard to track
down. The official BSD answer to this is, I think, ``If you
don't like that, (1) use our quarterly stable branches like
pkgsrc 2007Q2, and (2) always upgrade everything when you upgrade
anything.''
I like the BSD package system. I think eliminating classes of
bugs is Good, and I think rebuilding and downloading things is
relatively cheap now. But if you have no base system like
Gentoo, it's really not good, because it's hard or impossible for
closed-source vendors to release consistently-working software
that depends on libraries inside something like pkgsrc or Gentoo.
I think open-source and closed-source software each work better
with a fundamentally different package system architecture, and
kind of like the status quo where I get a consistent-ABI platform
from Sun, and then I add on all this open source stuff with
pkgsrc. however I'm not sure this is going to fly with the way
Solaris sysadmins like to do security upgrades. And I'm sure it
won't fly with Sun's evangelical push to ``be more like Linux,''
which makes me sad, the way GNOME trying to be more like Windows
makes me sad.
* There was a ``package'' filesystem in QNX Neutrino, which they have
abandoned I think. I believe it was a sort of dual-booting thing,
where you could have multiple simultaneous collections, each of
hundreds of packages, some packages shared between two collections,
and a mesh of dependencies. Then, you can choose at boot time, ``I
want Package Set A'', and the system will make things appear as
though only files mentioned in Set A, not Set B, are installed. I
don't know how it was used.
I could imagine it being an alternative to FLASHing the ROM of an
embedded system. High-end systems usually have ``Bank A'' and
``Bank B'', or they have a FLASH filesystem like Cisco where each
boot image is a single big file, and a ROM monitor lets you choose
among them. With this ``package filesystem,'' the vendor could say
``install this tiny patch package on your system'', and if it
didn't work, you'd be able to roll back the install with the
Package Filesystem.
This sounds eerily like the ``package management service'' Ian
hinted at, but it could be completely different. I suppose I
should read about it instead of speculating.
Ian focused on the weirdest things, like whether the tool downloads
packages for you, or you do it by hand. Who gives a shit? Package
systems are really complicated and interesting. I'd like to see what
Sun comes up with, though I think I will miss the freedom and the
readily-available easy-to-tweak source code I get with pkgsrc.
Attachment:
pgpS14ThfpOLk.pgp
Description: PGP signature