Lazy Reading for 2020/06/14

Wide topic range, again.

Lazy Reading for 2015/10/18

Accidental topic this week: very, very old computers.

Your comics link of the week: Cartozia Tales #1, with more added.  I subscribed to this series long ago, and it’s a lot of fun.

Lazy Reading for 2014/11/23

Lots to read this week.

Your unrelated link of the week: Snowpocalypse 2014.  I grew up there and now live not too far away.  That’s really not that much snow for the area; it’s just that it fell so quickly.

Google Code-In 2013 and Summer of Code 2014 announced

Google has a post up about the 10th anniversary of Summer of Code, with next year’s version of the event getting some changes – an increase in the students allocated and in the student stipend, and more events.  I’m planning to apply for DragonFly, for 2014.

Google is also doing the Code-In, for 13 to 17-year-old students, again.  DragonFly participated in the first year (the only BSD to do so), but sat out last year.  I’m not currently anticipating DragonFly being involved for 2013, cause of reasons.  (It’s a lot of work!)

Trying out deduplication

I moved to DragonFly 2.10 over the past few days, and I tried out deduplication, to see what kind of results I would get.  The procedure is outlined below.  I’m using /home here as an example, just to reduce the amount of text pasted in.

/pfs/@@-1:00004     966000640 566434576 399566064    59%    /home

Move my various Hammer pseudo-file systems to version 5, which supports deduplication.

# hammer version-upgrade /home 5

Issue a deduplication simulate command, to see what it guesses will be the savings:

# hammer dedup-simulate /home
Dedup-simulate /home: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 4
Dedup-simulate /home succeeded
Simulated dedup ratio = 1.22

That ratio turned out to be pretty accurate for the actual deduplication.  I didn’t time it, unfortunately.  I don’t know if the time taken is proportional to the amount of deduplication or the total volume of data, though I suspect the latter.

# hammer dedup /home
Dedup /home: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 4
Dedup /home succeeded
Dedup ratio = 1.22
462 GB referenced
378 GB allocated
14 MB skipped
6869 CRC collisions
0 SHA collisions
0 bigblock underflows

The end result?

/pfs/@@-1:00004     966000640 505887504 460113136    52%    /home

That data space is shared across all file systems, and it’s a 1TB disk, so it’s 7%, or 70GB. I was hoping for more, but I don’t have any obviously duplicated data (no local mail store, no on-disk backups), so perhaps this is normal. 70GB that I didn’t have before is no bad thing, though.

Incidentally, I was able to upgrade my installed software from pkgsrc-2009Q4 to pkgsrc-2011Q1 entirely using pkg_radd -u <pkgname>.  Remarkably quick and painless, though pkgin may have been able to do it even faster since it would pull from the same place.

Hammer and the future

Matthew Dillon’s been thinking about Hammer, and how to implement clustering well enough to work as a sort of RAID replacement.  He’s written up a document describing his plans.  Some highlights:

  • writable history snapshots
  • quotas and accounting
  • live rebuilds of data from mirrors
  • and the same history, mirroring, and snapshots as before.

It’s going to be a while before this “Hammer 2” becomes a finished product, though, so don’t count on it for the next release.

Google Code-In, and sysctls too

Ed Smith was thinking of working on sysctl documentation, but as it turns out, a lot of it has already been done via Google Code-In; Samuel Greear recently committed a lot of it. (Though there’s more sysctl work possible.)

While on that topic, Samuel Greear also posted a lengthy summary of all the Code-In work done so far.  We need more code-related tasks!  The existing ones have been so popular that they’re all getting done, quickly.