Tomas Bodzar asked about RAM usage with Hammer and deduplication, pointing at this example that shows ZFS requiring… I’m not sure. Lots? Anyway, Matthew Dillon noted that offline deduplication in Hammer would use available RAM/swap for CRCs on all files, but only a limited subset for ‘live’ dedup. For a real-world example, Venkatesh Srinivas described deduplicating about 600G down to 400G, with a machine having only 256M of RAM. Yes, only 256M.
Enabling the vfs.hammer.double_buffer=1 sysctl will greatly improve Hammer performance when you’ve exceeded your memory cache (at a possible slight penalty when you have not) and also speed things up when using live deduplication.
Update: Venkatesh Srinivas says:
“double_buffer makes sense when: 1) you want all CRCs to be checked on reads. 2) you’re running live dedup and care about dedup performance rather than say read-heavy performance; 3) you have swapcache but are often running into the vnode limit in what you can cache.”
So, not always useful.
The default Hammer version in DragonFly is now version 5, which is the one that includes deduplication. Enjoy, bleeding-edge users! Otherwise, wait for the next release.
Version 6 is there, but don’t upgrade to it yet; there aren’t significant user-visible changes, and the usual disclaimers for new versions apply.
A Phoronix test of DragonFly’s Hammer filesystem turned up, via Siju George. It’s not really a benchmark as much as it is a speed test, and it’s not a realistic comparison, but it’s interesting to see numbers.
They need a graph that shows how much historical data can be recovered by each file system, or how long fsck takes after a crash.
Update: Matthew Dillon points out the many ways these tests are wrong.
Ilya Dryomov’s work on deduplication for Hammer has been committed to the tree in an early test form. I guess I need to pay up as part of the code bounty. If you’re wondering how much space it will save, but don’t want to try non-production code yet, there’s a ‘hammer dedup-simulate’ command that will estimate the saving ratio.
This is great news – deduplication is so valuable it adds an extra zero onto the price of any storage device that can do it.
A smaller set of links, but still the same volume of reading material.
- Samuel Greear linked to this lengthy writeup on how to have both the consistency of ACID and the scaling of NoSQL. Astute observers may notice the similarities between the plan described and the way HAMMER works.
- Joerg Sonnenberger pointed out to me, after my works on The BSD Show! that MOSIX is an open source single-system-image implementation, though it appears to be designed for specialized high-speed networks rather than the more general case of DragonFly.
- This seems bizarre. (via)
Matthew Dillon posted a summary of recent bugfixes in HAMMER and kqueue, which means if you are running a version of bleeding edge DragonFly build in the last few weeks, you should update.
He also mentions a “significant improvement in performance” in disk encryption. How significant? Over three times as fast.
Matthew Dillon reports that DragonFly now has a catastrophic recovery tool for HAMMER filesystems, with pertinent details.
Matthew Dillon has provided some details about recent kernel work, along with a release forecast.
You have probably seen reports declaring the demise of OpenSolaris by now, many taking a less than conservative approach in reporting the news one way or the other. So what do you make of the news? By all accounts, the source code (including future changes) for things such as ZFS will continue to be published under the CDDL. Will Oracle closing up development make it impossible for operating systems like FreeBSD to maintain ZFS without forking it? What do you think the ramifications will be for DragonFly’s HAMMER and DragonFly in general?
I’ve been NAS-shopping, and I’ve found that deduplication ability seems to add an extra zero on the end of a device’s price tag. It would be very nice for HAMMER.
I apologize; I’ve been missing. Here’s some misc links while I get back in gear:
- A very good reason to be interested in Hammer over ZFS: nobody will threaten lawsuits over Hammer.
- 10 tricks for admins. I’m posting it cause I can never remember that thing with tunneling ssh out. (via)
- This Gaming Life, as a free download. An excellent book that is in physical form on my shelf right now. Yes, unrelated.
If you have a Hammer filesystem, and you want to roll the entire thing back to a previous snapshot – all files, everywhere – it can be accomplished with one command.
A note, in part for my own benefit: the @reboot crontab entry is all you need to get a HAMMER mirror-stream going again after a reboot/shutdown.
Matthew Dillon went into detail on just how Hammer snapshots could be shared out via Samba.
Siju George is making a Hammer volume’s snapshots available through Samba, with the results that some Windows-using developers get historical snapshots for free.
Michael Neumann has fixed the ability to stream Hammer data between 32 and 64 bit systems. However, this is a change to 64-bit systems that requires them to match; make sure that you are not mixing 64-bit systems built before and after this commit on the 21st.
I can’t find the commit message in the mail archive, so I’ll quote it here:
Pulled from a larger conversation: a description of the settings for a HAMMER filesystem, and what they mean. I can tell from experience that extremely active disks will need extra cleanup time…
I can’t keep up with all the things to post. I desperately want to clear my inbox, so here’s a week’s worth of posts all smushed together. Enjoy!
- Naoya Sugioka’s tmpfs work is almost ready to go.
- Francois Tigeot is looking to find supported RAID hardware for DragonFly; the LSI1068e isn’t useable. Freddie Cash listed a number of different and fully supported cards, and Francois listed some other potential choices.
- While talking about hardware, Steve O’Hara-Smith reported excellent results with a particular Atom 330-based board and DragonFly.
- Stathis Kamperis has added to ‘hammer snapls’ output; an example is in his submit@message.
- The 2.6 release of DragonFly, scheduled for March, will have version 4 of HAMMER. 2.4 has version 2. Upgrading from version 2 to 4 can happen in place, live, and only needs to happen once per volume, not per PFS. That’s about as easy as it gets. More details are available.
- The default sshd config has been updated; this shouldn’t affect your normal operations unless you’re using one of the mentioned options.
- Oliver Fromme linked to more discussion of SSD durability.
- Also, Matthew Dillon posted more notes and benchmark numbers for his swapcache work. There’s been some side benefits too. A man page for swapcache is now available.
- Aggelos Economopoulos’s libevtr has been added, for event tracing. He’s posted some additional notes on this work-in-progress.
- We now have /var/log/daemon, too.
- Notes on prepping for Google Summer of Code 2010 from the GSOC Discussion list; I don’t know if that link is readable for nonsubscribers.
- The Definitive Guide to PC-BSD is out at the end of this short month. Dru writes good books.
- Did you know FreeCiv (a Civilization clone, of sorts) is playable in a web browser? Goodbye free time! Details are available at my favoritest game site.
If you’re running DragonFly 2.5 and updated in the past week or so, and have UFS disks, there’s some instability introduced by Matthew Dillon’s recent work. It ought to be better by next week.
Users of Hammer, or of UFS only as /boot, don’t have anything to worry about.
That didn’t take long: Matthew Dillon has an update on his REDO work; he’s about halfway there. His summary includes instructions on how to test this new work, including ways to change how Hammer syncs to disk.
Thanks to Michael Neumann, it’s now possible to remove a drive from a Hammer volume. It’s experimental, so all the standard warnings apply.
This can’t be done on a root volume, for hopefully obvious reasons.
Did you know you a Hammer volume can span multiple disks? And that you can add extra disks later on? There’s no RAID-like features – it’s just a straight multiple-disk volume, but it works. The Hammer command to do it is now “hammer volume-add“
Some of the ikiwiki configuration files on dragonflybsd.org were accidentally overwritten during a software upgrade. Normally this would mean some work to locate and replace them from backups, but since it was a Hammer volume, a quick look in /var/hammer/usr/… found them for me.
I want to point out what Hammer does, here. Restoring from backup isn’t new – it is in fact probably one of the most basic and necessary of system administration duties. However, Hammer makes it so easy that the incremental work of using it falls to almost nothing. There’s no extra preparation or syntax to learn for retrieval, which is wonderful. Hammer’s easy fix has helped me out several times now, saving me time that, while probably still successful with any other backup system, would have been taken up just restoring things back to normal.
Matthew Dillon has made version 4 of Hammer the default; the upgrade is a relatively painless ‘hammer upgrade’ command. This new version cuts out a chunk of the disk syncs needed, speeding up Hammer disk operations.
I like linkblogging, especially because there’s been a lot of good stuff floating about:
- Matthew Dillon detailed some of the problems he had using hardlinks to create backups – problems Hammer solves.
- The History of the Internet in a Nutshell: pretty good, though it says Unix “influenced” Linux and FreeBSD. Influenced is right for Linux, but there’s parts of the different BSDs that are from UNIX directly.
- From O’Reilly: The War for the Web. The walled garden that failed in the long run for Compuserve and AOL and so on is being resurrected. (via)
- Along the same lines: The Death of the URL.
Thomas Nikolajsen came up with some ideas for making the configuration files for a given Hammer volume accessible, even when that volume is being presented over NFS. He’s looking for more ideas.
If anyone wants a project, there’s apparently a small undo bug that I’ve encountered. It is a small fix in terms of changes, for any takers.
There’s a status report from Matthew Dillon about his work on version 4 of Hammer, including the always enjoyable stories of tests that involve yanking the SATA cable from the drive.
‘mike’ made this interesting csh script that allows autocompletion of Hammer sub-commands. e.g. type ‘hammer’ and then cycle through the available hammer commands as you would through file names.
This description of a Hammer bug makes for interesting reading, since it delves into the sequence of events where data is actually laid down on disk. Interesting reading for a geek, admittedly…
Version 3 of Hammer is now available in bleeding-edge DragonFly, though it’s still experimental. The biggest reason for this version bump is to move the /snapshots folder to /var for all Hammer filesystems. This means an accidental <tt>rm -rf</tt> won’t destroy snapshots, as I’ve done. The saved data is still on the original partition, as just the metadata is saved to /var. More explication is available.
Jan Lentfer performed some Postgres benchmarks on DragonFly. It’s elaborate enough that it’s in the form of a PDF attached to the message I’ve linked. There’s some additional variations that haven’t been tried yet.
Vigorous file system activity seemed to lower performance in the long term on Hammer, which is certainly something to investigate. More testing please!
If you back up the pseudo-file-systems (PFS) on your Hammer volume to a non-Hammer disk, and then need to restore them to a new Hammer volume, and then realize you forgot to write down the shared-uuid, well, then, YONETANI Tomokazu has a patch for you. I haven’t seen this committed yet, but it appears valuable.
Matthew Dillon’s made some improvements to Hammer’s read and write processes. To quantize this, he’s tested Hammer and UFS with blogbench and written up the results. The tl;dr summary: UFS performs well until the system cache runs out, and then it halts. Hammer has some overhead from saving all history, but doesn’t stop working under a much heavier load.
Dear universe, including DragonFly people: stop doing so much stuff. It’s hard to keep up.
- Git in One Hour, an O’Reilly webcast. You need to register (free) and so on, but what the heck. O’Reilly doesn’t show crap.
- Poul Henning-Kamp is suing to recover the cost of Vista on his Lenovo laptop. (He’s installing FreeBSD.) I hope it comes out in his favor, though it will have little legal effect here in the U.S. (via)
- I didn’t realize this until I chimed in on the mailing lists, but one of the best books about file systems is freely available as a PDF.
- Another benefit of Hammer: you can’t run out of inodes, nor is it possible to have too many hardlinks.
- Some notes on pf usage in DragonFly. I know some parts have been mentioned before, but it’s good to sum up.
Matthew Dillon has a new version of Hammer, which speeds up listings from programs like ‘ls -la’ and ‘find’. This is only in 2.3.1.x code right now, so don’t force an upgrade via hammer version-upgrade if you’re still on DragonFly 2.2. His post includes some benchmarks.
On a side note: sili(4) tests look good.
Matthew Dillon’s made some small changes to Hammer; it should result in a small speedup when copying data.
I recently did a bulk build of pkgsrc on two similar machines; the only significant difference being extra CPU work being done on one system, and Hammer snapshots on the other. However, they’re diverging in speed over time, which is interesting but not yet conclusive. Read my post about it for more details.
A good benchmarking project would be testing Hammer with snapshots on and with snapshots off.
Matthew Dillon is trying to track down a Hammer bug where directory entries (files, usually) are missed, whether it’s with ls or find or similar. Has this happened to you? It’s apparently very hard to duplicate, so please speak up if it has.
Hammer’s ‘undo’ now has the ability to index and automatically diff historical versions of files for you, thanks to a patch from Joel K. Pettersson. (He’s got more ideas, too.
A bunch of links, cause that’s the easiest way to get this all out: