Joshua Coombs noted he would like to work on a new firewall strategy for DragonFly, and pasted in some of his notes. They are complex enough that it’s better to paste than to sum up. (I sure could use the multiple routes he talks about.)
Pasted material follows:
“Essentially, Cisco’s have a concept of more than one route for a
given destination. I’d like to pull that logic into *bsd.
—– Be warned, ugly paste from notes ————–
The routing table should be expanded to include the following
A route table
* 1 entry per route consisting of a src & netmask, dst & netmask, next hop address, hops, metric, TTL, state flag
* Multiple entries for the same src & netmask/dst & netmask are allowed (IE you can have multiple ‘default routes’ defined)
* Interfaces generate route entries for their directly connected networks, alias entries on interfaces don’t however.
* When an interface is shutdown, all route entries in the route cache and route table using it are removed.
* For route table entries with a TTL other than -1 (infinite) a route table maintanence thread periodically decrements the TTL of entries with a dynamic state flag (a state flag of -1 indicates a static route, any other number indicates a dynamic route and the number is the max TTL for the route), and any entries that the media has gone down on. When the TTL expires, the route is removed, along with any route cache table entries that match. If the state flag is set to permanent, the TTL is incremented each pass if the media is up. This prevents flapping interfaces from thrashing the route table, as well as giving established routes a chance to continue should the interface stabilize.
A route cache table
* Each entry contains a src & netmask, dst & netmask, next hop address, TTL
* This table is built dynamically by the routing thread as it operates, similar to a state table. This keeps traffic between two ip’s from constantly shifting between interfaces.
* As the table fills, a separate route cache management thread decrements TTL, eliminates entries when the TTL expire, and aggregates entries when the src & netmask entries are adjacent and share a dst & netmask. (Adjacency is based on CIDR rules for netblocks)
Routing decisions are made through the following process
1) Examine the route cache, if the src & netmask/dst & netmask pair are covered by one of those entries, route via the specified address, and reset the TTL for the route cache entry.
2) Using the src & netmask/dst & netmask pair, a list of all applicable routes is built.
3) If there is only one route applicable, jump to step 10
4) Perform a congestion check, eliminate any routes that fail, if all routes fail, they are considered equal and left as candidates.
5) Compare scopes, the most specific route wins and you move to step 10
6) Compare hops, the lowest wins and you move on to step 10
7) Compare metrics, the lowest wins and you move on to step 10
8) Compare TTL, the one with the most life wins and you move on to step 10
9) The first entry in the candidate route table wins.
10) Send the packet out the applicable route.
11) Add a route cache table entry for the src and destination, using /32’s for netmasks.
The congestion check is pretty straight forward. If an interface has a bandwidth and a threshold set, the current bandwidth over a 5 min window is checked against the threshold value. If it matches or exceeds, the congestion check fails for that interface. In a multihomed environment, say cable modem on one enet adapter, dsl on another, this will allow for two default routes, one out each. New sessions will be established based on the normal rules, pushing all sessions out one interface (without using bgp/etc, hops, metrics, and TTL will be defaults unless specifically tweaked by hand.) until it fails the congestion check. New routes will then go over the second interface until it to fails the congestion check, at which point, again normal rules apply. Because of the route cache, even when an interface fails a congestion check, routes already established over a given interface persist, allowing say, an ftp session to continue over a link that is suddenly congested as it never had to do a congestion check.
The hops, metric, and TTL values in the route table can be set via ospf, bgp, rip, etc daemons allowing routing to make more intelligent decisions. In an environment where traditional routing exchange protocols are not an option, it would be feasible to have a userland daemon watch the route cache table, do a ping or traceroute to the destination, and add an updated route to the route table with a user customized TTL, allowing one to get the best routes from two or more interfaces, without using traditional routing exchange protocols with each provider.
This system introduces ALLOT of memory and cpu bloat for routing into the system, but I think the gains outweigh the loss especially when considering the typical stats for current systems.”