
Created by three guys who love BSD, we cover the latest news and have an extensive series of tutorials, as well as interviews with various people from all areas of the BSD community. It also serves as a platform for support and questions. We love and advocate FreeBSD, OpenBSD, NetBSD, DragonFlyBSD and TrueOS. Our show aims to be helpful and informative for new users that want to learn about them, but still be entertaining for the people who are already pros. The show airs on Wednesdays at 2:00PM (US Eastern time) and the edited version is usually up the following day.
Similar Podcasts

Elixir Outlaws
Elixir Outlaws is an informal discussion about interesting things happening in Elixir. Our goal is to capture the spirit of a conference hallway discussion in a podcast.

The Cynical Developer
A UK based Technology and Software Developer Podcast that helps you to improve your development knowledge and career,
through explaining the latest and greatest in development technology and providing you with what you need to succeed as a developer.

Programming Throwdown
Programming Throwdown educates Computer Scientists and Software Engineers on a cavalcade of programming and tech topics. Every show will cover a new programming language, so listeners will be able to speak intelligently about any programming language.
224: The Bus Factor
We try to answer what happens to an open source project after a developers death, we tell you about the last bootstrapped tech company in Silicon Valley, we have an update to the NetBSD Thread sanitizer, and show how to use use cabal on OpenBSD This episode was brought to you by Headlines Life after death, for code (https://www.wired.com/story/giving-open-source-projects-life-after-a-developers-death/) YOU'VE PROBABLY NEVER heard of the late Jim Weirich or his software. But you've almost certainly used apps built on his work. Weirich helped create several key tools for Ruby, the popular programming language used to write the code for sites like Hulu, Kickstarter, Twitter, and countless others. His code was open source, meaning that anyone could use it and modify it. "He was a seminal member of the western world's Ruby community," says Justin Searls, a Ruby developer and co-founder of the software company Test Double. When Weirich died in 2014, Searls noticed that no one was maintaining one of Weirich's software-testing tools. That meant there would be no one to approve changes if other developers submitted bug fixes, security patches, or other improvements. Any tests that relied on the tool would eventually fail, as the code became outdated and incompatible with newer tech. The incident highlights a growing concern in the open-source software community. What happens to code after programmers pass away? Much has been written about what happens to social-media accounts after users die. But it’s been less of an issue among programmers. In part, that’s because most companies and governments relied on commercial software maintained by teams of people. But today, more programs rely on obscure but crucial software like Weirich's. Some open-source projects are well known, such as the Linux operating system or Google's artificial-intelligence framework TensorFlow. But each of these projects depend on smaller libraries of open-source code. And those libraries depend on other libraries. The result is a complex, but largely hidden, web of software dependencies. That can create big problems, as in 2014 when a security vulnerability known as "Heartbleed" was found in OpenSSL, an open-source program used by nearly every website that processes credit- or debit-card payments. The software comes bundled with most versions of Linux, but was maintained by a small team of volunteers who didn't have the time or resources to do extensive security audits. Shortly after the Heartbleed fiasco, a security issue was discovered in another common open-source application called Bash that left countless web servers and other devices vulnerable to attack. There are surely more undiscovered vulnerabilities. Libraries.io, a group that analyzes connections between software projects, has identified more than 2,400 open-source libraries that are used in at least 1,000 other programs but have received little attention from the open-source community. Security problems are only one part of the issue. If software libraries aren't kept up to date, they may stop working with newer software. That means an application that depends on an outdated library may not work after a user updates other software. When a developer dies or abandons a project, everyone who depends on that software can be affected. Last year when programmer Azer Koçulu deleted a tiny library called Leftpad from the internet, it created ripple effects that reportedly caused headaches at Facebook, Netflix, and elsewhere. The Bus Factor The fewer people with ownership of a piece of software, the greater the risk that it could be orphaned. Developers even have a morbid name for this: the bus factor, meaning the number of people who would have to be hit by a bus before there's no one left to maintain the project. Libraries.io has identified about 3,000 open-source libraries that are used in many other programs but have only a handful of contributors. Orphaned projects are a risk of using open-source software, though commercial software makers can leave users in a similar bind when they stop supporting or updating older programs. In some cases, motivated programmers adopt orphaned open-source code. That's what Searls did with one of Weirich’s projects. Weirich's most-popular projects had co-managers by the time of his death. But Searls noticed one, the testing tool Rspec-Given, hadn't been handed off, and wanted to take responsibility for updating it. But he ran into a few snags along the way. Rspec-Given's code was hosted on the popular code-hosting and collaboration site GitHub, home to 67 million codebases. Weirich's Rspec-Given page on GitHub was the main place for people to report bugs or to volunteer to help improve the code. But GitHub wouldn’t give Searls control of the page, because Weirich had not named him before he died. So Searls had to create a new copy of the code, and host it elsewhere. He also had to convince the operators of Ruby Gems, a “package-management system” for distributing code, to use his version of Rspec-Given, instead of Weirich's, so that all users would have access to Searls’ changes. GitHub declined to discuss its policies around transferring control of projects. That solved potential problems related to Rspec-Given, but it opened Searls' eyes to the many things that could go wrong. “It’s easy to see open source as a purely technical phenomenon,” Searls says. “But once something takes off and is depended on by hundreds of other people, it becomes a social phenomenon as well.” The maintainers of most package-management systems have at least an ad-hoc process for transferring control over a library, but that process usually depends on someone noticing that a project has been orphaned and then volunteering to adopt it. "We don’t have an official policy mostly because it hasn’t come up all that often," says Evan Phoenix of the Ruby Gems project. "We do have an adviser council that is used to decide these types of things case by case." Some package managers now monitor their libraries and flag widely used projects that haven't been updated in a long time. Neil Bowers, who helps maintain a package manager for the programming language Perl, says he sometimes seeks out volunteers to take over orphan projects. Bowers says his group vets claims that a project has been abandoned, and the people proposing to take it over. A 'Dead-Man's Switch' Taking over Rspec-Given inspired Searls, who was only 30 at the time, to make a will and a succession plan for his own open-source projects. There are other things developers can do to help future-proof their work. They can, for example, transfer the copyrights to a foundation, such as the Apache Foundation. But many open-source projects essentially start as hobbies, so programmers may not think to transfer ownership until it is too late. Searls suggests that GitHub and package managers such as Gems could add something like a "dead man's switch" to their platform, which would allow programmers to automatically transfer ownership of a project or an account to someone else if the creator doesn’t log in or make changes after a set period of time. But a transition plan means more than just giving people access to the code. Michael Droettboom, who took over a popular mathematics library called Matplotlib after its creator John Hunter died in 2012, points out that successors also need to understand the code. "Sometimes there are parts of the code that only one person understands," he says. "The knowledge exists only in one person's head." That means getting people involved in a project earlier, ideally as soon as it is used by people other than the original developer. That has another advantage, Searls points out, in distributing the work of maintaining a project to help prevent developer burnout. The Last Bootstrapped Tech Company In Silicon Valley (https://www.forbes.com/sites/forbestechcouncil/2017/12/12/the-last-bootstrapped-tech-company-in-silicon-valley/2/#4d53d50f1e4d) My business partner, Matt Olander, and I were intimately familiar with the ups and downs of the Silicon Valley tech industry when we acquired the remnants of our then-employer BSDi’s enterprise computer business in 2002 and assumed the roles of CEO and CTO. Fast-forward to today, and we still work in the same buildings where BSDi started in 1996, though you’d hardly recognize them today. As the business grew from a startup to a global brand, our success came from always ensuring we ran a profitable business. While that may sound obvious, keep in mind that we are in the heart of Silicon Valley where venture capitalists hunt for the unicorn company that will skyrocket to a billion-dollar valuation. Unicorns like Facebook and Twitter unquestionably exist, but they are the exception. Live By The VC, Die By The VC After careful consideration, Matt and I decided to bootstrap our company rather than seek funding. The first dot-com bubble had recently burst, and we were seeing close friends lose their jobs right and left at VC-funded companies based on dubious business plans. While we did not have much cash on hand, we did have a customer base and treasured those customers as our greatest asset. We concluded that meeting their needs was the surest path to meeting ours, and the rest would simply be details to address individually. This strategy ended up working so well that we have many of the same customers to this day. After deciding to bootstrap, we made a decision on a matter that has left egg on the face of many of our competitors: We seated sales next to support under one roof at our manufacturing facility in Silicon Valley. Dell's decision to outsource some of its support overseas in the early 2000s was the greatest gift it could have given us. Some of our sales and support staff have worked with the same clients for over a decade, and we concluded that no amount of funding could buy that mutual loyalty. While accepting venture capital or an acquisition may make you rich, it does not guarantee that your customers, employees or even business will be taken care of. Our motto is, “Treat your customers like friends and employees like family,” and we have an incredibly low employee turnover to show for it. Thanks to these principles, iXsystems has remained employee-owned, debt-free and profitable from the day we took it over -- all without VC funding, which is why we call ourselves the "last bootstrapped tech company in Silicon Valley." As a result, we now provide enterprise servers to thousands of customers, including top Fortune 500 companies, research and educational institutions, all branches of the military, and numerous government entities. Over time, however, we realized that we were selling more and more third-party data storage systems with every order. We saw this as a new opportunity. We had partnered with several storage vendors to meet our customers’ needs, but every time we did, we opened a can of worms with regard to supporting our customers to our standards. Given a choice of risking being dragged down by our partners or outmaneuvered by competitors with their own storage portfolios, we made a conscious decision to develop a line of storage products that would not only complement our enterprise servers but tightly integrate with them. To accelerate this effort, we adopted the FreeNAS open-source software-defined storage project in 2009 and haven’t looked back. The move enabled us to focus on storage, fully leveraging our experience with enterprise hardware and our open source heritage in equal measures. We saw many storage startups appear every quarter, struggling to establish their niche in a sea of competitors. We wondered how they’d instantly master hardware to avoid the partnering mistakes that we made years ago, given that storage hardware and software are truly inseparable at the enterprise level. We entered the storage market with the required hardware expertise, capacity and, most importantly, revenue, allowing us to develop our storage line at our own pace. Grow Up, But On Your Own Terms By not having the external pressure from VCs or shareholders that your competitors have, you're free to set your own priorities and charge fair prices for your products. Our customers consistently tell us how refreshing our sales and marketing approaches are. We consider honesty, transparency and responsible marketing the only viable strategy when you’re bootstrapped. Your reputation with your customers and vendors should mean everything to you, and we can honestly say that the loyalty we have developed is priceless. So how can your startup venture down a similar path? Here's our advice for playing the long game: Relate your experiences to each fad: Our industry is a firehose of fads and buzzwords, and it can be difficult to distinguish the genuine trends from the flops. Analyze every new buzzword in terms of your own products, services and experiences, and monitor customer trends even more carefully. Some buzzwords will even formalize things you have been doing for years. Value personal relationships: Companies come and go, but you will maintain many clients and colleagues for decades, regardless of the hat they currently wear. Encourage relationship building at every level of your company because you may encounter someone again. Trust your instincts and your colleagues: No contractual terms or credit rating system can beat the instincts you will develop over time for judging the ability of individuals and companies to deliver. You know your business, employees and customers best. Looking back, I don't think I’d change a thing. We need to be in Silicon Valley for the prime customers, vendors and talent, and it’s a point of pride that our customers recognize how different we are from the norm. Free of a venture capital “runway” and driven by these principles, we look forward to the next 20 years in this highly-competitive industry. Creating an AS for fun and profit (http://blog.thelifeofkenneth.com/2017/11/creating-autonomous-system-for-fun-and.html) At its core, the Internet is an interconnected fabric of separate networks. Each network which makes up the Internet is operated independently and only interconnects with other networks in clearly defined places. For smaller networks like your home, the interaction between your network and the rest of the Internet is usually pretty simple: you buy an Internet service plan from an ISP (Internet Service Provider), they give you some kind of hand-off through something like a DSL or cable modem, and give you access to "the entire Internet". Your router (which is likely also a WiFi access point and Ethernet switch) then only needs to know about two things; your local computers and devices are on one side, and the ENTIRE Internet is on the other side of that network link given to you by your ISP. For most people, that's the extent of what's needed to be understood about how the Internet works. Pick the best ISP, buy a connection from them, and attach computers needing access to the Internet. And that's fine, as long as you're happy with only having one Internet connection from one vendor, who will lend you some arbitrary IP address(es) for the extend of your service agreement, but that starts not being good enough when you don't want to be beholden to a single ISP or a single connection for your connectivity to the Internet. That also isn't good enough if you are an Internet Service Provider so you are literally a part of the Internet. You can't assume that the entire Internet is that way when half of the Internet is actually in the other direction. This is when you really have to start thinking about the Internet and treating the Internet as a very large mesh of independent connected organizations instead of an abstract cloud icon on the edge of your local network map. Which is pretty much never for most of us. Almost no one needs to consider the Internet at this level. The long flight of steps from DSL for your apartment up to needing to be an integral part of the Internet means that pretty much regardless of what level of Internet service you need for your projects, you can probably pay someone else to provide it and don't need to sit down and learn how BGP works and what an Autonomous System is. But let's ignore that for one second, and talk about how to become your own ISP. To become your own Internet Service Provider with customers who pay you to access the Internet, or be your own web hosting provider with customers who pay you to be accessible from the Internet, or your own transit provider who has customers who pay you to move their customer's packets to other people's customers, you need a few things: Your own public IP address space allocated to you by an Internet numbering organization Your own Autonomous System Number (ASN) to identify your network as separate from everyone else's networks At least one router connected to a different autonomous system speaking the Border Gateway Protocol to tell the rest of the Internet that your address space is accessible from your autonomous system. So... I recently set up my own autonomous system... and I don't really have a fantastic justification for it... My motivation was twofold: One of my friends and I sat down and figured it out that splitting the cost of a rack in Hurricane Electric's FMT2 data center marginally lowered our monthly hosting expenses vs all the paid services we're using scattered across the Internet which can all be condensed into this one rack. And this first reason on its own is a perfectly valid justification for paying for co-location space at a data center like Hurricane Electric's, but isn't actually a valid reason for running it as an autonomous system, because Hurricane Electric will gladly let you use their address space for your servers hosted in their building. That's usually part of the deal when you pay for space in a data center: power, cooling, Internet connectivity, and your own IP addresses. Another one of my friends challenged me to do it as an Autonomous System. So admittedly, my justification for going through the additional trouble to set up this single rack of servers as an AS is a little more tenuous. I will readily admit that, more than anything else, this was a "hold my beer" sort of engineering moment, and not something that is at all needed to achieve what we actually needed (a rack to park all our servers in). But what the hell; I've figured out how to do it, so I figured it would make an entertaining blog post. So here's how I set up a multi-homed autonomous system on a shoe-string budget: Step 1. Found a Company Step 2. Get Yourself Public Address Space Step 3. Find Yourself Multiple Other Autonomous Systems to Peer With Step 4. Apply for an Autonomous System Number Step 5. Source a Router Capable of Handling the Entire Internet Routing Table Step 6. Turn it All On and Pray And we're off to the races. At this point, Hurricane Electric is feeding us all ~700k routes for the Internet, we're feeding them our two routes for our local IPv4 and IPv6 subnets, and all that's left to do is order all our cross-connects to other ASes in the building willing to peer with us (mostly for fun) and load in all our servers to build our own personal corner of the Internet. The only major goof so far has been accidentally feeding the full IPv6 table to our first other peer that we turned on, but thankfully he has a much more powerful supervisor than the Sup720-BXL, so he just sent me an email to knock that off, a little fiddling with my BGP egress policies, and we were all set. In the end, setting up my own autonomous system wasn't exactly simple, it was definitely not justified, but some times in life you just need to take the more difficult path. And there's a certain amount of pride in being able to claim that I'm part of the actual Internet. That's pretty neat. And of course, thanks to all of my friends who variously contributed parts, pieces, resources, and know-how to this on-going project. I had to pull in a lot of favors to pull this off, and I appreciate it. News Roundup One year checkpoint and Thread Sanitizer update (https://blog.netbsd.org/tnf/entry/one_year_checkpoint_and_thread) The past year has been started with bugfixes and the development of regression tests for ptrace(2) and related kernel features, as well as the continuation of bringing LLDB support and LLVM sanitizers (ASan + UBsan and partial TSan + Msan) to NetBSD. My plan for the next year is to finish implementing TSan and MSan support, followed by a long run of bug fixes for LLDB, ptrace(2), and other related kernel subsystems TSan In the past month, I've developed Thread Sanitizer far enough to have a subset of its tests pass on NetBSD, started with addressing breakage related to the memory layout of processes. The reason for this breakage was narrowed down to the current implementation of ASLR, which was too aggressive and which didn't allow enough space to be mapped for Shadow memory. The fix for this was to either force the disabling of ASLR per-process, or globally on the system. The same will certainly happen for MSan executables. After some other corrections, I got TSan to work for the first time ever on October 14th. This was a big achievement, so I've made a snapshot available. Getting the snapshot of execution under GDB was pure hazard. ``` $ gdb ./a.out GNU gdb (GDB) 7.12 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64--netbsd". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./a.out...done. (gdb) r Starting program: /public/llvm-build/a.out [New LWP 2] WARNING: ThreadSanitizer: data race (pid=1621) Write of size 4 at 0x000001475d70 by thread T1: #0 Thread1 /public/llvm-build/tsan.c:4:10 (a.out+0x46bf71) Previous write of size 4 at 0x000001475d70 by main thread: #0 main /public/llvm-build/tsan.c:10:10 (a.out+0x46bfe6) Location is global 'Global' of size 4 at 0x000001475d70 (a.out+0x000001475d70) Thread T1 (tid=2, running) created by main thread at: #0 pthreadcreate /public/llvm/projects/compiler-rt/lib/tsan/rtl/tsaninterceptors.cc:930:3 (a.out+0x412120) #1 main /public/llvm-build/tsan.c:9:3 (a.out+0x46bfd1) SUMMARY: ThreadSanitizer: data race /public/llvm-build/tsan.c:4:10 in Thread1 Thread 2 received signal SIGSEGV, Segmentation fault. ``` I was able to get the above execution results around 10% of the time (being under a tracer had no positive effect on the frequency of successful executions). I've managed to hit the following final results for this month, with another set of bugfixes and improvements: check-tsan: Expected Passes : 248 Expected Failures : 1 Unsupported Tests : 83 Unexpected Failures: 44 At the end of the month, TSan can now reliably executabe the same (already-working) program every time. The majority of failures are in tests verifying sanitization of correct mutex locking usage. There are still problems with NetBSD-specific libc and libpthread bootstrap code that conflicts with TSan. Certain functions (pthreadcreate(3), pthreadkeycreate(3), _cxaatexit()) cannot be started early by TSan initialization, and must be deferred late enough for the sanitizer to work correctly. MSan I've prepared a scratch support for MSan on NetBSD to help in researching how far along it is. I've also cloned and adapted the existing FreeBSD bits; however, the code still needs more work and isn't functional yet. The number of passed tests (5) is negligible and most likely does not work at all. The conclusion after this research is that TSan shall be finished first, as it touches similar code. In the future, there will be likely another round of iterating the system structs and types and adding the missing ones for NetBSD. So far, this part has been done before executing the real MSan code. I've added one missing symbol that was missing and was detected when attempting to link a test program with MSan. Sanitizers The GCC team has merged the LLVM sanitizer code, which has resulted in almost-complete support for ASan and UBsan on NetBSD. It can be found in the latest GCC8 snapshot, located in pkgsrc-wip/gcc8snapshot. Though, do note that there is an issue with getting backtraces from libasan.so, which can be worked-around by backtracing ASan events in a debugger. UBsan also passes all GCC regression tests and appears to work fine. The code enabling sanitizers on the GCC/NetBSD frontend will be submitted upstream once the backtracing issue is fixed and I'm satisfied that there are no other problems. I've managed to upstream a large portion of generic+TSan+MSan code to compiler-rt and reduce local patches to only the ones that are in progress. This deals with any rebasing issues, and allows me to just focus on the delta that is being worked on. I've tried out the LLDB builds which have TSan/NetBSD enabled, and they built and started fine. However, there were some false positives related to the mutex locking/unlocking code. Plans for the next milestone The general goals are to finish TSan and MSan and switch back to LLDB debugging. I plan to verify the impact of the TSan bootstrap initialization on the observed crashes and research the remaining failures. This work was sponsored by The NetBSD Foundation. The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can: The scourge of systemd (https://blog.ungleich.ch/en-us/cms/blog/2017/12/10/the-importance-of-devuan/) While this article is actually couched in terms of promoting devuan, a de-systemd-ed version of debian, it would seem the same logic could be applied to all of the BSDs Let's say every car manufacturer recently discovered a new technology named "doord", which lets you open up car doors much faster than before. It only takes 0.05 seconds, instead of 1.2 seconds on average. So every time you open a door, you are much, much faster! Many of the manufacturers decide to implement doord, because the company providing doord makes it clear that it is beneficial for everyone. And additional to opening doors faster, it also standardises things. How to turn on your car? It is the same now everywhere, it is not necessarily to look for the keyhole anymore. Unfortunately though, sometimes doord does not stop the engine. Or if it is cold outside, it stops the ignition process, because it takes too long. Doord also changes the way your navigation system works, because that is totally related to opening doors, but leads to some users being unable to navigate, which is accepted as collateral damage. In the end, you at least have faster door opening and a standard way to turn on the car. Oh, and if you are in a traffic jam and have to restart the engine often, it will stop restarting it after several times, because that's not what you are supposed to do. You can open the engine hood and tune that setting though, but it will be reset once you buy a new car. Some of you might now ask themselves "Is systemd THAT bad?". And my answer to it is: No. It is even worse. Systemd developers split the community over a tiny detail that decreases stability significantly and increases complexity for not much real value. And this is not theoretical: We tried to build Data Center Light on Debian and Ubuntu, but servers that don't boot, that don't reboot or systemd-resolved that constantly interferes with our core network configuration made it too expensive to run Debian or Ubuntu. Yes, you read right: too expensive. While I am writing here in flowery words, the reason to use Devuan is hard calculated costs. We are a small team at ungleich and we simply don't have the time to fix problems caused by systemd on a daily basis. This is even without calculating the security risks that come with systemd. Using cabal on OpenBSD (https://deftly.net/posts/2017-10-12-using-cabal-on-openbsd.html) Since W^X became mandatory in OpenBSD (https://undeadly.org/cgi?action=article&sid=20160527203200), W^X’d binaries are only allowed to be executed from designated locations (mount points). If you used the auto partition layout during install, your /usr/local/ will be mounted with wxallowed. For example, here is the entry for my current machine: /dev/sd2g on /usr/local type ffs (local, nodev, wxallowed, softdep) This is a great feature, but if you build applications outside of the wxallowed partition, you are going to run into some issues, especially in the case of cabal (python as well). Here is an example of what you would see when attempting to do cabal install pandoc: qbit@slip[1]:~? cabal update Config file path source is default config file. Config file /home/qbit/.cabal/config not found. Writing default configuration to /home/qbit/.cabal/config Downloading the latest package list from hackage.haskell.org qbit@slip[0]:~? cabal install pandoc Resolving dependencies... ..... cabal: user error (Error: some packages failed to install: JuicyPixels-3.2.8.3 failed during the configure step. The exception was: /home/qbit/.cabal/setup-exe-cache/setup-Simple-Cabal-1.22.5.0-x86_64-openbsd-ghc-7.10.3: runProcess: runInteractiveProcess: exec: permission denied (Permission denied) The error isn’t actually what it says. The untrained eye would assume permissions issue. A quick check of dmesg reveals what is really happening: /home/qbit/.cabal/setup-exe-cache/setup-Simple-Cabal-1.22.5.0-x86_64-openbsd-ghc-7.10.3(22924): W^X binary outside wxallowed mountpoint OpenBSD is killing the above binary because it is violating W^X and hasn’t been safely kept in its /usr/local corral! We could solve this problem quickly by marking our /home as wxallowed, however, this would be heavy handed and reckless (we don’t want to allow other potentially unsafe binaries to execute.. just the cabal stuff). Instead, we will build all our cabal stuff in /usr/local by using a symlink! doas mkdir -p /usr/local/{cabal,cabal/build} # make our cabal and build dirs doas chown -R user:wheel /usr/local/cabal # set perms rm -rf ~/.cabal # kill the old non-working cabal ln -s /usr/local/cabal ~/.cabal # link it! We are almost there! Some cabal packages build outside of ~/.cabal: cabal install hakyll ..... Building foundation-0.0.14... Preprocessing library foundation-0.0.14... hsc2hs: dist/build/Foundation/System/Bindings/Posix_hsc_make: runProcess: runInteractiveProcess: exec: permission denied (Permission denied) Downloading time-locale-compat-0.1.1.3... ..... Fortunately, all of the packages I have come across that do this all respect the TMPDIR environment variable! alias cabal='env TMPDIR=/usr/local/cabal/build/ cabal' With this alias, you should be able to cabal without issue (so far pandoc, shellcheck and hakyll have all built fine)! TL;DR # This assumes /usr/local/ is mounted as wxallowed. # doas mkdir -p /usr/local/{cabal,cabal/build} doas chown -R user:wheel /usr/local/cabal rm -rf ~/.cabal ln -s /usr/local/cabal ~/.cabal alias cabal='env TMPDIR=/usr/local/cabal/build/ cabal' cabal install pandoc FreeBSD and APRS, or "hm what happens when none of this is well documented.." (https://adrianchadd.blogspot.co.uk/2017/10/freebsd-and-aprs-or-hm-what-happens.html) Here's another point along my quest for amateur radio on FreeBSD - bring up basic APRS support. Yes, someone else has done the work, but in the normal open source way it was .. inconsistently documented. First is figuring out the hardware platform. I chose the following: A Baofeng UV5R2, since they're cheap, plentiful, and do both VHF and UHF; A cable to do sound level conversion and isolation (and yes, I really should post a circuit diagram and picture..); A USB sound device, primarily so I can whack it into FreeBSD/Linux devices to get a separate sound card for doing radio work; FreeBSD laptop (it'll become a raspberry pi + GPS + sensor + LCD thingy later, but this'll do to start with.) The Baofeng is easy - set it to the right frequency (VHF APRS sits on 144.390MHz), turn on VOX so I don't have to make up a PTT cable, done/done. The PTT bit isn't that hard - one of the microphone jack pins is actually PTT (if you ground it, it engages PTT) so when you make the cable just ensure you expose a ground pin and PTT pin so you can upgrade it later. The cable itself isn't that hard either - I had a baofeng handmic lying around (they're like $5) so I pulled it apart for the cable. I'll try to remember to take pictures of that. Here's a picture I found on the internet that shows the pinout: image (https://3.bp.blogspot.com/-58HUyt-9SUw/Wdz6uMauWlI/AAAAAAAAVz8/e7OrnRzN3908UYGUIRI1EBYJ5UcnO0qRgCLcBGAs/s1600/aprs-cable.png) Now, I went a bit further. I bought a bunch of 600 ohm isolation transformers for audio work, so I wired it up as follows: From the audio output of the USB sound card, I wired up a little attenuator - input is 2k to ground, then 10k to the input side of the transformer; then the output side of the transformer has a 0.01uF greencap capacitor to the microphone input of the baofeng; From the baofeng I just wired it up to the transformer, then the output side of that went into a 0.01uF greencap capacitor in series to the microphone input of the sound card. In both instances those capacitors are there as DC blockers. Ok, so that bit is easy. Then on to the software side. The normal way people do this stuff is "direwolf" on Linux. So, "pkg install direwolf" installed it. That was easy. Configuring it up was a bit less easy. I found this guide to be helpful (https://andrewmemory.wordpress.com/tag/direwolf/) FreeBSD has the example direwolf config in /usr/local/share/doc/direwolf/examples/direwolf.conf . Now, direwolf will run as a normal user (there's no rc.d script for it yet!) and by default runs out of the current directory. So: $ cd ~ $ cp /usr/local/share/doc/direwolf/examples/direwolf.conf . $ (edit it) $ direwolf Editing it isn't that hard - you need to change your callsign and the audio device. OK, here is the main undocumented bit for FreeBSD - the sound device can just be /dev/dsp . It isn't an ALSA name! Don't waste time trying to use ALSA names. Instead, just find the device you want and reference it. For me the USB sound card shows up as /dev/dsp3 (which is very non specific as USB sound devices come and go, but that's a later problem!) but it's enough to bring it up. So yes, following the above guide, using the right sound device name resulted in a working APRS modem. Next up - something to talk to it. This is called 'xastir'. It's .. well, when you run it, you'll find exactly how old an X application it is. It's very nostalgically old. But, it is enough to get APRS positioning up and test both the TCP/IP side of APRS and the actual radio radio side. Here's the guide I followed: (https://andrewmemory.wordpress.com/2015/03/22/setting-up-direwolfxastir-on-a-raspberry-pi/) So, that was it! So far so good. It actually works well enough to decode and watch APRS traffic around me. I managed to get out position information to the APRS network over both TCP/IP and relayed via VHF radio. Beastie Bits Zebras All the Way Down - Bryan Cantrill (https://www.youtube.com/watch?v=fE2KDzZaxvE) Your impact on FreeBSD (https://www.freebsdfoundation.org/blog/your-impact-on-freebsd/) The Secret to a good Gui (https://bsdmag.org/secret-good-gui/) containerd hits v1.0.0 (https://github.com/containerd/containerd/releases/tag/v1.0.0) FreeBSD 11.1 Custom Kernels Made Easy - Configuring And Installing A Custom Kernel (https://www.youtube.com/watch?v=lzdg_2bUh9Y&t=) Debugging (https://pbs.twimg.com/media/DQgCNq6UEAEqa1W.jpg:large) *** Feedback/Questions Bostjan - Backup Tapes (http://dpaste.com/22ZVJ12#wrap) Philipp - A long time ago, there was a script (http://dpaste.com/13E8RGR#wrap) Adam - ZFS Pool Monitoring (http://dpaste.com/3BQXXPM#wrap) Damian - KnoxBug (http://dpaste.com/0ZZVM4R#wrap) ***
223: Compile once, debug twice
Picking a compiler for debuggability, how to port Rust apps to FreeBSD, what the point of Docker is on FreeBSD/Solaris, another EuroBSDcon recap, and network manager control in OpenBSD This episode was brought to you by Headlines Compile once, Debug twice: Picking a compiler for debuggability, part 1 of 3 (https://backtrace.io/blog/compile-once-debug-twice-picking-a-compiler-for-debuggability-1of3/) An interesting look into why when you try to debug a crash, you can often find all of the useful information has been ‘optimized out’ Have you ever had an assert get triggered only to result in a useless core dump with missing variable information or an invalid callstack? Common factors that go into selecting a C or C++ compiler are: availability, correctness, compilation speed and application performance. A factor that is often neglected is debug information quality, which symbolic debuggers use to reconcile application executable state to the source-code form that is familiar to most software engineers. When production builds of an application fail, the level of access to program state directly impacts the ability for a software engineer to investigate and fix a bug. If a compiler has optimized out a variable or is unable to express to a symbolic debugger how to reconstruct the value of a variable, the engineer’s investigation process is significantly impacted. Either the engineer has to attempt to recreate the problem, iterate through speculative fixes or attempt to perform prohibitively expensive debugging, such as reconstructing program state through executable code analysis. Debug information quality is in fact not proportionally related to the quality of the generated executable code and wildly varies from compiler to compiler. Different compilers emit debug information at varying levels of quality and accuracy. However, certain optimizations will certainly impact any debugger’s ability to generate accurate stack traces or extract variable values. In the above program, the value of argv is extracted and then the program is paused. The ckprloadptr function performs a read from the region of memory pointed to by argv, in a manner that prevents the compiler from performing optimization on it. This ensures that the memory access occurs and for this reason, the value of argv must be accessible by the time ckprloadptr is executed. When compiled with gcc, the debugger fails to find the value of the variable. The compiler determines that the value of argv is no longer needed after the ckprload_ptr operation and so doesn’t bother paying the cost of saving the value. Some optimizations generate executable code whose call stack cannot be sufficiently disambiguated to reconcile a call stack that mirrors that of the source program. Two common culprits for this are tail call optimization and basic block commoning. In another example If the program receives a first argument of 1, then function is called with the argument of "a". If the program receives a first argument of 2, then function is called with the argument of "b". However, if we compile this program with clang, the stack traces in both cases are identical! clang informs the debugger that the function f invoked the function("b") branch where x = 2 even if x = 1. Though some optimizations will certainly impact the accuracy of a symbolic debugger, some compilers simply lack the ability to generate debug information in the presence of certain optimizations. One common optimization is induction variable elimination. A variable that’s incremented or decremented by a constant on every iteration of a loop or derived from another variable that follows this pattern, is an induction variable. Coupled with other optimizations, the compiler is then able to generate code that doesn’t actually rely on a dedicated counter variable “i” for maintaining the current offset into “buffer”. As you can see, i is completely optimized out. The compiler determines it doesn’t have to pay the cost of maintaining the induction variable i. It maintains the pointer in the register %rdi. The code is effectively rewritten to something closer to this: So the for loop, changes into a while loop, with a condition of the end of the input We have shown some common optimizations that may get in the way of the debuggability of your application and demonstrated a disparity in debug information quality across two popular compilers. In the next blog post of this series, we will examine how gcc and clang stack up with regards to debug information quality across a myriad of synthetic applications and real world applications. Looking forward to part 2 *** This is how you can port your rust application to FreeBSD (https://medium.com/@andoriyu/this-is-how-you-can-port-your-rust-application-to-freebsd-7d3e9f1bc3df) This is how you can port your rust application to FreeBSD The FreeBSD Ports Collection is the way almost everyone installs applications (“ports”) on FreeBSD. Like everything else about FreeBSD, it is primarily a volunteer effort. It is important to keep this in mind when reading this document. In FreeBSD, anyone may submit a new port, or volunteer to maintain an existing unmaintained port. No special commit privilege is needed. For this guide I will use fd tool written by David Peter as example project. Prerequisites FreeBSD installation (VM is fine) Local ports tree (done via svn) portlint (located at devel/portlint) poudriere (located at ports-mgmt/poudriere)[optional] Getting ports tree When you install FreeBSD opt-out of the ports tree. Install svn: pkg install svn svn checkout https://svn.freebsd.org/ports/head /usr/ports Poudriere Sometimes you might get asked to show poudriere build log, sometimes you won’t. It’s good to have anyway. If you choose to use poudriere, use ZFS. There are plenty of guides on the subject. FreeBSD Porter’s Handbook is the most complete source of information on porting to FreeBSD. Makefile Whole porting process in most cases is writing one Makefile. I recommend doing something like this. Here is the one I wrote for fd: Port metadata Each port must have one primary category in case of fd it will be sysutils, therefore it's located in /usr/ports/systuils/fd. PORTNAME= fd CATEGORIES= sysutils Since this port conflicts with other util named fd I specified package suffix as: PKGNAMESUFFIX= -find and indicate conflict: CONFLICTS_INSTALL= fd-[0-9]*. That means to install it from packages user will have to type: pkg install fd-find Licenses This section is different for every port, but in case of fd it's pretty straightforward: LICENSE= MIT APACHE20 LICENSE_COMB= dual Since fd includes the text of licenses you should do this as well: LICENSE_FILE_MIT= ${WRKSRC}/LICENSE-MIT LICENSE_FILE_APACHE20= ${WRKSRC}/LICENSE-APACHE Distfiles FreeBSD has a requirement that all ports must allow offline building. That means you have specified which files are needed to be downloaded. Luckily we now have helpers to download GitHub sources directly from GitHub: USE_GITHUB= yes GH_ACCOUNT= sharkdp Since PORTNANE is fd it will try to download sources for sharkdp/fd. By default it's going to download tag: ${DISTVERSIONPREFIX}${DISTVERSION}${DISTVERSIONSUFFIX} fd uses v as the prefix, therefore we need to specify: DISTVERSIONPREFIX= v. It's also possible to specify GH_TAGNAME in case tag name doesn't match that pattern. Extra packages There are very few rust projects that are standalone and use no crates dependencies. It’s used to be PITA to make it work offline, but now cargo is a first class citizen in ports: USES= cargo CARGO_CRATES= aho-corasick-0.6.3 \ atty-0.2.3 \ # and so goes on Yes, you have to specify each dependency. Luckily, there is a magic awk script that turns Cargo.lock into what you need. Execute make cargo-crates in the port root. This will fail because you're missing checksum for the original source files: make makesum make cargo-crates This will give you what you need. Double check that result is correct. There is a way to ignore checksum error, but I can’t remember… Execute make makesum again. CARGO_OUT If. build.rs relies on that you have to change it. fd allows you to use SHELLCOMPLETIONSDIR to specify where completions go, while ripgrep doesn't. In our case we just specify SHELLCOMPLETIONSDIR: SHELL_COMPLETIONS_DIR= ${WRKDIR}/shell-completions-dir CARGO_ENV= SHELL_COMPLETIONS_DIR=${SHELL_COMPLETIONS_DIR} PLIST FreeBSD is very strict about files it’s installing and it won’t allow you to install random files that get lost. You have to specify which files you’re installing. In this case, it’s just two: PLIST_FILES= bin/fd \ man/man1/fd.1.gz Note that sources for fd have uncompressed man file, while here it’s listed as compressed. If port installs a lot of files, specify them in pkg-plist like here. To actually install them: post-install: @${STRIP_CMD} ${STAGEDIR}${PREFIX}/bin/fd ${INSTALL_MAN}${WRKSRC}/doc/fd.1 ${STAGEDIR}${MAN1PREFIX}/man/man1 Shell completions clap-rs can generate shell completions for you, it's usually handled by build.rs script. First, we need to define options: OPTIONS_DEFINE= BASH FISH ZSH # list options OPTIONS_DEFAULT= BASH FISH ZSH # select them by default BASH_PLIST_FILES= etc/bash_completion.d/fd.bash-completion FISH_PLIST_FILES= share/fish/completions/fd.fish ZSH_PLIST_FILES= share/zsh/site-functions/_fd To actually install them: post-install-BASH-on: @${MKDIR} ${STAGEDIR}${PREFIX}/etc/bash_completion.d ${INSTALL_DATA} ${SHELL_COMPLETIONS_DIR}/fd.bash-completion \ ${STAGEDIR}${PREFIX}/etc/bash_completion.d post-install-FISH-on: @${MKDIR} ${STAGEDIR}${PREFIX}/share/fish/completions ${INSTALL_DATA} ${SHELL_COMPLETIONS_DIR}/fd.fish \ ${STAGEDIR}${PREFIX}/share/fish/completions post-install-ZSH-on: @${MKDIR} ${STAGEDIR}${PREFIX}/share/zsh/site-functions ${INSTALL_DATA} ${SHELL_COMPLETIONS_DIR}/_fd \ ${STAGEDIR}${PREFIX}/share/zsh/site-functions Bonus round - Patching source code Sometimes you have to patch it and send the patch upstream. Merging it upstream can take awhile, so you can patch it as part of the install process. An easy way to do it: Go to work/ dir Copy file you want to patch and add .orig suffix to it Edit file you want to patch Execute make makepatch in port's root Submitting port First, make sure portlint -AC doesn't give you any errors or warnings. Second, make sure poudriere can build it on both amd64 and i386. If it can't?—?you have to either fix it or mark port broken for that arch. Follow this steps like I did steps. If you have any issues you can always ask your question in freebsd-ports on freenode try to find your answer in porter’s handbook before asking. Conference Recap: EuroBSDCon 2017 Recap (https://www.freebsdfoundation.org/blog/conference-recap-eurobsdcon-2017-recap/) The location was wonderful and I loved sneaking out and exploring the city when I could. From what I heard, it was the largest BSD conference in history, with over 320 attendees! Each venue is unique and draws many local BSD enthusiasts, who normally wouldn’t be able to travel to a conference. I love having the chance to talk to these people about how they are involved in the projects and what they would like to do. Most of the time, they are asking me questions about how they can get more involved and how we can help. Magical is how I would describe the conference social event. To stand in front of the dinner cruise on the Seine, with the Eiffel Tower standing tall, lit up in the night, while working – talking to our community members, was incredible. But, let me start at the beginning. We attend these conferences to talk to our community members, to find out what they are working on, determine technologies that should be supported in FreeBSD, and what we can do to help and improve FreeBSD. We started the week with a half-day board meeting on Wednesday. BSD conferences give us a chance to not only meet with community members around the world, but to have face-to-face meetings with our team members, who are also located around the world. We worked on refining our strategic direction and goals, determining what upcoming conferences we want FreeBSD presence at and who can give FreeBSD talks and workshops there, discussed current and potential software development projects, and discussed how we can help raise awareness about and increase the use of FreeBSD in Europe. Thursday was the first day of the FreeBSD developer summit, led by our very own Benedict Reuschling. He surprised us all by having us participate in a very clever quiz on France. 45 of us signed into the software, where he’d show the question on the screen and we had a limited amount of time to select our answers, with the results listed on the screen. It was actually a lot of fun, especially since they didn’t publicize the names of the people who got the questions wrong. The lucky or most knowledgeable person on France, was des@freebsd.org. Some of our board members ran tutorials in parallel to the summit. Kirk McKusick gave his legendary tutorial, An Introduction to the FreeBSD Open-Source Operating System , George Neville-Neil gave his tutorial, DTrace for Developers, and Benedict Reuschling gave a tutorial on, Managing BSD systems with Ansible. I was pleased to have two chairs from ACM-W Europe run an “Increasing Diversity in the BSDs” BoF for the second year in a row. We broke up into three groups to discuss different gender bias situations, and what we can do to address these types of situations, to make the BSD projects more diverse, welcoming, and inclusive. At the end, people asked that we continue these discussions at future BSD conferences and suggested having an expert in the field give a talk on how to increase the diversity in our projects. As I mentioned earlier, the social dinner was on a boat cruising along the Seine. I had a chance to talk to community members in a more social environment. With the conference being in France, we had a lot of first time attendees from France. I enjoyed talking to many of them, as well as other people I only get to see at the European conferences. Sunday was full of more presentations and conversations. During the closing session, I gave a short talk on the Foundation and the work we are doing. Then, Benedict Reuschling, Board Vice President, came up and gave out recognition awards to four FreeBSD contributors who have made an impact on the Project. News Roundup Playing with the pine64 (https://chown.me/blog/playing-with-the-pine64.html) Daniel Jakots writes in his blog about his experiences with his two pine64 boards: Finding something to install on it 6 weeks ago, I ordered two pine64 units. I didn't (and still don't) have much plan for them, but I wanted to play with some cheap boards. I finally received them this week. Initially I wanted to install some Linux stuff on it, I didn't have much requirement so I thought I would just look what seems to be easy and/or the best supported systemd flavour. I headed over their wiki. Everything seems either not really maintained, done by some random people or both. I am not saying random people do bad things, just that installing some random things from the Internet is not really my cup of tea. I heard about Armbian (https://www.armbian.com/pine64/) but the server flavour seems to be experimental so I got scared of it. And sadly, the whole things looks like to be alot undermanned. So I went for OpenBSD because I know the stuff and who to har^Wkindly ask for help. Spoiler alert, it's boring because it just works. Getting OpenBSD on it I downloaded miniroot62.fs, dd'ed it on the micro SD card. I was afraid I'd need to fiddle with some things like sysutils/dtb because I don't know what I would have needed to do. That's because I don't know what it does and for this precise reason I was wrong and I didn't need to do anything. So just dd the miniroot62.fs and you can go to next checkpoint. I plugged an HDMI cable, ethernet cable and the power, it booted, I could read for 10 seconds but then it got dark. Of course it's because you need a serial console. Of course I didn't have one. I thought about trying to install OpenBSD blindly, I could have probably succeeded with autoinstall buuuuuut… Following some good pieces of advice from OpenBSD people I bought some cp2102 (I didn't try to understand what it was or what were the other possibilities, I just wanted something that would work :D). I looked how to plug the thing. It appears you can plug it on two different places but if you plug it on the Euler bus it could power a bit the board so if you try to reboot it, it would then mess with the power disruption and could lead a unclean reboot. You just need to plug three cables: GND, TXD and RXD. Of course, the TXD goes on the RXD pin from the picture and the RXD goes on the TXD pin. Guess why I'm telling you that! That's it Then you can connect with the usual $ cu -dl /dev/cuaU0 -s 115200 What’s the point of Docker on FreeBSD or Solaris? (http://blog.frankleonhardt.com/2017/whats-the-point-of-docker-on-freebsd-or-solaris/) Penguinisters are very keen on their docker, but for the rest of us it may be difficult to see what the fuss is all about – it’s only been around a few years and everyone’s talking about it. And someone asked again today. What are we missing? Well docker is a solution to a Linux (and Windows) problem that FreeBSD/Solaris doesn’t have. Until recently, the Linux kernel only implemented the original user isolation model involving chroot. More recent kernels have had Control Groups added, which are intended to provide isolation for a group of processes (namespaces). This came out of Google, and they’ve extended to concept to include processor resource allocation as one of the knobs, which could be a good idea for FreeBSD. The scheduler is aware of the JID of the process it’s about to schedule, and I might take a look in the forthcoming winter evenings. But I digress. So if isolation (containerisation in Linux terms) is in the Linux kernel, what is Docker bringing to the party? The only thing I can think of is standardisation and an easy user interface (at the expense of having Python installed). You might think of it in similar terms to ezjail – a complex system intended to do something that is otherwise very simple. To make a jail in FreeBSD all you need do is copy the files for your system to a directory. This can even be a whole server’s system disk if you like, and jails can run inside jails. You then create a very simple config file, giving the jail a name, the path to your files and an what IP addresses to pass through (if any) and you’re done. Just type “service jail nameofjal start”, and off it goes. Is there any advantage in running Docker? Well, in a way, there is. Docker has a repository of system images that you can just install and run, and this is what a lot of people want. They’re a bit like virtual appliances, but not mind-numbingly inefficient. You can actually run docker on FreeBSD. A port was done a couple of years ago, but it relies on the 64-bit Linux emulation that started to appear in 10.x. The newer the version of FreeBSD the better. Docker is in ports/sysutils/docker-freebsd. It makes uses of jails instead of Linux cgroups, and requires ZFS rather than UFS for file system isolation. I believe the Linux version uses Union FS but I could be completely wrong on that. The FreeBSD port works with the Docker hub repository, giving you access to thousands of pre-packaged system images to play with. And that’s about as far as I’ve ever tested it. If you want to run the really tricky stuff (like Windows) you probably want full hardware emulation and something like Xen. If you want to deploy or migrate FreeBSD or Solaris systems, just copy a new tarball in to the directory and go. It’s a non-problem, so why make it more complicated? Given the increasing frequency Docker turns up in conversations, it’s probably worth taking seriously as Linux applications get packaged up in to images for easy access. Jails/Zones may be more efficient, and Docker images are limited to binary, but convenience tends to win in many environments. Network Manager Control for OpenBSD (http://www.vincentdelft.be/post/post_20171023) I propose you a small script allowing you to easily manage your networks connections. This script is integrated within the openbox dynamic menus. Moreover, it allow you to automatically have the connections you have pre-defined based. I was frustrated to not be able to swap quickly from one network interface to an another, to connect simply and quickly to my wifi, to my cable connection, to the wifi of a friend, ... Every time you have to type the ifconfig commands, .... This is nice, but boring. Surely, when you are in a middle of a presentation and you just want a quick connection to your mobile in tethering mode. Thanks to OpenBSD those commands are not so hard, but this frustrate me to not be able to do it with one click. Directly from my windows environment. Since I'm using Openbox, from a menu of openbox. So, I've looked around to see what is currently existing. One tool I've found was netctl (https://github.com/akpoff/netctl). The idea is to have a repository of hostname.if files ready to use for different cases. The idea sounds great, but I had some difficulties to use it. But what annoys me the most, is that it modify the current hostname.if files in /etc. To my eyes, I would avoid to modify those files because they are my working basis. I want to rely on them and make sure that my network will be back to a normal mode after a reboot. Nevertheless, if I've well understood netctl, you have a feature where it will look for the predefined network config matching the environment where you are. Very cool. So, after having played with netctl, look for alternative on internet, I've decided to create nmctl. A small python script which just perform the mandatory network commands. 1. nmctl: a Network Manager Control tool for OpenBSD Nmctl a small tool that allow you to manage your network connections. Why python ? Just because it's the easiest programming language for me. But I should maybe rewrite it in shell, more standard in the OpenBSD world than python. 1.1. download and install I've put nmctl on my sourceforge account here (https://sourceforge.net/p/nmctl/code/ci/master/tree/) You can dowload the last version here (https://sourceforge.net/p/nmctl/code/ci/master/tarball) To install you just have to run: make install (as root) The per-requists are: - having python2.7 installed - Since nmctl must be run as root, I strongly recommend you to run it via doas (http://man.openbsd.org/doas.conf.5). 1.2. The config file First you have to create a config and store it in /etc/nmctl.conf. This file must respect few rules: Each block must starts with a line having the following format: '''<-name->:<-interface->''' Each following lines must start by at least one space. Those lines have more or less the same format as for hostname.if. You have to create a block with the name "open". This will be used to establish a connection to the Open Wifi around you (in restaurant for example) The order of those elements is important. In case you use the -restart option, nmctl will try each of those network configs one after one until it can ping www.google.com. (if you wan to ping something else, you can change it in the python script if you want). You can use external commands. Just preced them with the "!". You have macors. Macros allow you to perform some actions. The 2 currently implemented are '''<-nwid->''' and '''<-random mac->'''. You can use keywords. Currently the only one implemented is "dhcp" Basically you can put all commands that nmctl will apply to the interface to which those commands are referring to. So, you will always have "ifconfig <-interface-> <-command you type in the config file->". Check the manpage of ifconfig to see how flexible command is. You have currently 2 macros: - <-nwid-> which refers to the "nwid <-nwid name->" when you select an Open Wifi with the -open option of nmctl. - <-random mac-> is a macro generating a random mac address. This is useful test a dhcp server for example. The keyword "dhcp" will trigger a command like "dhclient <-interface->". 1.3. Config file sample. Let me show you one nmctl.conf example. It speaks by itself. ``` # the name open is required for Open wifi. # this is the interface that nmctl will take to establish a connection # We must put the macro . This is where nmctl will put the nwid command # and the selected openwifi selected by the parameter --open open:iwn0 !route flush -wpa dhcp cable:em0 !route flush dhcp lgg4:iwn0 !route flush nwid LGG4s_8114 wpakey aanotherpassword dhcp home:iwn0 !route flush nwid Linksys19594 wpakey apassword dhcp college:iwn0 !route flush nwid john wpakey haahaaaguessme dhcp cable_fixip:em0 !route flush inet 192.168.3.3 netmask 255.255.255.0 !route add -host default 192.168.3.1 # with this network interface I'm using the macro # which will do what you guess it will do :-) cable_random:em0 !route flush lladdr dhcp ``` In this config we have several cable's networks associated with my interface "em0" and several wifi networks associated with my wireless interface "iwn0". You see that you can switch from dhcp, to fixed IP and even you can play with the random mac address macro. Thanks to the network called "open", you can connect to any open wifi system. To do that, just type ''' nmctl --open <-name of the open wifi->''' So, now, with just one command you can switch from one network configuration to an another one. That's become cool :-). 2. Integration with openbox Thanks to the dynamic menu feature of oenbox[sic], you can have your different pre-defined networks under one click of your mouse. For that, you just have to add, at the most appropriate place for you, the following code in your ./config/openbox/menu.xml In this case, you see the different networks as defined in the config file just above. 3. Automatically identify your available connection and connect to it in one go But the most interesting part, is coming from a loop through all of your defined networks. This loop is reachable via the -restart option. Basically the idea is to loop from the first network config to the last and test a ping for each of them. Once the ping works, we break the loop and keep this setting. Thus where ever you are, you just have to initiate a nmctl -restart and you will be connected to the network you have defined for this place. There is one small exception, the open-wifis. We do not include them in this loop exercise. Thus the way you define your config file is important. Since the network called "open" is dedicated to "open wifi", it will not be part of this scan exercise. I propose you keep it at the first place. Then, in my case, if my mobile, called lgg4, is open and visible by my laptop, I will connect it immediately. Second, I check if my "home wifi" is visible. Third, if I have a cable connected on my laptop, I'm using this connection and do a dhcp command. Then, I check to see if my laptop is not viewing the "college" wifi. ? and so on until a ping command works. If you do not have a cable in your laptop and if none of your pre-defined wifi connections are visible, the scan will stop. 3.1 examples No cable connected, no pre-defined wifi around me: t420:~$ time doas nmctl -r nwids around you: bbox2-d954 0m02.97s real 0m00.08s user 0m00.11s system t420:~$ t420:~$ I'm at home and my wifi router is running: ``` t420:~$ time doas nmctl -r nwids around you: Linksys19594 bbox2-d954 ifconfig em0 down: 0 default fw done fw 00:22:4d:ac:30:fd done nas link#2 done route flush: 0 ifconfig iwn0 nwid Linksys19594 ...: 0 iwn0: no link ........... sleeping dhclient iwn0: 0 Done. PING www.google.com (216.58.212.164): 56 data bytes 64 bytes from 216.58.212.164: icmp_seq=0 ttl=52 time=12.758 ms --- www.google.com ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 12.758/12.758/12.758/0.000 ms ping -c1 -w2 www.google.com: 0 0m22.49s real 0m00.08s user 0m00.11s system t420:~$ ``` I'm at home but tethering is active on my mobile: ``` t420:~$ t420:~$ time doas nmctl -r nwids around you: Linksys19594 bbox2-d954 LGG4s8114 ifconfig em0 down: 0 default fw done fw 00:22:4d:ac:30:fd done nas link#2 done route flush: 0 ifconfig iwn0 nwid LGG4s8114 ...: 0 iwn0: DHCPDISCOVER - interval 1 iwn0: DHCPDISCOVER - interval 2 iwn0: DHCPOFFER from 192.168.43.1 (a0:91:69:be:10:49) iwn0: DHCPREQUEST to 255.255.255.255 iwn0: DHCPACK from 192.168.43.1 (a0:91:69:be:10:49) iwn0: bound to 192.168.43.214 -- renewal in 1800 seconds dhclient iwn0: 0 Done. ping: Warning: www.google.com has multiple addresses; using 173.194.69.99 PING www.google.com (173.194.69.99): 56 data bytes 64 bytes from 173.194.69.99: icmp_seq=0 ttl=43 time=42.863 ms --- www.google.com ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 42.863/42.863/42.863/0.000 ms ping -c1 -w2 www.google.com: 0 0m13.78s real 0m00.08s user 0m00.13s system t420:~$ ``` Same situation, but I cut the tethering just after the scan. Thus the dhcp command will not succeed. We see that, after timeouts, nmctl see that the ping is failing (return code 1), thus he pass to the next possible pre-defined network. ``` t420:~$ time doas nmctl -r nwids around you: Linksys19594 bbox2-d954 LGG4s8114 ifconfig em0 down: 0 default 192.168.43.1 done 192.168.43.1 a0:91:69:be:10:49 done route flush: 0 ifconfig iwn0 nwid LGG4s8114 ...: 0 iwn0: no link ........... sleeping dhclient iwn0: 0 Done. ping: no address associated with name ping -c1 -w2 www.google.com: 1 ifconfig em0 down: 0 192.168.43.1 link#2 done route flush: 0 ifconfig iwn0 nwid Linksys19594 ...: 0 iwn0: DHCPREQUEST to 255.255.255.255 iwn0: DHCPACK from 192.168.3.1 (00:22:4d:ac:30:fd) iwn0: bound to 192.168.3.16 -- renewal in 302400 seconds dhclient iwn0: 0 Done. PING www.google.com (216.58.212.164): 56 data bytes 64 bytes from 216.58.212.164: icmp_seq=0 ttl=52 time=12.654 ms --- www.google.com ping statistics --- 1 packets transmitted, 1 packets received, 0.0% packet loss round-trip min/avg/max/std-dev = 12.654/12.654/12.654/0.000 ms ping -c1 -w2 www.google.com: 0 3m34.85s real 0m00.17s user 0m00.20s system t420:~$ ``` OpenVPN Setup Guide for FreeBSD (https://www.c0ffee.net/blog/openvpn-guide) OpenVPN Setup Guide Browse securely from anywhere using a personal VPN with OpenVPN, LDAP, FreeBSD, and PF. A VPN allows you to securely extend a private network over the internet via tunneling protocols and traffic encryption. For most people, a VPN offers two primary features: (1) the ability to access services on your local network over the internet, and (2) secure internet connectivity over an untrusted network. In this guide, I'll describe how to set up a personal VPN using OpenVPN on FreeBSD. The configuration can use both SSL certificates and LDAP credentials for authentication. We'll also be using the PF firewall to NAT traffic from our VPN out to the internet. One important note about running your own VPN: since you are most likely hosting your server using a VPS or hosting provider, with a public IP address allocated specifically to you, your VPN will not give you any extra anonymity on the internet. If anything, you'll be making yourself more of a target, since all your activity can be trivially traced back to your server's IP address. So while your VPN will protect you from a snooping hacker on the free WiFi at Starbucks, it won't protect you from a federal investigation. This guide assumes you are running FreeBSD with the PF firewall. If you're using a different Unix flavor, I'll probably get you most of the way there—but you'll be on your own when configuring your firewall and networking. Finally, I've used example.com and a non-routable public IP address for all the examples in this guide. You'll need to replace them with your own domain name and public IP address. Beastie Bits BSDCan 2017 videos (https://www.youtube.com/channel/UCuQhwHMJ0yK2zlfyRr1XZ_Q/feed) Getting started with OpenBSD device driver development PDF (https://www.openbsd.org/papers/eurobsdcon2017-device-drivers.pdf) AWS CloudWatch Logs agent for FreeBSD (https://macfoo.wordpress.com/2017/10/27/aws-cloudwatch-logs-agent-for-freebsd/) FreeBSD Foundation November 2017 Development Projects Update (https://www.freebsdfoundation.org/blog/november-2017-development-projects-update/) Schedule for the BSD Devroom at FOSDEM 2018 (https://fosdem.org/2018/schedule/track/bsd/) *** Feedback/Questions Matt - The show and Cantrill (http://dpaste.com/35VNXR5#wrap) Paulo - FreeBSD Question (http://dpaste.com/17E9Z2W#wrap) Steven - Virtualization under FreeBSD (http://dpaste.com/1N6F0TC#wrap) ***
222: How Netflix works
We take a look at two-faced Oracle, cover a FAMP installation, how Netflix works the complex stuff, and show you who the patron of yak shaving is. This episode was brought to you by Headlines Why is Oracle so two-faced over open source? (https://www.theregister.co.uk/2017/10/12/oracle_must_grow_up_on_open_source/) Oracle loves open source. Except when the database giant hates open source. Which, according to its recent lobbying of the US federal government, seems to be "most of the time". Yes, Oracle has recently joined the Cloud Native Computing Foundation (CNCF) to up its support for open-source Kubernetes and, yes, it has long supported (and contributed to) Linux. And, yes, Oracle has even gone so far as to (finally) open up Java development by putting it under a foundation's stewardship. Yet this same, seemingly open Oracle has actively hammered the US government to consider that "there is no math that can justify open source from a cost perspective as the cost of support plus the opportunity cost of forgoing features, functions, automation and security overwhelm any presumed cost savings." That punch to the face was delivered in a letter to Christopher Liddell, a former Microsoft CFO and now director of Trump's American Technology Council, by Kenneth Glueck, Oracle senior vice president. The US government had courted input on its IT modernisation programme. Others writing back to Liddell included AT&T, Cisco, Microsoft and VMware. In other words, based on its letter, what Oracle wants us to believe is that open source leads to greater costs and poorly secured, limply featured software. Nor is Oracle content to leave it there, also arguing that open source is exactly how the private sector does not function, seemingly forgetting that most of the leading infrastructure, big data, and mobile software today is open source. Details! Rather than take this counterproductive detour into self-serving silliness, Oracle would do better to follow Microsoft's path. Microsoft, too, used to Janus-face its way through open source, simultaneously supporting and bashing it. Only under chief executive Satya Nadella's reign did Microsoft realise it's OK to fully embrace open source, and its financial results have loved the commitment. Oracle has much to learn, and emulate, in Microsoft's approach. I love you, you're perfect. Now change Oracle has never been particularly warm and fuzzy about open source. As founder Larry Ellison might put it, Oracle is a profit-seeking corporation, not a peace-loving charity. To the extent that Oracle embraces open source, therefore it does so for financial reward, just like every other corporation. Few, however, are as blunt as Oracle about this fact of corporate open-source life. As Ellison told the Financial Times back in 2006: "If an open-source product gets good enough, we'll simply take it. So the great thing about open source is nobody owns it – a company like Oracle is free to take it for nothing, include it in our products and charge for support, and that's what we'll do. "So it is not disruptive at all – you have to find places to add value. Once open source gets good enough, competing with it would be insane... We don't have to fight open source, we have to exploit open source." "Exploit" sounds about right. While Oracle doesn't crack the top-10 corporate contributors to the Linux kernel, it does register a respectable number 12, which helps it influence the platform enough to feel comfortable building its IaaS offering on Linux (and Xen for virtualisation). Oracle has also managed to continue growing MySQL's clout in the industry while improving it as a product and business. As for Kubernetes, Oracle's decision to join the CNCF also came with P&L strings attached. "CNCF technologies such as Kubernetes, Prometheus, gRPC and OpenTracing are critical parts of both our own and our customers' development toolchains," said Mark Cavage, vice president of software development at Oracle. One can argue that Oracle has figured out the exploitation angle reasonably well. This, however, refers to the right kind of exploitation, the kind that even free software activist Richard Stallman can love (or, at least, tolerate). But when it comes to government lobbying, Oracle looks a lot more like Mr Hyde than Dr Jekyll. Lies, damned lies, and Oracle lobbying The current US president has many problems (OK, many, many problems), but his decision to follow the Obama administration's support for IT modernisation is commendable. Most recently, the Trump White House asked for feedback on how best to continue improving government IT. Oracle's response is high comedy in many respects. As TechDirt's Mike Masnick summarises, Oracle's "latest crusade is against open-source technology being used by the federal government – and against the government hiring people out of Silicon Valley to help create more modern systems. Instead, Oracle would apparently prefer the government just give it lots of money." Oracle is very good at making lots of money. As such, its request for even more isn't too surprising. What is surprising is the brazenness of its position. As Masnick opines: "The sheer contempt found in Oracle's submission on IT modernization is pretty stunning." Why? Because Oracle contradicts much that it publicly states in other forums about open source and innovation. More than this, Oracle contradicts much of what we now know is essential to competitive differentiation in an increasingly software and data-driven world. Take, for example, Oracle's contention that "significant IT development expertise is not... central to successful modernization efforts". What? In our "software is eating the world" existence Oracle clearly believes that CIOs are buyers, not doers: "The most important skill set of CIOs today is to critically compete and evaluate commercial alternatives to capture the benefits of innovation conducted at scale, and then to manage the implementation of those technologies efficiently." While there is some truth to Oracle's claim – every project shouldn't be a custom one-off that must be supported forever – it's crazy to think that a CIO – government or otherwise – is doing their job effectively by simply shovelling cash into vendors' bank accounts. Indeed, as Masnick points out: "If it weren't for Oracle's failures, there might not even be a USDS [the US Digital Service created in 2014 to modernise federal IT]. USDS really grew out of the emergency hiring of some top-notch internet engineers in response to the Healthcare.gov rollout debacle. And if you don't recall, a big part of that debacle was blamed on Oracle's technology." In short, blindly giving money to Oracle and other big vendors is the opposite of IT modernisation. In its letter to Liddell, Oracle proceeded to make the fantastic (by which I mean "silly and false") claim that "the fact is that the use of open-source software has been declining rapidly in the private sector". What?!? This is so incredibly untrue that Oracle should score points for being willing to say it out loud. Take a stroll through the most prominent software in big data (Hadoop, Spark, Kafka, etc.), mobile (Android), application development (Kubernetes, Docker), machine learning/AI (TensorFlow, MxNet), and compare it to Oracle's statement. One conclusion must be that Oracle believes its CIO audience is incredibly stupid. Oracle then tells a half-truth by declaring: "There is no math that can justify open source from a cost perspective." How so? Because "the cost of support plus the opportunity cost of forgoing features, functions, automation and security overwhelm any presumed cost savings." Which I guess is why Oracle doesn't use any open source like Linux, Kubernetes, etc. in its services. Oops. The Vendor Formerly Known As Satan The thing is, Oracle doesn't need to do this and, for its own good, shouldn't do this. After all, we already know how this plays out. We need only look at what happened with Microsoft. Remember when Microsoft wanted us to "get the facts" about Linux? Now it's a big-time contributor to Linux. Remember when it told us open source was anti-American and a cancer? Now it aggressively contributes to a huge variety of open-source projects, some of them homegrown in Redmond, and tells the world that "Microsoft loves open source." Of course, Microsoft loves open source for the same reason any corporation does: it drives revenue as developers look to build applications filled with open-source components on Azure. There's nothing wrong with that. Would Microsoft prefer government IT to purchase SQL Server instead of open-source-licensed PostgreSQL? Sure. But look for a single line in its response to the Trump executive order that signals "open source is bad". You won't find it. Why? Because Microsoft understands that open source is a friend, not foe, and has learned how to monetise it. Microsoft, in short, is no longer conflicted about open source. It can compete at the product level while embracing open source at the project level, which helps fuel its overall product and business strategy. Oracle isn't there yet, and is still stuck where Microsoft was a decade ago. It's time to grow up, Oracle. For a company that builds great software and understands that it increasingly needs to depend on open source to build that software, it's disingenuous at best to lobby the US government to put the freeze on open source. Oracle needs to learn from Microsoft, stop worrying and love the open-source bomb. It was a key ingredient in Microsoft's resurgence. Maybe it could help Oracle get a cloud clue, too. Install FAMP on FreeBSD (https://www.linuxsecrets.com/home/3164-install-famp-on-freebsd) The acronym FAMP refers to a set of free open source applications which are commonly used in Web server environments called Apache, MySQL and PHP on the FreeBSD operating system, which provides a server stack that provides web services, database and PHP. Prerequisites sudo Installed and working - Please read Apache PHP5 or PHP7 MySQL or MariaDB Install your favorite editor, ours is vi Note: You don't need to upgrade FreeBSD but make sure all patches have been installed and your port tree is up-2-date if you plan to update by ports. Install Ports portsnap fetch You must use sudo for each indivdual command during installations. Please see link above for installing sudo. Searching Available Apache Versions to Install pkg search apache Install Apache To install Apache 2.4 using pkg. The apache 2.4 user account managing Apache is www in FreeBSD. pkg install apache24 Confirmation yes prompt and hit y for yes to install Apache 2.4 This installs Apache and its dependencies. Enable Apache use sysrc to update services to be started at boot time, Command below adds "apache24enable="YES" to the /etc/rc.conf file. For sysrc commands please read ```sysrc apache24enable=yes Start Apache service apache24 start``` Visit web address by accessing your server's public IP address in your web browser How To find Your Server's Public IP Address If you do not know what your server's public IP address is, there are a number of ways that you can find it. Usually, this is the address you use to connect to your server through SSH. ifconfig vtnet0 | grep "inet " | awk '{ print $2 }' Now that you have the public IP address, you may use it in your web browser's address bar to access your web server. Install MySQL Now that we have our web server up and running, it is time to install MySQL, the relational database management system. The MySQL server will organize and provide access to databases where our server can store information. Install MySQL 5.7 using pkg by typing pkg install mysql57-server Enter y at the confirmation prompt. This installs the MySQL server and client packages. To enable MySQL server as a service, add mysqlenable="YES" to the /etc/rc.conf file. This sysrc command will do just that ```sysrc mysqlenable=yes Now start the MySQL server service mysql-server start Now run the security script that will remove some dangerous defaults and slightly restrict access to your database system. mysqlsecureinstallation``` Answer all questions to secure your newly installed MySQL database. Enter current password for root (enter for none): [RETURN] Your database system is now set up and we can move on. Install PHP5 or PHP70 pkg search php70 Install PHP70 you would do the following by typing pkg install php70-mysqli mod_php70 Note: In these instructions we are using php5.7 not php7.0. We will be coming out with php7.0 instructions with FPM. PHP is the component of our setup that will process code to display dynamic content. It can run scripts, connect to MySQL databases to get information, and hand the processed content over to the web server to display. We're going to install the modphp, php-mysql, and php-mysqli packages. To install PHP 5.7 with pkg, run this command ```pkg install modphp56 php56-mysql php56-mysqli Copy sample PHP configuration file into place. cp /usr/local/etc/php.ini-production /usr/local/etc/php.ini Regenerate the system's cached information about your installed executable files rehash``` Before using PHP, you must configure it to work with Apache. Install PHP Modules (Optional) To enhance the functionality of PHP, we can optionally install some additional modules. To see the available options for PHP 5.6 modules and libraries, you can type this into your system pkg search php56 Get more information about each module you can look at the long description of the package by typing pkg search -f apache24 Optional Install Example pkg install php56-calendar Configure Apache to Use PHP Module Open the Apache configuration file vim /usr/local/etc/apache24/Includes/php.conf DirectoryIndex index.php index.html Next, we will configure Apache to process requested PHP files with the PHP processor. Add these lines to the end of the file: SetHandler application/x-httpd-php SetHandler application/x-httpd-php-source Now restart Apache to put the changes into effect service apache24 restart Test PHP Processing By default, the DocumentRoot is set to /usr/local/www/apache24/data. We can create the info.php file under that location by typing vim /usr/local/www/apache24/data/info.php Add following line to info.php and save it. php phpinfo(); Details on info.php info.php file gives you information about your server from the perspective of PHP. It' useful for debugging and to ensure that your settings are being applied correctly. If this was successful, then your PHP is working as expected. You probably want to remove info.php after testing because it could actually give information about your server to unauthorized users. Remove file by typing rm /usr/local/www/apache24/data/info.php Note: Make sure Apache / meaning the root of Apache is owned by user which should have been created during the Apache install is the owner of the /usr/local/www structure. That explains FAMP on FreeBSD. IXsystems IXsystems TrueNAS X10 Torture Test & Fail Over Systems In Action with the ZFS File System (https://www.youtube.com/watch?v=GG_NvKuh530) How Netflix works: what happens every time you hit Play (https://medium.com/refraction-tech-everything/how-netflix-works-the-hugely-simplified-complex-stuff-that-happens-every-time-you-hit-play-3a40c9be254b) Not long ago, House of Cards came back for the fifth season, finally ending a long wait for binge watchers across the world who are interested in an American politician’s ruthless ascendance to presidency. For them, kicking off a marathon is as simple as reaching out for your device or remote, opening the Netflix app and hitting Play. Simple, fast and instantly gratifying. What isn’t as simple is what goes into running Netflix, a service that streams around 250 million hours of video per day to around 98 million paying subscribers in 190 countries. At this scale, providing quality entertainment in a matter of a few seconds to every user is no joke. And as much as it means building top-notch infrastructure at a scale no other Internet service has done before, it also means that a lot of participants in the experience have to be negotiated with and kept satiated?—?from production companies supplying the content, to internet providers dealing with the network traffic Netflix brings upon them. This is, in short and in the most layman terms, how Netflix works. Let us just try to understand how Netflix is structured on the technological side with a simple example. Netflix literally ushered in a revolution around ten years ago by rewriting the applications that run the entire service to fit into a microservices architecture?—?which means that each application, or microservice’s code and resources are its very own. It will not share any of it with any other app by nature. And when two applications do need to talk to each other, they use an application programming interface (API)?—?a tightly-controlled set of rules that both programs can handle. Developers can now make many changes, small or huge, to each application as long as they ensure that it plays well with the API. And since the one program knows the other’s API properly, no change will break the exchange of information. Netflix estimates that it uses around 700 microservices to control each of the many parts of what makes up the entire Netflix service: one microservice stores what all shows you watched, one deducts the monthly fee from your credit card, one provides your device with the correct video files that it can play, one takes a look at your watching history and uses algorithms to guess a list of movies that you will like, and one will provide the names and images of these movies to be shown in a list on the main menu. And that’s the tip of the iceberg. Netflix engineers can make changes to any part of the application and can introduce new changes rapidly while ensuring that nothing else in the entire service breaks down. They made a courageous decision to get rid of maintaining their own servers and move all of their stuff to the cloud?—?i.e. run everything on the servers of someone else who dealt with maintaining the hardware while Netflix engineers wrote hundreds of programs and deployed it on the servers rapidly. The someone else they chose for their cloud-based infrastructure is Amazon Web Services (AWS). Netflix works on thousands of devices, and each of them play a different format of video and sound files. Another set of AWS servers take this original film file, and convert it into hundreds of files, each meant to play the entire show or film on a particular type of device and a particular screen size or video quality. One file will work exclusively on the iPad, one on a full HD Android phone, one on a Sony TV that can play 4K video and Dolby sound, one on a Windows computer, and so on. Even more of these files can be made with varying video qualities so that they are easier to load on a poor network connection. This is a process known as transcoding. A special piece of code is also added to these files to lock them with what is called digital rights management or DRM?—?a technological measure which prevents piracy of films. The Netflix app or website determines what particular device you are using to watch, and fetches the exact file for that show meant to specially play on your particular device, with a particular video quality based on how fast your internet is at that moment. Here, instead of relying on AWS servers, they install their very own around the world. But it has only one purpose?—?to store content smartly and deliver it to users. Netflix strikes deals with internet service providers and provides them the red box you saw above at no cost. ISPs install these along with their servers. These Open Connect boxes download the Netflix library for their region from the main servers in the US?—?if there are multiple of them, each will rather store content that is more popular with Netflix users in a region to prioritise speed. So a rarely watched film might take time to load more than a Stranger Things episode. Now, when you will connect to Netflix, the closest Open Connect box to you will deliver the content you need, thus videos load faster than if your Netflix app tried to load it from the main servers in the US. In a nutshell… This is what happens when you hit that Play button: Hundreds of microservices, or tiny independent programs, work together to make one large Netflix service. Content legally acquired or licensed is converted into a size that fits your screen, and protected from being copied. Servers across the world make a copy of it and store it so that the closest one to you delivers it at max quality and speed. When you select a show, your Netflix app cherry picks which of these servers will it load the video from> You are now gripped by Frank Underwood’s chilling tactics, given depression by BoJack Horseman’s rollercoaster life, tickled by Dev in Master of None and made phobic to the future of technology by the stories in Black Mirror. And your lifespan decreases as your binge watching turns you into a couch potato. It looked so simple before, right? News Roundup Moving FreshPorts (http://dan.langille.org/2017/11/15/moving-freshports/) Today I moved the FreshPorts website from one server to another. My goal is for nobody to notice. In preparation for this move, I have: DNS TTL reduced to 60s Posted to Twitter Updated the status page Put the website put in offline mode: What was missed I turned off commit processing on the new server, but I did not do this on the old server. I should have: sudo svc -d /var/service/freshports That stops processing of incoming commits. No data is lost, but it keeps the two databases at the same spot in history. Commit processing could continue during the database dumping, but that does not affect the dump, which will be consistent regardless. The offline code Here is the basic stuff I used to put the website into offline mode. The main points are: header(“HTTP/1.1 503 Service Unavailable”); ErrorDocument 404 /index.php I move the DocumentRoot to a new directory, containing only index.php. Every error invokes index.php, which returns a 503 code. The dump The database dump just started (Sun Nov 5 17:07:22 UTC 2017). root@pg96:~ # /usr/bin/time pg_dump -h 206.127.23.226 -Fc -U dan freshports.org > freshports.org.9.6.dump That should take about 30 minutes. I have set a timer to remind me. Total time was: 1464.82 real 1324.96 user 37.22 sys The MD5 is: MD5 (freshports.org.9.6.dump) = 5249b45a93332b8344c9ce01245a05d5 It is now: Sun Nov 5 17:34:07 UTC 2017 The rsync The rsync should take about 10-20 minutes. I have already done an rsync of yesterday’s dump file. The rsync today should copy over only the deltas (i.e. differences). The rsync started at about Sun Nov 5 17:36:05 UTC 2017 That took 2m9.091s The MD5 matches. The restore The restore should take about 30 minutes. I ran this test yesterday. It is now Sun Nov 5 17:40:03 UTC 2017. $ createdb -T template0 -E SQL_ASCII freshports.testing $ time pg_restore -j 16 -d freshports.testing freshports.org.9.6.dump Done. real 25m21.108s user 1m57.508s sys 0m15.172s It is now Sun Nov 5 18:06:22 UTC 2017. Insert break here About here, I took a 30 minute break to run an errand. It was worth it. Changing DNS I’m ready to change DNS now. It is Sun Nov 5 19:49:20 EST 2017 Done. And nearly immediately, traffic started. How many misses? During this process, XXXXX requests were declined: $ grep -c '" 503 ' /usr/websites/log/freshports.org-access.log XXXXX That’s it, we’re done Total elapsed time: 1 hour 48 minutes. There are still a number of things to follow up on, but that was the transfers. The new FreshPorts Server (http://dan.langille.org/2017/11/17/x8dtu-3/) *** Using bhyve on top of CEPH (https://lists.freebsd.org/pipermail/freebsd-virtualization/2017-November/005876.html) Hi, Just an info point. I'm preparing for a lecture tomorrow, and thought why not do an actual demo.... Like to be friends with Murphy :) So after I started the cluster: 5 jails with 7 OSDs This what I manually needed to do to boot a memory stick Start een Bhyve instance rbd --dest-pool rbddata --no-progress import memstick.img memstick rbd-ggate map rbddata/memstick ggate-devvice is available on /dev/ggate1 kldload vmm kldload nmdm kldload iftap kldload ifbridge kldload cpuctl sysctl net.link.tap.uponopen=1 ifconfig bridge0 create ifconfig bridge0 addm em0 up ifconfig ifconfig tap11 create ifconfig bridge0 addm tap11 ifconfig tap11 up load the GGate disk in bhyve bhyveload -c /dev/nmdm11A -m 2G -d /dev/ggate1 FB11 and boot a single from it. bhyve -H -P -A -c 1 -m 2G -l com1,/dev/nmdm11A -s 0:0,hostbridge -s 1:0,lpc -s 2:0,virtio-net,tap11 -s 4,ahci-hd,/dev/ggate1 FB11 & bhyvectl --vm=FB11 --get-stats Connect to the VM cu -l /dev/nmdm11B And that'll give you a bhyve VM running on an RBD image over ggate. In the installer I tested reading from the bootdisk: root@:/ # dd if=/dev/ada0 of=/dev/null bs=32M 21+1 records in 21+1 records out 734077952 bytes transferred in 5.306260 secs (138341865 bytes/sec) which is a nice 138Mb/sec. Hope the demonstration does work out tomorrow. --WjW *** Donald Knuth - The Patron Saint of Yak Shaves (http://yakshav.es/the-patron-saint-of-yakshaves/) Excerpts: In 2015, I gave a talk in which I called Donald Knuth the Patron Saint of Yak Shaves. The reason is that Donald Knuth achieved the most perfect and long-running yak shave: TeX. I figured this is worth repeating. How to achieve the ultimate Yak Shave The ultimate yak shave is the combination of improbable circumstance, the privilege to be able to shave at your hearts will and the will to follow things through to the end. Here’s the way it was achieved with TeX. The recount is purely mine, inaccurate and obviously there for fun. I’ll avoid the most boring facts that everyone always tells, such as why Knuth’s checks have their own Wikipedia page. Community Shaving is Best Shaving Since the release of TeX, the community has been busy working on using it as a platform. If you ever downloaded the full TeX distribution, please bear in mind that you are downloading the amassed work of over 40 years, to make sure that each and every TeX document ever written builds. We’re talking about documents here. But mostly, two big projects sprung out of that. The first is LaTeX by Leslie Lamport. Lamport is a very productive researcher, famous for research in formal methods through TLA+ and also known laying groundwork for many distributed algorithms. LaTeX is based on the idea of separating presentation and content. It is based around the idea of document classes, which then describe the way a certain document is laid out. Think Markdown, just much more complex. The second is ConTeXt, which is far more focused on fine grained layout control. The Moral of the Story Whenever you feel like “can’t we just replace this whole thing, it can’t be so hard” when handling TeX, don’t forget how many years of work and especially knowledge were poured into that system. Typesetting isn’t the most popular knowledge around programmers. Especially see it in the context of the space it is in: they can’t remove legacy. Ever. That would break documents. TeX is also not a programming language. It might resemble one, but mostly, it should be approached as a typesetting system first. A lot of it's confusing lingo gets much better then. It’s not programming lingo. By approaching TeX with an understanding for its history, a lot of things can be learned from it. And yes, a replacement would be great, but it would take ages. In any case, I hope I thoroughly convinced you why Donald Knuth is the Patron Saint of Yak Shaves. Extra Credits This comes out of a enjoyable discussion with [Arne from Lambda Island](https://lambdaisland.com/https://lambdaisland.com/, who listened and said “you should totally turn this into a talk”. Vincent’s trip to EuroBSDCon 2017 (http://www.vincentdelft.be/post/post_20171016) My euroBSDCon 2017 Posted on 2017-10-16 09:43:00 from Vincent in Open Bsd Let me just share my feedback on those 2 days spent in Paris for the EuroBSDCon. My 1st BSDCon. I'm not a developer, contributor, ... Do not expect to improve your skills with OpenBSD with this text :-) I know, we are on October 16th, and the EuroBSDCon of Paris was 3 weeks ago :( I'm not quick !!! Sorry for that Arrival at 10h, I'm too late for the start of the key note. The few persons behind a desk welcome me by talking in Dutch, mainly because of my name. Indeed, Delft is a city in Netherlands, but also a well known university. I inform them that I'm from Belgium, and the discussion moves to the fact the Fosdem is located in Brussels. I receive my nice T-shirt white and blue, a bit like the marine T-shirts, but with the nice EuroBSDCon logo. I'm asking where are the different rooms reserved for the BSD event. We have 1 big on the 1st floor, 1 medium 1 level below, and 2 smalls 1 level above. All are really easy to access. In this entrance we have 4 or 5 tables with some persons representing their company. Those are mainly the big sponsors of the event providing details about their activity and business. I discuss a little bit with StormShield and Gandi. On other tables people are selling BSD t-shirts, and they will quickly be sold. "Is it done yet ?" The never ending story of pkg tools In the last Fosdem, I've already hear Antoine and Baptiste presenting the OpenBSD and FreeBSD battle, I decide to listen Marc Espie in the medium room called Karnak. Marc explains that he has rewritten completely the pkg_add command. He explains that, at contrario with other elements of OpenBSD, the packages tools must be backward compatible and stable on a longer period than 12 months (the support period for OpenBSD). On the funny side, he explains that he has his best idea inside his bath. Hackathons are also used to validate some ideas with other OpenBSD developers. All in all, he explains that the most time consuming part is to imagine a good solution. Coding it is quite straightforward. He adds that better an idea is, shorter the implementation will be. A Tale of six motherboards, three BSDs and coreboot After the lunch I decide to listen the talk about Coreboot. Indeed, 1 or 2 years ago I had listened the Libreboot project at Fosdem. Since they did several references to Coreboot, it's a perfect occasion to listen more carefully to this project. Piotr and Katazyba Kubaj explains us how to boot a machine without the native Bios. Indeed Coreboot can replace the bios, and de facto avoid several binaries imposed by the vendor. They explain that some motherboards are supporting their code. But they also show how difficult it is to flash a Bios and replace it by Coreboot. They even have destroyed a motherboard during the installation. Apparently because the power supply they were using was not stable enough with the 3v. It's really amazing to see that open source developers can go, by themselves, to such deep technical level. State of the DragonFly's graphics stack After this Coreboot talk, I decide to stay in the room to follow the presentation of Fran?ois Tigeot. Fran?ois is now one of the core developer of DrangonflyBSD, an amazing BSD system having his own filesystem called Hammer. Hammer offers several amazing features like snapshots, checksum data integrity, deduplication, ... Francois has spent his last years to integrate the video drivers developed for Linux inside DrangonflyBSD. He explains that instead of adapting this code for the video card to the kernel API of DrangonflyBSD, he has "simply" build an intermediate layer between the kernel of DragonflyBSD and the video drivers. This is not said in the talk, but this effort is very impressive. Indeed, this is more or less a linux emulator inside DragonflyBSD. Francois explains that he has started with Intel video driver (drm/i915), but now he is able to run drm/radeon quite well, but also drm/amdgpu and drm/nouveau. Discovering OpenBSD on AWS Then I move to the small room at the upper level to follow a presentation made by Laurent Bernaille on OpenBSD and AWS. First Laurent explains that he is re-using the work done by Antoine Jacoutot concerning the integration of OpenBSD inside AWS. But on top of that he has integrated several other Open Source solutions allowing him to build OpenBSD machines very quickly with one command. Moreover those machines will have the network config, the required packages, ... On top of the slides presented, he shows us, in a real demo, how this system works. Amazing presentation which shows that, by putting the correct tools together, a machine builds and configure other machines in one go. OpenBSD Testing Infrastructure Behind bluhm.genua.de Here Jan Klemkow explains us that he has setup a lab where he is able to run different OpenBSD architectures. The system has been designed to be able to install, on demand, a certain version of OpenBSD on the different available machines. On top of that a regression test script can be triggered. This provides reports showing what is working and what is not more working on the different machines. If I've well understood, Jan is willing to provide such lab to the core developers of OpenBSD in order to allow them to validate easily and quickly their code. Some more effort is needed to reach this goal, but with what exists today, Jan and his colleague are quite close. Since his company is using OpenBSD business, to his eyes this system is a "tit for tat" to the OpenBSD community. French story on cybercrime Then comes the second keynote of the day in the big auditorium. This talk is performed by the colonel of french gendarmerie. Mr Freyssinet, who is head of the Cyber crimes unit inside the Gendarmerie. Mr Freyssinet explains that the "bad guys" are more and more volatile across countries, and more and more organized. The small hacker in his room, alone, is no more the reality. As a consequence the different national police investigators are collaborating more inside an organization called Interpol. What is amazing in his talk is that Mr Freyssinet talks about "Crime as a service". Indeed, more and more hackers are selling their services to some "bad and temporary organizations". Social event It's now time for the famous social event on the river: la Seine. The organizers ask us to go, by small groups, to a station. There is a walk of 15 minutes inside Paris. Hopefully the weather is perfect. To identify them clearly several organizers takes a "beastie fork" in their hands and walk on the sidewalk generating some amazing reactions from some citizens and toursits. Some of them recognize the Freebsd logo and ask us some details. Amazing :-) We walk on small and big sidewalks until a small stair going under the street. There, we have a train station a bit like a metro station. 3 stations later they ask us to go out. We walk few minutes and come in front of a boat having a double deck: one inside, with nice tables and chairs and one on the roof. But the crew ask us to go up, on the second deck. There, we are welcome with a glass of wine. The tour Eiffel is just at few 100 meters from us. Every hour the Eiffel tower is blinking for 5 minutes with thousands of small lights. Brilliant :-) We see also the "statue de la libertee" (the small one) which is on a small island in the middle of the river. During the whole night the bar will be open with drinks and some appetizers, snacks, ... Such walking diner is perfect to talk with many different persons. I've discussed with several persons just using BSD, they are not, like me, deep and specialized developers. One was from Switzerland, another one from Austria, and another one from Netherlands. But I've also followed a discussion with Theo de Raadt, several persons of the FreeBSD foundation. Some are very technical guys, other just users, like me. But all with the same passion for one of the BSD system. Amazing evening. OpenBSD's small steps towards DTrace (a tale about DDB and CTF) On the second day, I decide to sleep enough in order to have enough resources to drive back to my home (3 hours by car). So I miss the 1st presentations, and arrive at the event around 10h30. Lot of persons are already present. Some faces are less "fresh" than others. I decide to listen to Dtrace in OpenBSD. After 10 minutes I am so lost into those too technical explainations, that I decide to open and look at my PC. My OpenBSD laptop is rarely leaving my home, so I've never had the need to have a screen locking system. In a crowded environment, this is better. So I was looking for a simple solution. I've looked at how to use xlock. I've combined it with the /ets/apm/suspend script, ... Always very easy to use OpenBSD :-) The OpenBSD web stack Then I decide to follow the presentation of Michael W Lucas. Well know person for his different books about "Absolute OpenBSD", Relayd", ... Michael talks about the httpd daemon inside OpenBSD. But he also present his integration with Carp, Relayd, PF, FastCGI, the rules based on LUA regexp (opposed to perl regexp), ... For sure he emphasis on the security aspect of those tools: privilege separation, chroot, ... OpenSMTPD, current state of affairs Then I follow the presentation of Gilles Chehade about the OpenSMTPD project. Amazing presentation that, on top of the technical challenges, shows how to manage such project across the years. Gilles is working on OpenSMTPD since 2007, thus 10 years !!!. He explains the different decisions they took to make the software as simple as possible to use, but as secure as possible, too: privilege separation, chroot, pledge, random malloc, ? . The development starts on BSD systems, but once quite well known they received lot of contributions from Linux developers. Hoisting: lessons learned integrating pledge into 500 programs After a small break, I decide to listen to Theo de Raadt, the founder of OpenBSD. In his own style, with trekking boots, shorts, backpack. Theo starts by saying that Pledge is the outcome of nightmares. Theo explains that the book called "Hacking blind" presenting the BROP has worried him since few years. That's why he developed Pledge as a tool killing a process as soon as possible when there is an unforeseen behavior of this program. For example, with Pledge a program which can only write to disk will be immediately killed if he tries to reach network. By implementing Pledge in the +-500 programs present in the "base", OpenBSD is becoming more secured and more robust. Conclusion My first EuroBSDCon was a great, interesting and cool event. I've discussed with several BSD enthusiasts. I'm using OpenBSD since 2010, but I'm not a developer, so I was worried to be "lost" in the middle of experts. In fact it was not the case. At EuroBSDCon you have many different type of enthusiasts BSD's users. What is nice with the EuroBSDCon is that the organizers foresee everything for you. You just have to sit and listen. They foresee even how to spend, in a funny and very cool attitude, the evening of Saturday. > The small draw back is that all of this has a cost. In my case the whole weekend cost me a bit more than 500euro. Based on what I've learned, what I've saw this is very acceptable price. Nearly all presentations I saw give me a valuable input for my daily job. For sure, the total price is also linked to my personal choice: hotel, parking. And I'm surely biased because I'm used to go to the Fosdem in Brussels which cost nothing (entrance) and is approximately 45 minutes of my home. But Fosdem is not the same atmosphere and presentations are less linked to my daily job. I do not regret my trip to EuroBSDCon and will surely plan other ones. Beastie Bits Important munitions lawyering (https://www.jwz.org/blog/2017/10/important-munitions-lawyering/) AsiaBSDCon 2018 CFP is now open, until December 15th (https://2018.asiabsdcon.org/) ZSTD Compression for ZFS by Allan Jude (https://www.youtube.com/watch?v=hWnWEitDPlM&feature=share) NetBSD on Allwinner SoCs Update (https://blog.netbsd.org/tnf/entry/netbsd_on_allwinner_socs_update) *** Feedback/Questions Tim - Creating Multi Boot USB sticks (http://dpaste.com/0FKTJK3#wrap) Nomen - ZFS Questions (http://dpaste.com/1HY5MFB) JJ - Questions (http://dpaste.com/3ZGNSK9#wrap) Lars - Hardening Diffie-Hellman (http://dpaste.com/3TRXXN4) ***
221: BSD in Taiwan
Allan reports on his trip to BSD Taiwan, new versions of Lumina and GhostBSD are here, a bunch of OpenBSD p2k17 hackathon reports. This episode was brought to you by Headlines Allan’s Trip Report from BSD Taiwan (https://bsdtw.org/) BSD TW and Taiwan in general was a fun and interesting experience I arrived Thursday night and took the high speed train to Taipei main station, and then got on the Red line subway to the venue. The dorm rooms were on par with BSDCan, except the mattress was better. I spent Friday with a number of other FreeBSD developers doing touristy things. We went to Taipei 101, the world’s tallest building from 2004 - 2010. It also features the world’s fastest elevator (2004 - 2016), traveling at 60.6 km/h and transporting passengers from the 5th to 89th floor in 37 seconds. We also got to see the “tuned mass damper”, a 660 tonne steel pendulum suspended between the 92nd and 87th floors. This device resists the swaying of the building caused by high winds. There are interesting videos on display beside the damper, of its reaction during recent typhoons and earthquakes. The Taipei 101 building sits just 200 meters from a major fault line. Then we had excellent dumplings for lunch After walking around the city for a few more hours, we retired to a pub to escape the heat of the sunny Friday afternoon. Then came the best part of each day in Taipei, dinner! We continued our efforts to cause a nation wide shortage of dumplings Special thanks to Scott Tsai (https://twitter.com/scottttw) who took detailed notes for each of the presentations Saturday marked the start of the conference: Arun Thomas provided background and then a rundown of what is happening with the RISC-V architecture. Notes (https://docs.google.com/document/d/1yrnhNTHaMDr4DG-iviXN0O9NES9Lmlc7sWVQhnios6g/edit#heading=h.kcm1n3yzl35q) George Neville-Neil talked about using DTrace in distributed systems as an in-depth auditing system (who did what to whom and when). Notes (https://docs.google.com/document/d/1qut6tMVF8NesrGHd6bydLDN-aKBdXMgHx8Vp3_iGKjQ/edit#heading=h.qdghsgk1bgtl) Baptiste Daroussin presented Poudrière image, an extension of everyone’s favourite package building system, to build custom images of FreeBSD. There was discussion of making this generate ZFS based images as well, making it mesh very well with my talk the next day. Notes (https://docs.google.com/document/d/1LceXj8IWJeTRHp9KzOYy8tpM00Fzt7fSN0Gw83B9COE/edit#heading=h.incfzi6bnzxr) Brooks Davis presented his work on an API design for a replacement for mmap. It started with a history of address space management in the BSD family of operating systems going all the way back to the beginning. This overview of the feature and how it evolved filled in many gaps for me, and showed why the newer work would be beneficial. The motivation for the work includes further extensions to support the CHERI hardware platform. Notes (https://docs.google.com/document/d/1LceXj8IWJeTRHp9KzOYy8tpM00Fzt7fSN0Gw83B9COE/edit#heading=h.incfzi6bnzxr) Johannes M Dieterich gave an interesting presentation about using FreeBSD and GPU acceleration for high performance computing. One of the slides showed that amd64 has taken almost the entire market for the top 500 super computers, and that linux dominates the list, with only a few remaining non-linux systems. Sadly, at the supercomputing conference the next week, it was announced that linux has achieved 100% saturation of the top 500 super computers list. Johannes detailed the available tools, what ports are missing, what changes should be made to the base system (mostly OpenMP), and generally what FreeBSD needs to do to become a player in the supercomputer OS market. Johannes’ perspective is interesting, as he is a computational chemist, not a computer scientist. Those interested in improving the numerical libraries and GPU acceleration frameworks on FreeBSD should join the ports team. Notes (https://docs.google.com/document/d/1uaJiqtPk8WetST6_GnQwIV49bj790qx7ToY2BHC9zO4/edit#heading=h.nvsz1n6w3gyq) The final talk of the day was Peter Grehan, who spoke about how graphics support in bhyve came to be. He provided a history of how the feature evolved, and where it stands today. Notes (https://docs.google.com/document/d/1LqJQJUwdUwWZ0n5KwCH1vNI8jiWGJlI1j0It3mERN80/edit#heading=h.sgeixwgz7bjs) Afterwards, we traveled as a group to a large restaurant for dinner. There was even Mongolian Vodka, provided by Ganbold Tsagaankhuu of the FreeBSD project. Sunday: The first talk of the day Sunday was mine. I presented “ZFS: Advanced Integration”, mostly talking about how boot environments work, and the new libbe and be(1) tools that my GSoC student Kyle Kneitinger created to manage them. I talked about how they can be used for laptop and developer systems, but also how boot environments can be used to replace nanobsd for appliances (as already done in FreeNAS and pfSense). I also presented about zfsbootcfg (zfs nextboot), and some future extensions to it to make it even more useful in appliance type workloads. I also provided a rundown of new developments out of the ZFS developer summit, two weeks previous. Notes (https://docs.google.com/document/d/1Blh3Dulf0O91A0mwv34UnIgxRZaS_0FU2lZ41KRQoOU/edit#heading=h.gypim387e8hy) Theo de Raadt presented “Mitigations and other real Security Features”, and made his case for changing to a ‘fail closed’ mode of interoperability. Computer’s cannot actually self heal, so lets stop pretending that they can. Notes (https://docs.google.com/document/d/1fFHzlxJjbHPsV9t_Uh3PXZnXmkapAK5RkJsfaHki7kc/edit#heading=h.192e4lmbl70c) Ruslan Bukin talked about doing the port of FreeBSD for RISC-V and writing the Device Drivers. Ruslan walked through the process step by step, leading members of the audience to suggest he turn it into a developer’s handbook article, explaining how to do the initial bringup on new hardware. Ruslan also showed off a FreeBSD/MIPS board he designed himself and had manufactured in China. Notes (https://docs.google.com/document/d/1kRhRr3O3lQ-0dS0kYF0oh_S0_zFufEwrdFjG1QLyk8Y/edit#heading=h.293mameym7w1) Mariusz Zaborski presented Case studies on sandboxing the base system with Capsicum. He discussed the challenges encountered as existing programs are modified to sandbox them, and recent advancements in the debugging tools available during that process. Mariusz also discussed the Casper service at length, including the features that are planned for 2018 and onwards. Notes (https://docs.google.com/document/d/1_0BpAE1jGr94taUlgLfSWlJOYU5II9o7Y3ol0ym1eZQ/edit#heading=h.xm9mh7dh6bay) The final presentation of the day was Mark Johnston on Memory Management Improvements in FreeBSD 12.0. This talk provided a very nice overview of the memory management system in FreeBSD, and then detailed some of the recent improvements. Notes (https://docs.google.com/document/d/1gFQXxsHM66GQGMO4-yoeFRTcmOP4NK_ujVFHIQJi82U/edit#heading=h.uirc9jyyti7w) The conference wrapped up with the Work-in-Progress session, including updates on: multi-device-at-once GELI attach, MP-safe networking on NetBSD, pkgsrc, NetBSD in general, BSD on Microsoft Azure, Mothra (send-pr for bugzilla), BSDMizer a machine learning compiler optimizer, Hyperledger Sawtooth (blockchain), and finally VIMAGE and pf testing on FreeBSD. Notes (https://docs.google.com/document/d/1miHZEPrqrpCTh8JONmUKWDPYUmTuG2lbsVrWDtekvLc/edit#heading=h.orhedpjis5po) Group Photo (https://pbs.twimg.com/media/DOh1txnVoAAFKAa.jpg:large) BSDTW was a great conference. They are still considering if it should be an annual thing, trade off every 2nd year with AsiaBSDCon, or something else. In order to continue, BSD Taiwan requires more organizers and volunteers. They have regular meetups in Taipei if you are interested in getting involved. *** Lumina 1.4.0 released (https://lumina-desktop.org/version-1-4-0-released/) The Lumina Theme Engine (and associated configuration utility) The Lumina theme engine is a new component of the “core” desktop, and provides enhanced theming capabilities for the desktop as well as all Qt5 applications. While it started out life as a fork of the “qt5ct” utility, it quickly grew all sorts of new features and functionality such as system-defined color profiles, modular theme components, and built-in editors/creators for all components. The backend of this engine is a standardized theme plugin for the Qt5 toolkit, so that all Qt5 applications will now present a unified appearance (if the application does not enforce a specific appearance/theme of it’s own). Users of the Lumina desktop will automatically have this plugin enabled: no special action is required. Please note that the older desktop theme system for Lumina has been rendered obsolete by the new engine, but a settings-conversion path has already been implemented which should transition your current settings to the new engine the first time you login to Lumina 1.4.0. Custom themes for the older system may not be converted though, but it is trivial to copy/paste any custom stylesheets from the old system into the editor for the new theme engine to register/re-apply them as desired. Lumina-Themes Repository I also want to give a shout-out to the trueos/lumina-themes github repository contributors. All of the wallpapers in the 1.4.0 screenshots I posted come from that package, and they are working on making more wallpapers, color palettes, and desktop styles for use with the Lumina Theme Engine. If your operating system does not currently provide a package for lumina-themes, I highly recommend that you make one as soon as possible! The Lumina PDF Viewer (lumina-pdf) This is a new, stand-alone desktop utility for viewing/printing/presenting PDF documents. It uses the poppler-qt5 library in the backend for rendering the document, but uses multi-threading in many ways (such as to speed up the loading of pages) to give the user a nice, streamlined utility for viewing PDF documents. There is also built-in presentation functionality which allows users to easily cast the document to a separate screen without mucking about in system menus or configuration utilities. Lumina PDF Viewer (1.4.0) Important Packaging Changes One significant change of note for people who are packaging Lumina for their particular operating system is that the minimum supported versions of Qt for Lumina have been changed with this release: lumina-core: Qt 5.4+ lumina-mediaplayer: Qt 5.7+ Everything else: Qt 5.2+ Of course, using the latest version of the Qt5 libraries is always recommended. When packaging for Linux distributions, the theme engine also requires the availability of some of the “-dev” packages for Qt itself when compiling the theme plugin. For additional information (specifically regarding Ubuntu builds), please take a look at a recent ticket on the Lumina repository. + The new lumina-pdf utility requires the availability of the “poppler-qt5” library. The includes for this library on Ubuntu 17.10 were found to be installed outside of the normal include directories, so a special rule for it was added to our OS-Detect file in the Lumina source tree. If your particular operating system also places the the poppler include files in a non-standard place, please patch that file or send us the information and we can add more special rules for your particular OS. Other Changes of Note (in no particular order) lumina-config: Add a new page for changing audio theme (login, logout, low battery) Add option to replace fluxbox with some other WM (with appropriate warnings) Have the “themes” page redirect to launching the Lumina theme engine configuration utility. start-lumina-desktop: Auto-detect the active X11 displays and create a new display for the Lumina session (prevent conflict with prior graphical sessions). Add a process-failure counter & restart mechanism. This is particularly useful for restarting Fluxbox from time to time (such as after any monitor addition/removal) lumina-xconfig: Restart fluxbox after making any monitor changes with xrandr. This ensures a more reliable session. Implement a new 2D monitor layout mechanism. This allows for the placement of monitors anywhere in the X/Y plane, with simplification buttons for auto-tiling the monitors in each dimension based on their current location. Add the ability to save/load monitor profiles. Distinguish between the “default” monitor arrangement and the “current” monitor arrangement. Allow the user to set the current arrangement as the new default. lumina-desktop: Completely revamp the icon loading mechanisms so it should auto-update when the theme changes. Speed up the initialization of the desktop quite a bit. Prevent loading/probing files in the “/net/” path for existence (assume they exist in the interest of providing shortcuts). On FreeBSD, these are special paths that actually pause the calling process in order to mount/load a network share before resuming the process, and can cause significant “hangs” in the desktop process. Add the ability to take a directory as a target for the wallpaper. This will open/probe the directory for any existing image files that it can use as a wallpaper and randomly select one. Remove the popup dialog prompting about system updates, and replace it with new “Restart (with updates)” buttons on the appropriate menus/windows instead. If no wallpapers selection is provided, try to use the “lumina-nature” wallpaper directory as the default, otherwise fall back on the original default wallpaper if the “lumina-themes” package is not installed. lumina-open: Make the *.desktop parsing a bit more flexible regarding quoted strings where there should not be any. If selecting which application to use, only overwrite the user-default app if the option is explicitly selected. lumina-fileinfo: Significant cleanup of this utility. Now it can be reliably used for creating/registering XDG application shortcuts. Add a whole host of new ZFS integrations: If a ZFS dataset is being examined, show all the ZFS properties for that dataset. If the file being examined exists within ZFS snapshots, show all the snapshots of the file lumina-fm: Significant use of additional multi-threading. Makes the loading of directories much faster (particularly ones with image files which need thumbnails) Add detection/warning when running as root user. Also add an option to launch a new instance of lumina-fm as the root user. [FreeBSD/TrueOS] Fix up the detection of the “External Devices” list to also list available devices for the autofs system. Fix up some drag and drop functionality. Expose the creation, extraction, and insertion of files into archives (requires lumina-archiver at runtime) Expand the “Open With” option into a menu of application suggestions in addition to the “Other” option which runs “lumina-open” to find an application. Provide an option to set the desktop wallpaper to the selected image file(s). (If the running desktop session is Lumina). lumina-mediaplayer: Enable the ability to playback local video files. (NOTE: If Qt5 is set to use the gstreamer multimedia backend, make sure you have the “GL” plugin installed for smooth video playback). lumina-archiver: Add CLI flags for auto-archive and auto-extract. This allows for programmatic/scriptable interactions with archives. That is not mentioning all of the little bugfixes, performance tweaks, and more that are also included in this release. *** The strongest KASLR, ever? (https://blog.netbsd.org/tnf/entry/the_strongest_kaslr_ever) Re: amd64: kernel aslr support (https://mail-index.netbsd.org/tech-kern/2017/11/14/msg022594.html) So, I did it. Now the kernel sections are split in sub-blocks, and are all randomized independently. See my drawing [1]. What it means in practice, is that Kernel ASLR is much more difficult to defeat: a cache attack will at most allow you to know that a given range is mapped as executable for example, but you don't know which sub-block of .text it is; a kernel pointer leak will at most allow you to reconstruct the layout of one sub-block, but you don't know the layout and address of the remaining blocks, and there can be many. The size and number of these blocks is controlled by the split-by-file parameter in Makefile.amd64. Right now it is set to 2MB, which produces a kernel with ~23 allocatable (ie useful at runtime) sections, which is a third of the total number supported (BTSPACENSEGS = 64). I will probably reduce this parameter a bit in the future, to 1.5MB, or even 1MB. All of that leaves us with about the most advanced KASLR implementation available out there. There are ways to improve it even more, but you'll have to wait a few weeks for that. If you want to try it out you need to make sure you have the latest versions of GENERICKASLR / prekern / bootloader. The instructions are still here, and haven't changed. Initial design As I said in the previous episode, I added in October a Kernel ASLR implementation in NetBSD for 64bit x86 CPUs. This implementation would randomize the location of the kernel in virtual memory as one block: a random VA would be chosen, and the kernel ELF sections would be mapped contiguously starting from there. This design had several drawbacks: one leak, or one successful cache attack, could be enough to reconstruct the layout of the entire kernel and defeat KASLR. NetBSD’s new KASLR design significantly improves this situation. New design In the new design, each kernel ELF section is randomized independently. That is to say, the base addresses of .text, .rodata, .data and .bss are not correlated. KASLR is already at this stage more difficult to defeat, since you would need a leak or cache attack on each of the kernel sections in order to reconstruct the in-memory kernel layout. Then, starting from there, several techniques are used to strengthen the implementation even more. Sub-blocks The kernel ELF sections are themselves split in sub-blocks of approximately 1MB. The kernel therefore goes from having: { .text .rodata .data .bss } to having { .text .text.0 .text.1 ... .text.i .rodata .rodata.0 ... .rodata.j ... .data ...etc } As of today, this produces a kernel with ~33 sections, each of which is mapped at a random address and in a random order. This implies that there can be dozens of .text segments. Therefore, even if you are able to conduct a cache attack and determine that a given range of memory is mapped as executable, you don’t know which sub-block of .text it is. If you manage to obtain a kernel pointer via a leak, you can at most guess the address of the section it finds itself in, but you don’t know the layout of the remaining 32 sections. In other words, defeating this KASLR implementation is much more complicated than in the initial design. Higher entropy Each section is put in a 2MB-sized physical memory chunk. Given that the sections are 1MB in size, this leaves half of the 2MB chunk unused. Once in control, the prekern shifts the section within the chunk using a random offset, aligned to the ELF alignment constraint. This offset has a maximum value of 1MB, so that once shifted the section still resides in its initial 2MB chunk: The prekern then maps these 2MB physical chunks at random virtual addresses; but addresses aligned to 2MB. For example, the two sections in Fig. A will be mapped at two distinct VAs: There is a reason the sections are shifted in memory: it offers higher entropy. If we consider a .text.i section with a 64byte ELF alignment constraint, and give a look at the number of possibilities for the location of the section in memory: The prekern shifts the 1MB section in its 2MB chunk, with an offset aligned to 64 bytes. So there are (2MB-1MB)/(64B)=214 possibilities for the offset. Then, the prekern uses a 2MB-sized 2MB-aligned range of VA, chosen in a 2GB window. So there are (2GB-2MB)/(2MB)=210-1 possibilities for the VA. Therefore, there are 214x(210-1)˜224 possible locations for the section. As a comparison with other systems: OS # of possibilities Linux 2^6 MacOS 2^8 Windows 2^13 NetBSD 2^24 Of course, we are talking about one .text.i section here; the sections that will be mapped afterwards will have fewer location possibilities because some slots will be already occupied. However, this does not alter the fact that the resulting entropy is still higher than that of the other implementations. Note also that several sections have an alignment constraint smaller than 64 bytes, and that in such cases the entropy is even higher. Large pages There is also a reason we chose to use 2MB-aligned 2MB-sized ranges of VAs: when the kernel is in control and initializes itself, it can now use large pages to map the physical 2MB chunks. This greatly improves memory access performance at the CPU level. Countermeasures against TLB cache attacks With the memory shift explained above, randomness is therefore enforced at both the physical and virtual levels: the address of the first page of a section does not equal the address of the section itself anymore. It has, as a side effect, an interesting property: it can mostly mitigate TLB cache attacks. Such attacks operate at the virtual-page level; they will allow you to know that a given large page is mapped as executable, but you don’t know where exactly within that page the section actually begins. Strong? This KASLR implementation, which splits the kernel in dozens of sub-blocks, randomizes them independently, while at the same time allowing for higher entropy in a way that offers large page support and some countermeasures against TLB cache attacks, appears to be the most advanced KASLR implementation available publicly as of today. Feel free to prove me wrong, I would be happy to know! WIP Even if it is in a functional state, this implementation is still a work in progress, and some of the issues mentioned in the previous blog post haven't been addressed yet. But feel free to test it and report any issue you encounter. Instructions on how to use this implementation can still be found in the previous blog post, and haven’t changed since. See you in the next episode! News Roundup GhostBSD 11.1 Finally Ready and Available! (http://www.ghostbsd.org/11.1_release_announcement) Screenshots (https://imgur.com/a/Mu8xk) After a year of development, testing, debugging and working on our software package repository, we are pleased to announce the release of GhostBSD 11.1 is now available on 64-bit(amd64) architecture with MATE and XFCE Desktop on direct and torrent download. With 11.1 we drop 32-bit i386 supports, and we currently maintain our software packages repository for more stability. What's new on GhostBSD 11.1 GhostBSD software repository Support VMware Workstation Guest Features New UFS full disk mirroring option on the installer New UFS full disk MBR and GPT option on the installer New UFS full disk swap size option on the installer Whisker Menu as default Application menu on XFCE All software developed by GhostBSD is now getting updated ZFS configuration for disk What has been fixed on 11.1? Fix XFCE sound plugin Installer ZFS configuration file setting Installer ZFS setup appears to be incomplete The installer was not listing ZFS disk correctly. The installer The partition list was not deleted when pressing back XFCE and MATE shutdown/suspend/hibernate randomly missing Clicking 'GhostBSD Bugs' item in the Main menu -> 'System Tools' brings up 'Server not found' page XFCE installation - incorrect keyboard layout Locale setting not filling correctly Update Station tray icon The image checksum's, hybrid ISO(DVD, USB) images are available at GhostBSD (http://www.ghostbsd.org/download). *** p2k17 Hackathon Reports p2k17 Hackathon Report: Matthias Kilian on xpdf, haskell, and more (https://undeadly.org/cgi?action=article;sid=20171107034258) p2k17 Hackathon Report: Herzliche grusse vom Berlin (espie@ on mandoc, misc packages progress) (https://undeadly.org/cgi?action=article;sid=20171107185122) p2k17 Hackathon Report: Paul Irofti (pirofti@) on hotplugd(8), math ports, xhci(4) and other kernel advancements (https://undeadly.org/cgi?action=article;sid=20171107225258) p2k17 Hackathon report: Jeremy Evans on ruby progress, postgresql and webdriver work (https://undeadly.org/cgi?action=article;sid=20171108072117) p2k17 Hackathon report: Christian Weisgerber on random devices, build failures and gettext (https://undeadly.org/cgi?action=article;sid=20171109171447) p2k17 Hackathon report: Sebastian Reitenbach on Puppet progress (https://undeadly.org/cgi?action=article;sid=20171110124645) p2k17 Hackathon Report: Anthony J. Bentley on firmware, games and securing pkg_add runs (https://undeadly.org/cgi?action=article;sid=20171110124656) p2k17 Hackathon Report: Landry Breuil on Mozilla things and much more (https://undeadly.org/cgi?action=article;sid=20171113091807) p2k17 Hackathon report: Florian Obser on network stack progress, kernel relinking and more (https://undeadly.org/cgi?action=article;sid=20171113235334) p2k17 Hackathon report: Antoine Jacoutot on ports+packages progress (https://undeadly.org/cgi?action=article;sid=20171120075903) *** TrueOS Talks Tech and Open Source at Pellissippi State (https://www.trueos.org/blog/trueos-talks-tech-open-source-pellissippi-state/) Ken Moore of the TrueOS project presented a talk to the AITP group at Pellissippi State today entitled “It’s A Unix(-like) system? An Introduction to TrueOS and Open source”. Joshua Smith of the TrueOS project was also in attendance. We were happy to see a good attendance of about 40 individuals that came to hear more about TrueOS and how we continue to innovate along with the FreeBSD project. Many good questions were raised about development, snapshots, cryptocurrency, and cyber-security. We’ve included a copy of the slides if you’d like to have a look at the talk on open source. We’d like to offer a sincere thanks to everyone who attended and offer an extended invitation for you to join us at our KnoxBUG group on October 30th @ the iXsystems offices! We hope to see you soon! Open Source Talk – Slideshare PDF (https://web.trueos.org/wp-content/uploads/2017/10/Open-Source-Talk.pdf) KnoxBug - Lumina Rising : Challenging Desktop Orthodoxy (http://knoxbug.org/content/octobers-talk-available-youtube) Ken gave his talk about the new Lumina 2.0 Window Manager that he gave at Ohio LinuxFest 2017 KnoxBUG October 2017 (https://youtu.be/w3ZrqxLTnIU) (OLF 2017) Lumina Rising: Challenging Desktop Orthodoxy (https://www.slideshare.net/beanpole135/olf-2017-lumina-rising-challenging-desktop-orthodoxy) *** Official OpenBSD 6.2 CD set - the only one to be made! (https://undeadly.org/cgi?action=article;sid=20171118190325) Our dear friend Bob Beck (beck@) writes: So, again this release the tradition of making Theo do art has continued! Up for sale by auction to the highest bidder on Ebay is the only OpenBSD 6.2 CD set to be produced. The case and CD's feature the 6.2 artwork, custom drawn and signed by Theo. All proceeds to support OpenBSD Go have a look at the auction As with previous OpenBSD auctions, if you are not the successful bidder, we would like to encourage you to donate the equivalent of you highest bid to the project. The Auction (https://www.ebay.ca/itm/Official-OpenBSD-6-2-CD-Set/253265944606) *** Beastie Bits HAMMER2 userspace on Linux (http://lists.dragonflybsd.org/pipermail/users/2017-October/313646.html) OpenBSD Porting Workshop (now changed to January 3, 2018) (http://www.nycbug.org/index.cgi?action=view&id=10655) Matt Ahrens on when Native Encryption for ZFS will land (https://twitter.com/mahrens1/status/921204908094775296) The first successful build of OpenBSD base system (http://nanxiao.me/en/the-first-successful-build-of-openbsd-base-system/) KnoxBug November Meeting (https://www.meetup.com/KnoxBUG-BSD-Linux-and-FOSS-Users-Unite/events/245291204/) Absolute FreeBSD, 3rd Edition, pre-orders available (https://www.michaelwlucas.com/os/af3e) Feedback/Questions Jon - Jails and Networking (http://dpaste.com/2BEW0HB#wrap) Nathan - bhyve Provisioning (http://dpaste.com/1GHSYJS#wrap) Lian - OpenSSL jumping the Shark (http://dpaste.com/18P8D8C#wrap) Kim - Suggestions (http://dpaste.com/1VE0K9E#wrap) ***
220: Opening ZFS in 2017
We have a first PS4 kernel exploit, the long awaited OpenZFS devsummit report by Allan, DragonflyBSD 5.0 is out, we show you vmadm to manage jails, and parallel processing with Unix tools. This episode was brought to you by Headlines The First PS4 Kernel Exploit: Adieu (https://fail0verflow.com/blog/2017/ps4-namedobj-exploit/) The First PS4 Kernel Exploit: Adieu Plenty of time has passed since we first demonstrated Linux running on the PS4. Now we will step back a bit and explain how we managed to jump from the browser process into the kernel such that ps4-kexec et al. are usable. Over time, ps4 firmware revisions have progressively added many mitigations and in general tried to lock down the system. This post will mainly touch on vulnerabilities and issues which are not present on the latest releases, but should still be useful for people wanting to investigate ps4 security. Vulnerability Discovery As previously explained, we were able to get a dump of the ps4 firmware 1.01 kernel via a PCIe man-in-the-middle attack. Like all FreeBSD kernels, this image included “export symbols” - symbols which are required to perform kernel and module initialization processes. However, the ps4 1.01 kernel also included full ELF symbols (obviously an oversight as they have been removed in later firmware versions). This oversight was beneficial to the reverse engineering process, although of course not a true prerequisite. Indeed, we began exploring the kernel by examining built-in metadata in the form of the syscall handler table - focusing on the ps4-specific entries. Each process object in the kernel contains its own “idt” (ID Table) object. As can be inferred from the snippet above, the hash table essentially just stores pointers to opaque data blobs, along with a given kind and name. Entries may be accessed (and thus “locked”) with either read or write intent. Note that IDTTYPE is not a bitfield consisting of only unique powers of 2. This means that if we can control the kind of an identry, we may be able to cause a type confusion to occur (it is assumed that we may control name). Exploitation To an exploiter without ps4 background, it might seem that the easiest way to exploit this bug would be to take advantage of the write off the end of the malloc’d namedobjusrt object. However, this turns out to be impossible (as far as I know) because of a side effect of the ps4 page size being changed to 0x4000 bytes (from the normal of 0x1000). It appears that in order to change the page size globally, the ps4 kernel developers opted to directly change the related macros. One of the many changes resulting from this is that the smallest actual amount of memory which malloc may give back to a caller becomes 0x40 bytes. While this also results in tons of memory being completely wasted, it does serve to nullify certain exploitation techniques (likely completely by accident…). Adieu The namedobj exploit was present and exploitable (albeit using a slightly different method than described here) until it was fixed in firmware version 4.06. This vulnerability was also found and exploited by (at least) Chaitin Tech, so props to them! Taking a quick look at the 4.07 kernel, we can see a straightforward fix (4.06 is assumed to be identical - only had 4.07 on hand while writing this post): int sys_namedobj_create(struct thread *td, void *args) { // ... rv = EINVAL; kind = *((_DWORD *)args + 4) if ( !(kind & 0x4000) && *(_QWORD *)args ) { // ... (unchanged) } return rv; } And so we say goodbye to a nice exploit. I hope you enjoyed this blast from the past :) Keep hacking! OpenZFS Developer Summit 2017 Recap (https://www.ixsystems.com/blog/openzfs-devsummit-2017/) The 5th annual OpenZFS Developer Summit was held in San Francisco on October 24-25. Hosted by Delphix at the Children’s Creativity Museum in San Francisco, over a hundred OpenZFS contributors from a wide variety of companies attended and collaborated during the conference and developer summit. iXsystems was a Gold sponsor and several iXsystems employees attended the conference, including the entire Technical Documentation Team, the Director of Engineering, the Senior Analyst, a Tier 3 Support Engineer, and a Tier 2 QA Engineer. Day 1 of the conference had 9 highly detailed, informative, and interactive technical presentations from companies which use or contribute to OpenZFS. The presentations highlighted improvements to OpenZFS developed “in-house” at each of these companies, with most improvements looking to be made available to the entire OpenZFS community in the near to long term. There’s a lot of exciting stuff happening in the OpenZFS community and this post provides an overview of the presented features and proof-of-concepts. The keynote was delivered by Mark Maybee who spoke about the past, present, and future of ZFS at Oracle. An original ZFS developer, he outlined the history of closed-source ZFS development after Oracle’s acquisition of Sun. ZFS has a fascinating history, as the project has evolved over the last decade in both open and closed source forms, independent of one another. While Oracle’s proprietary internal version of ZFS has diverged from OpenZFS, it has implemented many of the same features. Mark was very proud of the work his team had accomplished over the years, claiming Oracle’s ZFS products have accounted for over a billion dollars in sales and are used in the vast majority of Fortune 100 companies. However, with Oracle aggressively moving into cloud storage, the future of closed source ZFS is uncertain. Mark presented a few ideas to transform ZFS into a mainstream and standard file system, including adding more robust support for Linux. Allan Jude from ScaleEngine talked about ZStandard, a new compression method he is developing in collaboration with Facebook. It offers compression comparable to gzip, but at speeds fast enough to keep up with hard drive bandwidth. According to early testing, it improves both the speed and compression efficiency over the current LZ4 compression algorithm. It also offers a new “dictionary” feature for improving image compression, which is of particular interest to Facebook. In addition, when using ZFS send and receive, it will adapt the compression ratio to make the most efficient use of the network bandwidth. Currently, deleting a clone on ZFS is a time-consuming process, especially when dealing with large datasets that have diverged over time. Sara Hartse from Delphix described how “clone fast delete” speeds up clone deletion. Rather than traversing the entire dataset during clone deletion, changes to the clone are tracked in a “live list” which the delete process uses to determine which blocks to free. In addition, rather than having to wait for the clone to finish, the delete process backgrounds the task so you can keep working without any interruptions. Sara shared the findings of a test they ran on a clone with 500MB of data, which took 45 minutes to delete with the old method, and under a minute using the live list. This behavior is an optional property as it may not be appropriate for long-lived clones where deletion times are not a concern. At this time, it does not support promoted clones. Olaf Faaland from Lawrence Livermore National Labs demonstrated the progress his team has made to improve ZFS pool imports with MMP (Multi-Modifier Protection), a watchdog system to make sure that ZFS pools in clustered High Availability environments are not imported by more than one host at a time. MMP uses uberblocks and other low-level ZFS features to monitor pool import status and otherwise safeguard the import process. MMP adds fields to on-disk metadata so it does not depend on hardware, such as SAS. It supports multi-node HA configs and does not affect non-HA systems. However, it does have issues with long I/O delays so existing HA software is recommended as an additional fallback. Jörgen Lundman of GMO Internet gave an entertaining talk on the trials and tribulations of porting ZFS to OS X. As a bonus, he talked about porting ZFS to Windows, and showed a working demo. While not yet in a usable state, it demonstrated a proof-of-concept of ZFS support for other platforms. Serapheim Dimitropoulos from Delphix discussed Faster Allocation with the Log Spacemap as a means of optimizing ZFS allocation performance. He began with an in-depth overview of metaslabs and how log spacemaps are used to track allocated and freed blocks. Since blocks are only allocated from loaded metaslabs but freed blocks may apply to any metaslab, over time logging the freed blocks to each appropriate metaslab with every txg becomes less efficient. Their solution is to create a pool-wide metaslab for unflushed entries. Shailendra Tripathi from Tegile presented iFlash: Dynamic Adaptive L2ARC Caching. This was an interesting talk on what is required to allow very different classes of resources to share the same flash device–in their case, ZIL, L2ARC, and metadata. To achieve this, they needed to address the following differences for each class: queue priority, metaslab load policy, allocation, and data protection (as cache has no redundancy). Isaac Huang of Intel introduced DRAID, or parity declustered RAID. Once available, this will provide the same levels of redundancy as traditional RAIDZ, providing the administrator doubles the amount of options for providing redundancy for their use case. The goals of DRAID are to address slow resilvering times and the write throughput of a single replacement drive being a bottleneck. This solution skips block pointer tree traversal when rebuilding the pool after drive failure, which is the cause of long resilver times. This means that redundancy is restored quickly, mitigating the risk of losing additional drives before the resilver completes, but it does require a scrub afterwards to confirm data integrity. This solution supports logical spares, which must be defined at vdev creation time, which are used to quickly restore the array. Prakash Surya of Delphix described how ZIL commits currently occur in batches, where waiting threads have to wait for the batch to complete. His proposed solution was to replace batch commits and to instead notify the waiting thread after its ZIL commit in order to greatly increase throughput. A new tunable for the log write block timeout can also be used to log write blocks more efficiently. Overall, the quality of the presentations at the 2017 OpenZFS conference was high. While quite technical, they clearly explained the scope of the problems being addressed and how the proposed solutions worked. We look forward to seeing the described features integrated into OpenZFS. The videos and slides for the presentations should be made available over the next month or so at the OpenZFS website. OpenZFS Photo Album (https://photos.google.com/share/AF1QipNxYQuOm5RDxRgRQ4P8BhtoLDpyCuORKWiLPT0WlvUmZYDdrX3334zu5lvY_sxRBA?key=MW5fR05MdUdPaXFKVDliQVJEb3N3Uy1uMVFFdVdR) DragonflyBSD 5.0 (https://www.dragonflybsd.org/release50/) DragonFly version 5.0 brings the first bootable release of HAMMER2, DragonFly's next generation file system. HAMMER2 Preliminary HAMMER2 support has been released into the wild as-of the 5.0 release. This support is considered EXPERIMENTAL and should generally not yet be used for production machines and important data. The boot loader will support both UFS and HAMMER2 /boot. The installer will still use a UFS /boot even for a HAMMER2 installation because the /boot partition is typically very small and HAMMER2, like HAMMER1, does not instantly free space when files are deleted or replaced. DragonFly 5.0 has single-image HAMMER2 support, with live dedup (for cp's), compression, fast recovery, snapshot, and boot support. HAMMER2 does not yet support multi-volume or clustering, though commands for it exist. Please use non-clustered single images for now. ipfw Updates IPFW has gone through a number of updates in DragonFly and now offers better performance. pf and ipfw3 are also still supported. Improved graphics support The i915 driver has been brought up to match what's in the Linux 4.7.10 kernel. Intel GPUs are supported up to the Kabylake generation. vga_switcheroo(4) module added, allowing the use of Intel GPUs on hybrid-graphics systems. The new apple_gmux driver enables switching to the Intel video chipset on dual Intel/NVIDIA and Intel/Radeon Macbook computers. Other user-affecting changes efisetup(8) added. DragonFly can now support over 900,000 processes on a single machine. Client-side SSH by default does not try password authentication, which is the default behavior in newer versions of OpenSSH. Pass an explicit '-o PasswordAuthentication=yes' or change /etc/ssh/ssh_config if you need the old behavior. Public key users are unaffected. Clang status A starting framework has been added for using clang as the alternate base compiler in DragonFly, to replace gcc 4.7. It's not yet complete. Clang can of course be added as a package. Package updates Many package updates but I think most notably we need to point to chrome60 finally getting into dports with accelerated video and graphics support. 64-bit status Note that DragonFly is a 64-bit-only operating system as of 4.6, and will not run on 32-bit hardware. AMD Ryzen is supported and DragonFly 5.0 has a workaround for a hardware bug (http://lists.dragonflybsd.org/pipermail/commits/2017-August/626190.html). DragonFly quickly released a v5.0.1 with a few patches Download link (https://www.dragonflybsd.org/download/) News Roundup (r)vmadm – managing FreeBSD jails (https://blog.project-fifo.net/rvmadm-managing-freebsd-jails/) We are releasing the first version (0.1.0) of our clone of vmadm for FreeBSD jails today. It is not done or feature complete, but it does provides basic functionality. At this point, we think it would be helpful to get it out there and get some feedback. As of today, it allows basic management of datasets, as well as creating, starting, stopping, and destroying jails. Why another tool to manage jails However, before we go into details let’s talk why we build yet another jail manager? It is not the frequent NIH syndrome, actually quite the opposite. In FiFo 0.9.2 we experimented with iocage as a way to control jails. While iocage is a useful tool when used as a CLI utility it has some issues when used programmatically. When managing jails automatically and not via a CLI tool things like performance, or a machine parsable interface matter. While on a CLI it is acceptable if a call takes a second or two, for automatically consuming a tool this delay is problematic. Another reason for the decision was that vmadm is an excellent tool. It is very well designed. SmartOs uses vmadm for years now. Given all that, we opted for adopting a proven interface rather than trying to create a new one. Since we already interface with it on SmartOS, we can reuse a majority of our management code between SmartOS and FreeBSD. What can we do Today we can manage datasets, which are jail templates in the form of ZFS volumes. We can list and serve them from a dataset-server, and fetch those we like want. At this point, we provide datasets for FreeBSD 10.0 to 11.1, but it is very likely that the list will grow. As an idea here is a community-driven list of datasets (https://datasets.at/) that exist for SmartOS today. Moreover, while those datasets will not work, we hope to see the same for BSD jails. After fetching the dataset, we can define jails by using a JSON file. This file is compatible with the zone description used on SmartOS. It does not provide all the same features but a subset. Resources such as CPU and memory can be defined, networking configured, a dataset selected and necessary settings like hostname set. With the jail created, vmadm allows managing its lifetime, starting, stopping it, accessing the console and finally destroying it. Updates to jails are supported to however as of today they are only taken into account after restarting the jail. However, this is in large parts not a technical impossibility but rather wasn’t high up on the TODO list. It is worth mentioning that vmadm will not pick up jails created in other tools or manually. Only using vmadm created jails was a conscious decision to prevent it interfering with existing setups or other utilities. While conventional tools can manage jails set up with vmadm just fine we use some special tricks like nested jails to allow for restrictions required for multi-tenancy that are hard or impossible to achieve otherwise. Whats next First and foremost we hope to get some feedback and perhaps community engagement. In the meantime, as announced earlier this year (https://blog.project-fifo.net/fifo-in-2017/), we are hard at work integrating FreeBSD hypervisors in FiFo, and as of writing this, the core actions work quite well. Right now only the barebone functions are supported, some of the output is not as clear as we would like. We hope to eventually add support for behyve to vmadm the same way that it supports KVM on SmartOS. Moreover, the groundwork for this already exists in the nested jail techniques we are using. Other than that we are exploring ways to allow for PCI pass through in jails, something not possible in SmartOS zones right now that would be beneficial for some users. In general, we want to improve compatibility with SmartOS as much as possible and features that we add over time should make the specifications invalid for SmartOS. You can get the tool from github (https://github.com/project-fifo/r-vmadm). *** Parallel processing with unix tools (http://www.pixelbeat.org/docs/unix-parallel-tools.html) There are various ways to use parallel processing in UNIX: piping An often under appreciated idea in the unix pipe model is that the components of the pipe run in parallel. This is a key advantage leveraged when combining simple commands that do "one thing well" split -n, xargs -P, parallel Note programs that are invoked in parallel by these, need to output atomically for each item processed, which the GNU coreutils are careful to do for factor and sha*sum, etc. Generally commands that use stdio for output can be wrapped with the stdbuf -oL command to avoid intermixing lines from parallel invocations make -j Most implementations of make(1) now support the -j option to process targets in parallel. make(1) is generally a higher level tool designed to process disparate tasks and avoid reprocessing already generated targets. For example it is used very effictively when testing coreutils where about 700 tests can be processed in 13 seconds on a 40 core machine. implicit threading This goes against the unix model somewhat and definitely adds internal complexity to those tools. The advantages can be less data copying overhead, and simpler usage, though its use needs to be carefully considered. A disadvantage is that one loses the ability to easily distribute commands to separate systems. Examples are GNU sort(1) and turbo-linecount The example provided counts lines in parallel: The examples below will compare the above methods for implementing multi-processing, for the function of counting lines in a file. First of all let's generate some test data. We use both long and short lines to compare the overhead of the various methods compared to the core cost of the function being performed: $ seq 100000000 > lines.txt # 100M lines $ yes $(yes longline | head -n9) | head -n10000000 > long-lines.txt # 10M lines We'll also define the add() { paste -d+ -s | bc; } helper function to add a list of numbers. Note the following runs were done against cached files, and thus not I/O bound. Therefore we limit the number of processes in parallel to $(nproc), though you would generally benefit to raising that if your jobs are waiting on network or disk etc. + We'll use this command to count lines for most methods, so here is the base non multi-processing performance for comparison: $ time wc -l lines.txt $ time wc -l long-lines.txt split -n Note using -n alone is not enough to parallelize. For example this will run serially with each chunk, because since --filter may write files, the -n pertains to the number of files to split into rather than the number to process in parallel. $ time split -n$(nproc) --filter='wc -l' lines.txt | add You can either run multiple invocations of split in parallel on separate portions of the file like: $ time for i in $(seq $(nproc)); do split -n$i/$(nproc) lines.txt | wc -l& done | add Or split can do parallel mode using round robin on each line, but that's huge overhead in this case. (Note also the -u option significant with -nr): $ time split -nr/$(nproc) --filter='wc -l' lines.txt | add Round robin would only be useful when the processing per item is significant. Parallel isn't well suited to processing a large single file, rather focusing on distributing multiple files to commands. It can't efficiently split to lightweight processing if reading sequentially from pipe: $ time parallel --will-cite --block=200M --pipe 'wc -l' < lines.txt | add Like parallel, xargs is designed to distribute separate files to commands, and with the -P option can do so in parallel. If you have a large file then it may be beneficial to presplit it, which could also help with I/O bottlenecks if the pieces were placed on separate devices: split -d -n l/$(nproc) lines.txt l. Those pieces can then be processed in parallel like: $ time find -maxdepth 1 -name 'l.*' | xargs -P$(nproc) -n1 wc -l | cut -f1 -d' ' | add If your file sizes are unrelated to the number of processors then you will probably want to adjust -n1 to batch together more files to reduce the number of processes run in total. Note you should always specify -n with -P to avoid xargs accumulating too many input items, thus impacting the parallelism of the processes it runs. make(1) is generally used to process disparate tasks, though can be leveraged to provide low level parallel processing on a bunch of files. Note also the make -O option which avoids the need for commands to output their data atomically, letting make do the synchronization. We'll process the presplit files as generated for the xargs example above, and to support that we'll use the following Makefile: %: FORCE # Always run the command @wc -l < $@ FORCE: ; Makefile: ; # Don't include Makefile itself One could generate this and pass to make(1) with the -f option, though we'll keep it as a separate Makefile here for simplicity. This performs very well and matches the performance of xargs. $ time find -name 'l.*' -exec make -j$(nproc) {} + | add Note we use the POSIX specified "find ... -exec ... {} +" construct, rather than conflating the example with xargs. This construct like xargs will pass as many files to make as possible, which make(1) will then process in parallel. OpenBSD gives a hint on forgetting unlock mutex (http://nanxiao.me/en/openbsd-gives-a-hint-on-forgetting-unlock-mutex/) OpenBSD gives a hint on forgetting unlock mutex Check following simple C++ program: > ``` #include int main(void) { std::mutex m; m.lock(); return 0; } ``` The mutex m forgot unlock itself before exiting main function: m.unlock(); Test it on GNU/Linux, and I chose ArchLinux as the testbed: $ uname -a Linux fujitsu-i 4.13.7-1-ARCH #1 SMP PREEMPT Sat Oct 14 20:13:26 CEST 2017 x86_64 GNU/Linux $ clang++ -g -pthread -std=c++11 test_mutex.cpp $ ./a.out $ The process exited normally, and no more words was given. Build and run it on OpenBSD 6.2: clang++ -g -pthread -std=c++11 test_mutex.cpp ./a.out pthread_mutex_destroy on mutex with waiters! The OpenBSD prompts “pthreadmutexdestroy on mutex with waiters!“. Interesting! *** Beastie Bits Updates to the NetBSD operating system since OSHUG #57 & #58 (http://mailman.uk.freebsd.org/pipermail/ukfreebsd/2017-October/014148.html) Creating a jail with FiFo and Digital Ocean (https://blog.project-fifo.net/fifo-jails-digital-ocean/) I'm thinking about OpenBSD again (http://stevenrosenberg.net/blog/bsd/openbsd/2017_0924_openbsd) Kernel ASLR on amd64 (https://blog.netbsd.org/tnf/entry/kernel_aslr_on_amd64) Call for Participation - BSD Devroom at FOSDEM (https://people.freebsd.org/~rodrigo/fosdem18/) BSD Stockholm Meetup (https://www.meetup.com/BSD-Users-Stockholm/) *** Feedback/Questions architect - vBSDCon (http://dpaste.com/15D5SM4#wrap) Brad - Packages and package dependencies (http://dpaste.com/3MENN0X#wrap) Lars - dpb (http://dpaste.com/2SVS18Y) Alex re: PS4 Network Throttling (http://dpaste.com/028BCFA#wrap) ***
219: We love the ARC
Papers we love: ARC by Bryan Cantrill, SSD caching adventures with ZFS, OpenBSD full disk encryption setup, and a Perl5 Slack Syslog BSD daemon. This episode was brought to you by Headlines Papers We Love: ARC: A Self-Tuning, Low Overhead Replacement Cache (https://www.youtube.com/watch?v=F8sZRBdmqc0&feature=youtu.be) Ever wondered how the ZFS ARC (Adaptive Replacement Cache) works? How about if Bryan Cantrill presented the original paper on its design? Today is that day. Slides (https://www.slideshare.net/bcantrill/papers-we-love-arc-after-dark) It starts by looking back at a fundamental paper from the 40s where the architecture of general-purpose computers are first laid out The main is the description of memory hierarchies, where you have a small amount of very fast memory, then the next level is slower but larger, and on and on. As we look at the various L1, L2, and L3 caches on a CPU, then RAM, then flash, then spinning disks, this still holds true today. The paper then does a survey of the existing caching policies and tries to explain the issues with each. This includes ‘MIN’, which is the theoretically optimal policy, which requires future knowledge, but is useful for setting the upper bound, what is the best we could possibly do. The paper ends up showing that the ARC can end up being better than manually trying to pick the best number for the workload, because it adapts as the workload changes At about 1:25 into the video, Bryan start talking about the practical implementation of the ARC in ZFS, and some challenges they have run into recently at Joyent. A great discussion about some of the problems when ZFS needs to shrink the ARC. Not all of it applies 1:1 to FreeBSD because the kernel and the kmem implementation are different in a number of ways There were some interesting questions asked at the end as well *** How do I use man pages to learn how to use commands? (https://unix.stackexchange.com/a/193837) nwildner on StackExchange has a very thorough answer to the question how to interpret man pages to understand complicated commands (xargs in this case, but not specifically). Have in mind what you want to do. When doing your research about xargs you did it for a purpose, right? You had a specific need that was reading standard output and executing commands based on that output. But, when I don't know which command I want? Use man -k or apropos (they are equivalent). If I don't know how to find a file: man -k file | grep search. Read the descriptions and find one that will better fit your needs. Apropos works with regular expressions by default, (man apropos, read the description and find out what -r does), and on this example I'm looking for every manpage where the description starts with "report". Always read the DESCRIPTION before starting Take a time and read the description. By just reading the description of the xargs command we will learn that: xargs reads from STDIN and executes the command needed. This also means that you will need to have some knowledge of how standard input works, and how to manipulate it through pipes to chain commands The default behavior is to act like /bin/echo. This gives you a little tip that if you need to chain more than one xargs, you don't need to use echo to print. We have also learned that unix filenames can contain blank and newlines, that this could be a problem and the argument -0 is a way to prevent things explode by using null character separators. The description warns you that the command being used as input needs to support this feature too, and that GNU find support it. Great. We use a lot of find with xargs. xargs will stop if exit status 255 is reached. Some descriptions are very short and that is generally because the software works on a very simple way. Don't even think of skipping this part of the manpage ;) Other things to pay attention... You know that you can search for files using find. There is a ton of options and if you only look at the SYNOPSIS, you will get overwhelmed by those. It's just the tip of the iceberg. Excluding NAME, SYNOPSIS, and DESCRIPTION, you will have the following sections: When this method will not work so well... + Tips that apply to all commands Some options, mnemonics and "syntax style" travel through all commands making you buy some time by not having to open the manpage at all. Those are learned by practice and the most common are: Generally, -v means verbose. -vvv is a variation "very very verbose" on some software. Following the POSIX standard, generally one dash arguments can be stacked. Example: tar -xzvf, cp -Rv. Generally -R and/or -r means recursive. Almost all commands have a brief help with the --help option. --version shows the version of a software. -p, on copy or move utilities means "preserve permissions". -y means YES, or "proceed without confirmation" in most cases. Default values of commands. At the pager chunk of this answer, we saw that less -is is the pager of man. The default behavior of commands are not always shown at a separated section on manpages, or at the section that is most top placed. You will have to read the options to find out defaults, or if you are lucky, typing /pager will lead you to that info. This also requires you to know the concept of the pager(software that scrolls the manpage), and this is a thing you will only acquire after reading lots of manpages. And what about the SYNOPSIS syntax? After getting all the information needed to execute the command, you can combine options, option-arguments and operands inline to make your job done. Overview of concepts: Options are the switches that dictates a command behavior. "Do this" "don't do this" or "act this way". Often called switches. Check out the full answer and see if it helps you better grasp the meaning of a man page and thus the command. *** My adventure into SSD caching with ZFS (Home NAS) (https://robertputt.co.uk/my-adventure-into-ssd-caching-with-zfs-home-nas.html) Robert Putt as written about his adventure using SSDs for caching with ZFS on his home NAS. Recently I decided to throw away my old defunct 2009 MacBook Pro which was rotting in my cupboard and I decided to retrieve the only useful part before doing so, the 80GB Intel SSD I had installed a few years earlier. Initially I thought about simply adding it to my desktop as a bit of extra space but in 2017 80GB really wasn’t worth it and then I had a brainwave… Lets see if we can squeeze some additional performance out of my HP Microserver Gen8 NAS running ZFS by installing it as a cache disk. I installed the SSD to the cdrom tray of the Microserver using a floppy disk power to SATA power converter and a SATA cable, unfortunately it seems the CD ROM SATA port on the motherboard is only a 3gbps port although this didn’t matter so much as it was an older 3gbps SSD anyway. Next I booted up the machine and to my suprise the disk was not found in my FreeBSD install, then I realised that the SATA port for the CD drive is actually provided by the RAID controller, so I rebooted into intelligent provisioning and added an additional RAID0 array with just the 1 disk to act as my cache, in fact all of the disks in this machine are individual RAID0 arrays so it looks like just a bunch of disks (JBOD) as ZFS offers additional functionality over normal RAID (mainly scrubbing, deduplication and compression). Configuration Lets have a look at the zpool before adding the cache drive to make sure there are no errors or uglyness: Now lets prep the drive for use in the zpool using gpart. I want to split the SSD into two seperate partitions, one for L2ARC (read caching) and one for ZIL (write caching). I have decided to split the disk into 20GB for ZIL and 50GB for L2ARC. Be warned using 1 SSD like this is considered unsafe because it is a single point of failure in terms of delayed writes (a redundant configuration with 2 SSDs would be more appropriate) and the heavy write cycles on the SSD from the ZIL is likely to kill it over time. Now it’s time to see if adding the cache has made much of a difference. I suspect not as my Home NAS sucks, it is a HP Microserver Gen8 with the crappy Celeron CPU and only 4GB RAM, anyway, lets test it and find out. First off lets throw fio at the mount point for this zpool and see what happens both with the ZIL and L2ARC enabled and disabled. Observations Ok, so the initial result is a little dissapointing, but hardly unexpected, my NAS sucks and there are lots of bottle necks, CPU, memory and the fact only 2 of the SATA ports are 6gbps. There is no real difference performance wise in comparison between the results, the IOPS, bandwidth and latency appear very similar. However lets bare in mind fio is a pretty hardcore disk benchmark utility, how about some real world use cases? Next I decided to test a few typical file transactions that this NAS is used for, Samba shares to my workstation. For the first test I wanted to test reading a 3GB file over the network with both the cache enabled and disabled, I would run this multiple times to ensure the data is hot in the L2ARC and to ensure the test is somewhat repeatable, the network itself is an uncongested 1gbit link and I am copying onto the secondary SSD in my workstation. The dataset for these tests has compression and deduplication disabled. Samba Read Test Not bad once the data becomes hot in the L2ARC cache reads appear to gain a decent advantage compared to reading from the disk directly. How does it perform when writing the same file back accross the network using the ZIL vs no ZIL. Samba Write Test Another good result in the real world test, this certainately helps the write transfer speed however I do wonder what would happen if you filled the ZIL transferring a very large file, however this is unlikely with my use case as I typically only deal with a couple of files of several hundred megabytes at any given time so a 20GB ZIL should suit me reasonably well. Is ZIL and L2ARC worth it? I would imagine with a big beefy ZFS server running in a company somewhere with a large disk pool and lots of users with multiple enterprise level SSD ZIL and L2ARC would be well worth the investment, however at home I am not so sure. Yes I did see an increase in read speeds with cached data and a general increase in write speeds however it is use case dependant. In my use case I rarely access the same file frequently, my NAS primarily serves as a backup and for archived data, and although the write speeds are cool I am not sure its a deal breaker. If I built a new home NAS today I’d probably concentrate the budget on a better CPU, more RAM (for ARC cache) and more disks. However if I had a use case where I frequently accessed the same files and needed to do so in a faster fashion then yes, I’d probably invest in an SSD for caching. I think if you have a spare SSD lying around and you want something fun todo with it, sure chuck it in your ZFS based NAS as a cache mechanism. If you were planning on buying an SSD for caching then I’d really consider your needs and decide if the money can be spent on alternative stuff which would improve your experience with your NAS. I know my NAS would benefit more from an extra stick of RAM and a more powerful CPU, but as a quick evening project with some parts I had hanging around adding some SSD cache was worth a go. More Viewer Interview Questions for Allan News Roundup Setup OpenBSD 6.2 with Full Disk Encryption (https://blog.cagedmonster.net/setup-openbsd-with-full-disk-encryption/) Here is a quick way to setup (in 7 steps) OpenBSD 6.2 with the encryption of the filesystem. First step: Boot and start the installation: (I)nstall: I Keyboard Layout: ENTER (I'm french so in my case I took the FR layout) Leave the installer with: ! Second step: Prepare your disk for encryption. Using a SSD, my disk is named : sd0, the name may vary, for example : wd0. Initiating the disk: Configure your volume: Now we'll use bioctl to encrypt the partition we created, in this case : sd0a (disk sd0 + partition « a »). Enter your passphrase. Third step: Let's resume the OpenBSD's installer. We follow the install procedure Fourth step: Partitioning of the encrypted volume. We select our new volume, in this case: sd1 The whole disk will be used: W(hole) Let's create our partitions: NB: You are more than welcome to create multiple partitions for your system. Fifth step: System installation It's time to choose how we'll install our system (network install by http in my case) Sixth step: Finalize the installation. Last step: Reboot and start your system. Put your passphrase. Welcome to OpenBSD 6.2 with a full encrypted file system. Optional: Disable the swap encryption. The swap is actually part of the encrypted filesystem, we don't need OpenBSD to encrypt it. Sysctl is giving us this possibility. Step-by-Step FreeBSD installation with ZFS and Full Disk Encryption (https://blog.cagedmonster.net/step-by-step-freebsd-installation-with-full-disk-encryption/) 1. What do I need? For this tutorial, the installation has been made on a Intel Core i7 - AMD64 architecture. On a USB key, you would probably use this link : ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/amd64/ISO-IMAGES/11.1/FreeBSD-11.1-RELEASE-amd64-mini-memstick.img If you can't do a network installation, you'd better use this image : ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/amd64/ISO-IMAGES/11.1/FreeBSD-11.1-RELEASE-amd64-memstick.img You can write the image file on your USB device (replace XXXX with the name of your device) using dd : # dd if=FreeBSD-11.1-RELEASE-amd64-mini-memstick.img of=/dev/XXXX bs=1m 2. Boot and install: Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F1.png) 3. Configure your keyboard layout: Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F2.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F3.png) 4. Hostname and system components configuration : Set the name of your machine: [Screenshot](https://blog.cagedmonster.net/content/images/2017/09/F4.png_ What components do you want to install? Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F5.png) 5. Network configuration: Select the network interface you want to configure. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F6.png) First, we configure our IPv4 network. I used a static adress so you can see how it works, but you can use DHCP for an automated configuration, it depends of what you want to do with your system (desktop/server) Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F7.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F7-1.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F8.png) IPv6 network configuration. Same as for IPv4, you can use SLAAC for an automated configuration. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F9.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F10-1.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F10-2.png) Here, you can configure your DNS servers, I used the Google DNS servers so you can use them too if needed. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F11.png) 6. Select the server you want to use for the installation: I always use the IPv6 mirror to ensure that my IPv6 network configuration is good.Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F12.png) 7. Disk configuration: As we want to do an easy full disk encryption, we'll use ZFS. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F13.png) Make sure to select the disk encryption :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F14.png) Launch the disk configuration :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F15.png) Here everything is normal, you have to select the disk you'll use :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F16.png) I have only one SSD disk named da0 :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F17.png) Last chance before erasing your disk :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F18.png) Time to choose the password you'll use to start your system : Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F19.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F20.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F21.png) 8. Last steps to finish the installation: The installer will download what you need and what you selected previously (ports, src, etc.) to create your system: Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F22.png) 8.1. Root password: Enter your root password: Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F22-1.png) 8.2. Time and date: Set your timezone, in my case: Europe/France Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F22-2.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F23.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F23-1.png) Make sure the date and time are good, or you can change them :Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F24.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F25.png) 8.3. Services: Select the services you'll use at system startup depending again of what you want to do. In many cases powerd and ntpd will be useful, sshd if you're planning on using FreeBSD as a server. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26.png) 8.4. Security: Security options you want to enable. You'll still be able to change them after the installation with sysctl. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-1.png) 8.5. Additionnal user: Create an unprivileged system user: Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-2.png) Make sure your user is in the wheel group so he can use the su command. Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-3.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-4.png) 8.6. The end: End of your configuration, you can still do some modifications if you want : Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-5.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-6.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F26-7.png) 9. First boot: Enter the passphrase you have chosen previously : Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F27.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F28.png) & Screenshot (https://blog.cagedmonster.net/content/images/2017/09/F29.png) Welcome to Freebsd 11.1 with full disk encryption! *** The anatomy of ldd program on OpenBSD (http://nanxiao.me/en/the-anatomy-of-ldd-program-on-openbsd/) In the past week, I read the ldd (https://github.com/openbsd/src/blob/master/libexec/ld.so/ldd/ldd.c) source code on OpenBSD to get a better understanding of how it works. And this post should also be a reference for other*NIX OSs. The ELF (https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) file is divided into 4 categories: relocatable, executable, shared, and core. Only the executable and shared object files may have dynamic object dependencies, so the ldd only check these 2 kinds of ELF file: (1) Executable. ldd leverages the LD_TRACE_LOADED_OBJECTS environment variable in fact, and the code is as following: if (setenv("LD_TRACE_LOADED_OBJECTS", "true", 1) < 0) err(1, "setenv(LD_TRACE_LOADED_OBJECTS)"); When LDTRACELOADED_OBJECTS is set to 1 or true, running executable file will show shared objects needed instead of running it, so you even not needldd to check executable file. See the following outputs: $ /usr/bin/ldd usage: ldd program ... $ LD_TRACE_LOADED_OBJECTS=1 /usr/bin/ldd Start End Type Open Ref GrpRef Name 00000b6ac6e00000 00000b6ac7003000 exe 1 0 0 /usr/bin/ldd 00000b6dbc96c000 00000b6dbcc38000 rlib 0 1 0 /usr/lib/libc.so.89.3 00000b6d6ad00000 00000b6d6ad00000 rtld 0 1 0 /usr/libexec/ld.so (2) Shared object. The code to print dependencies of shared object is as following: if (ehdr.e_type == ET_DYN && !interp) { if (realpath(name, buf) == NULL) { printf("realpath(%s): %s", name, strerror(errno)); fflush(stdout); _exit(1); } dlhandle = dlopen(buf, RTLD_TRACE); if (dlhandle == NULL) { printf("%s\n", dlerror()); fflush(stdout); _exit(1); } _exit(0); } Why the condition of checking a ELF file is shared object or not is like this: if (ehdr.e_type == ET_DYN && !interp) { ...... } That’s because the file type of position-independent executable (PIE) is the same as shared object, but normally PIE contains a interpreter program header since it needs dynamic linker to load it while shared object lacks (refer this article). So the above condition will filter PIE file. The dlopen(buf, RTLD_TRACE) is used to print dynamic object information. And the actual code is like this: if (_dl_traceld) { _dl_show_objects(); _dl_unload_shlib(object); _dl_exit(0); } In fact, you can also implement a simple application which outputs dynamic object information for shared object yourself: # include int main(int argc, char **argv) { dlopen(argv[1], RTLD_TRACE); return 0; } Compile and use it to analyze /usr/lib/libssl.so.43.2: $ cc lddshared.c $ ./a.out /usr/lib/libssl.so.43.2 Start End Type Open Ref GrpRef Name 000010e2df1c5000 000010e2df41a000 dlib 1 0 0 /usr/lib/libssl.so.43.2 000010e311e3f000 000010e312209000 rlib 0 1 0 /usr/lib/libcrypto.so.41.1 The same as using ldd directly: $ ldd /usr/lib/libssl.so.43.2 /usr/lib/libssl.so.43.2: Start End Type Open Ref GrpRef Name 00001d9ffef08000 00001d9fff15d000 dlib 1 0 0 /usr/lib/libssl.so.43.2 00001d9ff1431000 00001d9ff17fb000 rlib 0 1 0 /usr/lib/libcrypto.so.41.1 Through the studying of ldd source code, I also get many by-products: such as knowledge of ELF file, linking and loading, etc. So diving into code is a really good method to learn *NIX deeper! Perl5 Slack Syslog BSD daemon (https://clinetworking.wordpress.com/2017/10/13/perl5-slack-syslog-bsd-daemon/) So I have been working on my little Perl daemon for a week now. It is a simple syslog daemon that listens on port 514 for incoming messages. It listens on a port so it can process log messages from my consumer Linux router as well as the messages from my server. Messages that are above alert are sent, as are messages that match the regex of SSH or DHCP (I want to keep track of new connections to my wifi). The rest of the messages are not sent to slack but appended to a log file. This is very handy as I can get access to info like failed ssh logins, disk failures, and new devices connecting to the network all on my Android phone when I am not home. Screenshot (https://clinetworking.files.wordpress.com/2017/10/screenshot_2017-10-13-23-00-26.png) The situation arose today that the internet went down and I thought to myself what would happen to all my important syslog messages when they couldn’t be sent? Before the script only ran an eval block on the botsend() function. The error was returned, handled, but nothing was done and the unsent message was discarded. So I added a function that appended unsent messengers to an array that are later sent when the server is not busy sending messages to slack. Slack has a limit of one message per second. The new addition works well and means that if the internet fails my server will store these messages in memory and resend them at a rate of one message per second when the internet connectivity returns. It currently sends the newest ones first but I am not sure if this is a bug or a feature at this point! It currently works with my Linux based WiFi router and my FreeBSD server. It is easy to scale as all you need to do is send messages to syslog to get them sent to slack. You could sent CPU temp, logged in users etc. There is a github page: https://github.com/wilyarti/slackbot Lscpu for OpenBSD/FreeBSD (http://nanxiao.me/en/lscpu-for-openbsdfreebsd/) Github Link (https://github.com/NanXiao/lscpu) There is a neat command, lscpu, which is very handy to display CPU information on GNU/Linux OS: $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 But unfortunately, the BSD OSs lack this command, maybe one reason is lscpu relies heavily on /proc file system which BSD don’t provide, :-). TakeOpenBSD as an example, if I want to know CPU information, dmesg should be one choice: $ dmesg | grep -i cpu cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz, 2527.35 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM, PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR cpu0: 3MB 64b/line 8-way L2 cache cpu0: apic clock running at 266MHz cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE But the output makes me feeling messy, not very clear. As for dmidecode, it used to be another option, but now can’t work out-of-box because it will access /dev/mem which for security reason, OpenBSD doesn’t allow by default (You can refer this discussion): $ ./dmidecode $ dmidecode 3.1 Scanning /dev/mem for entry point. /dev/mem: Operation not permitted Based on above situation, I want a specified command for showing CPU information for my BSD box. So in the past 2 weeks, I developed a lscpu program for OpenBSD/FreeBSD, or more accurately, OpenBSD/FreeBSD on x86 architecture since I only have some Intel processors at hand. The application getsCPU metrics from 2 sources: (1) sysctl functions. The BSD OSs provide sysctl interface which I can use to get general CPU particulars, such as how many CPUs the system contains, the byte-order of CPU, etc. (2) CPUID instruction. For x86 architecture, CPUID instruction can obtain very detail information of CPU. This coding work is a little tedious and error-prone, not only because I need to reference both Intel and AMD specifications since these 2 vendors have minor distinctions, but also I need to parse the bits of register values. The code is here (https://github.com/NanXiao/lscpu), and if you run OpenBSD/FreeBSD on x86 processors, please try it. It will be better you can give some feedback or report the issues, and I appreciate it very much. In the future if I have other CPUs resource, such as ARM or SPARC64, maybe I will enrich this small program. *** Beastie Bits OpenBSD Porting Workshop - Brian Callahan will be running an OpenBSD porting workshop in NYC for NYC*BUG on December 6, 2017. (http://daemonforums.org/showthread.php?t=10429) Learn to tame OpenBSD quickly (http://www.openbsdjumpstart.org/#/) Detect the operating system using UDP stack corner cases (https://gist.github.com/sortie/94b302dd383df19237d1a04969f1a42b) *** Feedback/Questions Awesome Mike - ZFS Questions (http://dpaste.com/1H22BND#wrap) Michael - Expanding a file server with only one hard drive with ZFS (http://dpaste.com/1JRJ6T9) - information based on Allan's IRC response (http://dpaste.com/36M7M3E) Brian - Optimizing ZFS for a single disk (http://dpaste.com/3X0GXJR#wrap) ***
218: A KRACK in the WiFi
FreeBSD 10.4-RELEASE is here, more EuroBSDcon travel notes, the KRACK attack, ZFS and DTrace on NetBSD, and pfsense 2.4. This episode was brought to you by Headlines FreeBSD 10.4-RELEASE Available (https://www.freebsd.org/releases/10.4R/announce.html) FreeBSD 10.4-RELEASE is out. The FreeBSD Project dedicates the FreeBSD 10.4-RELEASE to the memory of Andrey A. Chernov. Some of the highlights: 10.4-RELEASE is the first FreeBSD release to feature full support for eMMC storage, including eMMC partitions, TRIM and bus speed modes up to HS400. Please note, though, that availability of especially the DDR52, HS200 and HS400 modes requires support in the actual sdhci(4) front-end as well as by the hardware used. Also note, that the SDHCI controller part of Intel® Apollo Lake chipsets is affected by several severe silicon bugs. Apparently, it depends on the particular Apollo Lake platform whether the workarounds in place so far are sufficient to avoid timeouts on attaching sdhci(4) there. Also in case a GPT disk label is used, the fsckffs(8) utility now is able to find alternate superblocks. The aesni(4) driver now no longer shares a single FPU context across multiple sessions in multiple threads, addressing problems seen when employing aesni(4) for accelerating ipsec(4). Support for the Kaby Lake generation of Intel® i219(4)/ i219(5) devices has been added to the em(4) driver. The em(4) driver is now capable of enabling Wake On LAN (WOL) also for Intel® i217, i218 and i219 chips. Note that stale interface configurations from previous unsuccessful attempts to enable WOL for these devices now will actually take effect. For example, an ifconfig em0 wol activates all WOL variants including wolmcast, which might be undesirable. Support for WOL has been added to the igb(4) driver, which was not able to activate this feature on any device before. The same remark regarding stale WOL configurations as for the em(4) driver applies. Userland coredumps can now trigger events such as generating a human readable crash report via devd(8). This feature is off by default. The firmware shipping with the qlxgbe(4) driver has been updated to version 5.4.66. Additionally, this driver has received some TSO and locking fixes, performance optimizations as well as SYSCTLs providing MAC, RX and TX statistics. Mellanox® ConnectX-4 series adapters are now supported by the newly added mlx5ib(4) driver. OpenSSH received an update to version 7.3p1. GNOME has been updated to version 3.18. Xorg-Server has been updated to version 1.18.4. Check out the full release notes and upgrade your systems to 10.4-RELEASE. Thanks to the FreeBSD Release Engineering Team for their efforts. *** EuroBSDcon 2017: "travel notes" after the conference (https://blog.netbsd.org/tnf/entry/eurobsdcon_2017_travel_notes_after) Leonardo Taccari posted in the NetBSD blog about his experiences at EuroBSDcon 2017: Let me tell you about my experience at EuroBSDcon 2017 in Paris, France. We will see what was presented during the NetBSD developer summit on Friday and then we will give a look to all of the NetBSD and pkgsrc presentations given during the conference session on Saturday and Sunday. Of course, a lot of fun also happened on the "hall track", the several breaks during the conference and the dinners we had together with other *BSD developers and community! This is difficult to describe and I will try to just share some part of that with photographs that we have taken. I can just say that it was a really beautiful experience, I had a great time with others and, after coming back home... ...I miss all of that! :) So, if you have never been in any BSD conferences I strongly suggest you to go to the next ones, so please stay tuned via NetBSD Events. Being there this is probably the only way to understand these feelings! Thursday (21/09): NetBSD developers dinner Arriving in Paris via a night train from Italy I literally sleep-walked through Paris getting lost again and again. After getting in touch with other developers we had a dinner together and went sightseeing for a^Wseveral beers! Friday (22/09): NetBSD developers summit On Friday morning we met for the NetBSD developers summit kindly hosted by Arolla. NetBSD on Google Compute Engine -- Benny Siegert (bsiegert) Scripting DDB with Forth -- Valery Ushakov (uwe) News from the version control front -- Jörg Sonnenberger (joerg) Afternoon discussions and dinner After the lunch we had several non-scheduled discussions, some time for hacking, etc. We then had a nice dinner together (it was in a restaurant with a very nice waiter who always shouted after every order or after accidentally dropping and crashing dishes!, yeah! That's probably a bit weird but I liked that attitude! :)) and then did some sightseeing and had a beer together. Saturday (23/09): First day of conference session and Social Event A Modern Replacement for BSD spell(1) -- Abhinav Upadhyay (abhinav) Portable Hotplugging: NetBSD's uvm_hotplug(9) API development -- Cherry G. Mathew (cherry) Hardening pkgsrc -- Pierre Pronchery (khorben) Reproducible builds on NetBSD -- Christos Zoulas (christos) Social event The social event on Saturday evening took place on a boat that cruised on the Seine river. It was a very nice and different way to sightsee Paris, eat and enjoy some drinks and socialize and discuss with other developers and community. + Sunday (24/09): Second day of conference session The school of hard knocks - PT1 -- Sevan Janiyan (sevan) The LLDB Debugger on NetBSD -- Kamil Rytarowski (kamil) What's in store for NetBSD 8.0? -- Alistair Crooks (agc) Sunday dinner After the conference we did some sightseeing in Paris, had a dinner together and then enjoyed some beers! Conclusion It was a very nice weekend and conference. It is worth to mention that EuroBSDcon 2017 was the biggest BSD conference (more than 300 people attended it!). I would like to thank the entire EuroBSDcon organising committee (Baptiste Daroussin, Antoine Jacoutot, Jean-Sébastien Pédron and Jean-Yves Migeon), EuroBSDcon programme committee (Antoine Jacoutot, Lars Engels, Ollivier Robert, Sevan Janiyan, Jörg Sonnenberger, Jasper Lievisse Adriaanse and Janne Johansson) and EuroBSDcon Foundation for organizing such a wonderful conference! I also would like to thank the speakers for presenting very interesting talks, all developers and community that attended the NetBSD devsummit and conference, in particular Jean-Yves and Jörg, for organizing and moderating the devsummit and Arolla that kindly hosted us for the NetBSD devsummit! A special thanks also to Abhinav (abhinav) and Martin (martin) for photographs and locals Jean-Yves (jym) and Stoned (seb) for helping us in not get lost in Paris' rues! :) Thank you! *** WiFi Vulnerability in WPA2: KRACK (https://www.krackattacks.com/) “We discovered serious weaknesses in WPA2, a protocol that secures all modern protected Wi-Fi networks. An attacker within range of a victim can exploit these weaknesses using key reinstallation attacks (KRACKs). Concretely, attackers can use this novel attack technique to read information that was previously assumed to be safely encrypted. This can be abused to steal sensitive information such as credit card numbers, passwords, chat messages, emails, photos, and so on. The attack works against all modern protected Wi-Fi networks. Depending on the network configuration, it is also possible to inject and manipulate data. For example, an attacker might be able to inject ransomware or other malware into websites.” “Note that if your device supports Wi-Fi, it is most likely affected. During our initial research, we discovered ourselves that Android, Linux, Apple, Windows, OpenBSD, MediaTek, Linksys, and others, are all affected by some variant of the attacks. For more information about specific products, consult the database of CERT/CC, or contact your vendor.” FreeBSD Advisory (https://www.freebsd.org/security/advisories/FreeBSD-SA-17:07.wpa.asc) As of the date of this recording, a few weeks ahead of when this episode will air, the issue is fixed in FreeBSD 11.0 and 11.1, and a workaround has been provided for 10.3 and 10.4 (install newer wpa_supplicant from ports). A fix for 10.3 and 10.4 is expected soon. They will more than likely be out by time you are watching this. The fix for 10.3 and 10.4 is more complicated because the version of wpasupplicant included in the base system is 2.0, from January 2013, so is nearly 5 years old, so the patches do not apply cleanly. The security team is still considering if it will try to patch 2.0, or just replace the version of wpasupplicant with 2.5 from FreeBSD 11.x. OpenBSD was unwilling to wait when the embargo was extended on this vulnerability and stealth fixed the issue on Aug 30th (https://marc.info/?l=openbsd-cvs&m=150410571407760&w=2) stsp@openbsd.org ‘s Mastodon post (https://mastodon.social/@stsp/98837563531323569) Lobste.rs conversation about flaw and OpenBSD’s reaction (https://lobste.rs/s/dwzplh/krack_attacks_breaking_wpa2#c_pbhnfz) “What happened is that he told me on July 15, and gave a 6 weeks embargo until end of August. We already complained back then that this was way too long and leaving people exposed. Then he got CERT (and, thus, US gov agencies) involved and had to extend the embargo even further until today. At that point we already had the ball rolling and decided to stick to the original agreement with him, and he gave us an agreeing nod towards that as well.” “In this situation, a request for keeping the problem and fix secret is a request to leave our users at risk and exposed to insiders who will potentially use the bug to exploit our users. And we have no idea who the other insiders are. We have to assume that information of this kind leaks and dissipates pretty fast in the security “community”.” “We chose to serve the needs of our users who are the vulnerable people in this drama. I stand by that choice.” As a result of this: “To avoid this problem in the future, OpenBSD will now receive vulnerability notifications closer to the end of an embargo.” NetBSD: “patches for the WPA issues in KRACK Attacks were committed Oct 16th to HEAD & are pending pullup to 6/7/8 branches” (http://mail-index.netbsd.org/source-changes/2017/10/16/msg088877.html) As of this recording, Dragonfly appears to use wpa_supplicant 2.1 which they imported in 2014 and has not been touched in over a year (https://github.com/DragonFlyBSD/DragonFlyBSD/commits/master/contrib/wpa_supplicant) *** News Roundup NetBSD - dtrace and ZFS update (https://mail-index.netbsd.org/tech-kern/2017/10/13/msg022436.html) Chuck Silvers writes to the tech-kern mailing list of NetBSD: I've been working on updating netbsd's copy of the dtrace and zfs code to rebase from the existing ancient opensolaris version to a recent freebsd version. most of the freebsd changes are pretty close to what netbsd needs, so that seems like a more useful upstream for us. I have things working well enough now that I want to share the code in preparation for committing. this update improves upon our existing dtrace/zfs code in several ways: picks up all the upstream zfs fixes and enhancements from the last decade zfs now supports mmap on netbsd, so you can run executables stored in zfs dtrace fbt probes can now be used in kernel modules (such as zfs) A patch is provided here: http://ftp.netbsd.org/pub/NetBSD/misc/chs/diff.cddl.20171012 which needs to be applied using “patch -E” as it adds and removes files. He provides the following summary for the diff: freebsd's dtrace/zfs code as of r315983 (2017-03-26), adapted for netbsd. a few updates to our copy of freebsd's libproc. build system support for using -fno-omit-frame-pointer everywhere and disabling other compiler optimizations that confuse dtrace. sample kernel config changes for a couple evbarm configs (the ones I tested). module/ksyms enhancements for dtrace integration. genfs API enhancements to support zfs. an option to have mutexes not become no-ops during a panic. uvm_aobj API change to support 64-bit aobj sizes (eg. for tmpfs). Known issues with the patch include: unloading the zfs module fails even with no zpools imported if you've done much with zfs since it was loaded. there's some refcounting problem that I haven't tracked down yet. the module refcounting for active fbt probes is bogus. currently module refcounting is protected by kernconfig_lock(), but taking that lock down in the bowels of dtrace seems likely to create deadlocks. I plan to do something fancier but haven't gotten to it yet. the dtrace uregs[] stuff is probably still wrong. the CTF typeid overflow problem is still there (more on this below). Unsupported features include: the ".zfs" virtual directory, eg. ".zfs/snapshot/foo@bar" zvols ZFS ACLs (aka. NFSv4 ACLs) NFS exporting a ZFS file system setting dtrace probes in application code using ZFS as the root fs new crypto hashes SHA512_256, skein, and edonr (the last one is not in freebsd yet either) zio delay injection (used for testing zfs) dtrace support for platforms other than x86 and arm A more detailed description of the CTF typeid overflow is also provided. Check out the full thread with followups and try out the patch if you’re on NetBSD. *** pfSense 2.4.0-RELEASE Now Available! (https://www.netgate.com/blog/pfsense-2-4-0-release-now-available.html) Jim Pingle writes about the new release: We are excited to announce the release of pfSense® software version 2.4, now available for new installations and upgrades! pfSense software version 2.4.0 was a herculean effort! It is the culmination of 18 months of hard work by Netgate and community contributors, with over 290 items resolved. According to git, 671 files were changed with a total 1651680 lines added, and 185727 lines deleted. Most of those added lines are from translated strings for multiple language support! + Highlights FreeBSD 11.1-RELEASE as the base Operating System New pfSense installer based on bsdinstall, with support for ZFS, UEFI, and multiple types of partition layouts (e.g. GPT, BIOS) Support for Netgate ARM devices such as the SG-1000 OpenVPN 2.4.x support, which brings features like AES-GCM ciphers, speed improvements, Negotiable Crypto Parameters (NCP), TLS encryption, and dual stack/multihome Translation of the GUI into 13 different languages! For more information on contributing to the translation effort, read our previous blog post and visit the project on Zanata WebGUI improvements, such as a new login page, improved GET/POST CSRF handling, significant improvements to the Dashboard and its AJAX handling Certificate Management improvements including CSR signing and international character support Captive Portal has been rewritten to work without multiple instances of ipfw Important Information: 32-bit x86 and NanoBSD have been deprecated and are not supported on pfSense 2.4. Read the full release notes and let them know how you like the new release. *** OpenBSD changes of note 629 (https://www.tedunangst.com/flak/post/openbsd-changes-of-note-629) Use getrusage to measure CPU time in md5 benchmarking. Add guard pages at the end of kernel stacks so overflows don’t run into important stuff. This would be useful in FreeBSD, even just to detect the condition. I had all kinds of strange crashes when I was accidently overflowing the stack when working on the initial version of the ZSTD patches before ZSTD gained a working heap mode. Add dwxe driver for ethernet found on Allwinner A64, H3 and H5 SoCs. Fix a regression caused by removal of SIGIO from some devices. In malloc, always delay freeing chunks and change ‘F’ option to perform a more extensive check for double free. Change sendsyslog prototype to take a string, since there’s little point logging not strings. The config program tries to modify zero initialized variables. Previous versions of gcc were patched to place these in the data segment, instead of the bss, but clang has no such patches. Long long ago, this was the default behavior for compilers, which is why gcc was patched to maintain that existing behavior, but now we want a slightly less unusual toolchain. Fix the underlying issue for now by annotating such variables with a data section attribute. *** t2k17 Hackathon Report: Philip Guenther: locking and libc (https://undeadly.org/cgi?action=article;sid=20170824080132) Next up in our series of t2k17 hackathon reports is this one from Philip Guenther: I showed up at t2k17 with a couple hold-over diffs from e2k17 that weren't stable then and hadn't gotten much better since, so after a red-eye through Chicago I arrived in the hackroom, fired up my laptop and synced trees. Meanwhile, people trickled in and the best part of hackathons, the conversations and "what do you think about this?" chats started. Theo introduced me to Todd Mortimer (mortimer@), who's been hacking on clang to implement RETGUARD for C programs. Over the hackathon we discussed a few loose ends that cropped up and what the correct behavior should be for them as well as the mechanics of avoiding 0xc3 bytes (the RET opcode) embedded in the middle of other multi-byte x86 machine code. Fun stuff. Martin (mpi@) and I had a conversation about the desirability of being able to sleep while holding netlock and pretty much came down on "oof, the scheduler does need work before the underlying issue driving this question can be resolved enough to answer it". :-( After some final hammering I got in an enhancement to pool(9) to let a pool use (sleeping) rwlocks instead of (spinning) mutexes and then immediately used that for the per-CPU pool cache pool as well as the futex pool. Further pools are likely to be converted as well kernel upper-level locking changes are made. Speaking of, a larger diff I had been working on for said upper-level locking was still suffering deadlock issues so I took a stab at narrowing it down to just a lock for the process tree, mostly mirroring the FreeBSD proctreelock. That appears to be holding up much better and I just have some code arrangement issues around sysptrace() before that'll go out for final review. Then most of the way through the week, Bob (beck@) vocally complained that life would be easier for libressl if we had some version of pthread_once() and the pthread mutex routines in libc. This would make some other stuff easier too (c.f. /usr/X11R6/lib/libpthread-stubs.*) and the TIB work over the last couple years has basically eliminated the runtime costs of doing so, so I spent most the rest of the hackathon finding the right place to draw a line through libpthread and move everything on the one side of the line into libc. That code seems pretty stable and the xenocara and ports people seem to like—or at least accept—the effects, so it will almost certainly go in with the next libc bump. Lots of other random conversations, hacking, meals, and beer. Many thanks to Ken (krw@) and local conspirators for another excellent Toronto hackathon! Beastie Bits 2017 NetBSD Foundation Officers (https://blog.netbsd.org/tnf/entry/2017_netbsd_foundation_officers) New BSDMag is out - Military Grade Data Wiping in FreeBSD with BCWipe (https://bsdmag.org/download/military-grade-data-wiping-freebsd-bcwipe/) LibertyBSD 6.1 released (http://libertybsd.net/) *** Feedback/Questions Eddy - EuroBSDCon 2017 video and some help (http://dpaste.com/3WDNV05#wrap) Eric - ZFS monitoring (http://dpaste.com/2RP0S60#wrap) Tom - BSD Hosting (http://dpaste.com/31DGH3J#wrap) ***
217: Your questions, part II
OpenBSD 6.2 is here, style arguments, a second round of viewer interview questions, how to set CPU affinity for FreeBSD jails, containers on FreeNAS & more! Headlines OpenBSD 6.2 Released (https://www.openbsd.org/62.html) OpenBSD continues their six month release cadence with the release of 6.2, the 44th release On a disappointing note, the song for 6.2 will not be released until December Highlights: Improved hardware support on modern platforms including ARM64/ARMv7 and octeon, while amd64 users will appreciate additional support for the Intel Kaby Lake video cards. Network stack improvements include extensive SMPization improvements and a new FQ-CoDel queueing discipline, as well as enhanced WiFi support in general and improvements to iwn(4), iwm(4) and anthn(4) drivers. Improvements in vmm(4)/vmd include VM migration, as well as various compatibility and performance improvements. Security enhancements including a new freezero(3) function, further pledge(2)ing of base system programs and conversion of several daemons to the fork+exec model. Trapsleds, KARL, and random linking for libcrypto and ld.so, dramatically increase security by making it harder to find helpful ROP gadgets, and by creating a unique order of objects per-boot. A unique kernel is now created by the installer to boot from after install/upgrade. The base system compiler on the amd64 and i386 platforms has switched to clang(1). New versions of OpenSSH, OpenSMTPd, LibreSSL and mandoc are also included. The kernel no longer handles IPv6 Stateless Address Autoconfiguration (RFC 4862), allowing cleanup and simplification of the IPv6 network stack. Improved IPv6 checks for IPsec policies and made them consistent with IPv4. Enabled the use of per-CPU caches in the network packet allocators. Improved UTF-8 line editing support for ksh(1) Emacs and Vi input mode. breaking change for nvme(4) users with GPT: If you are booting from an nvme(4) drive with a GPT disk layout, you are affected by an off-by-one in the driver with the consequence that the sector count in your partition table may be incorrect. The only way to fix this is to re-initialize the partition table. Backup your data to another disk before you upgrade. In the new bsd.rd, drop to a shell and re-initialize the GPT: fdisk -iy -g -b 960 sdN Why we argue: style (https://www.sandimetz.com/blog/2017/6/1/why-we-argue-style) I've been thinking about why we argue about code, and how we might transform vehement differences of opinion into active forces for good. My thoughts spring from a very specific context. Ten or twelve times a year I go to an arbitrary business and spend three or more days teaching a course in object-oriented design. I'm an outsider, but for a few days these business let me in on their secrets. Here's what I've noticed. In some places, folks are generally happy. Programmers get along. They feel as if they are all "in this together." At businesses like this I spend most of my time actually teaching object-oriented design. Other places, folks are surprisingly miserable. There's a lot of discord, and the programmers have devolved into competing "camps." In these situations the course rapidly morphs away from OO Design and into wide-ranging group discussions about how to resolve deeply embedded conflicts. Tolstoy famously said that "Happy families are all alike; every unhappy family is unhappy in its own way." This is known as the Anna Karenina Principle, and describes situations in which success depends on meeting all of a number of criteria. The only way to be happy is to succeed at every one of them. Unhappiness, unfortunately, can be achieved by any combination of failure. Thus, all happy businesses are similar, but unhappy ones appear unique in their misery. Today I'm interested in choices of syntax, i.e whether or not your shop has agreed upon and follows a style guide. If you're surprised that I'm starting with this apparently mundane issue, consider yourself lucky in your choice of workplace. If you're shaking your head in rueful agreement about the importance of this topic, I feel your pain. I firmly believe that all of the code that I personally have to examine should come to me in a consistent format. Code is read many more times than it is written, which means that the ultimate cost of code is in its reading. It therefore follows that code should be optimized for readability, which in turn dictates that an application's code should all follow the same style. This is why FreeBSD, and most other open source projects, have a preferred style. Some projects are less specific and less strict about it. Most programmers agree with the prior paragraph, but here's where things begin to break down. As far as I'm concerned, my personal formatting style is clearly the best. However, I'm quite sure that you feel the same. It's easy for a group of programmers to agree that all code should follow a common style, but surprisingly difficult to get them to agree on just what that common style should be. Avoid appointing a human "style cop", which just forces someone to be an increasingly ill-tempered nag. Instead, supply programmers with the information they need to remedy their own transgressions. By the time a pull request is submitted, mis-stylings should long since have been put right. Pull request conversations ought to be about what code does rather than how code looks. What about old code? Ignore it. You don't have to re-style all existing code, just do better from this day forward. Defer updating old code until you touch it for other reasons. Following this strategy means that the code you most often work on will gradually take on a common style. It also means that some of your existing code might never get updated, but if you never look at it, who cares? If you choose to re-style code that you otherwise have no need to touch, you're declaring that changing the look of this old code has more value to your business than delivering the next item on the backlog. The opportunity cost of making a purely aesthetic change includes losing the benefit of what you could have done instead. The rule-of-thumb is: Don't bother updating the styling of stable, existing code unless not doing so costs you money. Most open source projects also avoid reformatting code just to change the style, because of the merge conflicts this will cause for downstream consumers If you disagree with the style guide upon which your team agrees, you have only two honorable options: First, you can obey the guide despite your aversion. As with me in the Elm story above, this act is likely to change your thinking so that over time you come to prefer the new style. It's possible that if you follow the guide you'll begin to like it. Alternatively, you can decide you will not obey the style guide. Making this decision demands that you leave your current project and find some other project whose guide matches your preferred style. Go there and follow that one. Notice that both of these choices have you following a guide. This part is not optional. The moral of this story? It's more important for all code to be formatted the same than it is for any one of us to get our own way. Commit to agreeing upon and following a style guide. And if you find that your team cannot come to an agreement, step away from this problem and start a discussion about power. There have been many arguments about style, and it can often be one of the first complaints of people new to any open source project This article covers it fairly well from both sides, a) you should follow the style guide of the project you are contributing to, b) the project should review your actual code, then comment on the style after, and provide gentle guidance towards the right style, and avoid being “style cops” *** Interview - The BSDNow Crew, Part II News Roundup Building FreeBSD for the Onion Omega 2 (https://github.com/sysadminmike/freebsd-onion-omega2-build) I got my Onion Omega 2 devices in the mail quite a while ago, but I had never gotten around to trying to install FreeBSD on them. They are a different MIPS SoC than the Onion Omega 1, so it would not work out of the box at the time. Now, the SoC is supported! This guide provides the steps to build an image for the Omega 2 using the freebsd-wifi-build infrastructure First some config files are modified to make the image small enough for the Omega 2’s flash chip The DTS (Device Tree Source) files are not yet included in FreeBSD, so they are fetched from github Then the build for the ralink SoC is run, with the provided DTS file and the MT7628_FDT kernel config Once the build is complete, you’ll have a tftp image file. Then that image is compressed, and bundled into a uboot image Write the files to a USB stick, and plug it into the Omega’s dock Turn it on while holding the reset button with console open Press 1 to get into the command line. You will need to reset the usb: usb reset Then load the kernel boot image: fatload usb 0:1 0x80800000 kernel.MT7628_FDT.lzma.uImage And boot it: bootm 0x80800000 At this point FreeBSD should boot Mount a userland, and you should end up in multi-user mode Hopefully this will get even easier in the next few weeks, and we’ll end up with a more streamlined process to tftp boot the device, then write FreeBSD into the onboard flash so it boots automatically. *** Setting the CPU Affinity on FreeBSD Jails with ezjail (https://www.neelc.org/setting-the-cpu-affinity-on-freebsd-jails-with-ezjail/) While there are more advanced resource controls available for FreeBSD jails, one of the most basic ways to control CPU usage is to limit the subset of CPUs that each jail can use. This can make sure that every jail has access to some dedicated resources, while at the same time doesn’t have the ability to entirely dominate the machine I just got a new home server: a HP ProLiant ML110 G6. Being a FreeBSD person myself, it was natural that I used it on my server instead of Linux I chose to use ezjail to manage the jails on my ProLiant, with the initial one being a Tor middle node. Despite the fact that where my ML110 is, the upstream is only 35mbps (which is pretty good for cable), I did not want to give my Tor jail access to all four cores. Setting the CPU Affinity would let you choose a specific CPU core (or a range of cores) you want to use. However, it does not just let you pick the number of CPU cores you want and make FreeBSD choose the core running your jail. Going forward, I assumed that you have already created a jail using ezjail-admin. I also do not cover limiting a jail to a certain percentage of CPU usage. ezjail-admin config -c [CORENUMBERFIRST]-[CORENUMBERLAST] [JAIL_NAME] or ezjail-admin config -c [CORENUMBERFIRST],[CORENUMBERSECOND],...,[CORENUMBERN] [JAILNAME] And hopefully, you should have your ezjail-managed FreeBSD jail limited to the CPU cores you want. While I did not cover a CPU percentage or RAM usage, this can be done with rctl I'll admit: it doesn't really matter which CPU a jail runs on, but it might matter if you don't want a jail to have access to all the CPU cores available and only want [JAILNAME] to use one core. Since it's not really possible just specify the number of CPU cores with ezjail (or even iocell), a fallback would be to use CPU affinity, and that requires you to specify an exact CPU core. I know it's not the best solution (it would be better if we could let the scheduler choose provided a jail only runs on one core), but it's what works. We use this at work on high core count machines. When we have multiple databases colocated on the same machine, we make sure each one has a few cores to itself, while it shares other cores with the rest of the machine. We often reserve a core or two for the base system as well. *** A practical guide to containers on FreeNAS for a depraved psychopath. (https://medium.com/@andoriyu/a-practical-guide-to-containers-on-freenas-for-a-depraved-psychopath-c212203c0394) If you are interested in playing with Docker, this guide sets up a Linux VM running on FreeBSD or FreeNAS under bhyve, then runs linux docker containers on top of it You know that jails are dope and I know that jails are dope, yet no one else knows it. So here we are stuck with docker. Two years ago I would be the last person to recommend using docker, but a whole lot of things has changes past years… This tutorial uses iohyve to manage the VMs on the FreeBSD or FreeNAS There are many Linux variants you can choose from — RancherOS, CoreOS are the most popular for docker-only hosts. We going to use RancherOS because it’s more lightweight out of the box. Navigate to RancherOS website and grab link to latest version sudo iohyve setup pool=zpool kmod=1 net=em0 sudo iohyve fetch https://releases.rancher.com/os/latest/rancheros.iso sudo iohyve renameiso rancheros.iso rancheros-v1.0.4.iso sudo pkg install grub2-bhyve sudo iohyve create rancher 32G sudo iohyve set rancher loader=grub-bhyve ram=8G cpu=8 con=nmdm0 os=debian sudo iohyve install rancher rancheros-v1.0.4.iso sudo iohyve console rancher Then the tutorial does some basic configuration of RancherOS, and some house keeping in iohyve to make RancherOS come up unattended at boot The whole point of this guide is to reduce pain, and using the docker CLI is still painful. There are a lot of Web UIs to control docker. Most of them include a lot of orchestrating services, so it’s just overkill. Portainer is very lightweight and can be run even on Raspberry Pi Create a config file as described After reboot you will be able to access WebUI on 9000 port. Setup is very easy, so I won’t go over it The docker tools for FreeBSD are still being worked on. Eventually you will be able to host native FreeBSD docker containers on FreeBSD jails, but we are not quite there yet In the meantime, you can install sysutils/docker and use it to manage the docker instances running on a remote machine, or in this case, the RancherOS VM running in bhyve *** Beastie Bits The Ghost of Invention: A Visit to Bell Labs, excerpt from the forthcoming book: “Kitten Clone: Inside Alcatel-Lucent” (https://www.wired.com/2014/09/coupland-bell-labs/) OpenBSD Cookbook (set of Ansible playbooks) (https://github.com/ligurio/openbsd-cookbooks) 15 useful sockstat commands to find open ports on FreeBSD (https://www.tecmint.com/sockstat-command-examples-to-find-open-ports-in-freebsd/) A prehistory of Slashdot (https://medium.freecodecamp.org/a-pre-history-of-slashdot-6403341dabae) Using ed, the unix line editor (https://medium.com/@claudio.santos.ribeiro/using-ed-the-unix-line-editor-557ed6466660) *** Feedback/Questions Malcolm - ZFS snapshots (http://dpaste.com/16EB3ZA#wrap) Darryn - Zones (http://dpaste.com/1DGHQJP#wrap) Mohammad - SSH Keys (http://dpaste.com/08G3VTB#wrap) Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv)
216: Software is storytelling
EuroBSDcon trip report, how to secure OpenBSD’s LDAP server, ZFS channel programs in FreeBSD HEAD and why software is storytelling. This episode was brought to you by Headlines EuroBSDcon Trip Report This is from Frank Moore, who has been supplying us with collections of links for the show and who we met at EuroBSDcon in Paris for the first time. Here is his trip report. My attendance at the EuroBSDCon 2017 conference in Paris was sprinkled with several 'firsts'. My first visit to Paris, my first time travelling on a EuroTunnel Shuttle train and my first time at any BSD conference. Hopefully, none of these will turn out to be 'lasts'. I arrived on the Wednesday afternoon before the conference started on Thursday morning. My hotel was conveniently located close to the conference centre in Paris' 3rd arrondissement. This area is well-known as a buzzy enclave of hip cafes, eateries, independent shops, markets, modern galleries and museums. It certainly lived up to its reputation. Even better, the weather held over the course of the conference, only raining once, with the rest of the time being both warm and sunny. The first two days were taken up with attending Dr Kirk McKusick's excellent tutorial 'An Introduction to the FreeBSD Open-Source Operating System'. This is training "straight from the horse's mouth". Kirk has worked extensively on The FreeBSD operating system since the 1980's, helping to design the original BSD filesystem (FFS) and later working on UFS as well. Not only is Kirk an engaging speaker, making what could be a dry topic very interesting, he also sprinkles liberal doses of history and war stories throughout his lectures. Want to know why a protocol was designed the way that it was? Or why a system flag has a particular value or position in a record? Kirk was there and has the first-hand answer. He reminisces about his meetings and work with other Unix and BSD luminaries and debunks and confirms common myths in equal measure. Kirk's teaching style and knowledge are impressive. Every section starts with an overview and a big picture diagram before drilling down into the nitty-gritty detail. Nothing feels superfluous, and everything fits together logically. It's easy to tell that the material and its delivery have been honed over many years, but without feeling stale. Topics covered included the kernel, processes, virtual memory, threads, I/O, devices, FFS, ZFS, and networking. The slides were just as impressive, with additional notes written by a previous student and every slide containing a reference back to the relevant page(s) in the 2nd edition of Kirk's operating system book. As well as a hard copy for those that requested it, Kirk also helpfully supplied soft copies of all the training materials. The breaks in between lectures were useful for meeting the students from the other tutorials and for recovering from the inevitable information overload. It's not often that you can get to hear someone as renowned as Dr McKusick give a lecture on something as important as the FreeBSD operating system. If you have any interest in FreeBSD, Unix history, or operating systems in general, I would urge you to grab the opportunity to attend one of his lectures. You won't be disappointed. The last two days of the conference consisted of various hour-long talks by members of each of the main BSD systems. All of them were fairly evenly represented except Dragonfly BSD which unfortunately only had one talk. With three talks going on at any one time, it was often difficult to pick which one to go to. At other times there might be nothing to pique the interest. Attendance at a talk is not mandatory, so for those times when no talks looked inviting, just hanging out in one of the lobby areas with other attendees was often just as interesting and informative. The conference centre itself was certainly memorable with the interior design of an Egyptian temple or pyramid. All the classrooms were more than adequate while the main auditorium was first-class and easily held the 300+ attendees comfortably. All in all, the facilities, catering and organisation were excellent. Kudos to the EuroBSDCon team, especially Bapt and Antoine for all their hard work and hospitality. As a long-time watcher and occasional contributor to the BSD Now podcast it was good to meet both Allan and Benedict in the flesh. And having done some proofreading for Michael Lucas previously, it was nice to finally meet him as well. My one suggestion to the organisers of the next conference would be to provide more hand-holding for newbies. As a first-time attendee at a BSD conference it would have been nice to have been formally introduced to various people within the projects as the goto people for their areas. I could do this myself, but it's not always easy finding the right person and wrangling an introduction. I also think it was a missed opportunity for each project to recruit new developers to their cause. Apparently, this is already in place at BSDCan, but should probably be rolled out across all BSD conferences. Having said all that, my aims for the conference were to take Dr McKusick's course, meet a few BSD people and make contacts within one of the BSD projects to start contributing. I was successful on all these fronts, so for me this was mission accomplished. Another first! autoconf/clang (No) Fun and Games (https://undeadly.org/cgi?action=article;sid=20170930133438) Robert Nagy (robert@) wrote in with a fascinating story of hunting down a recent problem with ports: You might have been noticing the amount of commits to ports regarding autoconf and nested functions and asking yourself… what the hell is this all about? I was hanging out at my friend Antoine (ajacoutot@)'s place just before EuroBSDCon 2017 started and we were having drinks and he told me that there is this weird bug where Gnome hangs completely after just a couple of seconds of usage and the gnome-shell process just sits in the fsleep state. This started to happen at the time when inteldrm(4) was updated, the default compiler was switched to clang(1) and futexes were turned on by default. The next day we started to have a look at the issue and since the process was hanging in fsleep, it seemed clear that the cause must be futexes, so we had to start bisecting the base system, which resulted in random success and failure. In the end we figured out that it is neither futex nor inteldrm(4) related, so the only thing that was left is the switch to clang. Now the problem is that we have to figure out what part of the system needs to be build with clang to trigger this issue, so we kept on going and systematically recompiled the base system with gcc until everything was ruled out … and it kept on hanging. We were drunk and angry that now we have to go and check hundreds of ports because gnome is not a small standalone port, so between two bottles of wine a build VM was fired up to do a package build with gcc, because manually building all the dependencies would just take too long and we had spent almost two days on this already. Next day ~200 packages were available to bisect and figure out what's going on. After a couple of tries it turned out that the hang is being caused by the gtk+3 package, which is bad since almost everything is using gtk+3. Now it was time to figure out what file the gtk+3 source being built by clang is causing the issue. (Compiler optimizations were ruled out already at this point.) So another set of bisecting happened, building each subdirectory of gtk+3 with clang and waiting for the hang to manifest … and it did not. What the $f? Okay so something else is going on and maybe the configure script of gtk+3 is doing something weird with different compilers, so I quickly did two configure runs with gcc and clang and simply diff'd the two directories. Snippets from the diff: -GDKHIDDENVISIBILITYCFLAGS = -fvisibility=hidden GDKHIDDENVISIBILITYCFLAGS = -ltcvprogcompilerrttiexceptions=no ltcvprogcompilerrttiexceptions=yes -#define GDKEXTERN attribute((visibility("default"))) extern -ltprogcompilernobuiltinflag=' -fno-builtin' +ltprogcompilernobuiltinflag=' -fno-builtin -fno-rtti -fno-exceptions' Okay, okay that's something, but wait … clang has symbol visibility support so what is going on again? Let's take a peek at config.log: configure:29137: checking for -fvisibility=hidden compiler flag configure:29150: cc -c -fvisibility=hidden -I/usr/local/include -I/usr/X11R6/include conftest.c >&5 conftest.c:82:17: error: function definition is not allowed here int main (void) { return 0; } ^ 1 error generated. Okay that's clearly an error but why exactly? autoconf basically generates a huge shell script that will check for whatever you throw at it by creating a file called conftest.c and putting chunks of code into it and then trying to compile it. In this case the relevant part of the code was: | int | main () | { | int main (void) { return 0; } | ; | return 0; | } That is a nested function declaration which is a GNU extension and it is not supported by clang, but that's okay, the question is why the hell would you use nested functions to check for simple compiler flags. The next step was to go and check what is going on in configure.ac to see how the configure script is generated. In the gtk+3 case the following snippet is used: AC_MSG_CHECKING([for -fvisibility=hidden compiler flag]) ACTRYCOMPILE([], [int main (void) { return 0; }], ACMSGRESULT(yes) enablefvisibilityhidden=yes, ACMSGRESULT(no) enablefvisibilityhidden=no) According to the autoconf manual the ACTRYCOMPILE macro accepts the following parameters: That clearly states that a function body has to be specified because the function definition is already provided automatically, so doing ACTRYCOMPILE([], [int main (void) { return 0;}], instead of ACTRYCOMPILE([],[] will result in a nested function declaration, which will work just fine with gcc, even though the autoconf usage is wrong. After fixing the autoconf macro in gtk+3 and rebuilding the complete port from scratch with clang, the hang completely went away as the proper CFLAGS and LDFLAGS were picked up by autoconf for the build. At this point we realized that most of the ports tree uses autoconf so this issue might be a lot bigger than we thought, so I asked sthen@ to do a grep on the ports object directory and just search for "function definition is not allowed here", which resulted in about ~60 additional ports affected. Out of the list of ports there were only two false positive matches. These were actually trying to test whether the compiler supports nested functions. The rest were a combination of several autoconf macros used in a wrong way, e.g: ACTRYCOMPILE, ACTRYLINK. Most of them were fixable by just removing the extra function declaration or by switching to other autoconf macros like ACLANGSOURCE where you can actually declare your own functions if need be. The conclusion is that this issue was a combination of people not reading documentation and just copy/pasting autoconf snippets, instead of reading their documentation and using the macros in the way they were intended, and the fact that switching to a new compiler is never easy and bugs or undefined behaviour are always lurking in the dark. Thanks to everyone who helped fixing all the ports up this quickly! Hopefully all of the changes can be merged upstream, so that others can benefit as well. Interview - David Carlier - @devnexen (https://twitter.com/devnexen) Software Engineer at Afilias *** News Roundup Setting up OpenBSD's LDAP Server (ldapd) with StartTLS and SASL (http://blog.databasepatterns.com/2017/08/setting-up-openbsds-ldap-server-ldapd.html) A tutorial on setting up OpenBSD’s native LDAP server with TLS encryption and SASL authentication OpenBSD has its own LDAP server, ldapd. Here's how to configure it for use with StartTLS and SASL authentication Create a certificate (acme-client anyone?) Create a basic config file listen on em0 tls certificate ldapserver This will listen on the em0 interface with tls using the certificate called ldapserver.crt / ldapserver.key Validate the configuration: /usr/sbin/ldapd -n Enable and start the service: rcctl enable ldapd rcctl start ldapd On the client machine: pkg_add openldap-client Copy the certificate to /etc/ssl/trusted.crt Add this line to /etc/openldap/ldap.conf TLS_CACERT /etc/ssl/trusted.crt Enable and start the service rcctl enable saslauthd rcctl start saslauthd Connect to ldapd (-ZZ means force TLS, use -H to specify URI): ldapsearch -H ldap://ldapserver -ZZ FreeBSD Picks Up Support for ZFS Channel Programs in -current (https://svnweb.freebsd.org/base?view=revision&revision=324163) ZFS channel programs (ZCP) adds support for performing compound ZFS administrative actions via Lua scripts in a sandboxed environment (with time and memory limits). This initial commit includes both base support for running ZCP scripts, and a small initial library of API calls which support getting properties and listing, destroying, and promoting datasets. Testing: in addition to the included unit tests, channel programs have been in use at Delphix for several months for batch destroying filesystems. Take a simple task as an example: Create a snapshot, then set a property on that snapshot. In the traditional system for this, when you issue the snapshot command, that closes the currently open transaction group (say #100), and opens a new one, #101. While #100 is being written to disk, other writes are accumulated in #101. Once #100 is flushed to disk, the ‘zfs snapshot’ command returns. You can then issue the ‘zfs set’ command. This actually ends up going into transaction group #102. Each administrative action needs to wait for the transaction group to flush, which under heavy loads could take multiple seconds. Now if you want to create AND set, you need to wait for two or three transaction groups. Meanwhile, during transaction group #101, the snapshot existed without the property set, which could cause all kinds of side effects. ZFS Channel programs solves this by allowing you to perform a small scripted set of actions as a single atomic operation. In Delphix’s appliance, they often needed to do as many as 15 operations together, which might take multiple minutes. Now with channel programs it is much faster, far safer, and has fewer chances of side effects BSDCan 2017 - Matt Ahrens: Building products based on OpenZFS, using channel programs -- Video Soon (http://www.bsdcan.org/2017/schedule/events/854.en.html) Software Is About Storytelling (http://bravenewgeek.com/software-is-about-storytelling/) Tyler Treat writes on the brave new geek blog: Software engineering is more a practice in archeology than it is in building. As an industry, we undervalue storytelling and focus too much on artifacts and tools and deliverables. How many times have you been left scratching your head while looking at a piece of code, system, or process? It’s the story, the legacy left behind by that artifact, that is just as important—if not more—than the artifact itself. And I don’t mean what’s in the version control history—that’s often useless. I mean the real, human story behind something. Artifacts, whether that’s code or tools or something else entirely, are not just snapshots in time. They’re the result of a series of decisions, discussions, mistakes, corrections, problems, constraints, and so on. They’re the product of the engineering process, but the problem is they usually don’t capture that process in its entirety. They rarely capture it at all. They commonly end up being nothing but a snapshot in time. It’s often the sign of an inexperienced engineer when someone looks at something and says, “this is stupid” or “why are they using X instead of Y?” They’re ignoring the context, the fact that circumstances may have been different. There is a story that led up to that point, a reason for why things are the way they are. If you’re lucky, the people involved are still around. Unfortunately, this is not typically the case. And so it’s not necessarily the poor engineer’s fault for wondering these things. Their predecessors haven’t done enough to make that story discoverable and share that context. I worked at a company that built a homegrown container PaaS on ECS. Doing that today would be insane with the plethora of container solutions available now. “Why aren’t you using Kubernetes?” Well, four years ago when we started, Kubernetes didn’t exist. Even Docker was just in its infancy. And it’s not exactly a flick of a switch to move multiple production environments to a new container runtime, not to mention the politicking with leadership to convince them it’s worth it to not ship any new code for the next quarter as we rearchitect our entire platform. Oh, and now the people behind the original solution are no longer with the company. Good luck! And this is on the timescale of about five years. That’s maybe like one generation of engineers at the company at most—nothing compared to the decades or more software usually lives (an interesting observation is that timescale, I think, is proportional to the size of an organization). Don’t underestimate momentum, but also don’t underestimate changing circumstances, even on a small time horizon. The point is, stop looking at technology in a vacuum. There are many facets to consider. Likewise, decisions are not made in a vacuum. Part of this is just being an empathetic engineer. The corollary to this is you don’t need to adopt every bleeding-edge tech that comes out to be successful, but the bigger point is software is about storytelling. The question you should be asking is how does your organization tell those stories? Are you deliberate or is it left to tribal knowledge and hearsay? Is it something you truly value and prioritize or simply a byproduct? Documentation is good, but the trouble with documentation is it’s usually haphazard and stagnant. It’s also usually documentation of how and not why. Documenting intent can go a long way, and understanding the why is a good way to develop empathy. Code survives us. There’s a fantastic talk by Bryan Cantrill on oral tradition in software engineering (https://youtu.be/4PaWFYm0kEw) where he talks about this. People care about intent. Specifically, when you write software, people care what you think. As Bryan puts it, future generations of programmers want to understand your intent so they can abide by it, so we need to tell them what our intent was. We need to broadcast it. Good code comments are an example of this. They give you a narrative of not only what’s going on, but why. When we write software, we write it for future generations, and that’s the most underestimated thing in all of software. Documenting intent also allows you to document your values, and that allows the people who come after you to continue to uphold them. Storytelling in software is important. Without it, software archeology is simply the study of puzzles created by time and neglect. When an organization doesn’t record its history, it’s bound to repeat the same mistakes. A company’s memory is comprised of its people, but the fact is people churn. Knowing how you got here often helps you with getting to where you want to be. Storytelling is how we transcend generational gaps and the inevitable changing of the old guard to the new guard in a maturing engineering organization. The same is true when we expand that to the entire industry. We’re too memoryless—shipping code and not looking back, discovering everything old that is new again, and simply not appreciating our lineage. Beastie Bits 1st BSD Users Stockholm Meetup (https://www.meetup.com/en-US/BSD-Users-Stockholm/) Absolute FreeBSD, 3rd Edition draft completed (https://blather.michaelwlucas.com/archives/3020) Absolute FreeBSD, 3rd Edition Table of Contents (https://blather.michaelwlucas.com/archives/2995) t2k17 Hackathon Report: My first time (Aaron Bieber) (https://undeadly.org/cgi?action=article;sid=20170824193521) The release of pfSense 2.4.0 will be slightly delayed to apply patches for vulnerabilities in 3rd party packages that are part of pfSense (https://www.netgate.com/blog/no-plan-survives-contact-with-the-internet.html) Feedback/Questions Ben writes in that zrepl is in ports now (http://dpaste.com/1XMJYMH#wrap) Peter asks us about Netflix on BSD (http://dpaste.com/334WY4T#wrap) meka writes in about dhclient exiting (http://dpaste.com/3GSGKD3#wrap) ***
215: Turning FreeBSD up to 100 Gbps
We look at how Netflix serves 100 Gbps from an Open Connect Appliance, read through the 2nd quarter FreeBSD status report, show you a freebsd-update speedup via nginx reverse proxy, and customize your OpenBSD default shell. This episode was brought to you by Headlines Serving 100 Gbps from an Open Connect Appliance (https://medium.com/netflix-techblog/serving-100-gbps-from-an-open-connect-appliance-cdb51dda3b99) In the summer of 2015, the Netflix Open Connect CDN team decided to take on an ambitious project. The goal was to leverage the new 100GbE network interface technology just coming to market in order to be able to serve at 100 Gbps from a single FreeBSD-based Open Connect Appliance (OCA) using NVM Express (NVMe)-based storage. At the time, the bulk of our flash storage-based appliances were close to being CPU limited serving at 40 Gbps using single-socket Xeon E5–2697v2. The first step was to find the CPU bottlenecks in the existing platform while we waited for newer CPUs from Intel, newer motherboards with PCIe Gen3 x16 slots that could run the new Mellanox 100GbE NICs at full speed, and for systems with NVMe drives. Fake NUMA Normally, most of an OCA’s content is served from disk, with only 10–20% of the most popular titles being served from memory (see our previous blog, Content Popularity for Open Connect (https://medium.com/@NetflixTechBlog/content-popularity-for-open-connect-b86d56f613b) for details). However, our early pre-NVMe prototypes were limited by disk bandwidth. So we set up a contrived experiment where we served only the very most popular content on a test server. This allowed all content to fit in RAM and therefore avoid the temporary disk bottleneck. Surprisingly, the performance actually dropped from being CPU limited at 40 Gbps to being CPU limited at only 22 Gbps! The ultimate solution we came up with is what we call “Fake NUMA”. This approach takes advantage of the fact that there is one set of page queues per NUMA domain. All we had to do was to lie to the system and tell it that we have one Fake NUMA domain for every 2 CPUs. After we did this, our lock contention nearly disappeared and we were able to serve at 52 Gbps (limited by the PCIe Gen3 x8 slot) with substantial CPU idle time. After we had newer prototype machines, with an Intel Xeon E5 2697v3 CPU, PCIe Gen3 x16 slots for 100GbE NIC, and more disk storage (4 NVMe or 44 SATA SSD drives), we hit another bottleneck, also related to a lock on a global list. We were stuck at around 60 Gbps on this new hardware, and we were constrained by pbufs. Our first problem was that the list was too small. We were spending a lot of time waiting for pbufs. This was easily fixed by increasing the number of pbufs allocated at boot time by increasing the kern.nswbuf tunable. However, this update revealed the next problem, which was lock contention on the global pbuf mutex. To solve this, we changed the vnode pager (which handles paging to files, rather than the swap partition, and hence handles all sendfile() I/O) to use the normal kernel zone allocator. This change removed the lock contention, and boosted our performance into the 70 Gbps range. As noted above, we make heavy use of the VM page queues, especially the inactive queue. Eventually, the system runs short of memory and these queues need to be scanned by the page daemon to free up memory. At full load, this was happening roughly twice per minute. When this happened, all NGINX processes would go to sleep in vm_wait() and the system would stop serving traffic while the pageout daemon worked to scan pages, often for several seconds. This problem is actually made progressively worse as one adds NUMA domains, because there is one pageout daemon per NUMA domain, but the page deficit that it is trying to clear is calculated globally. So if the vm pageout daemon decides to clean, say 1GB of memory and there are 16 domains, each of the 16 pageout daemons will individually attempt to clean 1GB of memory. To solve this problem, we decided to proactively scan the VM page queues. In the sendfile path, when allocating a page for I/O, we run the pageout code several times per second on each VM domain. The pageout code is run in its lightest-weight mode in the context of one unlucky NGINX process. Other NGINX processes continue to run and serve traffic while this is happening, so we can avoid bursts of pager activity that blocks traffic serving. Proactive scanning allowed us to serve at roughly 80 Gbps on the prototype hardware. Hans Petter Selasky, Mellanox’s 100GbE driver developer, came up with an innovative solution to our problem. Most modern NICs will supply an Receive Side Scaling (RSS) hash result to the host. RSS is a standard developed by Microsoft wherein TCP/IP traffic is hashed by source and destination IP address and/or TCP source and destination ports. The RSS hash result will almost always uniquely identify a TCP connection. Hans’ idea was that rather than just passing the packets to the LRO engine as they arrive from the network, we should hold the packets in a large batch, and then sort the batch of packets by RSS hash result (and original time of arrival, to keep them in order). After the packets are sorted, packets from the same connection are adjacent even when they arrive widely separated in time. Therefore, when the packets are passed to the FreeBSD LRO routine, it can aggregate them. With this new LRO code, we were able to achieve an LRO aggregation rate of over 2 packets per aggregation, and were able to serve at well over 90 Gbps for the first time on our prototype hardware for mostly unencrypted traffic. So the job was done. Or was it? The next goal was to achieve 100 Gbps while serving only TLS-encrypted streams. By this point, we were using hardware which closely resembles today’s 100GbE flash storage-based OCAs: four NVMe PCIe Gen3 x4 drives, 100GbE ethernet, Xeon E5v4 2697A CPU. With the improvements described in the Protecting Netflix Viewing Privacy at Scale blog entry, we were able to serve TLS-only traffic at roughly 58 Gbps. In the lock contention problems we’d observed above, the cause of any increased CPU use was relatively apparent from normal system level tools like flame graphs, DTrace, or lockstat. The 58 Gbps limit was comparatively strange. As before, the CPU use would increase linearly as we approached the 58 Gbps limit, but then as we neared the limit, the CPU use would increase almost exponentially. Flame graphs just showed everything taking longer, with no apparent hotspots. We finally had a hunch that we were limited by our system’s memory bandwidth. We used the Intel® Performance Counter Monitor Tools to measure the memory bandwidth we were consuming at peak load. We then wrote a simple memory thrashing benchmark that used one thread per core to copy between large memory chunks that did not fit into cache. According to the PCM tools, this benchmark consumed the same amount of memory bandwidth as our OCA’s TLS-serving workload. So it was clear that we were memory limited. At this point, we became focused on reducing memory bandwidth usage. To assist with this, we began using the Intel VTune profiling tools to identify memory loads and stores, and to identify cache misses. Because we are using sendfile() to serve data, encryption is done from the virtual memory page cache into connection-specific encryption buffers. This preserves the normal FreeBSD page cache in order to allow serving of hot data from memory to many connections. One of the first things that stood out to us was that the ISA-L encryption library was using half again as much memory bandwidth for memory reads as it was for memory writes. From looking at VTune profiling information, we saw that ISA-L was somehow reading both the source and destination buffers, rather than just writing to the destination buffer. We realized that this was because the AVX instructions used by ISA-L for encryption on our CPUs worked on 256-bit (32-byte) quantities, whereas the cache line size was 512-bits (64 bytes)?—?thus triggering the system to do read-modify-writes when data was written. The problem is that the the CPU will normally access the memory system in 64 byte cache line-sized chunks, reading an entire 64 bytes to access even just a single byte. After a quick email exchange with the ISA-L team, they provided us with a new version of the library that used non-temporal instructions when storing encryption results. Non-temporals bypass the cache, and allow the CPU direct access to memory. This meant that the CPU was no longer reading from the destination buffers, and so this increased our bandwidth from 58 Gbps to 65 Gbps. At 100 Gbps, we’re moving about 12.5 GB/s of 4K pages through our system unencrypted. Adding encryption doubles that to 25 GB/s worth of 4K pages. That’s about 6.25 Million mbufs per second. When you add in the extra 2 mbufs used by the crypto code for TLS metadata at the beginning and end of each TLS record, that works out to another 1.6M mbufs/sec, for a total of about 8M mbufs/second. With roughly 2 cache line accesses per mbuf, that’s 128 bytes * 8M, which is 1 GB/s (8 Gbps) of data that is accessed at multiple layers of the stack (alloc, free, crypto, TCP, socket buffers, drivers, etc). At this point, we’re able to serve 100% TLS traffic comfortably at 90 Gbps using the default FreeBSD TCP stack. However, the goalposts keep moving. We’ve found that when we use more advanced TCP algorithms, such as RACK and BBR, we are still a bit short of our goal. We have several ideas that we are currently pursuing, which range from optimizing the new TCP code to increasing the efficiency of LRO to trying to do encryption closer to the transfer of the data (either from the disk, or to the NIC) so as to take better advantage of Intel’s DDIO and save memory bandwidth. FreeBSD April to June 2017 Status Report (https://www.freebsd.org/news/status/report-2017-04-2017-06.html) FreeBSD Team Reports FreeBSD Release Engineering Team Ports Collection The FreeBSD Core Team The FreeBSD Foundation The Postmaster Team Projects 64-bit Inode Numbers Capability-Based Network Communication for Capsicum/CloudABI Ceph on FreeBSD DTS Updates Kernel Coda revival FreeBSD Driver for the Annapurna Labs ENA Intel 10G Driver Update pNFS Server Plan B Architectures FreeBSD on Marvell Armada38x FreeBSD/arm64 Userland Programs DTC Using LLVM's LLD Linker as FreeBSD's System Linker Ports A New USES Macro for Porting Cargo-Based Rust Applications GCC (GNU Compiler Collection) GNOME on FreeBSD KDE on FreeBSD New Port: FRRouting PHP Ports: Help Improving QA Rust sndio Support in the FreeBSD Ports Collection TensorFlow Updating Port Metadata for non-x86 Architectures Xfce on FreeBSD Documentation Absolute FreeBSD, 3rd Edition Doc Version Strings Improved by Their Absence New Xen Handbook Section Miscellaneous BSD Meetups at Rennes (France) Third-Party Projects HardenedBSD DPDK, VPP, and the future of pfSense @ the DPDK Summit (https://www.pscp.tv/DPDKProject/1dRKZnleWbmKB?t=5h1m0s) The DPDK (Data Plane Development Kit) conference included a short update from the pfSense project The video starts with a quick introduction to pfSense and the company behind it It covers the issues they ran into trying to scale to 10gbps and beyond, and some of the solutions they tried: libuinet, netmap, packet-journey Then they discovered VPP (Vector Packet Processing) The video then covers the architecture of the new pfSense pfSense has launched of EC2, on Azure soon, and will launch support for the new Atom C3000 and Xeon hardware with built-in QAT (Quick-Assist crypto offload) in November The future: 100gbps, MPLS, VXLANs, and ARM64 hardware support *** News Roundup Local nginx reverse proxy cache for freebsd-update (https://wiki.freebsd.org/VladimirKrstulja/Guides/FreeBSDUpdateReverseProxy) Vladimir Krstulja has created this interesting tutorial on the FreeBSD wiki about a freebsd-update reverse proxy cache Either because you're a good netizen and don't want to repeatedly hammer the FreeBSD mirrors to upgrade all your systems, or you want to benefit from the speed of having a local "mirror" (cache, more precisely), running a freebsd update reverse proxy cache with, say, nginx is dead simple. 1. Install nginx somewhere 2. Configure nginx for a subdomain, say, freebsd-update.example.com 3. On all your hosts, in all your jails, configure /etc/freebsd-update.conf for new ServerName And... that's it. Running freebsd-update will use the ServerName domain which is your reverse nginx proxy. Note the comment about using a "nearby" server is not quite true. FreeBSD update mirrors are frequently slow and running such a reverse proxy cache significantly speeds things up. Caveats: This is a simple cache. That means it doesn't consider the files as a whole repository, which in turn means updates to your cache are not atomic. It'd be advised to nuke your cache before your update run, as its point is only to retain the files in a local cache for some short period of time required for all your machines to be updated. ClonOS is a free, open-source FreeBSD-based platform for virtual environment creation and management (https://clonos.tekroutine.com/) The operating system uses FreeBSD's development branch (12.0-CURRENT) as its base. ClonOS uses ZFS as the default file system and includes web-based administration tools for managing virtual machines and jails. The project's website also mentions the availability of templates for quickly setting up new containers and web-based VNC access to jails. Puppet, we are told, can be used for configuration management. ClonOS can be downloaded as a disk image file (IMG) or as an optical media image (ISO). I downloaded the ISO file which is 1.6GB in size. Booting from ClonOS's media displays a text console asking us to select the type of text terminal we are using. There are four options and most people can probably safely take the default, xterm, option. The operating system, on the surface, appears to be a full installation of FreeBSD 12. The usual collection of FreeBSD packages are available, including manual pages, a compiler and the typical selection of UNIX command line utilities. The operating system uses ZFS as its file system and uses approximately 3.3GB of disk space. ClonOS requires about 50MB of active memory and 143MB of wired memory before any services or jails are created. Most of the key features of ClonOS, the parts which set it apart from vanilla FreeBSD, can be accessed through a web-based control panel. When we connect to this control panel, over a plain HTTP connection, using our web browser, we are not prompted for an account name or password. The web-based interface has a straight forward layout. Down the left side of the browser window we find categories of options and controls. Over on the right side of the window are the specific options or controls available in the selected category. At the top of the page there is a drop-down menu where we can toggle the displayed language between English and Russian, with English being the default. There are twelve option screens we can access in the ClonOS interface and I want to quickly give a summary of each one: Overview - this page shows a top-level status summary. The page lists the number of jails and nodes in the system. We are also shown the number of available CPU cores and available RAM on the system. Jail containers - this page allows us to create and delete jails. We can also change some basic jail settings on this page, adjusting the network configuration and hostname. Plus we can click a button to open a VNC window that allows us to access the jail's command line interface. Template for jails - provides a list of available jail templates. Each template is listed with its name and a brief description. For example, we have a Wordpress template and a bittorrent template. We can click a listed template to create a new jail with a vanilla installation of the selected software included. We cannot download or create new templates from this page. Bhyve VMs - this page is very much like the Jails containers page, but concerns the creation of new virtual machines and managing them. Virtual Private Network - allows for the management of subnets Authkeys - upload security keys for something, but it is not clear for what these keys will be used. Storage media - upload ISO files that will be used when creating virtual machines and installing an operating system in the new virtual environment. FreeBSD Bases - I think this page downloads and builds source code for alternative versions of FreeBSD, but I am unsure and could not find any associated documentation for this page. FreeBSD Sources - download source code for various versions of FreeBSD. TaskLog - browse logs of events, particularly actions concerning jails. SQLite admin - this page says it will open an interface for managing a SQLite database. Clicking link on the page gives a file not found error. Settings - this page simply displays a message saying the settings page has not been implemented yet. While playing with ClonOS, I wanted to perform a couple of simple tasks. I wanted to use the Wordpress template to set up a blog inside a jail. I wanted a generic, empty jail in which I could play and run commands without harming the rest of the operating system. I also wanted to try installing an operating system other than FreeBSD inside a Bhyve virtual environment. I thought this would give me a pretty good idea of how quick and easy ClonOS would make common tasks. Conclusions ClonOS appears to be in its early stages of development, more of a feature preview or proof-of-concept than a polished product. A few of the settings pages have not been finished yet, the web-based controls for jails are unable to create jails that connect to the network and I was unable to upload even small ISO files to create virtual machines. The project's website mentions working with Puppet to handle system configuration, but I did not encounter any Puppet options. There also does not appear to be any documentation on using Puppet on the ClonOS platform. One of the biggest concerns I had was the lack of security on ClonOS. The web-based control panel and terminal both automatically login as the root user. Passwords we create for our accounts are ignored and we cannot logout of the local terminal. This means anyone with physical access to the server automatically gains root access and, in addition, anyone on our local network gets access to the web-based admin panel. As it stands, it would not be safe to install ClonOS on a shared network. Some of the ideas present are good ones. I like the idea of jail templates and have used them on other systems. The graphical Bhyve tools could be useful too, if the limitations of the ISO manager are sorted out. But right now, ClonOS still has a way to go before it is likely to be safe or practical to use. Customize ksh display for OpenBSD (http://nanxiao.me/en/customize-ksh-display-for-openbsd/) The default shell for OpenBSD is ksh, and it looks a little monotonous. To make its user-experience more friendly, I need to do some customizations: (1) Modify the “Prompt String” to display the user name and current directory: PS1='$USER:$PWD# ' (2) Install colorls package: pkg_add colorls Use it to replace the shipped ls command: alias ls='colorls -G' (3) Change LSCOLORS environmental variable to make your favorite color. For example, I don’t want the directory is displayed in default blue, change it to magenta: LSCOLORS=fxexcxdxbxegedabagacad For detailed explanation of LSCOLORS, please refer manual of colorls. This is my final modification of .profile: PS1='$USER:$PWD# ' export PS1 LSCOLORS=fxexcxdxbxegedabagacad export LSCOLORS alias ls='colorls -G' DragonFly 5 release candidate (https://www.dragonflydigest.com/2017/10/02/20295.html) Commit (http://lists.dragonflybsd.org/pipermail/commits/2017-September/626463.html) I tagged DragonFly 5.0 (commit message list in that link) over the weekend, and there’s a 5.0 release candidate for download (http://mirror-master.dragonflybsd.org/iso-images/). It’s RC2 because the recent Radeon changes had to be taken out. (http://lists.dragonflybsd.org/pipermail/commits/2017-September/626476.html) Beastie Bits Faster forwarding (http://www.grenadille.net/post/2017/08/21/Faster-forwarding) DRM-Next-Kmod hits the ports tree (http://www.freshports.org/graphics/drm-next-kmod/) OpenBSD Community Goes Platinum (https://undeadly.org/cgi?action=article;sid=20170829025446) Setting up iSCSI on TrueOS and FreeBSD12 (https://www.youtube.com/watch?v=4myESLZPXBU) *** Feedback/Questions Christopher - Virtualizing FreeNAS (http://dpaste.com/38G99CK#wrap) Van - Tar Question (http://dpaste.com/3MEPD3S#wrap) Joe - Book Reviews (http://dpaste.com/0T623Z6#wrap) ***
214: The history of man, kind
The costs of open sourcing a project are explored, we discover why PS4 downloads are so slow, delve into the history of UNIX man pages, and more. This episode was brought to you by Headlines The Cost Of Open Sourcing Your Project (https://meshedinsights.com/2016/09/20/open-source-unlikely-to-be-abandonware/) Accusing a company of “dumping” their project as open source is probably misplaced – it’s an expensive business no-one would do frivolously. If you see an active move to change software licensing or governance, it’s likely someone is paying for it and thus could justify the expense to an executive. A Little History Some case study cameos may help. From 2004 onwards, Sun Microsystems had a policy of all its software moving to open source. The company migrated almost all products to open source licenses, and had varying degrees of success engaging communities around the various projects, largely related to the outlooks of the product management and Sun developers for the project. Sun occasionally received requests to make older, retired products open source. For example, Sun acquired a company called Lighthouse Design which created a respected suite of office productivity software for Steve Jobs’ NeXT platform. Strategy changes meant that software headed for the vault (while Jonathan Schwartz, a founder of Lighthouse, headed for the executive suite). Members of the public asked if Sun would open source some of this software, but these requests were declined because there was no business unit willing to fund the move. When Sun was later bought by Oracle, a number of those projects that had been made open source were abandoned. “Abandoning” software doesn’t mean leaving it for others; it means simply walking away from wherever you left it. In the case of Sun’s popular identity middleware products, that meant Oracle let the staff go and tried to migrate customers to other products, while remaining silent in public on the future of the project. But the code was already open source, so the user community was able to pick up the pieces and carry on, with help from Forgerock. It costs a lot of money to open source a mature piece of commercial software, even if all you are doing is “throwing a tarball over the wall”. That’s why companies abandoning software they no longer care about so rarely make it open source, and those abandoning open source projects rarely move them to new homes that benefit others. If all you have thought about is the eventual outcome, you may be surprised how expensive it is to get there. Costs include: For throwing a tarball over the wall: Legal clearance. Having the right to use the software is not the same as giving everyone in the world an unrestricted right to use it and create derivatives. Checking every line of code to make sure you have the rights necessary to release under an OSI-approved license is a big task requiring high-value employees on the “liberation team”. That includes both developers and lawyers; neither come cheap. Repackaging. To pass it to others, a self-contained package containing all necessary source code, build scripts and non-public source and tool dependencies has to be created since it is quite unlikely to exist internally. Again, the liberation team will need your best developers. Preserving provenance. Just because you have confidence that you have the rights to the code, that doesn’t mean anyone else will. The version control system probably contains much of the information that gives confidence about who wrote which code, so the repackaging needs to also include a way to migrate the commit information. Code cleaning. The file headers will hopefully include origin information but the liberation team had better check. They also need to check the comments for libel and profanities, not to mention trade secrets (especially those from third parties) and other IP issues. For a sustainable project, all the above plus: Compliance with host governance. It is a fantastic idea to move your project to a host like Apache, Conservancy, Public Software and so on. But doing so requires preparatory work. As a minimum you will need to negotiate with the new host organisation, and they may well need you to satisfy their process requirements. Paperwork obviously, but also the code may need conforming copyright statements and more. That’s more work for your liberation team. Migration of rights. Your code has an existing community who will need to migrate to your new host. That includes your staff – they are community too! They will need commit rights, governance rights, social media rights and more. Your liberation team will need your community manager, obviously, but may also need HR input. Endowment. Keeping your project alive will take money. It’s all been coming from you up to this point, but if you simply walk away before the financial burden has been accepted by the new community and hosts there may be a problem. You should consider making an endowment to your new host to pay for their migration costs plus the cost of hosting the community for at least a year. Marketing. Explaining the move you are making, the reasons why you are making it and the benefits for you and the community is important. If you don’t do it, there are plenty of trolls around who will do it for you. Creating a news blog post and an FAQ — the minimum effort necessary — really does take someone experienced and you’ll want to add such a person to your liberation team. Motivations There has to be some commercial reason that makes the time, effort and thus expense worth incurring. Some examples of motivations include: Market Strategy. An increasing number of companies are choosing to create substantial, openly-governed open source communities around software that contributes to their business. An open multi-stakeholder co-developer community is an excellent vehicle for innovation at the lowest cost to all involved. As long as your market strategy doesn’t require creating artificial scarcity. Contract with a third party. While the owner of the code may no longer be interested, there may be one or more parties to which they owe a contractual responsibility. Rather than breaching that contract, or buying it out, a move to open source may be better. Some sources suggest a contractual obligation to IBM was the reason Oracle abandoned OpenOffice.org by moving it over to the Apache Software Foundation for example. Larger dependent ecosystem. You may have no further use for the code itself, but you may well have other parts of your business which depend on it. If they are willing to collectively fund development you might consider an “inner source” strategy which will save you many of the costs above. But the best way to proceed may well be to open the code so your teams and those in other companies can fund the code. Internal politics. From the outside, corporations look monolithic, but from the inside it becomes clear they are a microcosm of the market in which they exist. As a result, they have political machinations that may be addressed by open source. One of Oracle’s motivations for moving NetBeans to Apache seems to have been political. Despite multiple internal groups needing it to exist, the code was not generating enough direct revenue to satisfy successive executive owners, who allegedly tried to abandon it on more than one occasion. Donating it to Apache meant that couldn’t happen again. None of this is to say a move to open source guarantees the success of a project. A “Field of Dreams” strategy only works in the movies, after all. But while it may be tempting to look at a failed corporate liberation and describe it as “abandonware”, chances are it was intended as nothing of the kind. Why PS4 downloads are so slow (https://www.snellman.net/blog/archive/2017-08-19-slow-ps4-downloads/) From the blog that brought us “The origins of XXX as FIXME (https://www.snellman.net/blog/archive/2017-04-17-xxx-fixme/)” and “The mystery of the hanging S3 downloads (https://www.snellman.net/blog/archive/2017-07-20-s3-mystery/)”, this week it is: “Why are PS4 downloads so slow?” Game downloads on PS4 have a reputation of being very slow, with many people reporting downloads being an order of magnitude faster on Steam or Xbox. This had long been on my list of things to look into, but at a pretty low priority. After all, the PS4 operating system is based on a reasonably modern FreeBSD (9.0), so there should not be any crippling issues in the TCP stack. The implication is that the problem is something boring, like an inadequately dimensioned CDN. But then I heard that people were successfully using local HTTP proxies as a workaround. It should be pretty rare for that to actually help with download speeds, which made this sound like a much more interesting problem. Before running any experiments, it's good to have a mental model of how the thing we're testing works, and where the problems might be. If nothing else, it will guide the initial experiment design. The speed of a steady-state TCP connection is basically defined by three numbers. The amount of data the client is will to receive on a single round-trip (TCP receive window), the amount of data the server is willing to send on a single round-trip (TCP congestion window), and the round trip latency between the client and the server (RTT). To a first approximation, the connection speed will be: speed = min(rwin, cwin) / RTT With this model, how could a proxy speed up the connection? The speed through the proxy should be the minimum of the speed between the client and proxy, and the proxy and server. It should only possibly be slower With a local proxy the client-proxy RTT will be very low; that connection is almost guaranteed to be the faster one. The improvement will have to be from the server-proxy connection being somehow better than the direct client-server one. The RTT will not change, so there are just two options: either the client has a much smaller receive window than the proxy, or the client is somehow causing the server's congestion window to decrease. (E.g. the client is randomly dropping received packets, while the proxy isn't). After setting up a test rig, where the PS4’s connection was bridged through a linux box so packets could be captured, and artificial latency could be added, some interested results came up: The differences in receive windows at different times are striking. And more important, the changes in the receive windows correspond very well to specific things I did on the PS4 When the download was started, the game Styx: Shards of Darkness was running in the background (just idling in the title screen). The download was limited by a receive window of under 7kB. This is an incredibly low value; it's basically going to cause the downloads to take 100 times longer than they should. And this was not a coincidence, whenever that game was running, the receive window would be that low. Having an app running (e.g. Netflix, Spotify) limited the receive window to 128kB, for about a 5x reduction in potential download speed. Moving apps, games, or the download window to the foreground or background didn't have any effect on the receive window. Playing an online match in a networked game (Dreadnought) caused the receive window to be artificially limited to 7kB. I ran a speedtest at a time when downloads were limited to 7kB receive window. It got a decent receive window of over 400kB; the conclusion is that the artificial receive window limit appears to only apply to PSN downloads. When a game was started (causing the previously running game to be stopped automatically), the receive window could increase to 650kB for a very brief period of time. Basically it appears that the receive window gets unclamped when the old game stops, and then clamped again a few seconds later when the new game actually starts up. I did a few more test runs, and all of them seemed to support the above findings. The only additional information from that testing is that the rest mode behavior was dependent on the PS4 settings. Originally I had it set up to suspend apps when in rest mode. If that setting was disabled, the apps would be closed when entering in rest mode, and the downloads would proceed at full speed. The PS4 doesn't make it very obvious exactly what programs are running. For games, the interaction model is that opening a new game closes the previously running one. This is not how other apps work; they remain in the background indefinitely until you explicitly close them. So, FreeBSD and its network stack are not to blame Sony used a poor method to try to keep downloads from interfering with your gameplay The impact of changing the receive window is highly dependant upon RTT, so it doesn’t work as evenly as actual traffic shaping or queueing would. An interesting deep dive, it is well worth reading the full article and checking out the graphs *** OpenSSH 7.6 Released (http://www.openssh.com/releasenotes.html#7.6) From the release notes: This release includes a number of changes that may affect existing configurations: ssh(1): delete SSH protocol version 1 support, associated configuration options and documentation. ssh(1)/sshd(8): remove support for the hmac-ripemd160 MAC. ssh(1)/sshd(8): remove support for the arcfour, blowfish and CAST Refuse RSA keys <1024 bits in length and improve reporting for keys that do not meet this requirement. ssh(1): do not offer CBC ciphers by default. Changes since OpenSSH 7.5 This is primarily a bugfix release. It also contains substantial internal refactoring. Security: sftp-server(8): in read-only mode, sftp-server was incorrectly permitting creation of zero-length files. Reported by Michal Zalewski. New features: ssh(1): add RemoteCommand option to specify a command in the ssh config file instead of giving it on the client's command line. This allows the configuration file to specify the command that will be executed on the remote host. sshd(8): add ExposeAuthInfo option that enables writing details of the authentication methods used (including public keys where applicable) to a file that is exposed via a $SSHUSERAUTH environment variable in the subsequent session. ssh(1): add support for reverse dynamic forwarding. In this mode, ssh will act as a SOCKS4/5 proxy and forward connections to destinations requested by the remote SOCKS client. This mode is requested using extended syntax for the -R and RemoteForward options and, because it is implemented solely at the client, does not require the server be updated to be supported. sshd(8): allow LogLevel directive in sshd_config Match blocks; ssh-keygen(1): allow inclusion of arbitrary string or flag certificate extensions and critical options. ssh-keygen(1): allow ssh-keygen to use a key held in ssh-agent as a CA when signing certificates. ssh(1)/sshd(8): allow IPQoS=none in ssh/sshd to not set an explicit ToS/DSCP value and just use the operating system default. ssh-add(1): added -q option to make ssh-add quiet on success. ssh(1): expand the StrictHostKeyChecking option with two new settings. The first "accept-new" will automatically accept hitherto-unseen keys but will refuse connections for changed or invalid hostkeys. This is a safer subset of the current behaviour of StrictHostKeyChecking=no. The second setting "off", is a synonym for the current behaviour of StrictHostKeyChecking=no: accept new host keys, and continue connection for hosts with incorrect hostkeys. A future release will change the meaning of StrictHostKeyChecking=no to the behaviour of "accept-new". ssh(1): add SyslogFacility option to ssh(1) matching the equivalent option in sshd(8). Check out the bugfixes and portability sections, too. *** News Roundup FreeBSD comes to FiFo 0.9.3 with vmadm (https://blog.project-fifo.net/freebsd-in-fifo-0-9-3/) What is Project FiFo? It’s an Open Source SmartOS Cloud Management and orchestration software. FiFo can be installed on SmartOS zones, running on standard compute nodes. There is no need for dedicated hardware or server roles. FiFo 0.9.3 has been in the works for a while, and it comes with quite a few new features. With our last release, we started experimenting with FreeBSD support. Since then much work has gone into improving this. We also did something rather exciting with the mystery box! However, more on that in a later post. The stable release of 0.9.3 will land within a few days with only packaging and documentation tasks left to do. Part of this means that we’ll have packages for all major components that work natively on BSD. There is no more need for a SmartOS box to run the components! When we introduced FreeBSD support last version we marked it as an experimental feature. We needed to try out and experiment what works and what does not. Understand the way FreeBSD does things, what tools exist, and how those align with our workflow. Bottomline we were not even sure BSD support was a thing in the future. We are happy to announce that with 0.9.3 we are now sure BSD support is a thing, and it is here to remain. That said it was good that we experimented in the last release, we did some significant changes to what we have now. When first looking at FreeBSD we went ahead and used existing tooling, namely iocage, to manage jails. It turns out the tooling around jails is not on par with what exists on illumos and especially SmartOS. The goodness of vmadm as a CLI for managing zones is just unparalleled. So we do what every (in)sane person would do! So with 0.9.3, we did what every (in)sane person would do! We implemented a version of vmadm that would work with FreeBSD and jails and keep the same CLI. Our clone works completely stand alone; vmadm is a compiled binary, written in rust which is blazing fast! The design takes up lessons learned from both zoneadm and vmadm in illumos/SmartOS for how things work instead of trying to reinvent the wheel. Moreover, while we love giving the FreeBSD community a tool we learned to love on SmartOS this also makes things a lot easier for us. FiFo now can use the same logic on SmartOS and FreeBSD as the differences are abstracted away inside of vmadm. That said there are a few notable differences. First of all, vmadm uses datasets the same way it does on SmartOS. However, there is no separate imgadm tool. Instead, we encapsulate the commands under vmadm images. To make this work we also provide a dataset server with base images for FreeBSD that used the same API as SmartOS dataset servers. Second, we needed to work around some limitations in VNET to make jails capable of being fully utilized in multi-tenancy environments. Nested vnet jails on freebsd While on illumos a virtual nic can be bound to an IP that can not be changed from inside the zone, VNET does not support this. Preventing tenants from messing with IP settings is crucial from a security standpoint! To work around that each jail created by vmadm are two jails: a minimal outer jail with nothing but a VNET interface, no IP or anything and an internal one that runs the user code. This outer jail then creates an inner jail with an inherited NIC that gets a fixed IP, combining both the security of a VNET jail as well as the security of a fixed IP interface. The nested jail layout resembles the way that SmartOS handles KVM machines, running KVM inside a zone. So in addition to working around VNET limitations, this already paves the way for bhyve nested in jails that might come in a future release. We hope to leverage the same two-step with just a different executable started in the outer jail instead of the jail command itself. History of UNIX Manpages (https://manpages.bsd.lv/history.html) Where do UNIX manpages come from? Who introduced the section-based layout of NAME, SYNOPSIS, and so on? And for manpage source writers and readers: where were those economical two- and three-letter instructions developed? The many accounts available on the Internet lack citations and are at times inconsistent. In this article, I reconstruct the history of the UNIX manpage based on source code, manuals, and first-hand accounts. Special thanks to Paul Pierce for his CTSS source archive; Bernard Nivelet for the Multics Internet Server; the UNIX Heritage Society for their research UNIX source reconstruction; Gunnar Ritter for the Heirloom Project sources; Alcatel-Lucent Bell Labs for the Plan 9 sources; BitSavers for their historical archive; and last but not least, Rudd Canaday, James Clarke, Brian Kernighan, Douglas McIlroy, Nils-Peter Nelson, Jerome Saltzer, Henry Spencer, Ken Thompson, and Tom Van Vleck for their valuable contributions. Please see the Copyright section if you plan on reproducing parts of this work. People: Abell, Vic Canaday, Rudd Capps, Dennis Clarke, James Dzonsons, Kristaps Kernighan, Brian Madnick, Stuart McIlroy, Douglas Morris, Robert Ossanna, Joseph F. Ritchie, Dennis Ritter, Gunnar Saltzer, Jerome H. Spencer, Henry Thompson, Ken *** BSDCam 2017 Trip Report: Mathieu Arnold (https://www.freebsdfoundation.org/blog/bsdcam-2017-trip-report-mathieu-arnold/) It seems that every time I try to go to England using the Eurostar, it gets delayed between 30 minutes and 2 hours. This year, it got my 45 minute layover down to 10 minutes. Luckily, King’s Cross is literally across the street from Saint Pancras, and I managed to get into my second train just in time. I arrived in Cambridge on Tuesday right on time for tea. A quick walk from the station got me to St Catharine’s College. This year, we were in a different building for the rooms, so I listened to the convoluted explanation the porter gave me to get to my room, I managed to get there without getting lost more than once. That evening was almost organized as we got together for dinner at the usual pub, the Maypole. Wednesday: The weather is lovely, and it is a good thing as there is a 25-30 minute walk from the College to the Computer Laboratory where the devsummit happens. The first morning is for deciding what we are going to talk about for the rest of the week, so we all go in turn introducing ourselves and voicing about what we would like to talk about. There are a few subjects that are of interest to me, so I listen to the toolchain discussions while writing new bits for the Porter’s Handbook. Thursday: I spent most of the day writing documentation, and talked a bit with a couple of DocEng members about joining the team as I would like to give some love to the build framework that has not been touched in a long time. At the end of the afternoon is a packaging session, we talked about the status of package in base, which is not really going anywhere right now. On the ports side, three aspects that are making good progress include, package flavors, sub packages, and migrating some base libraries to private libraries, which is a nightmare because of openssl, and kerberos, and pam. That evening, we had the formal diner at St John’s College, I love those old buildings that reminds me of Hogwarts. (I am sure there is a quidditch pitch somewhere nearby.) Friday: Last day. I continued to write documentation, while listening to a provisioning session. It would be great to have bhyve support in existing orchestration tools like vagrant, openstack, or maybe ganeti. We end the day, and the devsummit with short talks, some very interesting, some going way over my head. The weekend is here. I spent most of Saturday strolling in some of the numerous Cambridge parks, gardens, greens, fens… and I worked on a knitting pattern in the evening. On Sunday, I ate my last full gargantuan english breakfast of the year, and then back in a train to King’s Cross, and a Eurostar (this one on time) back to Paris. I would like to thank the FreeBSD Foundation for making this trip possible for me. GSoC 2017 Reports: Add SUBPACKAGES support to pkgsrc, part 1 (https://blog.netbsd.org/tnf/entry/gsoc_2017_reports_add_code) Introduction SUBPACKAGES (on some package systems they are known as multi-packages, but this term for pkgsrc is already used by packages that can be built against several versions (e.g. Python, PHP, Ruby packages)) consist in generating multiple binary packages from a single pkgsrc package. For example, from a pkgsrc package - local/frobnitzem - we will see how to generate three separate binary packages: frobnitzem-foo, frobnitzem-bar and frobnitzem-baz. This can be useful to separate several components of binary packages (and avoid to run the extract and configure phase two times!), for debugpkgs (so that all *.debug files containing debug symbols are contained in a separate -debugpkg package that can be installed only when it is needed), etc.. An high-level look at how SUBPACKAGES support is implemented Most of the changes needed are in mk/pkgformat/pkg/ hierarchy (previously known as mk/flavour and then renamed and generalized to other package formats during Anton Panev's Google Summer of Code 2011). The code in mk/pkgformat/${PKGFORMAT}/ handle the interaction of pkgsrc with the particular ${PKGFORMAT}, e.g. for pkg populate meta-data files used by pkgcreate(1), install/delete packages via pkgadd(1), and pkg_delete(1), etc. Conclusion In this first part of this blog post series we have seen what are SUBPACKAGES, when and why they can be useful. We have then seen a practical example of them taking a very trivial package and learned how to "subpackage-ify" it. Then we have described - from an high-level perspective - the changes needed to the pkgsrc infrastructure for the SUBPACKAGES features that we have used. If you are more interested in them please give a look to the pkgsrc debugpkg branch that contains all work done described in this blog post. In the next part we will see how to handle DEPENDS and buildlink3 inclusion for subpackages. ** Beastie Bits First partial boot of FreeBSD on Power8 (http://dmesgd.nycbug.org/index.cgi?do=view&id=3329) The new TNF Board of Directors are installed and patched for 2017. (https://blog.netbsd.org/tnf/entry/the_new_tnf_board_of) Open Source Summit 2017 October 23-26, 2017, Prague, Czech Republic Giovanni Bechis will give a talk about seccomp(2) vs pledge(2) (https://osseu17.sched.com/event/BxJw/seccomp2-vs-pledge2-giovanni-bechis-snb-srl) My first patch to OpenBSD (http://nanxiao.me/en/my-first-patch-to-openbsd/) Feedback/Questions Brian - OPNSense Facebook Group (http://dpaste.com/35WA42Z#wrap) Mark Felder - ZFS Health via SNMP (http://dpaste.com/0B8QH2W) Matt - Cantrill Presentation (http://dpaste.com/1D9WTHV#wrap) ***
213: The French CONnection
We recap EuroBSDcon in Paris, tell the story behind a pf PR, and show you how to do screencasting with OpenBSD. This episode was brought to you by Headlines Recap of EuroBSDcon 2017 in Paris, France (https://2017.eurobsdcon.org) EuroBSDcon was held in Paris, France this year, which drew record numbers this year. With over 300 attendees, it was the largest BSD event I have ever attended, and I was encouraged by the higher than expected number of first time attendees. The FreeBSD Foundation held a board meeting on Wednesday afternoon with the members who were in Paris. Topics included future conferences (including a conference kit we can mail to people who want to represent FreeBSD) and planning for next year. The FreeBSD Devsummit started on Thursday at the beautiful Mozilla Office in Paris. After registering and picking up our conference bag, everyone gathered for a morning coffee with lots of handshaking and greeting. We then gathered in the next room which had a podium with microphone, screens as well as tables and chairs. After developers sat down, Benedict opened the devsummit with a small quiz about France for developers to win a Mogics Power Bagel (https://www.mogics.com/?page_id=3824). 45 developers participated and DES won the item in the end. After introductions and collecting topics of interest from everyone, we started with the Work in Progress (WIP) session. The WIP session had different people present a topic they are working on in 7 minute timeslots. Topics ranged from FreeBSD Forwarding Performance, fast booting options, and a GELI patch under review to attach multiple providers. See their slides on the FreeBSD wiki (https://wiki.freebsd.org/DevSummit/201709). After lunch, the FreeBSD Foundation gave a general update on staff and funding, as well as a more focused presentation about our partnership with Intel. People were interested to hear what was done so far and asked a few questions to the Intel representative Glenn Weinberg. After lunch, developers worked quietly on their own projects. The mic remained open and occasionally, people would step forward and gave a short talk without slides or motivated a discussion of common interest. The day concluded with a dinner at a nice restaurant in Paris, which allowed to continue the discussions of the day. The second day of the devsummit began with a talk about the CAM-based SDIO stack by Ilya Bakulin. His work would allow access to wifi cards/modules on embedded boards like the Raspberry Pi Zero W and similar devices as many of these are using SDIO for data transfers. Next up was a discussion and Q&A session with the FreeBSD core team members who were there (missing only Benno Rice, Kris Moore, John Baldwin, and Baptiste Daroussin, the latter being busy with conference preparations). The new FCP (FreeBSD community proposals) were introduced for those who were not at BSDCan this year and the hows and whys about it. Allan and I were asked to describe our experiences as new members of core and we encouraged people to run for core when the next election happens. After a short break, Scott Long gave an overview of the work that’s been started on NUMA (Non-Uniform Memory Architecture), what the goals of the project are and who is working on it. Before lunch, Christian Schwarz presented his work on zrepl, a new ZFS replication solution he developed using Go. This sparked interest in developers, a port was started (https://reviews.freebsd.org/D12462) and people suggested to Christian that he should submit his talk to AsiaBSDcon and BSDCan next year. Benedict had to leave before lunch was done to teach his Ansible tutorial (which was well attended) at the conference venue. There were organized dinners, for those two nights, quite a feat of organization to fit over 100 people into a restaurant and serve them quickly. On Saturday, there was a social event, a river cruise down the Seine. This took the form of a ‘standing’ dinner, with a wide selection of appetizer type dishes, designed to get people to walk around and converse with many different people, rather than sit at a table with the same 6-8 people. I talked to a much larger group of people than I had managed to at the other dinners. I like having both dinner formats. We would also like to thank all of the BSDNow viewers who attended the conference and made the point of introducing themselves to us. It was nice to meet you all. The recordings of the live video stream from the conference are available immediately, so you can watch the raw versions of the talks now: Auditorium Keynote 1: Software Development in the Age of Heroes (https://youtu.be/4iR8g9-39LM?t=179) by Thomas Pornin (https://twitter.com/BearSSLnews) Tuning FreeBSD for routing and firewalling (https://youtu.be/4iR8g9-39LM?t=1660) by Olivier Cochard-Labbé (https://twitter.com/ocochardlabbe) My BSD sucks less than yours, Act I (https://youtu.be/4iR8g9-39LM?t=7040) by Antoine Jacoutot (https://twitter.com/ajacoutot) and Baptiste Daroussin (https://twitter.com/_bapt_) My BSD sucks less than yours, Act II (https://youtu.be/4iR8g9-39LM?t=14254) by Antoine Jacoutot (https://twitter.com/ajacoutot) and Baptiste Daroussin (https://twitter.com/_bapt_) Reproducible builds on NetBSD (https://youtu.be/4iR8g9-39LM?t=23351) by Christos Zoulas Your scheduler is not the problem (https://youtu.be/4iR8g9-39LM?t=26845) by Martin Pieuchot Keynote 2: A French story on cybercrime (https://youtu.be/4iR8g9-39LM?t=30540) by Éric Freyssinet (https://twitter.com/ericfreyss) Case studies of sandboxing base system with Capsicum (https://youtu.be/jqdHYEH_BQY?t=731) by Mariusz Zaborski (https://twitter.com/oshogbovx) OpenBSD’s small steps towards DTrace (a tale about DDB and CTF) (https://youtu.be/jqdHYEH_BQY?t=6030) by Jasper Lievisse Adriaanse The Realities of DTrace on FreeBSD (https://youtu.be/jqdHYEH_BQY?t=13096) by George Neville-Neil (https://twitter.com/gvnn3) OpenSMTPD, current state of affairs (https://youtu.be/jqdHYEH_BQY?t=16818) by Gilles Chehade (https://twitter.com/PoolpOrg) Hoisting: lessons learned integrating pledge into 500 programs (https://youtu.be/jqdHYEH_BQY?t=21764) by Theo de Raadt Keynote 3: System Performance Analysis Methodologies (https://youtu.be/jqdHYEH_BQY?t=25463) by Brendan Gregg (https://twitter.com/brendangregg) Closing Session (https://youtu.be/jqdHYEH_BQY?t=29355) Karnak “Is it done yet ?” The never ending story of pkg tools (https://youtu.be/1hjzleqGRYk?t=71) by Marc Espie (https://twitter.com/espie_openbsd) A Tale of six motherboards, three BSDs and coreboot (https://youtu.be/1hjzleqGRYk?t=7498) by Piotr Kubaj and Katarzyna Kubaj State of the DragonFly’s graphics stack (https://youtu.be/1hjzleqGRYk?t=11475) by François Tigeot From NanoBSD to ZFS and Jails – FreeBSD as a Hosting Platform, Revisited (https://youtu.be/1hjzleqGRYk?t=16227) by Patrick M. Hausen Bacula – nobody ever regretted making a backup (https://youtu.be/1hjzleqGRYk?t=20069) by Dan Langille (https://twitter.com/DLangille) Never Lose a Syslog Message (https://youtu.be/qX0BS4P65cQ?t=325) by Alexander Bluhm Running CloudABI applications on a FreeBSD-based Kubernetes cluster (https://youtu.be/qX0BS4P65cQ?t=5647) by Ed Schouten (https://twitter.com/EdSchouten) The OpenBSD web stack (https://youtu.be/qX0BS4P65cQ?t=13255) by Michael W. Lucas (https://twitter.com/mwlauthor) The LLDB Debugger on NetBSD (https://youtu.be/qX0BS4P65cQ?t=16835) by Kamil Rytarowski What’s in store for NetBSD 8.0? (https://youtu.be/qX0BS4P65cQ?t=21583) by Alistair Crooks Louxor A Modern Replacement for BSD spell(1) (https://youtu.be/6Nen6a1Xl7I?t=156) by Abhinav Upadhyay (https://twitter.com/abhi9u) Portable Hotplugging: NetBSD’s uvm_hotplug(9) API development (https://youtu.be/6Nen6a1Xl7I?t=5874) by Cherry G. Mathew Hardening pkgsrc (https://youtu.be/6Nen6a1Xl7I?t=9343) by Pierre Pronchery (https://twitter.com/khorben) Discovering OpenBSD on AWS (https://youtu.be/6Nen6a1Xl7I?t=14874) by Laurent Bernaille (https://twitter.com/lbernail) OpenBSD Testing Infrastructure Behind bluhm.genua.de (https://youtu.be/6Nen6a1Xl7I?t=18639) by Jan Klemkow The school of hard knocks – PT1 (https://youtu.be/8wuW8lfsVGc?t=276) by Sevan Janiyan (https://twitter.com/sevanjaniyan) 7 years of maintaining firefox, and still looking ahead (https://youtu.be/8wuW8lfsVGc?t=5321) by Landry Breuil Branch VPN solution based on OpenBSD, OSPF, RDomains and Ansible (https://youtu.be/8wuW8lfsVGc?t=12385) by Remi Locherer Running BSD on AWS (https://youtu.be/8wuW8lfsVGc?t=15983) by Julien Simon and Nicolas David Getting started with OpenBSD device driver development (https://youtu.be/8wuW8lfsVGc?t=21491) by Stefan Sperling A huge thanks to the organizers, program committee, and sponsors of EuroBSDCon. Next year, EuroBSDcon will be in Bucharest, Romania. *** The story of PR 219251 (https://www.sigsegv.be//blog/freebsd/PR219251) The actual story I wanted Kristof to tell, the pf bug he fixed at the Essen Hackathon earlier this summer. As I threatened to do in my previous post, I'm going to talk about PR 219251 for a bit. The bug report dates from only a few months ago, but the first report (that I can remeber) actually came from Shawn Webb on Twitter, of all places Despite there being a stacktrace it took quite a while (nearly 6 months in fact) before I figured this one out. It took Reshad Patuck managing to distill the problem down to a small-ish test script to make real progress on this. His testcase meant that I could get core dumps and experiment. It also provided valuable clues because it could be tweaked to see what elements were required to trigger the panic. This test script starts a (vnet) jail, adds an epair interface to it, sets up pf in the jail, and then reloads the pf rules on the host. Interestingly the panic does not seem to occur if that last step is not included. Obviously not the desired behaviour, but it seems strange. The instances of pf in the jails are supposed to be separate. We try to fetch a counter value here, but instead we dereference a bad pointer. There's two here, so already we need more information. Inspection of the core dump reveals that the state pointer is valid, and contains sane information. The rule pointer (rule.ptr) points to a sensible location, but the data is mostly 0xdeadc0de. This is the memory allocator being helpful (in debug mode) and writing garbage over freed memory, to make use-after-free bugs like this one easier to find. In other words: the rule has been free()d while there was still a state pointing to it. Somehow we have a state (describing a connection pf knows about) which points to a rule which no longer exists. The core dump also shows that the problem always occurs with states and rules in the default vnet (i.e. the host pf instance), not one of the pf instances in one of the vnet jails. That matches with the observation that the test script does not trigger the panic unless we also reload the rules on the host. Great, we know what's wrong, but now we need to work out how we can get into this state. At this point we're going to have to learn something about how rules and states get cleaned up in pf. Don't worry if you had no idea, because before this bug I didn't either. The states keep a pointer to the rule they match, so when rules are changed (or removed) we can't just delete them. States get cleaned up when connections are closed or they time out. This means we have to keep old rules around until the states that use them expire. When rules are removed pfunlinkrule() adds then to the Vpfunlinkedrules list (more on that funny V prefix later). From time to time the pf purge thread will run over all states and mark the rules that are used by a state. Once that's done for all states we know that all rules that are not marked as in-use can be removed (because none of the states use it). That can be a lot of work if we've got a lot of states, so pfpurgethread() breaks that up into smaller chuncks, iterating only part of the state table on every run. We iterate over all of our virtual pf instances (VNETFOREACH()), check if it's active (for FreeBSD-EN-17.08, where we've seen this code before) and then check the expired states with pfpurgeexpiredstates(). We start at state 'idx' and only process a certain number (determined by the PFTMINTERVAL setting) states. The pfpurgeexpiredstates() function returns a new idx value to tell us how far we got. So, remember when I mentioned the odd V_ prefix? Those are per-vnet variables. They work a bit like thread-local variables. Each vnet (virtual network stack) keeps its state separate from the others, and the V_ variables use a pointer that's changed whenever we change the currently active vnet (say with CURVNETSET() or CURVNETRESTORE()). That's tracked in the 'curvnet' variable. In other words: there are as many Vpfvnetactive variables as there are vnets: number of vnet jails plus one (for the host system). Why is that relevant here? Note that idx is not a per-vnet variable, but we handle multiple pf instances here. We run through all of them in fact. That means that we end up checking the first X states in the first vnet, then check the second X states in the second vnet, the third X states in the third and so on and so on. That of course means that we think we've run through all of the states in a vnet while we really only checked some of them. So when pfpurgeunlinkedrules() runs it can end up free()ing rules that actually are still in use because pfpurgethread() skipped over the state(s) that actually used the rule. The problem only happened if we reloaded rules in the host, because the active ruleset is never free()d, even if there are no states pointing to the rule. That explains the panic, and the fix is actually quite straightforward: idx needs to be a per-vnet variable, Vpfpurge_idx, and then the problem is gone. As is often the case, the solution to a fairly hard problem turns out to be really simple. As you might expect, finding the problem takes a lot more work that fixing it Thanks to Kristof for writing up this detailed post explaining how the problem was found, and what caused it. *** vBSDcon 2017: BSD at Work (https://www.ixsystems.com/blog/vbsdcon-2017-dexter/) The third biennial vBSDcon hosted by Verisign took place September 7th through 9th with the FreeBSD Developer Summit taking place the first day. vBSDcon and iXsystems’ MeetBSD event have been alternating between the East and West coasts of the U.S.A. and these two events play vital roles in reaching Washington, DC-area and Bay Area/Silicon Valley audiences. Where MeetBSD serves many BSD Vendors, vBSDcon attracts a unique government and security industry demographic that isn’t found anywhere else. Conference time and travel budgets are always limited and bringing these events to their attendees is a much-appreciated service provided by their hosts. The vBSDcon FreeBSD DevSummit had a strong focus on OpenZFS, the build system and networking with the FreeBSD 12 wish list of features in mind. How to best incorporate the steady flow of new OpenZFS features into FreeBSD such as dataset-level encryption was of particular interest. This feature from a GNU/Linux-based storage vendor is tribute to the growth of the OpenZFS community which is vital in light of the recent “Death of Solaris and ZFS” at Oracle. There has never been more demand for OpenZFS on FreeBSD and the Oracle news further confirms our collective responsibility to meet that demand. The official conference opened with my talk on “Isolated BSD Build Environments” in which I explained how the bhyve hypervisor can be used to effortlessly tour FreeBSD 5.0-onward and build specific source releases on demand to trace regressions to their offending commit. I was followed by a FreeNAS user who made the good point that FreeNAS is an exemplary “entry vector” into Unix and Enterprise Storage fundamentals, given that many of the vectors our generation had are gone. Where many of us discovered Unix and the Internet via console terminals at school or work, smart phones are only delivering the Internet without the Unix. With some irony, both iOS and Android are Unix-based yet offer few opportunities for their users to learn and leverage their Unix environments. The next two talks were The History and Future of Core Dumps in FreeBSD by Sam Gwydir and Using pkgsrc for multi-platform deployments in heterogeneous environments by G. Clifford Williams. I strongly recommend that anyone wanting to speak at AsiaBSDCon read Sam’s accompanying paper on core dumps because I consider it the perfect AsiaBSDCon topic and his execution is excellent. Core dumps are one of those things you rarely think about until they are a DROP EVERYTHING! priority. G. Clifford’s talk was about what I consider a near-perfect BSD project: pkgsrc, the portable BSD package manager. I put it up there with OpenSSH and mandoc as projects that have provided significant value to other Open Source operating systems. G. Clifford’s real-world experiences are perfectly inline with vBSDcon’s goal to be more production-oriented than other BSDCons. Of the other talks, any and all Dtrace talks are always appreciated and George Neville-Neil’s did not disappoint. He based it on his experiences with the Teach BSD project which is bringing FreeBSD-based computer science education to schools around the world. The security-related talks by John-Mark Gurney, Dean Freeman and Michael Shirk also represented vBSDcon’s consideration of the local community and made a convincing point that the BSDs should make concerted efforts to qualify for Common Criteria, FIPS, and other Government security requirements. While some security experts will scoff at these, they are critical to the adoption of BSD-based products by government agencies. BSD Now hosts Allan Jude and Benedict Reuschling hosted an OpenZFS BoF and Ansible talk respectively and I hosted a bhyve hypervisor BoF. The Hallway Track and food at vBSDcon were excellent and both culminated with an after-dinner dramatic reading of Michael W. Lucas’ latest book that raised money for the FreeBSD Foundation. A great time was had by all and it was wonderful to see everyone! News Roundup FreeBSD 10.4-RC2 Available (https://lists.freebsd.org/pipermail/freebsd-stable/2017-September/087848.html) FreeBSD 10.4 will be released soon, this is the last chance to find bugs before the official release is cut. Noteworthy Changes Since 10.4-RC1: Given that the amd64 disc1 image was overflowing, more of the base components installed into the disc1 (live) file systems had to be disabled. Most notably, this removed the compiler toolchain from the disc1 images. All disabled tools are still available with the dvd1 images, though. The aesni(4) driver now no longer shares a single FPU context across multiple sessions in multiple threads, addressing problems seen when employing aesni(4) for ipsec(4). Support for netmap(4) by the ixgbe(4) driver has been brought into line with the netmap(4) API present in stable/10. Also, ixgbe(4) now correctly handles VFs in its netmap(4) support again instead of treating these as PFs. During the creation of amd64 and i386 VM images, etcupdate(8) and mergemaster(8) databases now are bootstrapped, akin to what happens along the extraction of base.txz as part of a new installation via bsdinstall(8). This change allows for both of these tools to work out-of-box on the VM images and avoids errors seen when upgrading these images via freebsd-update(8). If you are still on the stable/10 branch, you should test upgrading to 10.4, and make sure there are no problems with your workload Additional testing specifically of the features that have changed since 10.4-BETA1 would also be most helpful This will be the last release from the stable/10 branch *** OpenBSD changes of note 628 (https://www.tedunangst.com/flak/post/openbsd-changes-of-note-628) EuroBSDCon in two weeks. Be sure to attend early and often. Many and various documentation improvements for libcrypto. New man pages, rewrites, expanded bugs sections, and more. Only allow upward migration in vmd. There’s a README for the syspatch build system if you want to run your own. Move the kernel relinking code from /etc/rc into a seperate script usable by syspatch. Kernel patches can now be reduced to just the necessary files. Make the callers of sogetopt() responsible for allocating memory. Now allocation and free occur in the same place. Use waitpid() instead of wait() in most programs to avoid accidentally collecting the wrong child. Have cu call isatty() before making assumptions. Switch mandoc rendering of mathematical symbols and greek letters from trying to imitate the characters’ graphical shapes, which resulted in unintelligible renderings in many cases, to transliterations conveying the characters’ meanings. Update libexpat to 2.2.4. Fix copying partial UTF-8 characters. Sigh, here we go again. Work around bug in F5’s handling of the supported elliptic curves extension. RFC 4492 only defines elliptic_curves for ClientHello. However, F5 is sending it in ServerHello. We need to skip over it since our TLS extension parsing code is now more strict. After a first install, run syspatch -c to check for patches. If SMAP is present, clear PSL_AC on kernel entry and interrupt so that only the code in copy{in,out}* that need it run with it set. Panic if it’s set on entry to trap() or syscall(). Prompted by Maxime Villard’s NetBSD work. Errata. New drivers for arm: rktemp, mvpinctrl, mvmpic, mvneta, mvmdio, mvpxa, rkiic, rkpmic. No need to exec rm from within mandoc. We know there’s exactly one file and directory to remove. Similarly with running cmp. Revert to Mesa 13.0.6 to hopefully address rendering issues a handful of people have reported with xpdf/fvwm on ivy bridge with modesetting driver. Rewrite ALPN extension using CBB/CBS and the new extension framework. Rewrite SRTP extension using CBB/CBS and the new extension framework. Revisit 2q queue sizes. Limit the hot queue to 1/20th the cache size up to a max of 4096 pages. Limit the warm and cold queues to half the cache. This allows us to more effectively notice re-interest in buffers instead of losing it in a large hot queue. Add glass console support for arm64. Probably not yet for your machine, though. Replace heaps of hand-written syscall stubs in ld.so with a simpler framework. 65535 is a valid port to listen on. When xinit starts an X server that listens only on UNIX socket, prefer DISPLAY=unix:0 rather than DISPLAY=:0. This will prevent applications from ever falling back to TCP if the UNIX socket connection fails (such as when the X server crashes). Reverted. Add -z and -Z options to apmd to auto suspend or hibernate when low on battery. Remove the original (pre-IETF) chacha20-poly1305 cipher suites. Add urng(4) which supports various USB RNG devices. Instead of adding one driver per device, start bundling them into a single driver. Remove old deactivated pledge path code. A replacement mechanism is being brewed. Fix a bug from the extension parsing rewrite. Always parse ALPN even if no callback has been installed to prevent leaving unprocessed data which leads to a decode error. Clarify what is meant by syslog priorities being ordered, since the numbers and priorities are backwards. Remove a stray setlocale() from ksh, eliminating a lot of extra statically linked code. Unremove some NPN symbols from libssl because ports software thinks they should be there for reasons. Fix saved stack location after resume. Somehow clang changed it. Resume works again on i386. Improve error messages in vmd and vmctl to be more informative. Stop building the miniroot installer for OMAP3 Beagleboards. It hasn’t worked in over a year and nobody noticed. Have the callers of sosetopt() free the mbuf for symmetry. On octeon, let the kernel use the hardware FPU even if emulation is compiled in. It’s faster. Fix support for 486DX CPUs by not calling cpuid. I used to own a 486. Now I don’t. Merge some drm fixes from linux. Defer probing of floppy drives, eliminating delays during boot. Better handling of probes and beacons and timeouts and scans in wifi stack to avoid disconnects. Move mutex, condvar, and thread-specific data routes, pthreadonce, and pthreadexit from libpthread to libc, along with low-level bits to support them. Let’s thread aware (but not actually threaded) code work with just libc. New POSIX xlocale implementation. Complete as long as you only use ASCII and UTF-8, as you should. Round and round it goes; when 6.2 stops, nobody knows. A peak at the future? *** Screencasting with OpenBSD (http://eradman.com/posts/screencasting.html) USB Audio Any USB microphone should appear as a new audio device. Here is the dmesg for my mic by ART: uaudio0 at uhub0 port 2 configuration 1 interface 0 "M-One USB" rev 1.10/0.01 addr 2 uaudio0: audio rev 1.00, 8 mixer controls audio1 at uaudio0 audioctl can read off all of the specific characterisitcs of this device $ audioctl -f /dev/audio1 | grep record mode=play,record record.rate=48000 record.channels=1 record.precision=16 record.bps=2 record.msb=1 record.encoding=slinear_le record.pause=0 record.active=0 record.block_size=1960 record.bytes=0 record.errors=0 Now test the recording from the second audio device using aucat(1) aucat -f rsnd/1 -o file.wav If the device also has a headset audio can be played through the same device. aucat -f rsnd/1 -i file.wav Screen Capture using Xvfb The rate at which a framebuffer for your video card is a feature of the hardware and software your using, and it's often very slow. x11vnc will print an estimate of the banwidth for the system your running. x11vnc ... 09/05/2012 22:23:45 fb read rate: 7 MB/sec This is about 4fps. We can do much better by using a virtual framebuffer. Here I'm setting up a new screen, setting the background color, starting cwm and an instance of xterm Xvfb :1 -screen 0 720x540x16 & DISPLAY=:1 xsetroot -solid steelblue & DISPLAY=:1 cwm & DISPLAY=:1 xterm +sb -fa Hermit -fs 14 & Much better! Now we're up around 20fps. x11vnc -display :1 & ... 11/05/2012 18:04:07 fb read rate: 168 MB/sec Make a connection to this virtual screen using raw encoding to eliminate time wasted on compression. vncviewer localhost -encodings raw A test recording with sound then looks like this ffmpeg -f sndio -i snd/1 -y -f x11grab -r 12 -s 800x600 -i :1.0 -vcodec ffv1 ~/out.avi Note: always stop the recording and playback using q, not Ctrl-C so that audio inputs are shut down properly. Screen Capture using Xephyr Xephyr is perhaps the easiest way to run X with a shadow framebuffer. This solution also avoids reading from the video card's RAM, so it's reasonably fast. Xephyr -ac -br -noreset -screen 800x600 :1 & DISPLAY=:1 xsetroot -solid steelblue & DISPLAY=:1 cwm & DISPLAY=:1 xrdb -load ~/.Xdefaults & DISPLAY=:1 xterm +sb -fa "Hermit" -fs 14 & Capture works in exactally the same way. This command tries to maintain 12fps. ffmpeg -f sndio -i snd/1 -y -f x11grab -r 12 -s 800x600 -i :1.0 -vcodec ffv1 -acodec copy ~/out.avi To capture keyboard and mouse input press Ctrl then Shift. This is very handy for using navigating a window manager in the nested X session. Arranging Windows I have sometimes found it helpful to launch applications and arrange them in a specific way. This will open up a web browser listing the current directory and position windows using xdotool DISPLAY=:1 midori "file:///pwd" & sleep 2 DISPLAY=:1 xdotool search --name "xterm" windowmove 0 0 DISPLAY=:1 xdotool search --class "midori" windowmove 400 0 DISPLAY=:1 xdotool search --class "midori" windowsize 400 576 This will position the window precisely so that it appears to be in a tmux window on the right. Audio/Video Sync If you find that the audio is way out of sync with the video, you can ajust the start using the -ss before the audio input to specify the number of seconds to delay. My final recording command line, that delays the audio by 0.5 seconds, writing 12fps ffmpeg -ss 0.5 -f sndio -i snd/1 -y -f x11grab -r 12 -s 800x600 -i :1.0 -vcodec ffv1 -acodec copy ~/out.avi Sharing a Terminal with tmux If you're trying to record a terminal session, tmux is able to share a session. In this way a recording of an X framebuffer can be taken without even using the screen. Start by creating the session. tmux -2 -S /tmp/tmux0 Then on the remote side connect on the same socket tmux -2 -S /tmp/tmux0 attach Taking Screenshots Grabbing a screenshots on Xvfb server is easily accomplished with ImageMagick's import command DISPLAY=:1 import -window root screenshot.png Audio Processing and Video Transcoding The first step is to ensure that the clip begins and ends where you'd like it to. The following will make a copy of the recording starting at time 00:00 and ending at 09:45 ffmpeg -i interactive-sql.avi \ -vcodec copy -acodec copy -ss 00:00:00 -t 00:09:45 interactive-sql-trimmed.avi mv interactive-sql-trimmed.avi interactive-sql.avi Setting the gain correctly is very important with an analog mixer, but if you're using a USB mic there may not be a gain option; simply record using it's built-in settings and then adjust the levels afterwards using a utility such as normalize. First extact the audio as a raw PCM file and then run normalize ffmpeg -i interactive-sql.avi -c:a copy -vn audio.wav normalize audio.wav Next merge the audio back in again ffmpeg -i interactive-sql.avi -i audio.wav \ -map 0:0 -map 1:0 -c copy interactive-sql-normalized.avi The final step is to compress the screencast for distribution. Encoding to VP8/Vorbis is easy: ffmpeg -i interactive-sql-normalized.avi -c:v libvpx -b:v 1M -c:a libvorbis -q:a 6 interactive-sql.webm H.264/AAC is tricky. For most video players the color space needs to be set to yuv420p. The -movflags puts the index data at the beginning of the file to enable streaming/partial content requests over HTTP: ffmpeg -y -i interactive-sql-normalized.avi -c:v libx264 \ -preset slow -crf 14 -pix_fmt yuv420p -movflags +faststart \ -c:a aac -q:a 6 interactive-sql.mp4 TrueOS @ Ohio Linuxfest ’17! (https://www.trueos.org/blog/trueos-ohio-linuxfest-17/) Dru Lavigne and Ken Moore are both giving presentations on Saturday the 30th. Sit in and hear about new developments for the Lumina and FreeNAS projects. Ken is offering Lumina Rising: Challenging Desktop Orthodoxy at 10:15 am in Franklin A. Hear his thoughts about the ideas propelling desktop environment development and how Lumina, especially Lumina 2, is seeking to offer a new model of desktop architecture. Elements discussed include session security, application dependencies, message handling, and operating system integration. Dru is talking about What’s New in FreeNAS 11 at 2:00 pm in Franklin D. She’ll be providing an overview of some of the new features added in FreeNAS 11.0, including: Alert Services Starting specific services at boot time AD Monitoring to ensure the AD service restarts if disconnected A preview of the new user interface support for S3-compatible storage and the bhyve hypervisor She’s also giving a sneak peek of FreeNAS 11.1, which has some neat features: A complete rewrite of the Jails/Plugins system as FreeNAS moves from warden to iocage Writing new plugins with just a few lines of code A brand new asynchronous middleware API Who’s going? Attending this year are: Dru Lavigne (dlavigne): Dru leads the technical documentation team at iX, and contributes heavily to open source documentation projects like FreeBSD, FreeNAS, and TrueOS. Ken Moore (beanpole134): Ken is the lead developer of Lumina and a core contributor to TrueOS. He also works on a number of other Qt5 projects for iXsystems. J.T. Pennington (q5sys): Some of you may be familiar with his work on BSDNow, but J.T. also contributes to the TrueOS, Lumina, and SysAdm projects, helping out with development and general bug squashing. *** Beastie Bits Lumina Development Preview: Theme Engine (https://www.trueos.org/blog/lumina-development-preview-theme-engine/) It's happening! Official retro Thinkpad lappy spotted in the wild (https://www.theregister.co.uk/2017/09/04/retro_thinkpad_spotted_in_the_wild/) LLVM libFuzzer and SafeStack ported to NetBSD (https://blog.netbsd.org/tnf/entry/llvm_libfuzzer_and_safestack_ported) Remaining 2017 FreeBSD Events (https://www.freebsdfoundation.org/news-and-events/event-calendar/2017-openzfs-developer-summit/) *** Feedback/Questions Andrew - BSD Teaching Material (http://dpaste.com/0YTT0VP) Seth - Switching to Tarsnap after Crashplan becomes no more (http://dpaste.com/1SK92ZX#wrap) Thomas - Native encryption in ZFS (http://dpaste.com/02KD5FX#wrap) Coding Cowboy - Coding Cowboy - Passwords and clipboards (http://dpaste.com/31K0E40#wrap) ***
212: The Solaris Eclipse
We recap vBSDcon, give you the story behind a PF EN, reminisce in Solaris memories, and show you how to configure different DEs on FreeBSD. This episode was brought to you by Headlines [vBSDCon] vBSDCon was held September 7 - 9th. We recorded this only a few days after getting home from this great event. Things started on Wednesday night, as attendees of the thursday developer summit arrived and broke into smallish groups for disorganized dinner and drinks. We then held an unofficial hacker lounge in a medium sized seating area, working and talking until we all decided that the developer summit started awfully early tomorrow. The developer summit started with a light breakfast and then then we dove right in Ed Maste started us off, and then Glen Barber gave a presentation about lessons learned from the 11.1-RELEASE cycle, and comparing it to previous releases. 11.1 was released on time, and was one of the best releases so far. The slides are linked on the DevSummit wiki page (https://wiki.freebsd.org/DevSummit/20170907). The group then jumped into hackmd.io a collaborative note taking application, and listed of various works in progress and upstreaming efforts. Then we listed wants and needs for the 12.0 release. After lunch we broke into pairs of working groups, with additional space for smaller meetings. The first pair were, ZFS and Toolchain, followed by a break and then a discussion of IFLIB and network drivers in general. After another break, the last groups of the day met, pkgbase and secure boot. Then it was time for the vBSDCon reception dinner. This standing dinner was a great way to meet new people, and for attendees to mingle and socialize. The official hacking lounge Thursday night was busy, and included some great storytelling, along with a bunch of work getting done. It was very encouraging to watch a struggling new developer getting help from a seasoned veteran. Watching the new developers eyes light up as the new information filled in gaps and they now understood so much more than just a few minutes before, and they raced off to continue working, was inspirational, and reminded me why these conferences are so important. The hacker lounge shut down relatively early by BSD conference standards, but, the conference proper started at 8:45 sharp the next morning, so it made sense. Friday saw a string of good presentations, I think my favourite was Jonathan Anderson’s talk on Oblivious sandboxing. Jonathan is a very energetic speaker, and was able to keep everyone focused even during relatively complicated explanations. Friday night I went for dinner at ‘Big Bowl’, a stir-fry bar, with a largish group of developers and users of both FreeBSD and OpenBSD. The discussions were interesting and varied, and the food was excellent. Benedict had dinner with JT and some other folks from iXsystems. Friday night the hacker lounge was so large we took over a bigger room (it had better WiFi too). Saturday featured more great talks. The talk I was most interested in was from Eric McCorkle, who did the EFI version of my GELIBoot work. I had reviewed some of the work, but it was interesting to hear the story of how it happened, and to see the parallels with my own story. My favourite speaker was Paul Vixie, who gave a very interesting talk about the gets() function in libc. gets() was declared unsafe before the FreeBSD project even started. The original import of the CSRG code into FreeBSD includes the compile time, and run-time warnings against using gets(). OpenBSD removed gets() in version 5.6, in 2014. Following Paul’s presentation, various patches were raised, to either cause use of gets() to crash the program, or to remove gets() entirely, causing such programs to fail to link. The last talk before the closing was Benedict’s BSD Systems Management with Ansible (https://people.freebsd.org/~bcr/talks/vBSDcon2017_Ansible.pdf). Shortly after, Allan won a MacBook Pro by correctly guessing the number of components in a jar that was standing next to the registration desk (Benedict was way off, but had a good laugh about the unlikely future Apple user). Saturday night ended with the Conference Social, and excellent dinner with more great conversations On Sunday morning, a number of us went to the Smithsonian Air and Space Museum site near the airport, and saw a Concorde, an SR-71, and the space shuttle Discovery, among many other exhibits. Check out the full photo album by JT (https://t.co/KRmSNzUSus), our producer. Thanks to all the sponsors for vBSDcon and all the organizers from Verisign, who made it such a great event. *** The story behind FreeBSD-EN-17.08.pf (https://www.sigsegv.be//blog/freebsd/FreeBSD-EN-17.08.pf) After our previous deep dive on a bug in episode 209, Kristof Provost, the maintainer of pf on FreeBSD (he is going to hate me for saying that) has written the story behind a recent ERRATA notice for FreeBSD First things first, so I have to point out that I think Allan misremembered things. The heroic debugging story is PR 219251, which I'll try to write about later. FreeBSD-EN-17:08.pf is an issue that affected some FreeBSD 11.x systems, where FreeBSD would panic at startup. There were no reports for CURRENT. There's very little to go on here, but we do know the cause of the panic ("integer divide fault"), and that the current process was "pf purge". The pf purge thread is part of the pf housekeeping infrastructure. It's a housekeeping kernel thread which cleans up things like old states and expired fragments. The lack of mention of pf functions in the backtrace is a hint unto itself. It suggests that the error is probably directly in pfpurgethread(). It might also be in one of the static functions it calls, because compilers often just inline those so they don't generate stack frames. Remember that the problem is an "integer divide fault". How can integer divisions be a problem? Well, you can try to divide by zero. The most obvious suspect for this is this code: idx = pfpurgeexpiredstates(idx, pfhashmask / (Vpfdefaultrule.timeout[PFTMINTERVAL] * 10)); However, this variable is both correctly initialised (in pfattachvnet()) and can only be modified through the DIOCSETTIMEOUT ioctl() call and that one checks for zero. At that point I had no idea how this could happen, but because the problem did not affect CURRENT I looked at the commit history and found this commit from Luiz Otavio O Souza: Do not run the pf purge thread while the VNET variables are not initialized, this can cause a divide by zero (if the VNET initialization takes to long to complete). Obtained from: pfSense Sponsored by: Rubicon Communications, LLC (Netgate) That sounds very familiar, and indeed, applying the patch fixed the problem. Luiz explained it well: it's possible to use Vpfdefaultrule.timeout before it's initialised, which caused this panic. To me, this reaffirms the importance of writing good commit messages: because Luiz mentioned both the pf purge thread and the division by zero I was easily able to find the relevant commit. If I hadn't found it this fix would have taken a lot longer. Next week we’ll look at the more interesting story I was interested in, which I managed to nag Kristof into writing *** The sudden death and eternal life of Solaris (http://dtrace.org/blogs/bmc/2017/09/04/the-sudden-death-and-eternal-life-of-solaris/) A blog post from Bryan Cantrill about the death of Solaris As had been rumored for a while, Oracle effectively killed Solaris. When I first saw this, I had assumed that this was merely a deep cut, but in talking to Solaris engineers still at Oracle, it is clearly much more than that. It is a cut so deep as to be fatal: the core Solaris engineering organization lost on the order of 90% of its people, including essentially all management. Of note, among the engineers I have spoken with, I heard two things repeatedly: “this is the end” and (from those who managed to survive Friday) “I wish I had been laid off.” Gone is any of the optimism (however tepid) that I have heard over the years — and embarrassed apologies for Oracle’s behavior have been replaced with dismay about the clumsiness, ineptitude and callousness with which this final cut was handled. In particular, that employees who had given their careers to the company were told of their termination via a pre-recorded call — “robo-RIF’d” in the words of one employee — is both despicable and cowardly. To their credit, the engineers affected saw themselves as Sun to the end: they stayed to solve hard, interesting problems and out of allegiance to one another — not out of any loyalty to the broader Oracle. Oracle didn’t deserve them and now it doesn’t have them — they have been liberated, if in a depraved act of corporate violence. Assuming that this is indeed the end of Solaris (and it certainly looks that way), it offers a time for reflection. Certainly, the demise of Solaris is at one level not surprising, but on the other hand, its very suddenness highlights the degree to which proprietary software can suffer by the vicissitudes of corporate capriciousness. Vulnerable to executive whims, shareholder demands, and a fickle public, organizations can simply change direction by fiat. And because — in the words of the late, great Roger Faulkner — “it is easier to destroy than to create,” these changes in direction can have lasting effect when they mean stopping (or even suspending!) work on a project. Indeed, any engineer in any domain with sufficient longevity will have one (or many!) stories of exciting projects being cancelled by foolhardy and myopic management. For software, though, these cancellations can be particularly gutting because (in the proprietary world, anyway) so many of the details of software are carefully hidden from the users of the product — and much of the innovation of a cancelled software project will likely die with the project, living only in the oral tradition of the engineers who knew it. Worse, in the long run — to paraphrase Keynes — proprietary software projects are all dead. However ubiquitous at their height, this lonely fate awaits all proprietary software. There is, of course, another way — and befitting its idiosyncratic life and death, Solaris shows us this path too: software can be open source. In stark contrast to proprietary software, open source does not — cannot, even — die. Yes, it can be disused or rusty or fusty, but as long as anyone is interested in it at all, it lives and breathes. Even should the interest wane to nothing, open source software survives still: its life as machine may be suspended, but it becomes as literature, waiting to be discovered by a future generation. That is, while proprietary software can die in an instant, open source software perpetually endures by its nature — and thrives by the strength of its communities. Just as the existence of proprietary software can be surprisingly brittle, open source communities can be crazily robust: they can survive neglect, derision, dissent — even sabotage. In this regard, I speak from experience: from when Solaris was open sourced in 2005, the OpenSolaris community survived all of these things. By the time Oracle bought Sun five years later in 2010, the community had decided that it needed true independence — illumos was born. And, it turns out, illumos was born at exactly the right moment: shortly after illumos was announced, Oracle — in what remains to me a singularly loathsome and cowardly act — silently re-proprietarized Solaris on August 13, 2010. We in illumos were indisputably on our own, and while many outsiders gave us no chance of survival, we ourselves had reason for confidence: after all, open source communities are robust because they are often united not only by circumstance, but by values, and in our case, we as a community never lost our belief in ZFS, Zones, DTrace and myriad other technologies like MDB, FMA and Crossbow. Indeed, since 2010, illumos has thrived; illumos is not only the repository of record for technologies that have become cross-platform like OpenZFS, but we have also advanced our core technologies considerably, while still maintaining highest standards of quality. Learning some of the mistakes of OpenSolaris, we have a model that allows for downstream innovation, experimentation and differentiation. For example, Joyent’s SmartOS has always been focused on our need for a cloud hypervisor (causing us to develop big features like hardware virtualization and Linux binary compatibility), and it is now at the heart of a massive buildout for Samsung (who acquired Joyent a little over a year ago). For us at Joyent, the Solaris/illumos/SmartOS saga has been formative in that we have seen both the ill effects of proprietary software and the amazing resilience of open source software — and it very much informed our decision to open source our entire stack in 2014. Judging merely by its tombstone, the life of Solaris can be viewed as tragic: born out of wedlock between Sun and AT&T and dying at the hands of a remorseless corporate sociopath a quarter century later. And even that may be overstating its longevity: Solaris may not have been truly born until it was made open source, and — certainly to me, anyway — it died the moment it was again made proprietary. But in that shorter life, Solaris achieved the singular: immortality for its revolutionary technologies. So while we can mourn the loss of the proprietary embodiment of Solaris (and we can certainly lament the coarse way in which its technologists were treated!), we can rejoice in the eternal life of its technologies — in illumos and beyond! News Roundup OpenBSD on the Lenovo Thinkpad X1 Carbon (5th Gen) (https://jcs.org/2017/09/01/thinkpad_x1c) Joshua Stein writes about his experiences running OpenBSD on the 5th generation Lenovo Thinkpad X1 Carbon: ThinkPads have sort of a cult following among OpenBSD developers and users because the hardware is basic and well supported, and the keyboards are great to type on. While no stranger to ThinkPads myself, most of my OpenBSD laptops in recent years have been from various vendors with brand new hardware components that OpenBSD does not yet support. As satisfying as it is to write new kernel drivers or extend existing ones to make that hardware work, it usually leaves me with a laptop that doesn't work very well for a period of months. After exhausting efforts trying to debug the I2C touchpad interrupts on the Huawei MateBook X (and other 100-Series Intel chipset laptops), I decided to take a break and use something with better OpenBSD support out of the box: the fifth generation Lenovo ThinkPad X1 Carbon. Hardware Like most ThinkPads, the X1 Carbon is available in a myriad of different internal configurations. I went with the non-vPro Core i7-7500U (it was the same price as the Core i5 that I normally opt for), 16Gb of RAM, a 256Gb NVMe SSD, and a WQHD display. This generation of X1 Carbon finally brings a thinner screen bezel, allowing the entire footprint of the laptop to be smaller which is welcome on something with a 14" screen. The X1 now measures 12.7" wide, 8.5" deep, and 0.6" thick, and weighs just 2.6 pounds. While not available at initial launch, Lenovo is now offering a WQHD IPS screen option giving a resolution of 2560x1440. Perhaps more importantly, this display also has much better brightness than the FHD version, something ThinkPads have always struggled with. On the left side of the laptop are two USB-C ports, a USB-A port, a full-size HDMI port, and a port for the ethernet dongle which, despite some reviews stating otherwise, is not included with the laptop. On the right side is another USB-A port and a headphone jack, along with a fan exhaust grille. On the back is a tray for the micro-SIM card for the optional WWAN device, which also covers the Realtek microSD card reader. The tray requires a paperclip to eject which makes it inconvenient to remove, so I think this microSD card slot is designed to house a card semi-permanently as a backup disk or something. On the bottom are the two speakers towards the front and an exhaust grille near the center. The four rubber feet are rather plastic feeling, which allows the laptop to slide around on a desk a bit too much for my liking. I wish they were a bit softer to be stickier. Charging can be done via either of the two USB-C ports on the left, though I wish more vendors would do as Google did on the Chromebook Pixel and provide a port on both sides. This makes it much more convenient to charge when not at one's desk, rather than having to route a cable around to one specific side. The X1 Carbon includes a 65W USB-C PD with a fixed USB-C cable and removable country-specific power cable, which is not very convenient due to its large footprint. I am using an Apple 61W USB-C charger and an Anker cable which charge the X1 fine (unlike HP laptops which only work with HP USB-C chargers). Wireless connectivity is provided by a removable Intel 8265 802.11a/b/g/n/ac WiFi and Bluetooth 4.1 card. An Intel I219-V chip provides ethernet connectivity and requires an external dongle for the physical cable connection. The screen hinge is rather tight, making it difficult to open with one hand. The tradeoff is that the screen does not wobble in the least bit when typing. The fan is silent at idle, and there is no coil whine even under heavy load. During a make -j4 build, the fan noise is reasonable and medium-pitched, rather than a high-pitched whine like on some laptops. The palm rest and keyboard area remain cool during high CPU utilization. The full-sized keyboard is backlit and offers two levels of adjustment. The keys have a soft surface and a somewhat clicky feel, providing very quiet typing except for certain keys like Enter, Backspace, and Escape. The keyboard has a reported key travel of 1.5mm and there are dedicated Page Up and Page Down keys above the Left and Right arrow keys. Dedicated Home, End, Insert, and Delete keys are along the top row. The Fn key is placed to the left of Control, which some people hate (although Lenovo does provide a BIOS option to swap it), but it's in the same position on Apple keyboards so I'm used to it. However, since there are dedicated Page Up, Page Down, Home, and End keys, I don't really have a use for the Fn key anyway. Firmware The X1 Carbon has a very detailed BIOS/firmware menu which can be entered with the F1 key at boot. F12 can be used to temporarily select a different boot device. A neat feature of the Lenovo BIOS is that it supports showing a custom boot logo instead of the big red Lenovo logo. From Windows, download the latest BIOS Update Utility for the X1 Carbon (my model was 20HR). Run it and it'll extract everything to C:\drivers\flash(some random string). Drop a logo.gif file in that directory and run winuptp.exe. If a logo file is present, it'll ask whether to use it and then write the new BIOS to its staging area, then reboot to actually flash it. + OpenBSD support Secure Boot has to be disabled in the BIOS menu, and the "CSM Support" option must be enabled, even when "UEFI/Legacy Boot" is left on "UEFI Only". Otherwise the screen will just go black after the OpenBSD kernel loads into memory. Based on this component list, it seems like everything but the fingerprint sensor works fine on OpenBSD. *** Configuring 5 different desktop environments on FreeBSD (https://www.linuxsecrets.com/en/entry/51-freebsd/2017/09/04/2942-configure-5-freebsd-x-environments) This fairly quick tutorial over at LinuxSecrets.com is a great start if you are new to FreeBSD, especially if you are coming from Linux and miss your favourite desktop environment It just goes to show how easy it is to build the desktop you want on modern FreeBSD The tutorial covers: GNOME, KDE, Xfce, Mate, and Cinnamon The instructions for each boil down to some variation of: Install the desktop environment and a login manager if it is not included: > sudo pkg install gnome3 Enable the login manager, and usually dbus and hald: > sudo sysrc dbusenable="YES" haldenable="YES" gdmenable="YES" gnomeenable="YES"? If using a generic login manager, add the DE startup command to your .xinitrc: > echo "exec cinnamon" > ~/.xinitrc And that is about it. The tutorial goes into more detail on other configuration you can do to get your desktop just the way you like it. To install Lumina: > sudo pkg install lumina pcbsd-utils-qt5 This will install Lumina and the pcbsd utilities package which includes pcdm, the login manager. In the near future we hear the login manager and some of the other utilities will be split into separate packages, making it easier to use them on vanilla FreeBSD. > sudo sysrc pcdmenable=”YES” dbusenable="YES" hald_enable="YES" Reboot, and you should be greeted with the graphical login screen *** A return-oriented programming defense from OpenBSD (https://lwn.net/Articles/732201/) We talked a bit about RETGUARD last week, presenting Theo’s email announcing the new feature Linux Weekly News has a nice breakdown on just how it works Stack-smashing attacks have a long history; they featured, for example, as a core part of the Morris worm back in 1988. Restrictions on executing code on the stack have, to a great extent, put an end to such simple attacks, but that does not mean that stack-smashing attacks are no longer a threat. Return-oriented programming (ROP) has become a common technique for compromising systems via a stack-smashing vulnerability. There are various schemes out there for defeating ROP attacks, but a mechanism called "RETGUARD" that is being implemented in OpenBSD is notable for its relative simplicity. In a classic stack-smashing attack, the attack code would be written directly to the stack and executed there. Most modern systems do not allow execution of on-stack code, though, so this kind of attack will be ineffective. The stack does affect code execution, though, in that the call chain is stored there; when a function executes a "return" instruction, the address to return to is taken from the stack. An attacker who can overwrite the stack can, thus, force a function to "return" to an arbitrary location. That alone can be enough to carry out some types of attacks, but ROP adds another level of sophistication. A search through a body of binary code will turn up a great many short sequences of instructions ending in a return instruction. These sequences are termed "gadgets"; a large program contains enough gadgets to carry out almost any desired task — if they can be strung together into a chain. ROP works by locating these gadgets, then building a series of stack frames so that each gadget "returns" to the next. There is, of course, a significant limitation here: a ROP chain made up of exclusively polymorphic gadgets will still work, since those gadgets were not (intentionally) created by the compiler and do not contain the return-address-mangling code. De Raadt acknowledged this limitation, but said: "we believe once standard-RET is solved those concerns become easier to address separately in the future. In any case a substantial reduction of gadgets is powerful". Using the compiler to insert the hardening code greatly eases the task of applying RETGUARD to both the OpenBSD kernel and its user-space code. At least, that is true for code written in a high-level language. Any code written in assembly must be changed by hand, though, which is a fair amount of work. De Raadt and company have done that work; he reports that: "We are at the point where userland and base are fully working without regressions, and the remaining impacts are in a few larger ports which directly access the return address (for a variety of reasons)". It can be expected that, once these final issues are dealt with, OpenBSD will ship with this hardening enabled. The article wonders about applying the same to Linux, but notes it would be difficult because the Linux kernel cannot currently be compiled using LLVM If any benchmarks have been run to determine the cost of using RETGUARD, they have not been publicly posted. The extra code will make the kernel a little bigger, and the extra overhead on every function is likely to add up in the end. But if this technique can make the kernel that much harder to exploit, it may well justify the extra execution overhead that it brings with it. All that's needed is somebody to actually do the work and try it out. Videos from BSDCan have started to appear! (https://www.youtube.com/playlist?list=PLeF8ZihVdpFfVEsCxNWGDmcATJfRZacHv) Henning Brauer: tcp synfloods - BSDCan 2017 (https://www.youtube.com/watch?v=KuHepyI0_KY) Benno Rice: The Trouble with FreeBSD - BSDCan 2017 (https://www.youtube.com/watch?v=1DM5SwoXWSU) Li-Wen Hsu: Continuous Integration of The FreeBSD Project - BSDCan 2017 (https://www.youtube.com/watch?v=SCLfKWaUGa8) Andrew Turner: GENERIC ARM - BSDCan 2017 (https://www.youtube.com/watch?v=gkYjvrFvPJ0) Bjoern A. Zeeb: From the outside - BSDCan 2017 (https://www.youtube.com/watch?v=sYmW_H6FrWo) Rodney W. Grimes: FreeBSD as a Service - BSDCan 2017 (https://www.youtube.com/watch?v=Zf9tDJhoVbA) Reyk Floeter: The OpenBSD virtual machine daemon - BSDCan 2017 (https://www.youtube.com/watch?v=Os9L_sOiTH0) Brian Kidney: The Realities of DTrace on FreeBSD - BSDCan 2017 (https://www.youtube.com/watch?v=NMUf6VGK2fI) The rest will continue to trickle out, likely not until after EuroBSDCon *** Beastie Bits Oracle has killed sun (https://meshedinsights.com/2017/09/03/oracle-finally-killed-sun/) Configure Thunderbird to send patch friendly (http://nanxiao.me/en/configure-thunderbird-to-send-patch-friendly/) FreeBSD 10.4-BETA4 Available (https://www.freebsd.org/news/newsflash.html#event20170909:01) iXsystems looking to hire kernel and zfs developers (especially Sun/Oracle Refugees) (https://www.facebook.com/ixsystems/posts/10155403417921508) Speaking of job postings, UnitedBSD.com has few job postings related to BSD (https://unitedbsd.com/) Call for papers USENIX FAST ‘18 - February 12-15, 2018, Due: September 28 2017 (https://www.freebsdfoundation.org/news-and-events/call-for-papers/usenix-fast-18-call-for-papers/) Scale 16x - March 8-11, 2018, Due: October 31, 2017 (https://www.freebsdfoundation.org/news-and-events/call-for-papers/scale-16x-call-for-participation/) FOSDEM ‘18 - February 3-4, 2018, Due: November 3 2017 (https://www.freebsdfoundation.org/news-and-events/call-for-papers/fosdem-18-call-for-participation/) Feedback/Questions Jason asks about cheap router hardware (http://dpaste.com/340KRHG) Prashant asks about latest kernels with freebsd-update (http://dpaste.com/2J7DQQ6) Matt wants know about VM Performance & CPU Steal Time (http://dpaste.com/1H5SZ81) John has config questions regarding Dell precision 7720, FreeBSD, NVME, and ZFS (http://dpaste.com/0X770SY) ***
211: It's HAMMER2 Time!
We explore whether a BSD can replicate Cisco router performance; RETGUARD, OpenBSDs new exploit mitigation technology, Dragonfly’s HAMMER2 filesystem implementation & more! This episode was brought to you by Headlines Can a BSD system replicate the performance of a Cisco router? (https://www.reddit.com/r/networking/comments/6upchy/can_a_bsd_system_replicate_the_performance_of/) Short Answer: No, but it might be good enough for what you need Traditionally routers were built with a tightly coupled data plane and control plane. Back in the 80s and 90s the data plane was running in software on commodity CPUs with proprietary software. As the needs and desires for more speeds and feeds grew, the data plane had to be implemented in ASICs and FPGAs with custom memories and TCAMs. While these were still programmable in a sense, they certainly weren't programmable by anyone but a small handful of people who developed the hardware platform. The data plane was often layered, where features not handled by the hardware data plane were punted to a software only data path running on a more general CPU. The performance difference between the two were typically an order or two of magnitude. source (https://fd.io/wp-content/uploads/sites/34/2017/07/FDioVPPwhitepaperJuly2017.pdf) Except for encryption (e.g. IPsec) or IDS/IPS, the true measure of router performance is packets forwarded per unit time. This is normally expressed as Packets-per-second, or PPS. To 'line-rate' forward on a 1gbps interface, you must be able to forward packets at 1.488 million pps (Mpps). To forward at "line-rate" between 10Gbps interfaces, you must be able to forward at 14.88Mpps. Even on large hardware, kernel-forwarding is limited to speeds that top out below 2Mpps. George Neville-Neil and I did a couple papers on this back in 2014/2015. You can read the papers (https://github.com/freebsd-net/netperf/blob/master/Documentation/Papers/ABSDCon2015Paper.pdf) for the results. However, once you export the code from the kernel, things start to improve. There are a few open source code bases that show the potential of kernel-bypass networking for building a software-based router. The first of these is netmap-fwd which is the FreeBSD ip_forward() code hosted on top of netmap, a kernel-bypass technology present in FreeBSD (and available for linux). Full-disclosure, netmap-fwd was done at my company, Netgate. netmap-fwd will l3 forward around 5 Mpps per core. slides (https://github.com/Netgate/netmap-fwd/blob/master/netmap-fwd.pdf) The first of these is netmap-fwd (https://github.com/Netgate/netmap-fwd) which is the FreeBSD ip_forward() code hosted on top of netmap (https://github.com/luigirizzo/netmap), a kernel-bypass technology present in FreeBSD (and available for linux). Full-disclosure, netmap-fwd was done at my company, Netgate. (And by "my company" I mean that I co-own it with my spouse.). netmap-fwd will l3 forward around 5 Mpps per core. slides (https://github.com/Netgate/netmap-fwd/blob/master/netmap-fwd.pdf) Nanako Momiyama of the Keio Univ Tokuda Lab presented on IP Forwarding Fastpath (https://www.bsdcan.org/2017/schedule/events/823.en.html) at BSDCan this past May. She got about 5.6Mpps (roughly 10% faster than netmap-fwd) using a similar approach where the ip_foward() function was rewritten as a module for VALE (the netmap-based in-kernel switch). Slides (https://2016.eurobsdcon.org/PresentationSlides/NanakoMomiyama_TowardsFastIPForwarding.pdf) from her previous talk at EuroBSDCon 2016 are available. (Speed at the time was 2.8Mpps.). Also a paper (https://www.ht.sfc.keio.ac.jp/~nanako/conext17-sw.pdf) from that effort, if you want to read it. Of note: They were showing around 1.6Mpps even after replacing the in-kernel routing lookup algorithm with DXR. (DXR was written by Luigi Rizzo, who is also the primary author of netmap.) Not too long after netmap-fwd was open sourced, Ghandi announced packet-journey, an application based on drivers and libraries and from DPDK. Packet-journey is also an L3 router. The GitHub page for packet-journey lists performance as 21,773.47 mbps (so 21.77Gbps) for 64-byte UDP frames with 50 ACLs and 500,000 routes. Since they're using 64-byte frames, this translates to roughly 32.4Mpps. Finally, there is recent work in FreeBSD (which is part of 11.1-RELEASE) that gets performance up to 2x the level of netmap-fwd or the work by Nanako Momiyama. 10 million PPS: Here (http://blog.cochard.me/2015/09/receipt-for-building-10mpps-freebsd.html) is a decent introduction. But of course, even as FreeBSD gets up to being able to do 10gbps at line-rate, 40 and 100 gigabits are not uncommon now Even with the fastest modern CPUs, this is very little time to do any kind of meaningful packet processing. At 10Gbps, your total budget per packet, to receive (Rx) the packet, process the packet, and transmit (Tx) the packet is 67.2 ns. Complicating the task is the simple fact that main memory (RAM) is 70 ns away. The simple conclusion here is that, even at 10Gbps, if you have to hit RAM, you can't generate the PPS required for line-rate forwarding. There is some detail about design tradeoffs in the Ryzen architecture and how that might impact using those machines as routers Anyway... those are all interesting, but the natural winner here is FD.io's Vector Packet Processing (VPP). Read this (http://blogs.cisco.com/sp/a-bigger-helping-of-internet-please) VPP is an efficient, flexible open source data plane. It consists of a set of forwarding nodes arranged in a directed graph and a supporting framework. The framework has all the basic data structures, timers, drivers (and interfaces to both DPDK and netmap), a scheduler which allocates the CPU time between the graph nodes, performance and debugging tools, like counters and built-in packet trace. The latter allows you to capture the paths taken by the packets within the graph with high timestamp granularity, giving full insight into the processing on a per-packet level. The net result here is that Cisco (again, Cisco) has shown the ability to route packets at 1 Tb/s using VPP on a four socket Purley system There is also much discussion of the future of pfSense, as they transition to using VPP This is a very lengthy write up which deserves a full read, plus there are some comments from other people *** RETGUARD, the OpenBSD next level in exploit mitigation, is about to debut (https://marc.info/?l=openbsd-tech&m=150317547021396&w=2) This year I went to BSDCAN in Ottawa. I spent much of it in the 'hallway track', and had an extended conversation with various people regarding our existing security mitigations and hopes for new ones in the future. I spoke a lot with Todd Mortimer. Apparently I told him that I felt return-address protection was impossible, so a few weeks later he sent a clang diff to address that issue... The first diff is for amd64 and i386 only -- in theory RISC architectures can follow this approach soon. The mechanism is like a userland 'stackghost' in the function prologue and epilogue. The preamble XOR's the return address at top of stack with the stack pointer value itself. This perturbs by introducing bits from ASLR. The function epilogue undoes the transform immediately before the RET instruction. ROP attack methods are impacted because existing gadgets are transformed to consist of " RET". That pivots the return sequence off the ROP chain in a highly unpredictable and inconvenient fashion. The compiler diff handles this for all the C code, but the assembly functions have to be done by hand. I did this work first for amd64, and more recently for i386. I've fixed most of the functions and only a handful of complex ones remain. For those who know about polymorphism and pop/jmp or JOP, we believe once standard-RET is solved those concerns become easier to address seperately in the future. In any case a substantial reduction of gadgets is powerful. For those worried about introducing worse polymorphism with these "xor; ret" epilogues themselves, the nested gadgets for 64bit and 32bit variations are +1 "xor %esp,(%rsp); ret", +2 "and $0x24,%al; ret" and +3 "and $0xc3,%al; int3". Not bad. Over the last two weeks, we have received help and advice to ensure debuggers (gdb, egdb, ddb, lldb) can still handle these transformed callframes. Also in the kernel, we discovered we must use a smaller XOR, because otherwise userland addresses are generated, and cannot rely on SMEP as it is really new feature of the architecture. There were also issues with pthreads and dlsym, which leads to a series of uplifts around _builtinreturn_address and DWARF CFI. Application of this diff doesn't require anything special, a system can simply be built twice. Or shortcut by building & installing gnu/usr.bin/clang first, then a full build. We are at the point where userland and base are fully working without regressions, and the remaining impacts are in a few larger ports which directly access the return address (for a variety of reasons). So work needs to continue with handling the RET-addr swizzle in those ports, and then we can move forward. You can find the full message with the diff here (https://marc.info/?l=openbsd-tech&m=150317547021396&w=2) *** Interview - Ed Maste, Charlie & Siva - @ed_maste (https://twitter.com/ed_maste), @yzgyyang (https://twitter.com/yzgyyang) & @svmhdvn (https://twitter.com/svmhdvn) Co-op Students for the FreeBSD Foundation *** News Roundup Next DFly release will have an initial HAMMER2 implementation (http://lists.dragonflybsd.org/pipermail/users/2017-August/313558.html) The next DragonFly release (probably in September some time) will have an initial HAMMER2 implementation. It WILL be considered experimental and won't be an installer option yet. This initial release will only have single-image support operational plus basic features. It will have live dedup (for cp's), compression, fast recovery, snapshot, and boot support out of the gate. This first H2 release will not have clustering or multi-volume support, so don't expect those features to work. I may be able to get bulk dedup and basic mirroring operational by release time, but it won't be very efficient. Also, right now, sync operations are fairly expensive and will stall modifying operations to some degree during the flush, and there is no reblocking (yet). The allocator has a 16KB granularity (on HAMMER1 it was 2MB), so for testing purposes it will still work fairly well even without reblocking. The design is in a good place. I'm quite happy with how the physical layout turned out. Allocations down to 1KB are supported. The freemap has a 16KB granularity with a linear counter (one counter per 512KB) for packing smaller allocations. INodes are 1KB and can directly embed 512 bytes of file data for files <= 512 bytes, or have four top-level blockrefs for files > 512 bytes. The freemap is also zoned by type for I/O locality. The blockrefs are 'fat' at 128 bytes but enormously powerful. That will allow us to ultimately support up to a 512-bit crypto hash and blind dedup using said hash. Not on release, but that's the plan. I came up with an excellent solution for directory entries. The 1KB allocation granularity was a bit high but I didn't want to reduce it. However, because blockrefs are now 128 byte entities, and directory entries are hashed just like in H1, I was able to code them such that a directory entry is embedded in the blockref itself and does not require a separate data reference or allocation beyond that. Filenames up to 64 bytes long can be accomodated in the blockref using the check-code area of the blockref. Longer filenames will use an additional data reference hanging off the blockref to accomodate up to 255 char filenames. Of course, a minimum of 1KB will have to be allocated in that case, but filenames are <= 64 bytes in the vast majority of use cases so it just isn't an issue. This gives directory entries optimal packing and indexing and is a huge win in terms of performance since blockrefs are arrayed in 16KB and 64KB blocks. In addition, since inodes can embed up to four blockrefs, the directory entries for 'small' directories with <= 4 entries ('.' and '..' don't count) can actually be embedded in the directory inode itself. So, generally speaking, the physical layout is in a very happy place. The basics are solid on my test boxes so it's now a matter of implementing as many of the more sophisticated features as I can before release, and continuing to work on the rest after the release. Removing Some Code (https://www.openssl.org/blog/blog/2017/06/17/code-removal/) This is another update on our effort to re-license the OpenSSL software. Our previous post in March was about the launch of our effort to reach all contributors, with the hope that they would support this change. So far, about 40% of the people have responded. For a project that is as old as OpenSSL (including its predecessor, SSLeay, it’s around 20 years) that’s not bad. We’ll be continuing our efforts over the next couple of months to contact everyone. Of those people responding, the vast majority have been in favor of the license change – less then a dozen objected. This post describes what we’re doing about those and how we came to our conclusions. The goal is to be very transparent and open with our processes. First, we want to mention that we respect their decision. While it is inconvenient to us, we acknowledge their rights to keep their code under the terms that they originally agreed to. We are asking permission to change the license terms, and it is fully within their rights to say no. The license website is driven by scripts that are freely available in the license section of our tools repository on GitHub. When we started, we imported every single git commit, looked for anything that resembled an email address, and created tables for each commit, each user we found, and a log that connects users to commits. This did find false positives: sometimes email Message-ID’s showed up, and there were often mentions of folks just as a passing side-comment, or because they were in the context diff. (That script is also in the tools repository, of course.) The user table also has columns to record approval or rejection of the license change, and comments. Most of the comments have been supportive, and (a bit surprisingly) only one or two were obscene. The whattoremove script finds the users who refused, and all commits they were named in. We then examined each commit – there were 86 in all – and filtered out those that were cherry-picks to other releases. We are planning on only changing the license for the master branch, so the other releases branches aren’t applicable. There were also some obvious false positives. At the end of this step, we had 43 commits to review. We then re-read every single commit, looking for what we could safely ignore as not relevant. We found the following: Nine of them showed up only because the person was mentioned in a comment. Sixteen of them were changes to the OpenBSD configuration data. The OpenSSL team had completely rewritten this, refactoring all BSD configurations into a common hierarchy, and the config stuff changed a great deal for 1.1.0. Seven were not copyrightable as because they were minimal changes (e.g., fixing a typo in a comment). One was a false positive. This left us with 10 commits. Two of them them were about the CryptoDev engine. We are removing that engine, as can be seen in this PR, but we expect to have a replacement soon (for at least Linux and FreeBSD). As for the other commits, we are removing that code, as can be seen in this first PR. and this second PR. Fortunately, none of them are particularly complex. Copyright, however, is a complex thing – at times it makes debugging look simple. If there is only one way to write something, then that content isn’t copyrightable. If the work is very small, such as a one-line patch, then it isn’t copyrightable. We aren’t lawyers, and have no wish to become such, but those viewpoints are valid. It is not official legal advice, however. For the next step, we hope that we will be able to “put back” the code that we removed. We are looking to the community for help, and encourage you to create the appropriate GitHub pull requests. We ask the community to look at that second PR and provide fixes. And finally, we ask everyone to look at the list of authors and see if their name, or the name of anyone they know, is listed. If so please email us. deraadt@ moves us to 6.2-beta! (http://undeadly.org/cgi?action=article&sid=20170821133453) Theo has just committed the diff that marks the end of the development cycle and the beginning of the testing phase for the upcoming 6.2 release: ` CVSROOT: /cvs Module name: src Changes by: deraadt@cvs.openbsd.org 2017/08/20 10:56:43 Modified files: etc/root : root.mail share/mk : sys.mk sys/arch/macppc/stand/tbxidata: bsd.tbxi sys/conf : newvers.sh sys/sys : param.h Log message: crank to 6.2-beta ` You all know what this means: get to testing! Find whatever hardware you have and install the latest snapshots, stress the upgrade procedure, play your favorite games, build your own code - whatever you use OpenBSD for, try it in the new snaps and report any problems you find. Your testing efforts will help make sure 6.2 is another great release! Beastie Bits 64 Hijacked ARMs (https://www.soldierx.com/news/64-Hijacked-ARMs) Smartisan Makes Another Iridium Donation to the OpenBSD Foundation (http://undeadly.org/cgi?action=article&sid=20170817195416) Andrey A. Chernov (ache), long-time FreeBSD core team member and dev, has passed away (https://twitter.com/Keltounet/status/898092662657560576) OpenBSD hacker Mickey Shalayeff has passed away (https://twitter.com/dugsong/status/897338038212276224) FreeBSD 10.4-BETA1 Available (https://www.freebsd.org/news/newsflash.html#event20170819:01) vmadm in action, from getting a dataset to running a jail w/ vnet networking in just a few commands (https://asciinema.org/a/M8sjN0FC64JPBWZqjKIG5sx2q) Interview with Patrick Wildt (from t2k17 OpenBSD Hackathon) (https://garbage.fm/episodes/41) *** Feedback/Questions Seth - Switching to tarsnap (http://dpaste.com/0F2382X) Johnny - memcmp (http://dpaste.com/1F576QS) Thomas - Drives and NAS (http://dpaste.com/0F9QSZZ) Ben - Nvidia (http://dpaste.com/1Z6CFWE) David - ZFS performance variations over nfs (http://dpaste.com/0B23QZB) ***
210: Your questions, part I
In this episode, we take a look at the reimplementation of NetBSD using a Microkernel, check out what makes DHCP faster, and see what high-process count support for DragonflyBSD has to offer, and we answer the questions you’ve always wanted to ask us. This episode was brought to you by Headlines A Reimplementation Of Netbsd Using a Microkernel (http://theembeddedboard.review/a-reimplementation-of-netbsd-using-a-microkernel-part-1-of-2/) Minix author Andy Tanenbaum writes in Part 1 of a-reimplementation-of-netbsd-using-a-microkernel (http://theembeddedboard.review/a-reimplementation-of-netbsd-using-a-microkernel-part-1-of-2/) Based on the MINIX 3 microkernel, we have constructed a system that to the user looks a great deal like NetBSD. It uses pkgsrc, NetBSD headers and libraries, and passes over 80% of the KYUA tests). However, inside, the system is completely different. At the bottom is a small (about 13,000 lines of code) microkernel that handles interrupts, message passing, low-level scheduling, and hardware related details. Nearly all of the actual operating system, including memory management, the file system(s), paging, and all the device drivers run as user-mode processes protected by the MMU. As a consequence, failures or security issues in one component cannot spread to other ones. In some cases a failed component can be replaced automatically and on the fly, while the system is running, and without user processes noticing it. The talk will discuss the history, goals, technology, and status of the project. Research at the Vrije Universiteit has resulted in a reimplementation of NetBSD using a microkernel instead of the traditional monolithic kernel. To the user, the system looks a great deal like NetBSD (it passes over 80% of the KYUA tests). However, inside, the system is completely different. At the bottom is a small (about 13,000 lines of code) microkernel that handles interrupts, message passing, low-level scheduling, and hardware related details. Nearly all of the actual operating system, including memory management, the file system(s), paging, and all the device drivers run as user-mode processes protected by the MMU. As a consequence, failures or security issues in one component cannot spread to other ones. In some cases a failed component can be replaced automatically and on the fly, while the system is running. The latest work has been adding live update, making it possible to upgrade to a new version of the operating system WITHOUT a reboot and without running processes even noticing. No other operating system can do this. The system is built on MINIX 3, a derivative of the original MINIX system, which was intended for education. However, after the original author, Andrew Tanenbaum, received a 2 million euro grant from the Royal Netherlands Academy of Arts and Sciences and a 2.5 million euro grant from the European Research Council, the focus changed to building a highly reliable, secure, fault tolerant operating system, with an emphasis on embedded systems. The code is open source and can be downloaded from www.minix3.org. It runs on the x86 and ARM Cortex V8 (e.g., BeagleBones). Since 2007, the Website has been visited over 3 million times and the bootable image file has been downloaded over 600,000 times. The talk will discuss the history, goals, technology, and status of the project. Part 2 (http://theembeddedboard.review/a-reimplementation-of-netbsd-using-a-microkernel-part-2-of-2/) is also available. *** Rapid DHCP: Or, how do Macs get on the network so fast? (https://cafbit.com/post/rapid_dhcp_or_how_do/) One of life's minor annoyances is having to wait on my devices to connect to the network after I wake them from sleep. All too often, I'll open the lid on my EeePC netbook, enter a web address, and get the dreaded "This webpage is not available" message because the machine is still working on connecting to my Wi-Fi network. On some occasions, I have to twiddle my thumbs for as long as 10-15 seconds before the network is ready to be used. The frustrating thing is that I know it doesn't have to be this way. I know this because I have a Mac. When I open the lid of my MacBook Pro, it connects to the network nearly instantaneously. In fact, no matter how fast I am, the network comes up before I can even try to load a web page. My curiosity got the better of me, and I set out to investigate how Macs are able to connect to the network so quickly, and how the network connect time in other operating systems could be improved. I figure there are three main categories of time-consuming activities that occur during network initialization: Link establishment. This is the activity of establishing communication with the network's link layer. In the case of Wi-Fi, the radio must be powered on, the access point detected, and the optional encryption layer (e.g. WPA) established. After link establishment, the device is able to send and receive Ethernet frames on the network. Dynamic Host Configuration Protocol (DHCP). Through DHCP handshaking, the device negotiates an IP address for its use on the local IP network. A DHCP server is responsible for managing the IP addresses available for use on the network. Miscellaneous overhead. The operating system may perform any number of mundane tasks during the process of network initialization, including running scripts, looking up preconfigured network settings in a local database, launching programs, etc. My investigation thus far is primarily concerned with the DHCP phase, although the other two categories would be interesting to study in the future. I set up a packet capture environment with a spare wireless access point, and observed the network activity of a number of devices as they initialized their network connection. For a worst-case scenario, let's look at the network activity captured while an Android tablet is connecting: This tablet, presumably in the interest of "optimization", is initially skipping the DHCP discovery phase and immediately requesting its previous IP address. The only problem is this is a different network, so the DHCP server ignores these requests. After about 4.5 seconds, the tablet stubbornly tries again to request its old IP address. After another 4.5 seconds, it resigns itself to starting from scratch, and performs the DHCP discovery needed to obtain an IP address on the new network. In all fairness, this delay wouldn't be so bad if the device was connecting to the same network as it was previously using. However, notice that the tablet waits a full 1.13 seconds after link establishment to even think about starting the DHCP process. Engineering snappiness usually means finding lots of small opportunities to save a few milliseconds here and there, and someone definitely dropped the ball here. In contrast, let's look at the packet dump from the machine with the lightning-fast network initialization, and see if we can uncover the magic that is happening under the hood: The key to understanding the magic is the first three unicast ARP requests. It looks like Mac OS remembers certain information about not only the last connected network, but the last several networks. In particular, it must at least persist the following tuple for each of these networks: > 1. The Ethernet address of the DHCP server > 2. The IP address of the DHCP server > 3. Its own IP address, as assigned by the DHCP server During network initialization, the Mac transmits carefully crafted unicast ARP requests with this stored information. For each network in its memory, it attempts to send a request to the specific Ethernet address of the DHCP server for that network, in which it asks about the server's IP address, and requests that the server reply to the IP address which the Mac was formerly using on that network. Unless network hosts have been radically shuffled around, at most only one of these ARP requests will result in a response—the request corresponding to the current network, if the current network happens to be one of the remembered networks. This network recognition technique allows the Mac to very rapidly discover if it is connected to a known network. If the network is recognized (and presumably if the Mac knows that the DHCP lease is still active), it immediately and presumptuously configures its IP interface with the address it knows is good for this network. (Well, it does perform a self-ARP for good measure, but doesn't seem to wait more than 13ms for a response.) The DHCP handshaking process begins in the background by sending a DHCP request for its assumed IP address, but the network interface is available for use during the handshaking process. If the network was not recognized, I assume the Mac would know to begin the DHCP discovery phase, instead of sending blind requests for a former IP address as the Galaxy Tab does. The Mac's rapid network initialization can be credited to more than just the network recognition scheme. Judging by the use of ARP (which can be problematic to deal with in user-space) and the unusually regular transmission intervals (a reliable 1.0ms delay between each packet sent), I'm guessing that the Mac's DHCP client system is entirely implemented as tight kernel-mode code. The Mac began the IP interface initialization process a mere 10ms after link establishment, which is far faster than any other device I tested. Android devices such as the Galaxy Tab rely on the user-mode dhclient system (part of the dhcpcd package) dhcpcd program, which no doubt brings a lot of additional overhead such as loading the program, context switching, and perhaps even running scripts. The next step for some daring kernel hacker is to implement a similarly aggressive DHCP client system in the Linux kernel, so that I can enjoy fast sign-on speeds on my Android tablet, Android phone, and Ubuntu netbook. There already exists a minimal DHCP client implementation in the Linux kernel, but it lacks certain features such as configuring the DNS nameservers. Perhaps it wouldn't be too much work to extend this code to support network recognition and interface with a user-mode daemon to handle such auxillary configuration information received via DHCP. If I ever get a few spare cycles, maybe I'll even take a stab at it. You can also find other ways of optimizing the dhclient program and how it works in the dhclient tutorial on Calomel.org (https://calomel.org/dhclient.html). *** BSDCam Trip Report (https://www.freebsdfoundation.org/blog/bsdcam-2017-trip-report-michael-lucas/) Over the decades, FreeBSD development and coordination has shifted from being purely on-line to involving more and more in-person coordination and cooperation. The FreeBSD Foundation sponsors a devsummit right before BSDCan, EuroBSDCon, and AsiaBSDCon, so that developers traveling to the con can leverage their airfare and hammer out some problems. Yes, the Internet is great for coordination, but nothing beats a group of developers spending ten minutes together to sketch on a whiteboard and figuring out exactly how to make something bulletproof. In addition to the coordination efforts, though, conference devsummits are hierarchical. There’s a rigid schedule, with topics decided in advance. Someone leads the session. Sessions can be highly informative, passionate arguments, or anything in between. BSDCam is… a little different. It’s an invaluable part of the FreeBSD ecosystem. However, it’s something that I wouldn’t normally attend. But right now, is not normal. I’m writing a new edition of Absolute FreeBSD. To my astonishment, people have come to rely on this book when planning their deployments and operations. While I find this satisfying, it also increases the pressure on me to get things correct. When I wrote my first FreeBSD book back in 2000, a dozen mailing lists provided authoritative information on FreeBSD development. One person could read every one of those lists. Today, that’s not possible—and the mailing lists are only one narrow aspect of the FreeBSD social system. Don’t get me wrong—it’s pretty easy to find out what people are doing and how the system works. But it’s not that easy to find out what people will be doing and how the system will work. If this book is going to be future-proof, I needed to leave my cozy nest and venture into the wilds of Cambridge, England. Sadly, the BSDCam chair agreed with my logic, so I boarded an aluminum deathtrap—sorry, a “commercial airliner”—and found myself hurtled from Detroit to Heathrow. And one Wednesday morning, I made it to the William Gates building of Cambridge University, consciousness nailed to my body by a thankfully infinite stream of proper British tea. BSDCam attendance is invitation only, and the facilities can only handle fifty folks or so. You need to be actively working on FreeBSD to wrangle an invite. Developers attend from all over the world. Yet, there’s no agenda. Robert Watson is the chair, but he doesn’t decide on the conference topics. He goes around the room and asks everyone to introduce themselves, say what they’re working on, and declare what they want to discuss during the conference. The topics of interest are tallied. The most popular topics get assigned time slots and one of the two big rooms. Folks interested in less popular topics are invited to claim one of the small breakout rooms. Then the real fun begins. I started by eavesdropping in the virtualization workshop. For two hours, people discussed FreeBSD’s virtualization needs, strengths, and weaknesses. What needs help? What should this interface look like? What compatibility is important, and what isn’t? By the end of the session, the couple dozen people had developed a reasonable consensus and, most importantly, some folks had added items to their to-do lists. Repeat for a dozen more topics. I got a good grip on what’s really happening with security mitigation techniques, FreeBSD’s cloud support, TCP/IP improvements, advances in teaching FreeBSD, and more. A BSDCan devsummit presentation on packaging the base system is informative, but eavesdropping on two dozen highly educated engineers arguing about how to nail down the final tidbits needed to make that a real thing is far more educational. To my surprise, I was able to provide useful feedback for some sessions. I speak at a lot of events outside of the FreeBSD world, and was able to share much of what I hear at Linux conferences. A tool that works well for an experienced developer doesn’t necessarily work well for everyone. Every year, I leave BSDCan tired. I left BSDCam entirely exhausted. These intense, focused discussions stretched my brain. But, I have a really good idea where key parts of FreeBSD development are actually headed. This should help future-proof the new Absolute FreeBSD, as much as any computer book can be future-proof. Plus, BSDCam throws the most glorious conference dinner I’ve ever seen. I want to thank Robert Watson for his kind invitation, and the FreeBSD Foundation for helping defray the cost of this trip Interview - The BSDNow Crew As a kid, what did you dream of to become as an adult? JT: An Astronaut BR: I wanted to be a private detective, because of all the crime novels that I read back then. I didn’t get far with it. However, I think the structured analysis skills (who did what, when, and such) help me in debugging and sysadmin work. AJ: Didn’t think about it much How do you manage to stay organized day to day with so much things you're actively doing each day? (Day job, wife/girlfriend, conferences, hobbies, friends, etc.) JT: Who said I was organized? BR: A lot of stuff in my calendar as reminders, open browser tabs as “to read later” list. A few things like task switching when getting stuck helps. Also, focus on a single goal for the day, even though there will be distractions. Slowly, but steadily chip away at the things you’re working on. Rather than to procrastinate and put things back to review later, get started early with easy things for a big task and then tackle the hard part. Often, things look totally chaotic and unmanageable, until you start working on them. AJ: I barely manage. Lots of Google Calendar reminders, and the entire wall of my office is covered in whiteboard sheet todo lists. I use pinboard.in to deal with finding and organizing bookmarks. Write things down, don’t trust your memory. What hobbies outside of IT do you have? JT: I love photography, but I do that Professional part time, so I’m not sure if that counts as a hobby anymore. I guess it’d have to be working in the garage on my cars. BR: I do Tai Chi to relax once a week in a group, but can also do it alone, pretty much everywhere. Way too much Youtube watching and browsing the web. I did play some games before studying at the university and I’m still proud that I could control it to the bare minimum not to impact my studies. A few “lapses” from time to time, revisiting the old classics since the newer stuff won’t run on my machines anyway. Holiday time is pretty much spent for BSD conferences and events, this is where I can relax and talk with like-minded people from around the world, which is fascinating. Plus, it gets me to various places and countries I never would have dared to visit on my own. AJ: I play a few video games, and I like to ski, although I don’t go very often as most of my vacation time is spent hanging out with my BSD friends at various conferences How do you relax? JT: What is this word ‘relax’ and what does it mean? BR: My Tai Chi plays a big part in it I guess. I really calms you and the constant stream of thoughts for a while. It also gives you better clarity of what’s important in life. Watching movies, sleeping long. AJ: Usually watching TV or Movies. Although I have taken to doing most of my TV watching on my exercise bike now, but it is still mentally relaxing If FreeBSD didn't exist, which BSD flavour would you use? Why? JT: I use TrueOS, but if FreeBSD didn’t exist, that project might not either… so… My other choice would be HardenedBSD, but since it’s also based on FreeBSD I’m in the same dillema. BR: I once installed NetBSD to see what It can do. If FreeBSD wouldn’t exist, I would probably try my luck with it. OpenBSD is also appealing, but I’ve never installed it. AJ: When I started using FreeBSD in 2000, the only other BSD I had heard of at the time was OpenBSD. If FreeBSD wasn’t around, I don’t think the world would look like it does, so it is hard to speculate. If any of the BSD's weren't around and you had to use Linux, which camp would belong to? (Redhat, SUSE, Debian, Ubuntu, Gentoo?) JT: I learned Linux in the mid 90s using Slackware, which I used consistently up until the mid 2000s, when I joined the PuppyLinux community and eventually became a developer (FYI, Puppy was/is/can be based on Slackware -- its complicated). So I’d go back to using either Slackware or PuppyLinux. BR: I tried various Linux distributions until I landed at Debian. I used is pretty extensively as my desktop OS at home, building custom kernels and packages to install them until I discovered FreeBSD. I ran both side by side for a few months for learning until one day I figured out that I had not booted Debian in a while, so I switched completely. AJ: The first Linux I played with was Slackware, and it is the most BSD like, but the bits of Linux I learned in school were Redhat and so I can somewhat wrap my head around it, although now that they are changing everything to systemd, all of that old knowledge is more harmful than useful. Are you still finding yourself in need to use Windows/Mac OS? Why? JT: I work part time as a professional Photographer, so I do use Windows for my photography work. While I can do everything I need to do in Linux, it comes down to being pragmatic about my time. What takes me several hours to accomplish in Linux I can accomplish in 20 minutes on Windows. BR: I was a long time Windows-only user before my Unix days. But back when Vista was about to come out and I needed a new laptop, my choice was basically learning to cope with Vistas awful features or learn MacOS X. I did the latter, it increased my productivity since it’s really a good Unix desktop experience (at least, back then). I only have to use Windows at work from time to time as I manage our Windows Terminal server, which keeps the exposure low enough and I only connect to it to use a certain app not available for the Mac or the BSDs. AJ: I still use Windows to play games, for a lot of video conferencing, and to produce BSD Now. Some of it could be done on BSD but not as easily. I have promised myself that I will switch to 100% BSD rather than upgrade to Windows 10, so we’ll see how that goes. Please describe your home networking setup. Router type, router OS, router hardware, network segmentation, wifi apparatus(es), other devices connected, and anything else that might be interesting about your home network. BR: Very simple and boring: Apple Airport Express base station and an AVM FritzBox for DNS, DHCP, and the link to my provider. A long network cable to my desktop machine. That I use less and less often. I just bought an RPI 3 for some home use in the future to replace it. Mostly my brother’s and my Macbook Pro’s are connected, our phones and the iPad of my mother. AJ: I have a E3-1220 v3 (dual 3.1ghz + HT) with 8 GB of ram, and 4x Intel gigabit server NICs as my router, and it runs vanilla FreeBSD (usually some snapshot of -current). I have 4 different VLANs, Home, Office, DMZ, and Guest WiFi. WiFi is served via a tiny USB powered device I bought in Tokyo years ago, it serves 3 different SSIDs, one for each VLAN except the DMZ. There are ethernet jacks in every room wired for 10 gigabit, although the only machines with 10 gigabit are my main workstation, file server, and some machines in the server rack. There are 3 switches, one for the house (in the laundry room), one for the rack, and one for 10gig stuff. There is a rack in the basement spare bedroom, it has 7 servers in it, mostly storage for live replicas of customer data for my company. How do guys manage to get your work done on FreeBSD desktops? What do you do when you need to a Linux or Windows app that isn't ported, or working? I've made several attempts to switch to FreeBSD, but each attempt failed because of tools not being available (e.g. Zoom, Dropbox, TeamViewer, Crashplan) or broken (e.g. VirtualBox). BR: I use VIrtualBox for everything that is not natively available or Windows-only. Unfortunately, that means no modern games. I mostly do work in the shell when I’m on FreeBSD and when it has to be a graphical application, then I use Fluxbox as the DE. I want to get work done, not look at fancy eye-candy that get’s boring after a while. Deactivated the same stuff on my mac due to the same reason. I look for alternative software online, but my needs are relatively easy to satisfy as I’m not doing video editing/rendering and such. AJ: I generally find that I don’t need these apps. I use Firefox, Thunderbird, OpenSSH, Quassel, KomodoEdit, and a few other apps, so my needs are not very demanding. It is annoying when packages are broken, but I usually work around this with boot environments, and being able to just roll back to a version that worked for a few days until the problem is solved. I do still have access to a windows machine for the odd time I need specific VPN software or access to Dell/HP etc out-of-band management tools. Which desktop environments are your favorite, and why? For example, I like i3, Xfce, and I'm drawn to Lumina's ethos, but so far always seem to end up back on Xfc because of its ease of use, flexibility, and dashing good looks. JT: As a Lumina Desktop developer, I think my preference is obvious. ;) I am also a long timeOpenBox user, so I have a soft place in my heart for that as well. BR: I use Fluxbox when I need to work with a lot of windows or an application demands X11. KDE and others are too memory heavy for me and I rarely use even 20% of the features they provide. AJ: I was a long time KDE user, but I have adopted Lumina. I find it fast, and that it gets out of my way and lets me do what I want. It had some annoyances early on, but I’ve nagged the developers into making it work for me. Which command-line shells do you prefer, why, and how (if at all) have you customised the environment or prompt? BR: I use zsh, but without all the fancy stuff you can find online. It might make you more productive, yes. But again, I try to keep things simple. I’m slowly learning tmux and want to work more in it in the future. I sometimes look at other BSD people’s laptops and am amazed at what they do with window-management in tmux. My prompt looks like this: bcr@Voyager:~> 20:20 17-08-17 Put this in your .zshrc to get the same result: PROMPT='%n@%m:%~>' RPROMPT='%T %D' AJ: I started using tcsh early on, because it was the shell on the first box I had access to, and because one of the first things I read in “BSD Hacks” was how to enable ‘typo correction”, which made my life a lot better especially on dial up in the early days. My shell prompt looks like this: allan@CA-TOR1-02:/usr/home/allan% What is one thing (or more) missing in FreeBSD you would import from another project or community? Could be tech, process, etc. JT: AUFS from Linux BR: Nohup from Illumos where you can detach an already running process and put it in the background. I often forget that and I’m not in tmux when that happens, so I can see myself use that feature a lot. AJ: Zones (more complete Jails) from IllumOS how do you manage your time to learn about and work on FreeBSD? Does your work/employment enable what you do, or are your contributions mainly done in private time? JT: These days I’m mostly learning things I need for work, so it just falls into something I’m doing while working on work projects. BR: We have a lot of time during the semester holidays to learn on our own, it’s part of the idea of being in a university to keep yourself updated, at least for me. Especially in the fast moving world of IT. I also read a lot in my free time. My interests can shift sometimes, but then I devour everything I can find on the topic. Can be a bit excessive, but has gotten me where I am now and I still need a lot to learn (and want to). Since I work with FreeBSD at work (my owndoing), I can try out many things there. AJ: My work means a spend a lot of time working with FreeBSD, but not that much time working ON it. My contributions are mostly done outside of work, but as I own the company I do get more flexibility to take time off for conferences and other FreeBSD related stuff. we know we can bribe Michael W Lucas with gelato (good gelato that is), but what can we use to bribe you guys? Like when I want to have Allan to work on fixing a bug which prevents me from running ZFS on this fancy rock64 board? BR: Desserts of various kinds. AJ: I am probably not the right person to look at your rock64 board. Most people in the project have taken to bribing me with chocolate. In general, my todo list is so long, the best way is a trade, you take this task and I’ll take that task. Is your daily mobile device iOS, Android, Windows Mobile, or other? Why? JT: These days I’m using Android on my Blackberry Priv, but until recently I was still a heavy user of Sailfish OS. I would use SailfishOS everyday, if I could find a phone with a keyboard that I could run it on. BR: iOS on the iPhone 7 currently. Never used an Android phone, saw it on other people’s devices and what they can do with it (much more). But the infrequent security updates (if any at all) keep me away from it. AJ: I have a Google Nexus 6 (Android 7.1). I wanted the ‘pure’ Android experience, and I had been happy with my previous Nexus S. I don’t run a custom OS/ROM or anything because I use the phone to verify that video streams work on an ‘average users device’. I am displeased that support for my device will end soon. I am not sure what device I will get next, but it definitely won’t be an iPhone. News Roundup Beta Update - Request for (more) Testing (http://undeadly.org/cgi?action=article&sid=20170808065718&mode=flat&count=30) https://beta.undeadly.org/ has received an update. The most significant changes include: The site has been given a less antiquated "look". (As the topic icons have been eliminated, we are no longer seeking help with those graphics.) The site now uses a moderate amount of semantic HTML5. Several bugs in the HTML fragment validator (used for submissions and comments) have been fixed. To avoid generating invalid HTML, submission content which fails validation is no longer displayed in submission/comment previews. Plain text submissions are converted to HTML in a more useful fashion. (Instead of just converting each EOL to , the converter now generates proper paragraphs and interprets two or more consecutive EOLs as indicating a paragraph break.) The redevelopment remains a work-in-progress. Many thanks to those who have contributed! As before, constructive feedback would be appreciated. Of particular interest are reports of bugs in behaviour (for example, in the HTML validator or in authentication) that would preclude the adoption of the current code for the main site. High-process-count support added to master (http://lists.dragonflybsd.org/pipermail/users/2017-August/313552.html) We've fixed a number of bottlenecks that can develop when the number of user processes runs into the tens of thousands or higher. One thing led to another and I said to myself, "gee, we have a 6-digit PID, might as well make it work to a million!". With the commits made today, master can support at least 900,000 processes with just a kern.maxproc setting in /boot/loader.conf, assuming the machine has the memory to handle it. And, in fact, as today's machines start to ratchet up there in both memory capacity and core count, with fast storage (NVMe) and fast networking (10GigE and higher), even in consumer boxes, this is actually something that one might want to do. With AMD's threadripper and EPYC chips now out, the Intel<->AMD cpu wars are back on! Boasting up to 32 cores (64 threads) per socket and two sockets on EPYC, terabytes of ram, and motherboards with dual 10GigE built-in, the reality is that these numbers are already achievable in a useful manner. In anycase, I've tested these changes on a dual-socket xeon. I can in-fact start 900,000 processes. They don't get a whole lot of cpu and running 'ps' would be painful, but it works and the system is still responsive from the shell with all of that going on. xeon126# uptime 1:42PM up 9 mins, 3 users, load averages: 890407.00, 549381.40, 254199.55 In fact, judging from the memory use, these minimal test processes only eat around 60KB each. 900,000 of them ate only 55GB on a 128GB machine. So even a million processes is not out of the question, depending on the cpu requirements for those processes. Today's modern machines can be stuffed with enormous amounts of memory. Of course, our PIDs are currently limited to 6 digits, so a million is kinda the upper limit in terms of discrete user processes (verses pthreads which are less restricted). I'd rather not go to 7 digits (yet). CFT: Driver for generic MS Windows 7/8/10 - compatible USB HID multi-touch touchscreens (https://lists.freebsd.org/pipermail/freebsd-current/2017-August/066783.html) Following patch [1] adds support for generic MS Windows 7/8/10 - compatible USB HID multi-touch touchscreens via evdev protocol. It is intended to be a native replacement of hid-multitouch.c driver found in Linux distributions and multimedia/webcamd port. Patch is made for 12-CURRENT and most probably can be applied to recent 11-STABLE and 11.1-RELEASE (not tested) How to test" 1. Apply patch [1] 2. To compile this driver into the kernel, place the following lines into your kernel configuration file: device wmt device usb device evdev Alternatively, to load the driver as a module at boot time, place the following line in loader.conf(5): wmt_load="YES" 3. Install x11-drivers/xf86-input-evdev or x11-drivers/xf86-input-libinput port 4. Tell XOrg to use evdev or libinput driver for the device: ``` Section "ServerLayout" InputDevice "TouchScreen0" "SendCoreEvents" EndSection Section "InputDevice" Identifier "TouchScreen0" Driver "evdev" # Driver "libinput" Option "Device" "/dev/input/eventXXX" EndSection ``` Exact value of "/dev/input/eventXXX" can be obtained with evemu-record utility from devel/evemu. Note1: Currently, driver does not support pens or touchpads. Note2: wmt.ko should be kld-loaded before uhid driver to take precedence over it! Otherwise uhid can be kld-unloaded after loading of wmt. wmt review: https://reviews.freebsd.org/D12017 Raw diff: https://reviews.freebsd.org/D12017.diff *** Beastie Bits BSDMag Programing Languages Infographic (https://bsdmag.org/programm_history/) t2k17 Hackathon Report: Bob Beck on buffer cache tweaks, libressl and pledge progress (http://undeadly.org/cgi?action=article&sid=20170815171854) New FreeBSD Journal (https://www.freebsdfoundation.org/past-issues/resource-control/) NetBSD machines at Open Source Conference 2017 Kyoto (http://mail-index.netbsd.org/netbsd-advocacy/2017/08/10/msg000744.html) *** Feedback/Questions Dan - HDD question (http://dpaste.com/3H6TDJV) Benjamin - scrub of death (http://dpaste.com/10F086V) Jason - Router Opinion (http://dpaste.com/2D9102K) Sohrab - Thanks (http://dpaste.com/1XYYTWF) ***