| commit | 7e9ef4d35c9aa707f274b019acaf06b682639f50 | [log] [tgz] |
|---|---|---|
| author | Barret Rhoden <brho@cs.berkeley.edu> | Tue Jul 12 14:30:39 2016 -0400 |
| committer | Barret Rhoden <brho@cs.berkeley.edu> | Tue Jul 19 11:43:10 2016 -0400 |
| tree | 054e7fa2a4c62a955c4a23298f5aa7847c2ac6f8 | |
| parent | d7abf316488c4540764f6682632b053b308654fc [diff] |
x86: Ensure boot_pgdir's user entries are unmapped For sanity reasons, we should never have anything in boot_pgdir below ULIM. Technically, we could make it work, but not without some thought. The issues is that PML4 entries are pointers, pointing to common PML3s. Any entry in boot_pgdir is shared with every process, from the PML3 on down. The kernel expects to manage and synchronize global access to the kernel mappings (above ULIM). But memory below ULIM is managed per-process. Backstory: I was trying to debug a null function pointer by mapping something at page 0. The kernel panicked after decreffing a page too many times. What happened was that inserting one page at virtual addr 0 created a PML3, PML2, and PML1 that was shared between every process - not just the page mapped at 0. This PTE reach was 512 GB, including the program binary (which was in the page cache). The first process was fine. However, when we forked, the pages for e.g. busybox's text segment already had PTEs in the new process's address space. Technically, they were the same PTEs (and PML3, 2, and 1) as the parent process, since they were shared data structures. Anyway, map_page_at_addr() saw that the PTE had a mapping, so it decreffed the page before inserting a new page. It just so happened that the new page was the same as the old one, since it was a fork (duplicate_vmrs, etc). That page had a single refcnt, since it was managed by the page cache, causing it to be freed. Now the page is freed, but it is in the page tables still, since map_page_at_addr() wrote the PTE with the value of the page. Basically, it was a "replace the PTE with its own value", but it triggered a decref. Next up was the VMR for busybox's MAP_PRIVATE vmr. Since it's a private mapping, it gets its own page. upage_alloc() gives us the most recently freed page, which was the one that was still in the address space, right at the top of the text segment. Eventually the process page faults, probably because of the madness of its address space. When it does, the kernel puts every page in the VMRs. At least one page (the one we discussed) was in there twice. The second decref triggers the kref assert, since we decreffed something that was already 0. Good times. Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
Akaros is an open source, GPL-licensed operating system for manycore architectures. Its goal is to provide better support for parallel and high-performance applications in the datacenter. Unlike traditional OSs, which limit access to certain resources (such as cores), Akaros provides native support for application-directed resource management and 100% isolation from other jobs running on the system.
Although not yet integrated as such, it is designed to operate as a low-level node OS with a higher-level Cluster OS, such as Mesos, governing how resources are shared amongst applications running on each node. Its system call API and “Many Core Process” abstraction better match the requirements of a Cluster OS, eliminating many of the obstacles faced by other systems when trying to isolate simultaneously running processes. Moreover, Akaros’s resource provisioning interfaces allow for node-local decisions to be made that enforce the resource allocations set up by a Cluster OS. This can be used to simplify global allocation decisions, reduce network communication, and ultimately promote more efficient sharing of resources. There is limited support for such functionality on existing operating systems.
Akaros is still very young, but preliminary results show that processes running on Akaros have an order of magnitude less noise than on Linux, as well as fewer periodic signals, resulting in better CPU isolation. Additionally, its non-traditional threading model has been shown to outperform the Linux NPTL across a number of representative application workloads. This includes a 3.4x faster thread context switch time, competitive performance for the NAS parallel benchmark suite, and a 6% increase in throughput over nginx for a simple thread-based webserver we wrote. We are actively working on expanding Akaros's capabilities even further.
Visit us at akaros.org
Instructions on installation and getting started with Akaros can be found in GETTING_STARTED.md
Our current documentation is very lacking, but it is slowly getting better over time. Most documentation is typically available in the Documentation/ directory. However, many of these documents are outdated, and some general cleanup is definitely in order.
Send an email to akaros+subscribe@googlegroups.com.
Or visit our google group and click “Join Group”
Create a new issue here.
brho hangs out (usually alone) in #akaros on irc.freenode.net. The other devs may pop in every now and then.
Instructions on contributing can be found in Documentation/Contributing.md.
The Akaros repository contains a mix of code from different projects across a few top-level directories. The kernel is in kern/, userspace libraries are in user/, and a variety of tools can be found in tools/, including the toolchain.
The Akaros kernel is licensed under the GNU General Public License, version 2. Our kernel is made up of code from a number of other systems. Anything written for the Akaros kernel is licensed “GPLv2 or later”. However, other code, such as from Linux and Plan 9, are licensed GPLv2, without the “or later” clause. There is also code from BSD, Xen, JOS, and Plan 9 derivatives. As a whole, the kernel is licensed GPLv2.
Note that the Plan 9 code that is a part of Akaros is also licensed under the Lucent Public License. The University of California, Berkeley, has been authorised by Alcatel-Lucent to release all Plan 9 software previously governed by the Lucent Public License, Version 1.02 under the GNU General Public License, Version 2. Akaros derives its Plan 9 code from this UCB release. For more information, see LICENSE-plan9 or here.
Our user code is likewise from a mix of sources. All code written for Akaros, such as user/parlib/, is licensed under the GNU LGPLv2.1, or later. Plan 9 libraries, including user/iplib and user/ndblib are licensed under the LGPLv2.1, but without the “or later”. See each library for details.
Likewise, tools/ is a collection of various code. All of our contributions to existing code bases, such as GCC, glibc, and busybox, are licensed under their respective projects' licenses.