Kernel Traffic #266 For 1 Jul 2004 By Zack Brown Table Of Contents * Standard Format * Text Format * XML Source * Czech Translation * Mailing List Stats For This Week * Threads Covered (24 Linux 2.6.7-rc2 Released; Discussion 1. 29 May 2004 - 10 Jun 2004 posts) Of 2.8 Compiler Requirements; Build System Confusion 2. 2 Jun 2004 - 10 Jun 2004 (10 New kcopyd Memory Copying Tool posts) 3. 3 Jun 2004 - 14 Jun 2004 (24 Linux 2.6.7-rc2-mm2 Released posts) 4. 4 Jun 2004 - 14 Jun 2004 (14 Status Of JFFS2 posts) 5. 4 Jun 2004 - 10 Jun 2004 (5 NAPI Configuration Help Text posts) 6. 7 Jun 2004 - 14 Jun 2004 (9 Another Shot At A Debian Build Target posts) 7. 8 Jun 2004 - 10 Jun 2004 (10 NTFS Update Including Overwriting posts) Resident Files 8. 9 Jun 2004 - 11 Jun 2004 (38 Linux 2.6.7-rc3-mm1 Released posts) 9. 14 Jun 2004 (12 Linux 2.6.7-rc3-mm2 Released posts) 10. 16 Jun 2004 (1 CONFIG_PREEMPT For PPC64 post) Mailing List Stats For This Week We looked at 1547 posts in 8858K. There were 467 different contributors. 241 posted more than once. 182 posted last week too. The top posters of the week were: * 46 posts in 306K by William Lee Irwin III * 43 posts in 232K by Andrew Morton * 42 posts in 227K by Bartlomiej Zolnierkiewicz * 27 posts in 116K by Jens Axboe * 27 posts in 95K by Christoph Hellwig * Full Stats 1. Linux 2.6.7-rc2 Released; Discussion Of 2.8 Compiler Requirements; Build System Confusion 29 May 2004 - 10 Jun 2004 (24 posts) Archive Link: "Linux 2.6.7-rc2" Topics: FS: CIFS, FS: XFS, Kernel Release Announcement, Networking, Sound: ALSA , USB People: Linus Torvalds, Peter Osterlund, Albert Cahalan, Thomas Zehetbauer, Denis Vlasenko, Bill Davidsen, Geert Uytterhoeven, Andrew Morton Linus Torvalds announced Linux 2.6.7-rc2, saying: An ALSA update, and tons of sparse type-fixes from Al Viro. Some even uncovered real bugs in user pointer usage. USB, firewire, network driver, XFS, CIFS updates. One change included a 'make checkstack' build target from Andrew Morton; and Peter Osterlund used it to identify some heavy stack usage in certain areas. After hunting around, he identified the problem, and explained, "it looks like gcc is building a temporary structure on the stack and then copies the whole thing to *wdata." Albert Cahalan replied, "This would be required because of the -Wno-strict-aliasing option. For the 2.7.xx kernels, how about we start off by replacing -Wno-strict-aliasing with -std=gnu99 ? It's been 5 years since 1999. The "restrict" keyword is useful too." But Linus said: No can do. Aliasing in gcc is so broken (_purely_ type-based and no way to avoid it sanely in older versions) that it's not going to happen. When we can depend on everybody having gcc-3.3+ something, and that one properly supports the "may_alias" attribute, we may change that. "restrict" is pretty much useless. It just weakens the already too-weak alias rules of standard gcc. Albert replied, "By the time Linux 2.8.0 is out, gcc-3.3+ should be a perfectly reasonable requirement." Elsewhere, along a different train of thought, Thomas Zehetbauer noticed that "Make oldconfig silently disabled support for my CONFIG_TIGON3 NIC. It seems that it depends on CONFIG_NET_GIGE which in turn depends on CONFIG_NET_ETHERNET which was not required in 2.6.6 kernel." Denis Vlasenko replied, "Many days ago I read on lkml that separating 10,100 and 1000 Mbit ethernet is not really justified. There are devices which have 100 and 1000 variants. Just keeping all ethernet devices in one menu sounds sane to me." And Bill Davidsen said: There are other issues with the build process, when a driver supports a chipset used in several products there's no reasonable way to find out which driver should be used, and as you say the split of speed makes less and less sense, and will just get worse when 10Ge is more common. The solution may be an external table, program, or whatever, since the situation changes as drivers are modified to support new models, chipsets move to new vendors, etc. But it would be *really nice* to find the 3c940 with 3COM drivers, instead of grepping driver source and looking at spec sheets to find out that the driver is called something like sk98lin, it's in an unobvious place and has a name unrelated to 3COM. Here's a suggestion if someone wants to do something about this, like LDP. Produce a CSV list of vendor name, like 3c940, name used for config in the menu, module name and symbol in the .config file. Would let users find things a lot faster, and could be used with grep as well as some spreadsheet tool. Geert Uytterhoeven also remarked at this, "Another problem with the sk98lin driver is that it hasn't yet been converted to the new driver model. Even when booting Knoppix, you have to manually modprobe sk98lin to use the 3c940. Took me a while to find out..." 2. New kcopyd Memory Copying Tool 2 Jun 2004 - 10 Jun 2004 (10 posts) Archive Link: "[PATCH] 2/5: Device-mapper: kcopyd" People: Alasdair G. Kergon, Kevin Corry, Andrew Morton Alasdair G. Kergon posted a patch to implement something called kcopyd, but offered no explanation; after an inquiry by several folks including Andrew Morton, he explained that it was "A daemon for copying regions of block devices around in an efficient manner. Multiple destinations can be specified for a copy. Designed to perform well both with many small chunks or few large chunks." Kevin Corry also added that this would have asynchronous completion notification. 3. Linux 2.6.7-rc2-mm2 Released 3 Jun 2004 - 14 Jun 2004 (24 posts) Archive Link: "2.6.7-rc2-mm2" Topics: Device Mapper, Framebuffer, Kernel Release Announcement People: Andrew Morton Andrew Morton announced Linux 2.6.7-rc2-mm2, saying: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7-rc2/ 2.6.7-rc2-mm2/ * Huge update to the SiS framebuffer driver. Please test. * As soon as I merged Andrey's big dmi cleanup patches everyone started madly patching dmi_scan.c. The subsequent reject storm forced me to drop them. * Big devicemapper update - new feature work. * Various fixes, some quite serious. 4. Status Of JFFS2 4 Jun 2004 - 14 Jun 2004 (14 posts) Archive Link: "jff2 filesystem in vanilla" Topics: FS: FAT, FS: ReiserFS, FS: ext2, FS: ext3, Version Control People: Daniel Egger, David Woodhouse Michal Semler wanted to use JFFS2, and asked where to find patches for 2.4 and 2.6; Daniel Egger replied: JFFS2 is included in the standard kernels IIRC, however I'd recommend using the CVS version from the official repository as there are huge improvements in there. To use it on a non-MTD (Memory technology devices, e.g. flash chips soldered on some board with direct access) device you will need an emulation layer, the pseudo Block-MTD device. And you will need some additional partition using ext2 /ext3/reiserfs/FAT containing the kernel for your Grub/LILO bootloader. David Woodhouse added: JFFS2 in the 2.4 kernel is an old stable branch. The code in 2.6 and in CVS is much faster to mount, especially, and it also supports NAND flash. Linus' tree is updated periodically when I'm sufficiently happy with the stability of the development tree in CVS, and when I have time to merge it, test it and read through all the changes for sanity -- which often involves redoing some of them. You should be OK using what's in the kernel -- let me know if you have problems. Daniel remarked, "The original version in the 2.4 kernel has a dramatic problem leading to FS corruption, at least when used with blkmtd on CF. That's why I'm using 2.4 and a CVS snapshot, not only because it is much faster." David asked for a more specific bug report, but Daniel said unfortunately he couldn't. He did explain in general, however, that "We had misterious kernel oopses on bootup in changing places in the source that appeared and vanished at will. At first I had broken memory in mind but this wasn't the case. My next guess was that the log checking (the looong version) might temporarily overheat the passively cooled CPU but we could scrap that possibility as well after reproducing the problem in a very cool environment. After the tedious upgrade to the CVS version, everything works like a charm and is now in never-touch-a-running system mode." 5. NAPI Configuration Help Text 4 Jun 2004 - 10 Jun 2004 (5 posts) Archive Link: "[2.6 patch] add NAPI help texts" People: David S. Miller, Adrian Bunk, Jeff Garzik, Andrew Morton, Linus Torvalds Adrian Bunk added some help text to the NAPI configuration option: NAPI is a new driver API designed to reduce CPU and interrupt load when the driver is receiving lots of packets from the card. It is still somewhat experimental and thus not yet enabled by default. If your estimated Rx load is 10kpps or more, or if the card will be deployed on potentially unfriendly networks (e.g. in a firewall), then say Y here. See for more information. If in doubt, say N. The patch looked good to David S. Miller and Jeff Garzik, and they applied it to the 2.6 patch queue for Linus Torvalds and Andrew Morton. 6. Another Shot At A Debian Build Target 7 Jun 2004 - 14 Jun 2004 (9 posts) Archive Link: "kbuild make deb patch" Topics: Kernel Build System People: Wichert Akkerman, David Vrabel, Flavio Stanchina, Sam Ravnborg See Issue #238, Section #3 (11 Oct 2003: Makefile .deb Target) for the earlier discussion of this. This time, Wichert Akkerman said: I originally posted this before 2.6.0 was out and was told to wait until things have stabilized a bit. At least from my point of view that has happened by now so I'm bringing this one up again. kbuild has had a rpm make target for some time now. Since the concept of kernel packages is quite convenient I added a deb target as well, using the patch below. Since I'm (still) not familiar with kbuild Makefile bits are quite rough, but they Work For Me(Tm). David Vrabel asked, "Why this and not the make-kpkg utility in Debian's kernel-package package?" And Wichert replied: Several reasons: 1. it works on non-Debian systems which use dpkg 2. it is a *lot* simpler and faster than make-kpkg Elsewhere, Flavio Stanchina remarked, "I like the idea a lot, but your patch to the makefile touches quite a few things in the clean target that AFAICT are not related to the deb target in any way. Perhaps you are diffing from an older tree?" Wichert smacked himself on the head and agreed that yes, he'd inadvertantly included some extraneous work in the patch. He posted an updated patch. Elsewhere, Sam Ravnborg also said to Wichert: I'm in progress of doing some infrastructure work to better support building different packages. I have requests for .tar.gz, tar.gz2 as well as deb. I hope to post a few patches later this week. I will include your script in the patch-set then. 7. NTFS Update Including Overwriting Resident Files 8 Jun 2004 - 10 Jun 2004 (10 posts) Archive Link: "[2.6.7-BK] NTFS 2.1.13 - Enable overwriting of resident files and housekeeping of system files." Topics: FS: NTFS, Virtual Memory People: Anton Altaparmakov Anton Altaparmakov announced "The next NTFS release. This one is a milestone in that it finally allows people to overwrite resident, i.e. very small, files, too. As a bonus we also do all the necessary housekeeping of the NTFS system files to ensure data integrity and hence ntfsfix is no longer needed to be run after unmounting." Among the patches, the following were included: Patch 1: "Implement writing of mft records (fs/ntfs/mft.[hc]), which includes keeping the mft mirror in sync with the mft when mirrored mft records are written. The functions are write_mft_record{,_nolock}(). The implementation is quite rudimentary for now with lots of things not implemented yet but I am not sure any of them can actually occur so I will wait for people to hit each one and only then implement it." Patch 2: "Commit open system inodes at umount time. This should make it virtually impossible for sync_mft_mirror_umount() to ever be needed." Patch 3: "Implement ->write_inode (fs/ntfs/inode.c::ntfs_write_inode()) for the ntfs super operations. This gives us inode writing via the VFS inode dirty code paths. Note: Access time updates are not implemented yet." Patch 4: * Implement fs/ntfs/mft.[hc]::{,__}mark_mft_record_dirty() and make fs/ntfs/ aops.c::ntfs_writepage() and ntfs_commit_write() use it, thus finally enabling resident file overwrite! (-8 This also includes a placeholder for ->writepage (ntfs_mft_writepage()), which for now just redirties the page and returns. Also, at umount time, we for now throw away all mft data page cache pages after the last call to ntfs_commit_inode() in the hope that all inodes will have been written out by then and hence no dirty (meta)data will be lost. We also check for this case and emit an error message telling the user to run chkdsk. * If the user is trying to enable (dir)atime updates, warn about the fact that we are disabling them. Patch 5: "Use set_page_writeback()/end_page_writeback() in ntfs_writepage() resident attribute write code path as otherwise the radix-tree tag PAGECACHE_TAG_DIRTY remains set even though the page is clean." Patch 6: "Implement ntfs_mft_writepage() so it now checks if any of the mft records in the page are dirty and if so redirties the page and returns. Otherwise it just returns (after doing set_page_writeback(), unlock_page(), end_page_writeback() or the radix-tree tag PAGECACHE_TAG_DIRTY remains set even though the page is clean), thus allowing the VM to do with the page as it pleases. Also, at umount time, now only throw away dirty mft (meta)data pages if dirty inodes are present and ask the user to email us if they see this happening." Patch 7: "Add functions ntfs_{clear,set}_volume_flags(), to modify the volume information flags (fs/ntfs/super.c)." And patch 8: * Enable overwriting of resident files and housekeeping of system files. * Mark the volume dirty when (re)mounting read-write and mark it clean when unmounting or remounting read-only. If any volume errors are found, the volume is left marked dirty to force chkdsk to run. * Add code to set the NT4 compatibility flag when (re)mounting read-write for newer NTFS versions but leave it commented out for now since we do not make any modifications that are NTFS 1.2 specific yet and since setting this flag breaks Captive-NTFS which is not nice. This code must be enabled once we start writing NTFS 1.2 specific changes otherwise Windows NTFS driver might crash / cause corruption. * Fix a silly bug that caused a deadlock in ntfs_mft_writepage(). For inode 0, i.e. $MFT itself, we cannot use ilookup5() from there because the inode is already locked by the kernel (fs/fs-writeback.c::__sync_single_inode()) and ilookup5() waits until the inode is unlocked before returning it and it never gets unlocked because ntfs_mft_writepage() never returns. )-: Fortunately, we have inode 0 pinned in icache for the duration of the mount so we can access it directly. 8. Linux 2.6.7-rc3-mm1 Released 9 Jun 2004 - 11 Jun 2004 (38 posts) Archive Link: "2.6.7-rc3-mm1" Topics: Kernel Release Announcement People: Andrew Morton Andrew Morton announced Linux 2.6.7-rc3-mm1, saying: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7-rc3/ 2.6.7-rc3-mm1/ * Included the dreaded cpumask rework. * Lots of little fixes. * Added support for the NX (no execute) pagetable flag on ia32. 9. Linux 2.6.7-rc3-mm2 Released 14 Jun 2004 (12 posts) Archive Link: "2.6.7-rc3-mm2" Topics: FS: ext2, FS: ext3, Kernel Release Announcement People: Andrew Morton Andrew Morton announced Linus 2.6.7-rc3-mm2, saying: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7-rc3/ 2.6.7-rc3-mm2/ * Mainly lots of little fixes. * Added the ext3 online-resize patch. See http://sourceforge.net/projects/ ext2resize/ for some details. Needs a bit of work, and documentation. 10. CONFIG_PREEMPT For PPC64 16 Jun 2004 (1 post) Archive Link: "[PATCH] Implement CONFIG_PREEMPT for PPC64" People: Paul Mackerras Paul Mackerras said, "This patch implements CONFIG_PREEMPT for ppc64. Aside from the entry.S changes to check the _TIF_NEED_RESCHED bit when returning from an exception, there are various changes to make the ppc64-specific code preempt-safe, mostly adding preempt_enable/disable or get_cpu/put_cpu calls where needed. I have been using this on my desktop G5 for the last week without problems." Sharon And Joy Kernel Traffic is grateful to be developed on a computer donated by Professor Greg Benson and Professor Allan Cruse in the Department of Computer Science at the University of San Francisco. This is the same department that invented FlashMob Computing. Kernel Traffic is hosted by the generous folks at kernel.org. All pages on this site are copyright their original authors, and distributed under the terms of the GNU General Public License version 2.0.