Fixing my annoying kernel bug(s) – Part 2

May 01 2012 Published by under linux

This blog entry details some of the problem outlined in these posts.

This is a lengthy technical post:
TL;DR
There is a detailed manual for building a Debian kernel at http://users.wowway.com/~zlinuxman/Kernel.htm. I was familiar with much of the content already but it was still a very helpful reference; for example, using the ‘src’ group to avoid root was a useful thing to learn.

The information on patching the Kernel for Phenom was cobbled together from various websites including the Gentoo forums.

My system is currently built from Debian squeeze with a bunch of packages from various other repositories including the Debian backports and packages I manually backported from Wheezy (testing).

I could have applied the necessary patch to this kernel but I decided at the same time to have another go at getting to the latest 3-series kernel.

For a long time I was stuck on a 2.6.39 kernel as I wasn’t able to successfully simply build a later kernel package from the Debian sources that were in testing. I could have tried to build from the kernel.org sources but I have tried where possible to maintain my system using .deb packages as far as possible. In the intervening year however it seems 3.2 has been released in backports, so that saved me a lot of potential problems.

So I upgraded my kernel and applied the patch. Here is Yet Another Tutorial on building a kernel the ‘Debian Way’. This will yield a DPKG file that you can install without clobbering any other kernels.

Prerequisites

  • Install various pre-requisites – this will vary depending on your system.
    A fresh system will need many others, I needed these for LZ compression and for ‘make xconfig’
  • You need to have the Debian backports in your APT sources.list file.

    Having added this, do a sudo apt-get update.
  • The ideal method these days is to be able to do most of the work without dropping to root or using sudo. To achieve this add your account to the ‘src’ group and setup permissions accordingly.

    At this point you will need to log out and log in (although you could ssh back in as yourself, and I read somewhere recently that this may not be strictly necessary with the right ‘magic’ incantations any more…)
  • I like to experiment with virtualisation so along the way I downloaded a patch that may be necessary for this from http://users.wowway.com/~zlinuxman/kernel-package/linuxv3.diff (This is also an attachment to this post)
    Apply like:
  • I also made my own patches to do an optimized build for my AMD64 Phenom:
    File phenom_1.patch:

    File phenom_2.patch:

    File phenom_3.patch:
  • And of course, the patch to fix my Firewire subsystem crash as described in the previous post:

Procedure

  1. Install the Debian source package and unpack the tree:
  2. Note – it turns out that this is in fact a 3.2.9 kernel. For some reason the Debian version is 3.2.4-1~bpo60+1 ; go figure…

  3. Apply patches: assumes the patch files are in /usr/src :
  4. Configure the kernel build:
    I started by copying the default config from the binary backports kernel, and tweaking it for my own purposes (not shown here)
  5. Finally, build the Debian packages.

    Here, CONCURRENCY_LEVEL=4 sets the number of concurrent make processes used, for taking advantage of a multi-core system.
    From the above settings, the actual package will become ‘linux-image-3.2.9-xxx-preempt-amd64′ with a Debian version of ‘1~yyy.00.00′ and cat /proc/version output of ‘3.2.9-xxx-preempt-amd64′
    Using this mechanism means you can have concurrent ‘flavours’ of a kernel installed but still being upgradable within that flavour.
  6. Installation:

    This should also trigger any DKMS modules to rebuild if present.
    My NVidia 280.13 driver rebuilds fine with this version.

Testing

Of course the proof is in the pudding.

After rebooting, I repeated the sequence necessary to trigger the fault: and it did not recur. Woot!

Attachments:
linuxv3.diff

No responses yet

Fixing my annoying kernel bug(s) – Part 1

May 01 2012 Published by under linux

This blog entry details some of the problem outlined in this post.

Regularly enough to almost be annoying, I was having a kernel fault popup (see stack trace following this blog.) This was not quite annoying enough to do something about for a long time because the computer wasn’t crashed and there were no obvious side effects. Eventually however I realised that each time it occurred a new instance of my external backup drive was being mounted automagically, so being a little cautious about potential data loss decided to try and get to the bottom of things.

After a few days taking notes and some experimentation I discovered the following:

  • It would happen with regularity after waking the computer up from suspend to RAM.
  • I could force it to happen by ‘ejecting’ the external backup drive.

After initially suspecting it was something to do with firewire or ACPI (shudder) looking at the stack trace, and the coincidence with removing the drive, it seemed in fact to be an issue in the SCSI subsystem somewhere. In fact I then worked out the following commands would always repeat the problem:

At this stage I stumbled over an almost identical stack trace inthe lkml.org mailing list, which luckily short-circuited my experimentation – learning about the /sys and scsi device manipulation is kind of useful maybe but I had a lot of other things to do as well.

The patch for the problem is documented at https://lkml.org/lkml/2012/2/8/246.

The next stage, was how to apply it to my system? This is described in the next blog post.

TD;DR

The offending stack trace:

Mar 16 22:44:47 atlantis3 kernel: [ 2020.140704] sd 15:0:0:0: [sdh] Stopping disk
Mar 16 22:44:48 atlantis3 kernel: [ 2021.495923] firewire_sbp2: released fw1.0, target 15:0:0
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849206] ------------[ cut here ]------------
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849232] WARNING: at /build/buildd-linux-2.6_3.2.4-1~bpo60+1-amd64-Ns0wYl/linux-2.6-3.2.4/debian/build/source_amd64_none/fs/sysfs/inode.
c:323 sysfs_hash_and_remove+0x30/0x8b()
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849242] Hardware name: To be filled by O.E.M.
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849248] sysfs: can not remove 'bsg', no directory
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849253] Modules linked in: nls_utf8 nls_cp437 vfat fat rfcomm bridge stp bnep speedstep_lib cpufreq_powersave cpufreq_userspace powerno
w_k8 ppdev cpufreq_stats lp mperf cpufreq_conservative nfsd lockd nfs_acl auth_rpcgss sunrpc kvm_amd kvm binfmt_misc ext3 jbd fuse ext2 it87 hwmon_vid loop btusb joydev bluetoo
th rfkill usbhid hid snd_usb_audio snd_usbmidi_lib cx22702 cx88_dvb cx88_vp3054_i2c videobuf_dvb dvb_core rc_winfast tuner_simple tuner_types tda9887 ir_lirc_codec lirc_dev ir_
mce_kbd_decoder tda8290 snd_hda_codec_realtek firewire_sbp2 ir_sony_decoder snd_hda_intel snd_hda_codec ir_jvc_decoder ir_rc6_decoder snd_hwdep tuner ir_rc5_decoder cx8800 cx88
_alsa ir_nec_decoder snd_pcm_oss snd_mixer_oss cx8802 cx88xx rc_core i2c_algo_bit tveeprom snd_pcm gspca_ov519 gspca_main v4l2_common snd_seq_midi videodev snd_rawmidi snd_seq_
midi_event media snd_seq usblp v4l2_compat_ioctl32 videobuf_dma_sg snd_timer snd_seq_device videobuf_core btcx_risc sp5100_tco k10temp edac_core parpo
Mar 16 22:44:51 atlantis3 kernel: rt_pc parport snd i2c_piix4 tpm_tis tpm edac_mce_amd i2c_core tpm_bios soundcore processor evdev pcspkr thermal_sys mxm_wmi wmi snd_page_alloc
 button ext4 mbcache jbd2 crc16 dm_mod nbd btrfs zlib_deflate crc32c libcrc32c usb_storage uas sg sr_mod cdrom sd_mod crc_t10dif ata_generic ohci_hcd ehci_hcd firewire_ohci fir
ewire_core crc_itu_t pata_jmicron ahci libahci libata xhci_hcd r8169 mii scsi_mod usbcore usb_common [last unloaded: scsi_wait_scan]
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849487] Pid: 9293, comm: bash Not tainted 3.2.0-0.bpo.1-amd64 #1
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849493] Call Trace:
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849507]  [] ? warn_slowpath_common+0x78/0x8c
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849517]  [] ? warn_slowpath_fmt+0x45/0x4a
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849527]  [] ? sysfs_hash_and_remove+0x30/0x8b
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849538]  [] ? kobject_get+0x12/0x17
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849547]  [] ? mutex_lock+0xd/0x2c
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849555]  [] ? bsg_unregister_queue+0x3f/0x78
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849587]  [] ? __scsi_remove_device+0x34/0xb7 [scsi_mod]
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849613]  [] ? scsi_remove_device+0x20/0x2b [scsi_mod]
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849627]  [] ? sbp2_remove+0x77/0x138 [firewire_sbp2]
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849639]  [] ? __device_release_driver+0x7f/0xca
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849648]  [] ? device_release_driver+0x1d/0x28
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849665]  [] ? driver_unbind+0x56/0x8b
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849674]  [] ? sysfs_write_file+0xe0/0x11c
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849682]  [] ? vfs_write+0xa4/0xff
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849690]  [] ? sys_write+0x45/0x6e
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849699]  [] ? system_call_fastpath+0x16/0x1b
Mar 16 22:44:51 atlantis3 kernel: [ 2023.849706] ---[ end trace d8b356d84e0828d4 ]---
Mar 16 22:44:51 atlantis3 kernel: [ 2024.653272] firewire_sbp2: released fw1.1, target 16:0:0
Mar 16 22:46:06 atlantis3 kerneloops: Submitted 1 kernel oopses to www.kerneloops.org

No responses yet

Patching and Building a custom Linux Kernel in Debian

Apr 10 2012 Published by under linux

These posts cover a topic which seems to be documented to varying degrees across the net, but nothing quite exactly matched what I wanted to do. In the end this is a result of multiple sources of information and inspiration (and perspiration…)

For some time I had been getting a Kernel fault report popup with irritating regularity. In the end I isolated it to something going wrong with my external Firewire drive after my computer was resuming from suspend (specifically Suspend to RAM.)
In the end chasing this down required working through the following tasks:

  1. Disabling the proprietary NVidia driver and activating ‘nv’ ( I was unable to successfully configure nouveau to work with my particular dual head configuration), so that my kernel was no longer ‘TAINTED’, which would have led me into a brick wall if I had been required to report a kernel bug.
  2. Consistently replicating the fault, which included learning about a bunch of stuff in the Linux /sys filesystem.
  3. Finally getting a 3-series kernel to work on Debian Squeeze – it turns out by now 3.2 has been packaged into Debian backports, which gets me past an earlier roadblock with kernel upgraded. Upgrading to the latest kernel would eliminate if the problem had been resolved (which is was not at least of 3.2.9)
  4. Rebuilding the kernel from source – (something I have done this many times before, but it doesn’t hurt to recap) and applying the patches needed
  5. Re-enabling NVidia – which involved verifying my DKMS setup was still working.

I haven’t blogged recently due to various family mini-crises to do with pets, sickness and other issues, as well as extra busyness at work.

As it is getting late this post will conclude with the command line used to build and install my kernel, and I will expand on this in the next post.

Things to note:

  • The above will build a kernel using the same configuration as an installed Debian backports 3.2 kernel, assuming the backports kernel an source packages have been installed. There are no changes or patches yet
  • Your user must be in the ‘src’ group for the make-kpkg command to work as-is.
  • The 3.2 kernel in backports (as of March 2012) was in fact version 3.2.9 although this is not indicated in the Debian version for some reason.

No responses yet

Debian Squeeze and Linux Kernel 3

Dec 27 2011 Published by under linux

A couple of months ago I splashed out and upgraded my motherboard to an ASUS Sabretooth 990FX, in theory to complement my quad core Athlon II 640 with faster RAM and SATA 3, more PCI-E slots etc. (but the camoflauge colour scheme and mil-spec capacitors didn’t hurt either… :-) )

However it wasn’t all smooth sailing, Continue Reading »

No responses yet