Thursday, September 15, 2016

OOM during rsync with USB drives

I had a (maybe/probably) dying external USB drive. It's part of a non-RAID LVM so I needed to grab everything off it before it died. I mounted a second external USB drive and started an rsync job to copy everything from the failing drive to the new one. After copying around 250GB the rsync would fail with an oom. 

Subsequent attempts would fail the same way but much more quickly. Running rsync with ionice -c 3 doesn't help (suggested in an ubuntuforums post), I wasn't using any of the options that would make recursiveness non-incremental. 

 I tried quite a few things (including --delete-during so that successive rsync attempts would have fewer files to keep track of) but finally found the solution. 

While rsync is running in one screen window, in a separate screen window I run: 


 
while true
do
  echo 3 > /proc/sys/vm/drop_caches
  sleep 2
done
 


There may be interaction between the USB drivers (perhaps timing/slowness?) and the memory used by the buffer caches. Perhaps when rsync needs memory the kernel isn't able to free it fast enough and the oom killer steps in before the memory becomes available to give to rsync? 

In any case, with this drop_caches loop, rsync is now running happily for hours. I expect it'll actually finish. I don't mind the loss of caching. It's only for very large copies from USB drives and the copies are reading everything sequentially.

There won't be much of a benefit to caching. Perhaps I could use just "1" though instead of "3", so that we'd keep inodes and dentries in the cache. But the limiting factor here is really USB transfer rate and the transfer rate while dropping caches is just the same as the rate when not dropping caches (before oom).

Sunday, July 31, 2016

Building a mainline kernel for the odroid

Shuah Khan has a howto on building a 4.x mainline kernel for the odroid XU4.  That was very helpful when I built the mainline kernel for my odroid XU3.  I've also got an odroid U3 that I'll be upgrading soon.

I wrote a little shell script to let me do most things automatically (because I needed to rebuild the kernel a few times, adding features that I needed).

Mainly I needed ecryptfs support, iptables modules and some cgroups options for unprivileged lxc containers.  Also a little modification to set the default cpufreq governor to ondemand instead of performance.  I'll update this post at some point to document which additional options I enabled to get all of those things.

I've seen a few more minor patches elsewhere that I'll want to test out.  In particular I'll look for the Mali GPU driver config fix to set it to either not run at all or to run at its lowest energy level (because I don't use that at all, these odroids run headless).

A list of things that I've needed to enable (I'm adding items as I find missing items and rebuild the kernel iteratively :-):

ecryptfs support

  • CONFIG_OVERLAY_FS
  • CONFIG_ECRYPT_FS


tun device for openvpn

  • CONFIG_TUN

IO accounting stuff for iotop

  • CONFIG_TASKSTATS
  • CONFIG_TASK_DELAY_ACCT
  • CONFIG_TASK_XACCT
  • CONFIG_TASK_IO_ACCOUNTING


netfilter/iptables modules

  • CONFIG_NETFILTER_ADVANCED
  • most things in IP: Netfilter Configuration
  • most things in IPv6: Netfilter Configuration
  • most things in Core Netfilter Configuration


user namespaces for unprivileged lxc

  • CONFIG_USER_NS

  • I forget if anything else was disabled in namespaces support, I just enabled everything under there.

veth support for lxc

  • CONFIG_VETH


#!/bin/bash

# optionally, make menuconfig and set options.
# particularly for the cgroups stuff for unprivileged lxc and ecryptfs

make menuconfig

#make exynos_defconfig

make prepare modules_prepare
time make -j 8 LOCALVERSON="-tiger" bzImage modules dtbs

cp arch/arm/boot/dts/exynos5422-odroidxu3.dtb /media/boot/exynos5422-odroidxu3_next.dtb
cp arch/arm/boot/zImage /media/boot/zImage_next

make modules_install

apt-get install -y live-boot

cp .config /boot/config-`cat include/config/kernel.release`
update-initramfs -c -k `cat include/config/kernel.release`
mkimage -A arm -O linux -T ramdisk -C none -a 0 -e 0 -n uInitrd -d /boot/initrd.img-`cat include/config/kernel.release` /boot/uInitrd-`cat include/config/kernel.release`
cp /boot/uInitrd-`cat include/config/kernel.release` /media/boot/
cat include/config/kernel.release

echo "edit /media/boot/boot.ini"
echo "comment out the current setenv bootcmd and replace it with"

echo "setenv bootcmd \"fatload mmc 0:1 0x40008000 zImage_next; fatload mmc 0:1 0x42000000 uInitrd-`cat include/config/kernel.release`; fatload mmc 0:1 0x44000000 exynos5422-odroidxu3_next.dtb; bootz 0x40008000 0x42000000 0x44000000\""

Wednesday, May 18, 2016

x11vnc xauth

I've had problems lately starting x11vnc (to connect to :0.0) when I'm starting it over ssh.  I don't want x11vnc running all the time, so I don't start it when I'm at work.  But when I need it I'm *not* on the desktop in question, instead I'm remote.

I found the solution at ubuntuforums:

http://ubuntuforums.org/showthread.php?t=1314958

I do run screen on the remote desktop (my work desktop), and I generally start screen when X is already up, so *that* has the correct $XAUTHORITY entries.  So I can just connect to the running screen instance and do:

echo $XAUTHORITY


That will have something (from the ubuntuforums link above):

/var/run/gdm/auth-for-fred-s0G6r1/database


and, from ssh, I can use that:

x11vnc -display :0 -auth /var/run/gdm/auth-for-fred-s0G6r1/database

I should probably just have .bashrc detect if there's a $XAUTHORITY variable and if there is, echo it out to a file that I can then just source :-).

Friday, April 15, 2016

btrfs COW breaks Virtualbox

btrfs copy on write breaks VirtualBox so that starting a VM will often not complete due to spurious disk errors.

I'm trying out the chattr fix that turns off COW.

I've just done this:

cd ~/VirtualBox\ VMs

for f in `find . -name '*.vdi'`
do
  F=$f
  T=tmpfile
  touch $T
  chattr =C $T
  dd if=$F of=$T bs=1M
  rm $F
  mv $T $F
done

Here's hoping it all works :-).  It's faster than mkfs.ext4 and restore from backup :-).

Tuesday, March 01, 2016

Lightning floating timezone warning

For the longest time (because I'm lazy), everytime thunderbird+lightning would start it would show three dialog box warnings about Unknown timezones being treated as the "floating" timezone.

This is easily fixed (in the sense that the warnings don't show anymore, and, given the nature of the fix, likely in the sense that timezone handling is now correct).


  1. Edit | Preferences | Advanced | Config Editor
  2. Click on the "I'll be careful" button.
  3. search for calendar.icaljs boolean value
  4. Set the value to true.

Tuesday, February 09, 2016

attic prune when using a single repository for multiple hosts

I use Attic Backup.  I'm considering switching to the BorgBackup fork, but that'll be for the future.  Maybe when BorgBackup gets pull backups implemented.

I have multiple computers (some servers, some desktops/laptops).  All of them backup to the same attic repository.  This is mainly to take advantage of deduplication.  Some files are sufficiently important to me (and disk space is now sufficiently large) that I keep the important files synced on ALL my machines, servers, laptops and desktops alike.

The disk savings are pretty large and pretty obvious a reason for having just one backup repository.  There are disadvantages to this too.  Attic locks the repository when doing any work on it (including just listing the repository or showing information about individual archives).  So I have to schedule backups so that they don't interfere with each other.

Backups of the servers run fast enough (after the first 20+hour full backup) now that I can just set up their schedules 2 hours apart and be confident that the second won't need to wait in the queue for the first.  And if it has to, it's no big deal.

The laptops don't have daily backups (mainly because I haven't figured out yet how to get anacron to start up a backup when the laptop wakes up from suspend AND hasn't done a backup yet for that day).  I'll probably just run those backups once a week.  Or maybe I'll figure out how to get anacron to do it.

I also use

attic prune -v -d 30 -w 5 -m 12 -y 100 [REPOSITORY]

to prune old backups.  What this does is keep 30 daily backups, 5 weekly backups and so on.  The problem with this command line is that if I ran backups for server1, server2, laptop1, and laptop2 on the same day then only one of those backups will be kept.  Attic doesn't know which backups were for which servers.  It just knows which backups were done on which day, so it chooses one backup for the day and prunes/deletes the others.

The solution is to use the -p parameter.  -p [PREFIX] tells attic prune that it should ignore all backups except the ones that start with [PREFIX].  My backup archives are named, e.g.,

server1-2016-02-09

so in the scheduled shell script that runs the backup, before attic create runs, first attic prune runs:

 attic prune -v -d 30 -w 5 -m 12 -y 100 -p server1 [REPOSITORY]

This will consider only archives whose archive tag start with server1.  So archives for other servers and laptops are safe and are kept.  Only archives of the pattern server1* are considered for pruning according to the rules given on the command line.

Thursday, January 21, 2016

Java8+Minecraft server prerequisites

I had to set up a minecraft server on a new VPS when the old VPS gave up the goat.

The new VPS is still Trusty (14.04.3 now).

This setup is a bit better than previously since it's a Xen instance, has twice the RAM and I'm now able to run lxc (and, for minecraft, use an unprivileged lxc container). I had a backup of the minecraft server, so just copied that over. I then installed Oracle's Java 8 since that's what minecraft recommend.

sudo apt-get install software-properties-common
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
sudo apt-get install libxrender1 libxtst6 libxi6

The last line above is needed since there's no headless option for java8 and when it starts it just requires some shared objects from those packages so it can link against them even though it'll never use them (since this is a VPS, and so, headless).

minecraft is running again now.  We'll see whether the 2Ghz opterons on there are fast enough to keep up with the boys' redstone creations.

I am going to *HAVE* to set up an hourly backup regime though, with hourly rsync.  The price of going with the cheapest VPS providers out there, they might not survive more than 6 months or so, and I need to be able to restore things on the next VPS in the chain.

Thursday, January 07, 2016

Thunderbird corrupted panacea.dat causing slowness

I had major thunderbird slowness issues. Just scrolling the folder list (grabbing the folder list scrollbar and moving it, or moving the scrollwheel while in the list) would lead to no thunderbird screen updates for a second or two. Clicking on a mailbox did the same, clicking on messages took even longer. Some discussion on https://bugzilla.mozilla.org/show_bug.cgi?id=794401 led to me removing panacea.dat. This could have been the wrong thing to do, alta88 does say that folder pretty names might be lost. I'm on thunderbird 38.4.0 though, not the 12,13,14,15 versions discussed in the bug report. That may be why, after just removing the file, I didn't lose any pretty folder names. That certainly has sped up thunderbird so it's now usable and now I won't need to switch to evolution :-)