Porting Android

home grailos code docs bib random links maps blog contact

Going to Vietnam

Sun, 02 Nov 2008 21:01:35

Well, for a change I will be flying for pleasure, not business (although 8 hours in economy is hardly pleasurable). Suzy and I鈥檒l be taking time-off in Vietnam, Cambodia and Singapore. I don鈥檛 expect to be contactable or regularly checking e-mail for the next two weeks, so please don鈥檛 expect a reply to any correspondence. For those still interesting in talking at the Open Mobile Miniconf please keep the submissions coming!

permalink

Android booting on Neo 1973

Sun, 02 Nov 2008 16:03:45

Well, it started almost a year ago, but I finally now have Android booting on my Neo 1973 phone:

clip_image002

It ain鈥檛 exactly running fast yet, and not everything it working 100%, but I think most of the tricky bits are done. I鈥檓 starting to push most of these changes back to the android project. It seems that while I鈥檝e been working on this Sean McNeill has been having similar successes getting Android up on the latest Freerunner phones.

permalink

Android on ARMv4 (take 2)

Mon, 27 Oct 2008 21:36:32

So, my earlier post on this was a little premature; anyone who has tried out the code has found out that it pretty much doesn鈥檛 work (hey I did warn you!). Now there are a range of fun reasons why this didn鈥檛 work, most of which I鈥檝e now solved.

Firstly, it turns out that EABI and ARMv4T are pretty much incompatible. (I鈥檒l post separately about that!). In short, thumb interworking doesn鈥檛 (can鈥檛) work, so I鈥檝e reverted back to plain old ARMv4 architecture as my target (the only difference between ARMv4 and ARMv4T is the thumb stuff, which we can鈥檛 use until the compiler / spec is fixed.). So I鈥檝e updated the linux-arm.mk to support ARMv4 for now as well.

Of course the next problem that this introduces is that the bx instruction doesn鈥檛 exist on ARMv4, and GCC (helpfully) complains and stops the compilation. Now a BX without thumb support is simply a mov pc, instruction, so I went through and provided a BX macro that expands to either bx or mov pc,. This is a little bit nasty/invasive because it touches all the system call bindings, thankfully these are generated anyway, but it makes the diff quite large. (When I have time I鈥檒l make it so that generation is part of the buid system, not a manual process.)

The next problem is that the provided compiler鈥檚 libgcc library is build for ARMv5, and has instructions that just don鈥檛 exist on ARMv4 (shc as clz), so I went and built a new compiler targeted to ARMv4. There is no reason why this couldn鈥檛 be set up as a multi-lib compiler that supports both, but I don鈥檛 have enough GCC wizardry in me to work that out right now. So a new compiler.

This got things to a booting stage, but not able to mount /system or /data. Basically, Android by default uses yet another flash file-system (YAFFS), but for some reasons, which I couldn鈥檛 fully work out initially, the filesystem just didn鈥檛 seem to cleanly initialise and then mount. So, without diving too deep, I figured I could just use jffs2 instead, which I know works on the target. So I upgraded the Android build system to support allowing you to choose which filesystem type to use, and providing jffs2 as an option. This was going much better, and I got a lot further, far enough that I needed to recompile my kernel with support for some of the Android specific drivers like ashmem, binder and logger. Unfortunately I was getting a hang on an mmap call, for reasons that I couldn鈥檛 quite work out. After a lot of tedious debugging (my serial console is broken, so I have to rely on graphics console, which is really just an insane way to try and debug anything), anyway, it turns out that part of what the Dalvik virtual machine does when optimising class files is to mmap the file as writable memory. This was what was failing, with the totally useless error invalid argument. Do you know how many unique paths along the mmap system call can set EINVAL? Well it鈥檚 a lot. Anyway, long story short, it turns out that the jffs2 filesystem doesn鈥檛 support writable mmaps! %&!#.

After I finished cursing, I decided to go back to using yaffs and working out what the real problem is. After upgrading u-boot (in a pointless attempt to fix my serial console), I noticed a new write yaffs[1] command. This wasn鈥檛 there in the old version. Ok, cool, maybe this has something do to with the problem. But what is this the deal with yaffs versus yaffs1? Well it turns out that NAND has different pagesize, 512 bytes, and 2k (or multiples thereof, maybe??). And it turns out that YAFFS takes advantage of this and has different file systems for different sized NAND pages, and of course, everything that can go wrong will so, the filesystem image that the build system creates is YAFFS2 which is for 2k pages not 512b pages. So, I again updated the build system to firstly build both the mkyaffs2image and the mkyaffsimage tool, and then set off building a YAFFS file system.

Now, while u-boot supports yaffs filesystem, device firmware update doesn鈥檛 (appear to). So this means I need to copy the image to memory first, then on the device copy it from memory to flash. Now, the other fun thing is that dfu can only copy 2MB or so to RAM at a time, and the system.img file is around 52MB or so, which means that it takes around 26 individual copies of 2MB sections.... very, very painful. But in the end this more or less worked. So now I have a 56MB partition for the system, and a 4MB partition for the user and things are looking good.

Good that is, right up until the point where dalvik starts up and writes out cached version of class files to /data. You see, it needs more than 4MB, a lot more, so I鈥檓 kind of back to square one. I mean, if I鈥檇 looked at the requirements I would have read 128MB of flash, but meh, who reads requirements? The obvious option would be some type of MMC card, but as it turns out the number of handy Fry鈥檚 stores on Boeing 747 from Sydney to LA number in the zeroes.

So the /system partition is read-only, and since the only problem with jffs2 was when we were writing to it, it seems that we could use jffs2 for the read-only system partition, which has the advantage of jffs2 doing compression, and fitting in about 30MB, not about 50MB, leaving plenty of room for the user data partition, which is where the Dalvik cached files belong. This also has the advantage of being able to use normal DFU commands to install the image (yay!). So after more updates to the build system to now support individually setting the system filesystem type and the user filesystem type things seem a lot happier.

Currently, I have a system that boots init, starts up most of the system services, including the Dalvik VM, runs a bunch of code, but bombs out with an out-of-memory error in the pixelflinger code which I鈥檓 yet to have any luck tracing. Currently my serial console is fubar, so I can鈥檛 get any useful logging, which makes things doubly painful. The next step is to get adb working over USB so I have at least an output of the errors and warning, which should give me half a chance of tracking down the problem.

So if you want to try and get up to this point, what are the steps? Well, firstly go and download the android toolchain source code. and compile it for a v4 target. You use the --target=armv4-android-eabi argument to configure if I remember correctly.

Once you have that done, grab my latest patch and apply it to the Android source code base. (That is tar file with diffs for each individual project, apply these correctly is left as an exercise for the reader). Then you want to compile it with the new toolchain. I use a script like this:

#!/bin/sh
 
make TARGET_ARCH_VERSION=armv4 /
     MKJFFS2_CMD="ssh nirvana -x /"cd `pwd`; mkfs.jffs2/""  /
     SYSTEM_FSTYPE=jffs2 /
     USERDATA_FSTYPE=yaffs /
     TARGET_TOOLS_PREFIX=/opt/benno/bin/armv4-android-eabi- $@

Things you will need to change it the tools prefix, and the mkjffs2 command. The evil-hackery above is to run it on my linux virtual machine (I鈥檓 compiling the rest under OS X, and I can鈥檛 get mkfs.jffs2 to compile under it yet.)

After some time passes you should end up with a ramdisk.img, userdata.img and system.img files. The next step is to get a usable kernel.

I鈥檓 using the OpenMoko stable kernel, which is 2.6.24 based. I鈥檝e patched this with bits of the Android kernel (enough, I think, to make it run). Make sure you configure support for yaffs, binder, logger and ashmem. Here is the kernel config I鈥檓 currently using.

At this stage it is important you have a version of u-boot supporting the yaffs write commands, if you don鈥檛 your next step is to install that. After this the next step is to re-partition your flash device. In case it isn鈥檛 obvious this will trash your current OS. The useful parts from my uboot environment are:

mtdids=nand0=neo1973-nand
bootdelay=-1
mtdparts=mtdparts=neo1973-nand:256k(uboot)ro,16k(uboot-env),752k(ramdisk),2m(kernel),36m(system),24m(userdata)
rdaddr=0x35000000
kaddr=0x32000000
bootcmd=setenv bootargs ${bootargs_base} ${mtdparts} initrd=${rdaddr},${rdsize}; nand read.e ${kaddr} kernel; nand read.e ${rdaddr} ramdisk; bootm ${kaddr}
bootargs_base=root=/dev/ram rw console=tty0 loglevel=8

Note the mtdparts which defines the partitions, and the bootcmd. (I鈥檓 not entirely happy with the boot command, mostly because when I install new RAM image I need to manually update $rdsize, which is a pain).

With this in place you are ready to start. The first image to move across is your userdata image. Now to make this happen we first copy it into memory using dfu-util:

sudo dfu-util -a 0 -R -D source/out/target/product/generic/userdata.img  -R

Then you need to use the nand write.yaffs1 command to copy it to the data partition. Note, at this stage I get weird behaviour, I鈥檓 not convinced that the yaffs support truly works yet! Afterwards I get some messed up data in other parts of the flash (which is why we are doing it first). After you have copied it in, I suggest reseting the device, and you may find you need to reinitialise u-boot (using dyngen, and resetting up the environment as above.

After this you are good to use dfu-util to copy accross the kernel, system.img and ramdisk.img. After copying the ramdisk.img across update the rdsize variable with the size of the ramdisk.

Once all this is done, you are good to boot, I wish you luck! If you have a working serial console you can probably try the logcat command to see why graphics aren鈥檛 working. If you get this far please email me the results!

permalink

Compiling the Android source code for ARMv4T

Thu, 23 Oct 2008 23:02:13

After a lot of stuffing around installing new hard drives so I had enough space to actually play with the source code, getting screwed by Time Machine when trying to convert my filesystem from case-insenstive to case-insensitive (I gave up and am now usuing a case-sensitive disk image on top of my case-insenstive file system.. sigh), I finally have the Android source code compiling, yay!.

Compiling is fairly trivial, just make and away it goes. The fun thing is trying to work out exactly what the hell the build system is actually doing. I鈥檝e got to admit though, it is a pretty clean build system, although it isn鈥檛 going to win any speed records. I鈥檓 going to go into more details on the build sstem when i have more time, and I鈥檝e actually worked out what the hell is happening.

Anyway, after a few false starts I now have the build system compiling for ARMv4T processors (such as the one inside the Neo1973), and hopefully at the same time I haven鈥檛 broken compilation from ARMv5TE.

For those interested I have a patch available. Simply apply this to the checked out code, and the build using make TARGET_ARCH_VERSION=armv4t. Now, of course I haven鈥檛 actually tried to run this code yet, so it might not work, but it seems to compile fine, so that is a good start! Now once I work out how to make git play nice I'll actually put this into a branch and make it available, but the diff will have to suffice for now. Of course I鈥檓 not the only one looking at this, check out Christopher鈥檚 page for more information. (Where he actually starts solving some problems instead of just working around them ;)

The rest of this post documents the patch. For those interested it should give you some idea of the build system and layout, and hopefully it is something that can be applied to mainline.

The first changes made are to the linux-arm.mk file. A new make variable TARGET_ARCH_VERSION is added. For now this is defaulted to armv5te, but it can be overridden on the command line as shown above.

project build/
diff --git a/core/combo/linux-arm.mk b/core/combo/linux-arm.mk
index adb82d3..a43368f 100644
--- a/core/combo/linux-arm.mk
+++ b/core/combo/linux-arm.mk
@@ -7,6 +7,8 @@ $(combo_target)TOOLS_PREFIX := /
        prebuilt/$(HOST_PREBUILT_TAG)/toolchain/arm-eabi-4.2.1/bin/arm-eabi-
 endif
 
+TARGET_ARCH_VERSION ?= armv5te
+
 $(combo_target)CC := $($(combo_target)TOOLS_PREFIX)gcc$(HOST_EXECUTABLE_SUFFIX)
 $(combo_target)CXX := $($(combo_target)TOOLS_PREFIX)g++$(HOST_EXECUTABLE_SUFFIX)
 $(combo_target)AR := $($(combo_target)TOOLS_PREFIX)ar$(HOST_EXECUTABLE_SUFFIX)

The next thing is to make the GLOBAL_CFLAGS variable dependent on the architecture version. The armv5te defines stay in place, but an armv4t architecture version is added. Most of the cflags are pretty similar, except we change the -march flag, and change the pre-processor defines. These will become important later in the patch as they provide the mechanism for distinguishing between versions in the code.

@@ -46,6 +48,7 @@ ifneq ($(wildcard $($(combo_target)CC)),)
 $(combo_target)LIBGCC := $(shell $($(combo_target)CC) -mthumb-interwork -print-libgcc-file-name)
 endif
 
+ifeq ($(TARGET_ARCH_VERSION), armv5te)
 $(combo_target)GLOBAL_CFLAGS += /
                       -march=armv5te -mtune=xscale /
                       -msoft-float -fpic /
@@ -56,6 +59,21 @@ $(combo_target)GLOBAL_CFLAGS += /
                       -D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ /
                       -D__ARM_ARCH_5E__ -D__ARM_ARCH_5TE__ /
                       -include $(call select-android-config-h,linux-arm)
+else
+ifeq ($(TARGET_ARCH_VERSION), armv4t)
+$(combo_target)GLOBAL_CFLAGS += /
+                      -march=armv4t /
+                      -msoft-float -fpic /
+                      -mthumb-interwork /
+                      -ffunction-sections /
+                      -funwind-tables /
+                      -fstack-protector /
+                      -D__ARM_ARCH_4__ -D__ARM_ARCH_4T__ /
+                      -include $(call select-android-config-h,linux-arm)
+else
+$(error Unknown TARGET_ARCH_VERSION=$(TARGET_ARCH_VERSION))
+endif
+endif
 
 $(combo_target)GLOBAL_CPPFLAGS += -fvisibility-inlines-hidden

The next bit we update is the prelink-linux-arm.map file. The dynamic libraries in android are laid out explicitly in virtual memory according to this map file. If I鈥檓 not mistaken those address look suspiciously 1MB aligned, which means they should fit nicely in the pagetable, and provides some opportunity to use fast-address-space-switching techniques. In the port to ARMv4 I have so far been lazy and instead of fixing up any assembler code I鈥檝e just gone with existing C code. One outcome of this is that I need the libffi.so for my foreign function interface, so I鈥檝e added this to the map for now. I鈥檓 not 100% sure that when compiling for ARMv5 this won鈥檛 cause a problem. Will need to see. Fixing up the code to avoid needing libffi is probably high on the list of things to do.

diff --git a/core/prelink-linux-arm.map b/core/prelink-linux-arm.map
index d4ebf43..6e0bc43 100644
--- a/core/prelink-linux-arm.map
+++ b/core/prelink-linux-arm.map
@@ -113,3 +113,4 @@ libctest.so             0x9A700000
 libUAPI_jni.so          0x9A500000
 librpc.so               0x9A400000 
 libtrace_test.so        0x9A300000 
+libffi.so               0x9A200000
 
 

The next module is the bionic module which is the light-weight C library that is part of Android. This has some nice optimised routines for memory copy and compare, but unfortunately they rely on ARMv5 instructions. I鈥檝e changed the build system to only use the optimised assembler when compiling with ARMv5TE, and falling back to C routines in the other cases. (The strlen implementation isn鈥檛 pure assembly, but the optimised C implementation has inline asm, so again it needs to drop back to plain old dumb strlen.)

project bionic/
diff --git a/libc/Android.mk b/libc/Android.mk
index faca333..3fb3455 100644
--- a/libc/Android.mk
+++ b/libc/Android.mk
@@ -206,13 +206,9 @@ libc_common_src_files := /
        arch-arm/bionic/_setjmp.S /
        arch-arm/bionic/atomics_arm.S /
        arch-arm/bionic/clone.S /
-       arch-arm/bionic/memcmp.S /
-       arch-arm/bionic/memcmp16.S /
-       arch-arm/bionic/memcpy.S /
        arch-arm/bionic/memset.S /
        arch-arm/bionic/setjmp.S /
        arch-arm/bionic/sigsetjmp.S /
-       arch-arm/bionic/strlen.c.arm /
        arch-arm/bionic/syscall.S /
        arch-arm/bionic/kill.S /
        arch-arm/bionic/tkill.S /
@@ -274,6 +270,18 @@ libc_common_src_files := /
        netbsd/nameser/ns_print.c /
        netbsd/nameser/ns_samedomain.c
 
+
+ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
+libc_common_src_files += arch-arm/bionic/memcmp.S /
+              arch-arm/bionic/memcmp16.S /
+              arch-arm/bionic/memcpy.S /
+              arch-arm/bionic/strlen.c.arm
+else
+libc_common_src_files += string/memcmp.c string/memcpy.c string/strlen.c string/ffs.c
+endif
+endif
+
 # These files need to be arm so that gdbserver
 # can set breakpoints in them without messing
 # up any thumb code.

Unfortunately, it is clear that this C only code hasn鈥檛 been used in a while as there was a trivial bug as fixed by the patch below. This makes me worry about what other bugs that aren鈥檛 caught by the compiler may be lurking.

diff --git a/libc/string/memcpy.c b/libc/string/memcpy.c
index 4cd4a80..dea78b2 100644
--- a/libc/string/memcpy.c
+++ b/libc/string/memcpy.c
@@ -25,5 +25,5 @@
  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
-#define MEM_COPY
+#define MEMCOPY
 #include "bcopy.c"

Finally, frustratingly, the compiler鈥檚 ffs() implementation appears to fallback to calling the C library鈥檚 ffs() implementation if it can鈥檛 doing something optimised. This happens when compiling for ARMv4, so I鈥檝e added an ffs() implementation (stolen from FreeBSD).

#include 
#include 
 
/*
 * Find First Set bit
 */
int
ffs(int mask)
{
        int bit;
 
        if (mask == 0)
                return (0);
        for (bit = 1; !(mask & 1); bit++)
                mask = (unsigned int)mask >> 1;
        return (bit);
}

The next module for attention is the dalvik virtual machine. Again this has some code that relies on ARMv5, but there is a C version that we fall back on. In this case it also means pulling in libffi. This is probably the module that needs to most attention in actually updating the code to be ARMv4 assembler in the near future.

project dalvik/
diff --git a/vm/Android.mk b/vm/Android.mk
index dfed78d..c66a861 100644
--- a/vm/Android.mk
+++ b/vm/Android.mk
@@ -189,6 +189,7 @@ ifeq ($(TARGET_SIMULATOR),true)
 endif
 
 ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
        # use custom version rather than FFI
        #LOCAL_SRC_FILES += arch/arm/CallC.c
        LOCAL_SRC_FILES += arch/arm/CallOldABI.S arch/arm/CallEABI.S
@@ -204,6 +205,16 @@ else
               mterp/out/InterpC-desktop.c /
               mterp/out/InterpAsm-desktop.S
        LOCAL_SHARED_LIBRARIES += libffi
+       LOCAL_SHARED_LIBRARIES += libdl
+endif
+else
+       # use FFI
+       LOCAL_C_INCLUDES += external/libffi/$(TARGET_OS)-$(TARGET_ARCH)
+       LOCAL_SRC_FILES += arch/generic/Call.c
+       LOCAL_SRC_FILES += /
+              mterp/out/InterpC-desktop.c /
+              mterp/out/InterpAsm-desktop.S
+       LOCAL_SHARED_LIBRARIES += libffi
 endif
 
 LOCAL_MODULE := libdvm

Next is libjpeg, which again, has assembler optimisation that we can鈥檛 easily use without real porting work, so we fall back to the C

project external/jpeg/
diff --git a/Android.mk b/Android.mk
index 9cfe4f6..3c052cd 100644
--- a/Android.mk
+++ b/Android.mk
@@ -19,6 +19,12 @@ ifneq ($(TARGET_ARCH),arm)
 ANDROID_JPEG_NO_ASSEMBLER := true
 endif
 
+# the assembler doesn't work for armv4t
+ifeq ($(TARGET_ARCH_VERSION),armv4t)
+ANDROID_JPEG_NO_ASSEMBLER := true
+endif
+
+
 # temp fix until we understand why this broke cnn.com
 #ANDROID_JPEG_NO_ASSEMBLER := true
 

For some reason compiling with ARMv4 doesn鈥檛 allow the prefetch loop array compiler optimisation, so we turn it off for ARMv4.

@@ -29,7 +35,10 @@ LOCAL_SRC_FILES += jidctint.c jidctfst.S
 endif
 
 LOCAL_CFLAGS += -DAVOID_TABLES 
-LOCAL_CFLAGS += -O3 -fstrict-aliasing -fprefetch-loop-arrays
+LOCAL_CFLAGS += -O3 -fstrict-aliasing
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
+LOCAL_FLAGS += -fprefetch-loop-arrays
+endif
 #LOCAL_CFLAGS += -march=armv6j
 
 LOCAL_MODULE:= libjpeg
 

Next up is libffi, which is just a case of turning it on since we now need it for ARMv4.

project external/libffi/
diff --git a/Android.mk b/Android.mk
index f4452c9..07b5c2f 100644
--- a/Android.mk
+++ b/Android.mk
@@ -6,7 +6,7 @@
 # We need to generate the appropriate defines and select the right set of
 # source files for the OS and architecture.
 
-ifneq ($(TARGET_ARCH),arm)
+ifneq ($(TARGET_ARCH_VERSION),armv5te)
 
 LOCAL_PATH:= $(call my-dir)
 include $(CLEAR_VARS)
 

The external module opencore contains a lot of software implemented codecs. (I wonder about the licensing restrictions on these things...). Not surprisingly these too are tuned for ARMv4, but again we fall back to plain old C.

project external/opencore/
diff --git a/codecs_v2/audio/aac/dec/Android.mk b/codecs_v2/audio/aac/dec/Android.mk
index ffe0089..6abdc2d 100644
--- a/codecs_v2/audio/aac/dec/Android.mk
+++ b/codecs_v2/audio/aac/dec/Android.mk
@@ -150,7 +150,7 @@ LOCAL_SRC_FILES := /
 LOCAL_MODULE := libpv_aac_dec
 
 LOCAL_CFLAGS := -DAAC_PLUS -DHQ_SBR -DPARAMETRICSTEREO  $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -D_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT
diff --git a/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk b/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
index e184178..3223841 100644
--- a/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
+++ b/codecs_v2/audio/gsm_amr/amr_wb/dec/Android.mk
@@ -48,7 +48,7 @@ LOCAL_SRC_FILES := /
 LOCAL_MODULE := libpvamrwbdecoder
 
 LOCAL_CFLAGS :=   $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -D_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT
diff --git a/codecs_v2/audio/mp3/dec/Android.mk b/codecs_v2/audio/mp3/dec/Android.mk
index 254cb6b..c2430fe 100644
--- a/codecs_v2/audio/mp3/dec/Android.mk
+++ b/codecs_v2/audio/mp3/dec/Android.mk
@@ -28,8 +28,8 @@ LOCAL_SRC_FILES := /
        src/pvmp3_seek_synch.cpp /
        src/pvmp3_stereo_proc.cpp /
        src/pvmp3_reorder.cpp
-       
-ifeq ($(TARGET_ARCH),arm)
+
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
 LOCAL_SRC_FILES += /
        src/asm/pvmp3_polyphase_filter_window_gcc.s /
        src/asm/pvmp3_mdct_18_gcc.s /
@@ -46,7 +46,7 @@ endif
 LOCAL_MODULE := libpvmp3
 
 LOCAL_CFLAGS :=   $(PV_CFLAGS)
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
  LOCAL_CFLAGS += -DPV_ARM_GCC
  else
  LOCAL_CFLAGS += -DC_EQUIVALENT

Unfortunately it is not just the build file that needs updating in this module. I need to manually go and update the headers so that some optimised inline assembler is only used in the ARMv5 case. To be honest this messes these files up a little bit, so a nicer solution would be preferred.

diff --git a/codecs_v2/video/m4v_h263/enc/src/dct_inline.h b/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
index 86474b2..41a3297 100644
--- a/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/dct_inline.h
@@ -22,7 +22,7 @@
 #ifndef _DCT_INLINE_H_
 #define _DCT_INLINE_H_
 
-#if !defined(PV_ARM_GCC)&& defined(__arm__)
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_5TE__))
 
 #include "oscl_base_macros.h"
 
@@ -109,7 +109,7 @@ __inline int32 sum_abs(int32 k0, int32 k1, int32 k2, int32 k3,
 #elif defined(__CC_ARM)  /* only work with arm v5 */
 
 #if defined(__TARGET_ARCH_5TE)
-
+#error
 __inline int32 mla724(int32 op1, int32 op2, int32 op3)
 {
     int32 out;
@@ -266,7 +266,7 @@ __inline int32 sum_abs(int32 k0, int32 k1, int32 k2, int32 k3,
     return abs_sum;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_5TE__) /* ARM GNU COMPILER  */
 
 __inline int32 mla724(int32 op1, int32 op2, int32 op3)
 {
diff --git a/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h b/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
index 6a35d43..fbfeddf 100644
--- a/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/fastquant_inline.h
@@ -25,7 +25,7 @@
 #include "mp4def.h"
 #include "oscl_base_macros.h"
 
-#if !defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__)) /* ARM GNU COMPILER  */
 
 __inline int32 aan_scale(int32 q_value, int32 coeff, int32 round, int32 QPdiv2)
 {
@@ -423,7 +423,7 @@ __inline int32 coeff_dequant_mpeg_intra(int32 q_value, int32 tmp)
     return q_value;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__) /* ARM GNU COMPILER  */
 
 __inline int32 aan_scale(int32 q_value, int32 coeff,
                          int32 round, int32 QPdiv2)
diff --git a/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h b/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
index 69857f3..b0bf46d 100644
--- a/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
+++ b/codecs_v2/video/m4v_h263/enc/src/vlc_encode_inline.h
@@ -18,7 +18,7 @@
 #ifndef _VLC_ENCODE_INLINE_H_
 #define _VLC_ENCODE_INLINE_H_
 
-#if !defined(PV_ARM_GCC)&& defined(__arm__)
+#if !(defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__))
 
 __inline  Int zero_run_search(UInt *bitmapzz, Short *dataBlock, RunLevelBlock *RLB, Int nc)
 {
@@ -208,7 +208,7 @@ __inline  Int zero_run_search(UInt *bitmapzz, Short *dataBlock, RunLevelBlock *R
     return idx;
 }
 
-#elif defined(PV_ARM_GCC) && defined(__arm__) /* ARM GNU COMPILER  */
+#elif defined(PV_ARM_GCC) && defined(__arm__) && defined(__ARCH_ARM_V5TE__) /* ARM GNU COMPILER  */
 
 __inline Int m4v_enc_clz(UInt temp)
 {
 

A similar approach is needed in the skia graphics library.

project external/skia/
diff --git a/include/corecg/SkMath.h b/include/corecg/SkMath.h
index 76cf279..5f0264f 100644
--- a/include/corecg/SkMath.h
+++ b/include/corecg/SkMath.h
@@ -162,7 +162,7 @@ static inline int SkNextLog2(uint32_t value) {
     With this requirement, we can generate faster instructions on some
     architectures.
 */
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARM_ARCH_5TE__) && !defined(__thumb__)
     static inline int32_t SkMulS16(S16CPU x, S16CPU y) {
         SkASSERT((int16_t)x == x);
         SkASSERT((int16_t)y == y);
 

The sonivox module (no idea what that is!), has the same requirement of updating the build to avoid building ARMv5 specific code.

project external/sonivox/
diff --git a/arm-wt-22k/Android.mk b/arm-wt-22k/Android.mk
index 565c233..a59f917 100644
--- a/arm-wt-22k/Android.mk
+++ b/arm-wt-22k/Android.mk
@@ -73,6 +73,7 @@ LOCAL_COPY_HEADERS := /
        host_src/eas_reverb.h
 
 ifeq ($(TARGET_ARCH),arm)
+ifeq (($TARGET_ARCH),armv5)
 LOCAL_SRC_FILES+= /
        lib_src/ARM-E_filter_gnu.s /
        lib_src/ARM-E_interpolate_loop_gnu.s /

The low-level audio code in audioflinger suffers from the same optimisations, and we need to dive into the code on this occasion to fix things up.

project frameworks/base/
diff --git a/libs/audioflinger/AudioMixer.cpp b/libs/audioflinger/AudioMixer.cpp
index 9f1b17f..4c0890c 100644
--- a/libs/audioflinger/AudioMixer.cpp
+++ b/libs/audioflinger/AudioMixer.cpp
@@ -400,7 +400,7 @@ void AudioMixer::process__validate(state_t* state, void* output)
 static inline 
 int32_t mulAdd(int16_t in, int16_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smlabb %[out], %[in], %[v], %[a] /n"
          : [out]"=r"(out)
@@ -415,7 +415,7 @@ int32_t mulAdd(int16_t in, int16_t v, int32_t a)
 static inline 
 int32_t mul(int16_t in, int16_t v)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smulbb %[out], %[in], %[v] /n"
          : [out]"=r"(out)
@@ -430,7 +430,7 @@ int32_t mul(int16_t in, int16_t v)
 static inline 
 int32_t mulAddRL(int left, uint32_t inRL, uint32_t vRL, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smlabb %[out], %[inRL], %[vRL], %[a] /n"
@@ -456,7 +456,7 @@ int32_t mulAddRL(int left, uint32_t inRL, uint32_t vRL, int32_t a)
 static inline 
 int32_t mulRL(int left, uint32_t inRL, uint32_t vRL)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smulbb %[out], %[inRL], %[vRL] /n"
diff --git a/libs/audioflinger/AudioResamplerSinc.cpp b/libs/audioflinger/AudioResamplerSinc.cpp
index e710d16..88b8c22 100644
--- a/libs/audioflinger/AudioResamplerSinc.cpp
+++ b/libs/audioflinger/AudioResamplerSinc.cpp
@@ -62,7 +62,7 @@ const int32_t AudioResamplerSinc::mFirCoefsDown[] = {
 static inline 
 int32_t mulRL(int left, int32_t in, uint32_t vRL)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smultb %[out], %[in], %[vRL] /n"
@@ -88,7 +88,7 @@ int32_t mulRL(int left, int32_t in, uint32_t vRL)
 static inline 
 int32_t mulAdd(int16_t in, int32_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     asm( "smlawb %[out], %[v], %[in], %[a] /n"
          : [out]"=r"(out)
@@ -103,7 +103,7 @@ int32_t mulAdd(int16_t in, int32_t v, int32_t a)
 static inline 
 int32_t mulAddRL(int left, uint32_t inRL, int32_t v, int32_t a)
 {
-#if defined(__arm__) && !defined(__thumb__)
+#if defined(__arm__) && defined(__ARCH_ARM_5TE__) && !defined(__thumb__)
     int32_t out;
     if (left) {
         asm( "smlawb %[out], %[v], %[inRL], %[a] /n"
 

The AndroidConfig.h header file is included on every compile. We mess with it to convince it that we don鈥檛 have an optimised memcmp16 function.

project system/core/
diff --git a/include/arch/linux-arm/AndroidConfig.h b/include/arch/linux-arm/AndroidConfig.h
index d7e182a..76f424e 100644
--- a/include/arch/linux-arm/AndroidConfig.h
+++ b/include/arch/linux-arm/AndroidConfig.h
@@ -249,8 +249,9 @@
 /*
  * Do we have __memcmp16()?
  */
+#if defined(__ARCH_ARM_5TE__)
 #define HAVE__MEMCMP16  1
-
+#endif
 /*
  * type for the third argument to mincore().
  */

Next up is the pixelflinger, where things get interesting, because all of a sudden we have armv6 code. I鈥檝e taken the rash decision of wrapping this in conditionals that are only enabled if you actually have an ARMv6 version, not a pesky ARMv5E, but I really need to better understand the intent here. It seems a little strange.

diff --git a/libpixelflinger/Android.mk b/libpixelflinger/Android.mk
index a8e5ee4..077cf47 100644
--- a/libpixelflinger/Android.mk
+++ b/libpixelflinger/Android.mk
@@ -5,7 +5,7 @@ include $(CLEAR_VARS)
 # ARMv6 specific objects
 #
 
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv6)
 LOCAL_ASFLAGS := -march=armv6
 LOCAL_SRC_FILES := rotate90CW_4x4_16v6.S
 LOCAL_MODULE := libpixelflinger_armv6
@@ -39,7 +39,7 @@ PIXELFLINGER_SRC_FILES:= /
        raster.cpp /
        buffer.cpp
 
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv5te)
 PIXELFLINGER_SRC_FILES += t32cb16blend.S
 endif
 
@@ -67,7 +67,7 @@ ifneq ($(BUILD_TINY_ANDROID),true)
 LOCAL_MODULE:= libpixelflinger
 LOCAL_SRC_FILES := $(PIXELFLINGER_SRC_FILES)
 LOCAL_CFLAGS := $(PIXELFLINGER_CFLAGS) -DWITH_LIB_HARDWARE
-ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv6)
 LOCAL_WHOLE_STATIC_LIBRARIES := libpixelflinger_armv6
 endif
 include $(BUILD_SHARED_LIBRARY)

Finally scanline has an optimised asm version it calls in preference to doing the same thing inline with C code. Again, I take the easy way out, and use the C code.

diff --git a/libpixelflinger/scanline.cpp b/libpixelflinger/scanline.cpp
index d24c988..685a3b7 100644
--- a/libpixelflinger/scanline.cpp
+++ b/libpixelflinger/scanline.cpp
@@ -1312,7 +1312,7 @@ void scanline_t32cb16blend(context_t* c)
     const int32_t v = (c->state.texture[0].shade.it0>>16) + y;
     uint32_t *src = reinterpret_cast(tex->data)+(u+(tex->stride*v));
 
-#if ((ANDROID_CODEGEN >= ANDROID_CODEGEN_ASM) && defined(__arm__))
+#if ((ANDROID_CODEGEN >= ANDROID_CODEGEN_ASM) && defined(__arm__) && defined(__ARCH_ARM_5TE__))
     scanline_t32cb16blend_arm(dst, src, ct);
 #else
     while (ct--) {

And that my friends, is that! Now to see if I can actually run this code!

permalink

Android Source Code Released

Wed, 22 Oct 2008 09:14:34

Usually when companies say release 4th quarter 2008 you usually see something around January 2009, and to be honest that was when I was expecting the Android source code to finally drop. So I was a little surprised to see that the code was released early this morning.

Stay tuned, more to come as I start playing.

permalink

Open Mobile Miniconf @ linux.conf.au 2009

Wed, 24 Sep 2008 15:12:02

It鈥檚 been almost two years now since I help organise and run linux.conf.au 2007, and I thought it was time to jump back into the fray again. This time I鈥檒l not be doing something as silly as trying to organise the whole conference, but I will be running a small miniconf on the first two days of the conference. So I鈥檇 like to invite you all to the Open Mobile Miniconf in January next year. And if you think you鈥檝e got something cool you鈥檇 like to share with other developers, please take a look at the call for presentations, and drop me a line.

permalink

What makes code trustworthy?

Mon, 08 Sep 2008 09:22:52

So last week I posted on the difference between trusted and trustworthy code. Picking up that thread again, if there is some code in my system that is trusted how can I tell if it is actually trustworthy?

Now ruling out truly malicious code, our assessment of the trustworthiness of code, really comes down to an assessment of the quality of the code, and a pretty reasonable proxy for code quality is the number of errors in the code, or more specifically, the lack thereof. So the question is how can we determine whether a piece of code has a small number of bugs?

The number of errors in a body of code is going to be the product of two important factors: the defect density and the size of the code. So, in general the defect density is measured in bugs per thousand lines of code (KLOC), and the size of the code is measured in lines of code. Now there are plenty of arguments on what the 鈥漴ight?/em> method is for measuring lines of code, and in general you can only know the exact defect density after all the bugs are found, and of course, program testing can be used to show the presence of bugs, but never to show their absence!*. So, to lower the number of bugs that exist in a code base there are really two options: reduce the number of lines of code, or improve the defect density.

So, how do we improve, that is reduce, the defect density. Well, there are a number of pretty well known ways. Effective testing, despite its caveats, goes a long way to reducing the number of bugs in a code base (assuming that you actually fix the bugs you find!). Static analysis (in its various forms), is also an important tool to help increase the quality of code, and is often a great complement to testing as it can expose bugs in code that is impractical to test. And of course there are other formal methods like model checking which can help eliminate bugs from the design phase. A great example of this is the SPIN model checker. Code reviews, while time intensive, are also a great way of finding bugs that would otherwise fly under the radar. Another way to improve code quality is to write simple, rather than complex code. McCabe鈥檚 cyclomatic complexity measure can be one good indicator of this. There is, of course, just a sampling of some of the aspects of software quality. Wikipedia and McConnell鈥檚 Code Complete for more information.

Now, how do you know if some code base has actually undergone testing, how do you know if static analysis has been performed on the source code? How do you know if the code has under gone a thorough code review? Well, generally there are two ways, you trust the developer of that code, or you get someone you do trust to do the quality analysis (which may be yourself). Now, is the point where things quickly shift from technological solutions into the fuzzy world of economics, social science, psychology and legal theory as we try to determine the trustworthiness of another entity, be it person or corporation. The one technologically relevant part is that it is much more difficult to do a 3rd party analysis of code quality without access to the source code. Note: this I am not saying that open source software is more trustworthy, simply that making the source available enables 3rd party assessments of code quality, which may make it easier for some people to trust the code.

So, improving code quality, and thereby reducing defect density, is one side of the equation, but even if you have a very low defect density, for example, less than 1/KLOC, you can still have a large number of bugs if your code base is large. So it is very important to reduce the size of the code base as well. A small code base doesn鈥檛 just have the direct benefit of reducing the code size part of the equation, it also helps improve the defect density part of the equation. Why? Well, almost all the techniques mentioned above are more effective or tractable on a small code base. You can usually get much better code coverage, and even a reasonable amount of path coverage, with a smaller code base. Code reviews can be more comprehensive. Now to some extent those techniques can work for large code bases, just through more programmers at it, but using static analysis is another matter. Many of the algorithms and techniques involved with static analysis have polynomial, or even exponential computational complexity with n based on number of lines of code. So an analysis that may take an hour on a 10,000 line code base, could end up taking all week to run on a code base of 100,000 lines of code.

Of course, this doesn鈥檛 address the problem of how you assure that the code you think is in your TCB is really what you think it is. That topic really gets us into trusted boot, trusted protection module, code signing and so on, which I鈥檓 not going to try and address in this post.

Now, it should be very clear that if you want to be able to trust your trusted computing base, then it is going to need to be both small and high quality.

permalink

Trusted vs. Trustworthy

Tue, 02 Sep 2008 13:01:05

If you鈥檝e seen me give a presentation recently, or just been talking about some of the stuff I鈥檝e been doing recently, you鈥檝e probably heard me mention the term trusted computing base or TCB. (Not to be confused with thread control blocks, the other TCB in operating systems). So what is the trusted computing base?

The TCB for a given system is all the components, both hardware and software, that we must be relied upon to operate correctly if the security of the system is to be maintained. In other words, an error that occurs in the TCB can affect the overall system security, while an error outside the TCB can not affect the overall system security.

Now, the TCB depends on the scope of the system and the defined security policy. For example, if we are talking about a UNIX operating system, and its applications, then the trusted computing base contains at least the operating system kernel, and probably any system daemons and setuid programs. As the kernel enforces the security mechanism of process boundaries, it should be obvious that an error in the kernel can affect the overall system security. Traditionally on UNIX, there is a user, root, who is all powerful, and can change the system security policies, so an error in any piece of software that runs with root privileges also forms part of the trusted computing base. Of course, any applications are outside the trusted computing base. An error in a database server should not affect the overall system security.

Of course, if we are using a UNIX operating system as the foundation of a database server, then the definition of the TCB changes. In this case not only is the operating system part of the TCB, but the database server is as well. This is because the database server is enforcing the security of which users can access which rows, tables and columns in the database, so an error in the database server can clearly impact the security of the system.

OK, so we now know we have to trust all the code that falls inside the TCB if we want to put any trust into our security system. The problem is, just because we have to trust this code does not give us any rational reason to believe that we can trust this code. Just because code is trusted doesn鈥檛 give us any indication at all as to whether the code is, in fact, trustworthy.

To put any faith in the security of the system we should ensure that any trusted code is trustworthy code.

There are a number of things that we can do to increase our confidence in the trustworthiness of our code, which I will explore in coming posts. For more information on the trusted computing base, the Wikipedia page gives a good overview, and links to some useful papers.

permalink

Feedback on talks

Tue, 02 Sep 2008 12:44:16

I recently spoke at the Open Mobile Exchange (OMX), at the O鈥橰eilly Open Source Conference (OSCON). Now, a totally fantastic thing about OSCON, is that there is an easy way for audience members to provide feedback to speakers via the conference website. (Unfortunately, I was spending too much time prepping my slides to give useful feedback to my other speakers, which I must apologise for.) I really hope that the zookeepr guys add similar functionality so linux.conf.au can have similar feedback mechanism.

So, the great thing is that I got some great feedback from my last talk, which confirmed something I was worried about in my talks. When giving a talk, there is always a lot of background material that I feel I need to cover to adequately explain my position, but the problem is that I then have a lot less time to present the crux of my position.

So I鈥檝e decided that I鈥檓 going to try and cover background information on my blog, so that I can refer to that in my talk, rather than going into lots of detail during the talk. This also gives all those people with laptops something useful to do during my talk.

Stay tuned!

permalink

Nokia acquires Symbian, Open Sources Symbian OS

Tue, 24 Jun 2008 20:04:11

On the 10th anniversary of the creation of Symbian Ltd, Nokia has announced that they will be acquiring Symbian Ltd with the aim of opening up the Symbian OS under an Eclipse based license. The mobile operating system market is really getting a shake up at the moment!

The Symbian Foundation has been created to build a new platform for mobile phones based on Symbian OS, S60, UIQ and MOAP(S). The foundation is expected to launch in H1 2009.

The new Symbian Foundation Platform will be made up of a common setup application suites, runtimes, UI framework, middleware, OS, tools and SDK, with Foundation members able to provide differentiated experiences on top. The platform is expected to be released to foundation members in 2009, and eventually open sourced over the following two years.

This obviously makes a huge change in the market place. It wll be interesting to see how Symbian Platform, vs. Android, vs. LiMo, vs. Windows Mobile vs. iPhone.

Of course it is all about developers, developers, developers, and it will be extremely interesting to see where developers will want to go.

permalink

memoize.py: a build tool framework

Fri, 06 Jun 2008 15:50:47

I鈥檝e, recently started using memoize.py, as the core of my build system for a new project I鈥檓 working on. This simplicity involved is pretty neat. Rather than manually needing to work out the dependencies, (or having specialised tools for determining the dependencies), with memoize.py, you simply write the commands you need to build your project, and memoize.py works out all the dependencies for you.

So, what鈥檚 the catch? Well, the way memoize.py works is by using strace to record all the system calls that a program makes during its execution. By analyzing this list memoize.py can work out all the files that are touched when a command is run, and then stores this as a list of dependencies for that command. Then, the next time you run the same command memoize.py first checks to see if any of the dependencies have change (using either md5sum, or timestamp), and only runs the command if any of the dependencies have changed. So the catch of course is that this only runs on Linux (as far as I know, you can鈥檛 get strace anywhere else, although that doesn鈥檛 mean the same techniques couldn鈥檛 be used with a different underlying system call tracing tool).

This technique is quite a radical difference to other tools which determine a large dependency graph of the entire build, and then, recursively work through this graph to fulfil unmet dependencies. As a result this form is a lot more imperative, rather than declarative style. Traditional tools (SCons, make, etc), provide a language which allows you to essentially describe a dependency graph, and then the order in which things are executed is really hidden inside the tool. Using memoize.py is a lot different. You go through defining the commands you want to run (in order!), and that is basically it.

Some of the advantages of this approach are:

  • Easy to debug builds. You very easily see the order in which things run.
  • More obvious what is happening.
  • Single pass, no need to parse files, then later run the commands.
  • Gets the dependencies right! You don鈥檛 end up missing a dependencies because your scanner failed to pick up a header, or you forgot to declare it. This makes build much more reliable.

There are however some disadvantages:

  • Running commands through strace is slow, and parsing the output of strace is even slower. This is to a certain extent mitigated by not needing special commands for scanning files for dependencies, and because it is one pass. This could probably be improved by directly using ptrace to perform the system call tracing.
  • No parallel builds. Because there is no job executor running through a dependency graph, it is harder to take advantage of parallel builds. This could definitely be a show-stopper for large projects. Of course it might be possible to explicitly define some jobs as parallel, which might mitigate this problem.
  • Now simple way to build just one target. If you have a very large build, and you just want one target, which only needs a small subset of the target build, then you are in trouble. Of course, it is possible to set up your build system so that there are explicit targets, and you specify a subset of command to run, but this must now be explicit, whereas the traditional approach gives you this for free.
  • Linux only. It would be possible to handle this on other OSes if you have a system call tracing mechanism, but for my current project, the compilers are Linux only anyway, so I鈥檓 not too fussed. I did extend memoize.py a little so that you could simply choose not to run strace. Obviously you can鈥檛 determine dependencies in this case, but you can at least build the thing.

As with may good tools in your programming kit, memoize.py is available under a very liberal BSD style license, which is nice, because I鈥檝e been able to fix up some problems and add some extra functionality. In particular I鈥檝e added options to:

  • Select verbosity of output.
  • Provide a different string to print when running a command.
  • Option to skip using strace.
  • Tracked directory creation, as well as file creation.
  • Provide an option to force building (e.g: ignore the dependencies)

The patch and full file are available. These have of course been provided upstream, so with any luck, some or most of them will be merged upstream.

So, if you have a primarily Linux project, and want to try something different to SCons, or make, I鈥檇 recommend considering memoize.py.

permalink

Video editing

Thu, 29 May 2008 19:09:42

I recently posted my videos from linux.conf.au earlier this year. I ended up spending a lot of time in post-production with these, probably more than I spent in preparing for the talk (and coming up with all the demos for the talk was a lot of work too!).

I ending up shelling out for Final Cut Express (FCE) as I really couldn鈥檛 find anything in the free/open source arena that could really do all the effects that I wanted. My biggest shock was how bloody difficult it was to actually use! Don鈥檛 let the express part fool you, the learning curve is far from quick. I was also a bit surprised how film oriented FCE is. It is much more geared towards production of video captured on tape that will be viewed on a real screen, than towards digitally captured video destined for the web. (Or at least that was my impression).

The other surprising bit of the process was that I really couldn鈥檛 find a suitable place to host my video on the web. Most of the free video places didn鈥檛 want hour long movies, and I found the quality of the video once it was transcoded to be pretty terrible in most cases. This is probably due to the fine detail that I鈥檓 attempting to show, which probably doesn鈥檛 get treated too nicely by most encoders. In any case, I ended up hosting the video using Amazon web services, since the storage a transfer fees were a lot more attractive than slicehost (where the rest of my website is hosted).

Any way, as with most of my posts, the main point of this one was to remind future Benno how to export decent quality movies with FCE. (There are about a million different options to play with, and it took a lot of tweaking to get right). So, in summary, you want something along the lines of:

Format: QuickTime Movie
Options
 -Video
  -Settings
   Compression Type: H.264
   Motion:
    Frame Rate: Current
    Key Frames: Automatic
    Frame Reording: x
   Data Rate:
    Data Rate: Automatic
   Compressor:
    Quality: Best
    Encoding: Best
  -Filter: None
  -Size:
    640x480
    Preserve: using letterbox
    Deinterlace
 -Sound
   Linear PCM
   Stereo L R
   Rate: 48khz
   Render Settings: Quality: Normal Linear PCM Seetings: Sample Size: 16 Litte Endian: x
 -Prepare for stream --- nope

To get Ogg Theora output, using the XiphQT tools.

One of the best/worst things about doing your own post production is that you become very familiar with your own annoying habits and tics. If you watch the video, um, I鈥檓 sure you will, um, realise, um, what I, um, mean. (Note to self: rehearse my talks more!)

By the end of the editing process I was both sick of my own voice, and sick of anyone who says computers are fast enough, when you spending a good 14 hours encoding and compressing a video, you realise that for some things, computers are still damn slow. I would expect most encoding and compression is reasonably easily paralellisable (if that is a real word?), so this massively multi-core revolution will hopefully help my future video editing projects.

permalink

Video - Porting OKL4 to a new SoC

Thu, 29 May 2008 18:45:49

Earlier this year I presented at the linux.conf.au embedded miniconf, about how to port OKL4 to a new SoC. The video was taped and had until recently been available on the linux.conf.au 2008 website, but for some reason that website has gone awol, so I thought it was a good time to put up my own copy. These videos have the advantage that they have gone through a painstaking post-production phase, which seamlessly meld the slides into the video (well, not quite seamless), and also all the bad bits have been removed.

This presentation gives a really good overview of what is involved in porting OKL4 to a new SoC. However, please note that the specific APIs have been somewhat simplified for pedagogical reasons, so this is more an introduction to the concepts, rather than a tutorial as such.

The videos are available in Ogg/Theora and also Quicktime/H264 formats, and either CIF (352x288) or PAL (720x576). If you can afford the bandwidth I would recommend hi-res ones, as then you can actually see what is one the screen.

permalink

Musings on literate programming

Wed, 21 May 2008 15:35:14

You know that any blog post with musings in the title is going to be a lot of navel-gazing babble, so I don鈥檛 blame you if you skip out now, this is mostly for me to consolidate my thoughts on literate programming.

The idea behind literate programming that you should write programs with a human reader as the primary audience, and a compiler as the secondary audience. So this means that you organise your program in logical orders that aid explanation of the program; think chapters and sections, rather than organising your program in a way that is oriented towards the compilers; think files and functions.

Some of the outcomes of writing your programs in this literate manner is that you think a lot more about how to explain things to another programmer (who is your audience), than if you are writing with the compiler as your audience. I鈥檓 quite interested in things that can improve the quality of my code personally, and of my team鈥檚 code. So I thought I鈥檇 try it out.

I first tried a somewhat traditional tool, called noweb. I took a fairly complex module of a kernel that I鈥檓 writing as the base for this. The output that I produced from this was some quite nice look LaTeX, that I think did a good job of explaining the code, as well as some of the design decisions that might have otherwise been difficult to communicate to another programmer. I was able to structure my prose in a way that I thought was quite logical for presentation of the ideas, but ended up being quite different to the actual structure of the original code. It is no surprise that the tool to take the source file, and generate source files to be used by the compile is called tangle. Unfortunately I can鈥檛 really share the output of this experiment as the code is closed (at the moment).

While I liked the experience of using noweb, it seem a lot like a case of write the code, then write the documentation, and then going back to modify the code would be a real nightmare. There is a lot of evidence (i.e: working bodies of code) that a body of code can be worked on by multiple people at once reasonably effectively. I鈥檓 yet to see a piece of literature that can be effectively modified by multiple parties. (And no, Wikipedia doesn鈥檛 count).

One person who agrees is Zed Shaw. He agreed so much that he made his own tool, Idiopidae that allowed you to have code, but then separately create documentation describing that code, in a mostly literate manner, in a separate file. This seemed like a good altnernative, and I tried it out when documenting simplefilemon. Here the documentation is separate, but the code has markers in it so that the documentation can effectively refer to blocks of code, which goes some way to eliminating the problems of traditional literate programming. For a start syntax hi-lighting actually worked! (Yes, emacs has a way of doing dual major modes, but none of them really worked particularly well). When doing this approach, I have to admit it felt less literate, which is pretty wishy-washy, but I felt more like I was documenting the code, rather than really taking a holistic approach to explaining the program. Of course, that isn鈥檛 exactly a good explanation, but it definitely felt different to the other approach. Maybe I felt dirty because I wasn鈥檛 following the Knuth religion to the letter. I think this approach probably has more legs, but I did end up with a lot of syntactic garbage in the source file, which made it more difficult to read than it should have been. Also I couldn鈥檛 find a good way of summarising large chunks of code. So for example, I could present code for a loop with the body replaced by a reference to another block of code, which is one of the nice things I could do in noweb. Of course that is probably something that can be added to the tool in the future, and isn鈥檛 really the end of the world.

Where to go next? Well, I think I鈥檓 going to try and go back and reproduce my original kernel documentation using Idiopidae to see what the experience is when only modifying one variable (the tool), and see how that goes. If that can produce something looking reasonably good, I think I might invest some time in extending Idiopidae to get it working exactly how I want it to.

permalink

VMware fusion, hard links, and zsh

Wed, 21 May 2008 11:52:57

While I end up using Mac OS X as my primary GUI, I still do a lot of development work on Linux. I'm using VMware Fusion to host a virtual headless Linux machine, which is all good. Recently I decided to upgrade my OS to Ubuntu 8.04, which promotes have a just-enough OS (jeOS), which seemed perfect for what I wanted to do. Unfortunately the process of getting the VMware client tools installed was less than simple. Cut a long story short, the fix is described by Peter Coooper, and things work well after that. (It is a little annoying that the Ubuntu documentation doesn't explain this, or link to this.).

Anyway, after this I'm able to share my home directory directly between OS X, and my virtual machine, which is absolutely fantastic, as I'm not using TRAMP or some network filesystem to shuffle files back and forth between the virtual machine and the main machine.

Unfortunately, I ran into a bit of a problem, specifically, history was not working in zsh. Specifically saving the history into the history file was not working, which is a really painful situation. It was not really clear why that was, running fc -W manually didn't work either, but managed to fail silently, no stderr output, and no error code returned. Failing this I went back to the massively useful debugging tool strace. This finally gave me the clue that link() (hard linking) was failing. I confirmed that using ln.

So, it turns out that the VMware hgfs file system doesn't support hard linking, which is a real pain, especially since the underlying OS X file system supports hard linking. So I'm down to the work around of storing my history file in /tmp rather than my home directory, which is slightly annoying, but not the end of the world.

As it turns out I'm not the first to discover this, Roger C. Clermont also found this out a few days ago. With any luck we will find a solution in the near future.

permalink

Simple File Monitoring on Mac OS X

Thu, 15 May 2008 02:21:22

Mac OS X has the kevent() system call which allows you to monitor various kernel events. This is kind of useful, because I want to, well, watch a file, and then do something when it changes. Now, I would have thought I could find something really simple to do this, but I could only find massive GUI programs programs to do this, which is not so great for scripting.

Anyway, long story short, I decided to write my own. It was pretty straight forward. I thought it was worth documenting how it works so that Benno 5 years from now can remember how to use kevent.

The first important thing you need to create a kernel queue using the kqueue() system call. This system call returns a descriptor which allows you get to use on calls to kevent(). These descriptors come out of the file descriptor namespace, but don't actually get inherited on fork().

19
int kq;

77
78
79
80
    kq = kqueue();
    if (kq == -1) {
        err(1, "kq!");
    }

After creating the kernel queue, an event is register. The EV_SET macro is used to initialise the struct kevent. The 1st argument is the address of the event structure to initialise it. The 2nd argument is the file descriptor we wish to monitor. The 3rd argument is the type event we wish to monitor. In this case we want to monitor the file underlying our file descriptor, which is this EVFILT_VNODE event. The 5th argument is some filter specific flags, in this case NOTE_WRITE, which means we want to get an event when the file is modified. The 4th argument describes what action to perform when the event happens. In particular we want the event added to the queue, so we use EV_ADD & EV_CLEAR. The EV_ADD is obvious, but EV_CLEAR less so. The NOTE_WRITE event is triggered by the first write to the file after register, and remains set. This means that you continue to receive the event indefinitately. By using the EV_CLEAR flag, the state is reset, so that an event is only delivered once for each write. (Actually it could be less than once per write, since events are coalesced.) The final arguments are data values, which aren't used for our event.

The kevent system call actually registers the event we initialised with EV_SET. The kevent function takes a the kqueue descriptor as the 1st argument. The 2nd and 3rd arguments are a list of events to register (pointer and length). In this case we register the event we just initialised. The 4th and 5th arguments is a list of events to receive (in this case empty). The final argument is a timeout, which is not relevent in this case (as we aren't receiving any events).

59
    struct kevent ke;

83
84
85
86
87
88
89
90
91
92
93
94
    EV_SET(&ke,
           /* the file we are monitoring */ fd,
           /* we monitor vnode changes */ EVFILT_VNODE,
           /* when the file is written add an event, and then clear the
              condition so it doesn't re- fire */ EV_ADD | EV_CLEAR,
           /* just care about writes to the file */ NOTE_WRITE,
           /* don't care about value */ 0, NULL);
    r = kevent(kq, /* register list */  &ke, 1, /* event list */  NULL, 0, /* timeout */ NULL);
    
    if (r == -1) {
        err(1, "kevent failed");
    }

After we have registered our event we go into an infinite loop receiving events. In this time we aren't setting up any events, so it is the list to register is simply NULL. But, 4th and 5th argument have a list of up to 1 item to receive. In this case we still don't want a timeout. We want to check that the event we received was what expected, so we assert it is true.

33
34
35
36
37
38
39
40
        r = kevent(kq,
                   /* register list */ NULL, 0,
                   /* event list */ &ke, 1,
                   /* timeout */ NULL);
        if (r == -1) {
            err(1, "kevent");
        }
        assert(ke.filter == EVFILT_VNODE && ke.fflags & NOTE_WRITE);

The aim of this program is to run a shell command whenever a file changes. Simply getting the write is not good enough. A progam that is updating a file will cause a number of consecutive writes, and since it is likely that our shell command is going to want to operate on a file that is a consistent state, we want to try and ensure the file is at a quiescent point. UNIX doesn't really provide a good way of doing this. Well, actually, there is a bunch of file locking APIs, but I guess I haven't really used them much, and it isn't clear if the file writing would be using them, and as far as I can tell, the writing file would have had to be written using the same locking mechanism. Also, the commands I want to run are only going to be reading the file, not writing to it, so at worst I'm going to end up with some broken output until the next write. Anyway, to get something that will work almost all the time, I've implemented a simply debouncing technique. It is a simple loop that waits until the file is not written to for 0.5 seconds. 0.5 seconds is a good tradeoff between latency and ensuring the file is quiescent. Of course it is far from ideal, but it will do.

To implement this a struct timespec object is created to pass as the timeout parameter to kevent.

21
struct timespec debounce_timeout;

72
73
74
    /* Set debounce timeout to 0.5 seconds */
    debounce_timeout.tv_sec = 0;
    debounce_timeout.tv_nsec = 500000000;

In the debounce loop, kevent is used, but this time passed with the 0.5 second timeout.

41
42
43
44
45
46
47
48
49
50
        /* debounce */
        do {
            r = kevent(kq,
                   /* register list */ NULL, 0,
                   /* event list */ &ke, 1,
                   /* timeout */ &debounce_timeout);
            if (r == -1) {
                err(1, "kevent");
            }
        } while (r != 0);

Finally after the debounce, we run the command that the user specified on the command line. The following code shows the declaration, initialisation and execution of the command.

20
char *command;

70
    command = argv[2];

51
        system(command);

To use simplefilemon is easy. E.g: simplefilemon filename "command to run".

You can compile simplemon.c with gcc simplefilemon.c -o simplefilemon.

Download: simplefilemon.c

permalink

Flying

Tue, 13 May 2008 06:18:52

You would think after a few hundred flights and around 300,000 miles the wonder of flying would have worn off. And to a very large extent it has. There is nothing magical or exciting being stuck in a cramped narrow seat for 12 hours, but there are definitely times when you can't help but be amazed where technology and industralisation has got us.

Taking off for the first time on the massive double decker, super jumbo, A380 is definitely one of those experiences. Despite the solid engineering and science behind it, it is still pretty amazing when something that big actually gets of the ground. The fact that this aircraft is so quite when operating ust adds to the experience.

I was lucky enough to get a window seat on the upper deck on my flight from Sydney to Singapore last week. It was comfortable, seat, there is storage right next to you which is great, and the entertainment system is freaking cool. Nice, large crisp LCD screens, and a huge range of TV shows (I watched Buffy, and Bones), movies (I finally saw Juno), and multiplayer games (I cleaned up on Texas Hold-em). All in all, Singapore still gets my vote for best airline.

The next 10 flights (Singapore-Frankfurt, Frankfurt-Marseille, Marsielle-Munich, Munich-Berlin, Berlin-Copenhagen, Copenhagen-Helsinki, Helsinki-Frankfurt, Frankfurt-Zrich, Zrich-Washington D.C, Washington D.C-San Francisco), were nothing to write home about, I didn't get any upgrades, I went very close to missed connections, I ran out of battery on my laptop, all the usual things that make flying fun. I really must recommend not flying through Dulles. It took around 90 minutes to get though immigration, customs, baggage recheck, and security. It looked as though they were upgrading the airport, but if you are flying Europe to west coast US I'd recommend anywhere else, except maybe Denver where you are liable to get snowed in, or Chicago where you are likely to miss your connection. In fact just try and fly direct.

Thankfully after the 8 hours to east-coast plus 6 more hours to the west-coast, I was able to look forward to flying in business class home to Sydney. I'm not sure if it was the 14 hours of flying in economy, but this has been one of my most relaxed flights ever. For some reason the flight was basically empty, the business class cabin was only half-full, and I think anyone in economy probably got a row to themselves.

But none of that would usually inspire me to bother writing. What really did it was the view from the airplane at dawn. Seeing the sun rise of the horizon when you are flying 10km above the planet it pretty amazing when you think about it.

Trying to capture the view is not easy, while shooting out the plane window is not exactly ideal, I just don't think my point-and-shoot is up to it (blame the tools). Anyway, this photo is the best of the lot, it kind of works, but in real life, the blues are bluer, the sun a deeper orange, and the view far more expansive.

clip_image004

permalink

At CELF and Embedded Systems Conference this week

Tue, 15 Apr 2008 09:48:56

I'm in San Jose at the moment for both the CELF Embedded Linux Conference and the Embedded Systems Conference (ESC). (Which are conveniently scheduled at the same time, in different places!). I'm not quite sure how much of each I'll see. I'm primarily going to at CELF, but will probably end up playing some time as booth babe at the Open Kernel Labs stand at ESC.

Most importantly there will be beer at Gordon Biersch (San Jose) on Tuesday night from around 7pm. (Not Thursday night as I may have told people previously, of course if anyone wants to meet up on Thursday as well, that works too.)

I did manage to take a quick break from work yesterday and took advantage of the awesome weather in northern California to drive down to Big Sur along Highway 1. It was some pretty spectacular scenery. Hopefully I won't have a sprained ankle next time and will be able to do some hiking.

clip_image006

ESC seems to bring out some fun billboards, such as this one that I saw driving outside my hotel room today.

clip_image008

permalink

pexif 0.11 released

Thu, 27 Mar 2008 13:22:11

I released a new version of pexif today. This release fixes some small bugs and now deals with files containing multiple application markers. This means files that have XMP metadata now work.

Now I just wish I had time to actually use it for its original purpose of attaching geo data to my photos.

permalink

On apostrophes, Unicode and XML

Thu, 28 Feb 2008 12:51:39

So, I started with something reasonably straight-forward鈥娾pdate my blog posts so that鈥妕he

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值