"In het verleden behaalde resultaten bieden geen garanties voor de toekomst"
About this blog

These are the ramblings of Matthijs Kooijman, concerning the software he hacks on, hobbies he has and occasionally his personal life.

Most content on this site is licensed under the WTFPL, version 2 (details).

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    
Powered by Blosxom &Perl onion
(With plugins: config, extensionless, hide, tagging, Markdown, macros, breadcrumbs, calendar, directorybrowse, entries_index, feedback, flavourdir, include, interpolate_fancy, listplugins, menu, pagetype, preview, seemore, storynum, storytitle, writeback_recent, moreentries)
Valid XHTML 1.0 Strict & CSS
New career addition: Education

I don't really have a well-defined plan for my professional carreer, but things keep popping up. A theme for this month seems to be "education".

I just returned from a day at Saxion hogeschool in Enschede, where I gave a lecture/workshop about programming Arduino boards without using the Arduino IDE or Arduino core code.

It was nice to be in front of students again and things went reasonably well. It's a bit different to engage non-academic (HBO) students and I didn't get around to telling and doing everything I had planned, but I have another followup lecture next week to improve on things.

At the same time, I'm currently talking with a publisher of technical books about writing a book about building wireless sensor networks with Arduino and XBee. I'll have to do some extensive research into the details of XBee for this, but I'm looking forward to see how I like writing such a technical book. If this blog is any indication, that should work out just fine.

For now, these are just small things next to my other work, but who knows what I'll come across next?

0 comments -:- permalink -:- 23:02
Reviving Xanthe somewhat

S270 notebook

I was prompted to write this post after I got an email from another S270 user, thanking me for all the info I posted about these machines. He was also having power supply issues, with all the same resulting troubles I had (keyboard stuttering, USB issues, etc.).

He said he fixed this by simply replacing the notebook power supply and all his problems were gone. Interestingly I had already tried that and it didn't work for me, so we were having different problems.

Another reason for writing this post is that I actually fixed my power supply issues a few months ago. I can't really remember what prompted me too look into it again, but this time I didn't have anything to lose - I wasn't really using Xanthe for anything anymore. I opened up the casing completely, took out the mainboard to properly access the underside.

The power connector felt like it was fixed to the mainboard pretty tightly, but I put my soldering iron to it anyway. I just let the solder reflow nicely.

To my surprise, this actually worked. The power connector now powers the laptop properly, without any blinking. I suspect that the solder in one of the contacts had split and caused a flakey connection.

So, Xanthe is still idling around in a cabinet somewhere, but at least she's now usable as a backup or LARP prop sometime :-)

0 comments -:- permalink -:- 22:20
JTAGICE3 converter board

Side view after assembly

Last year, I got myself an Atmel JTAGICE3 programmer, in order to speed up programming my Pinoccio boards. This worked great, except that as can be expected of the tiny flatcable they used (1.27mm connector and even smaller flatcable), the cable broke within 6 months.

Atmel support didn't want to replace it, because it wasn't broken when I first unpacked the programmer, and told me to find and buy a new cable myself. Since finding these cables turns out to be tricky, and I didn't feel like breaking another cable in 6 months, I designed a converter board.

This board is a PCB to be mounted on top of the JTAGICE3, which can be connected permanently to the JTAGICE3 using a very short (1") flatcable, and converts to a "normal" (2.54mm pitch) connector for every-day use. It has both 10-pin JTAG and 6-pin ICSP headers, replacing the two converters that atmel ships. Additionally, the board has another 1.27mm pitch connector, so you can still connect target boards which also use this tiny connector. This still requires using a fragile cable, but now at least only when it's really needed. Finally, the board has a reset button, which can be used to reset the target board (useful for Pinoccio boards, which have none themselves).

Connector polarity

Flatcables (top to bottom: original reversed, new straight, new reversed)

One odd thing about the JTAGICE3 is that the connector on the programmer itself is reversed wrt the polarity notch. Most flatcable connectors have a polarity key, with a matching notch in the socket, to prevent reversing the connector. Normally, the key and notch are at the side that has pin 1, but the JTAGICE3 has it reversed (rotated), so it's on the side that has pin 2. The flatcable that Atmel supplies also has one of the connectors reversed (it's actually upside down), which cancels out this reversion again.

This reversion actually makes it even harder to get a replacement cable, since most of the cables you can find in this size have both connectors applied normally.

Since I didn't want any of this reversed socket business on my board, I made sure to do the reversion in the cable between the JTAGICE3 and the converter board, and have all sockets on the board using the normal orientation.

This does mean that the original cable supplied by Atmel is now useless, even it hadn't broken, since it's way too long to connect the JTAGICE3 to the converter board, but connecting the converter board to a target that uses this tiny connector, needs a "straight" cable. The same goes for the squid cable, though that can still be used by manually correcting the pin numbers (1 becomes 10, 2 becomes 9, etc.).

Note that if you do want to keep using the reversed cable supplied by Atmel (perhaps because yours hasn't broken yet), that's just a matter of soldering on the JTAG2 socket rotated 180° (the notch is towards the board edge instead of toward the middle) - no need to change the PCB.

Above, there's a photo of the various flatcables. From top to bottom: original cable, with one connector upside-down, new regular cable, short cable with one connector reversed (effectively the same as upside-down).


Some parts needed

So, for this project, I needed:

  • a custom-designed PCB
  • a 2x5 pin shrouded pin header, regular 2.54mm spacing
  • a 2x3 pin shrouded pin header, regular 2.54mm spacing
  • two 2x5 pin shrouded pin headers, 1.27mm spacing (Samtec SHF-105-01-L-D-TH)
  • a tiny, reversed flatcable (Samtec FFSD-05-D-01.00-01-N-RN2)
  • a longer, non-reversed flatcable (Samtec FFSD-05-D-06.00-01-N)
  • Some jumper wire
  • Some bolts, nuts, copper standoff and insulating rings

For the half-pitched flatcables and connectors, I found that Samtec manufactures parts that fit the bill (though there should be others as well). I used their SHF series for the sockets, and FFSD series for the cables. Since I had some peculiar requirements (1" cable, reversed polarity key) and digikey only stocks a very limited number of products from their line, I ended up ordering what I needed as (free) samples from Samtec directly. There, you can just specify whatever parameters you need (once you figure out their part number system). Given the cables took a few days longer to ship, I believe they assembled them on demand for me. Thanks, Samtec!

Board design and milling

I designed the board in Kicad, my favorite open-source schematic and board tool. I had to design a custom footprint for the 1.27mm pitch connector, modify some other footprint to get 3mm mounting holes and I already had custom AVR-ICSP and AVR-JTAG pin header components (just normal headers, but with named pins), all else is just standard kicad stuff.

Since the board would cover up the status leds of the JTAGICE3, I added a cutout in the design, so you can still see the leds. I used pcb2gcode and a Mantis CNC mill to mill out the board, but I couldn't quite figure out how to get internal cutouts working properly. In the end, pcb2gcode would do the cutout, but it would mill on the outside of the cutout line instead of on the inside. I modified my design to shrink the cutout line by 1mm (size of my milling tool), so you might need to modify this line for your workflow if you want to create this board as well.

For connecting the 1.27mm connectors, I'm running traces in between the pins of the connector. This is really tiny stuff, which isn't conforming to my clearance requirements. This is intentional: If I reduced my clearance requirement or reduced the trace size, this would affect the traces everywhere on the board, not just the part where they run between the pins. By keeping the traces wide, pcb2gcode will also see a clearance violation but do its best to fulfill it by just milling exactly through the middle - which gives me the largest trace and pad size possible with my machine. You might need to modify this a bit for your workflow.

Because squeezing in these traces reduces the width of these pads significantly (in a few places they're no wider than the holes), I modified the footprint to make the pads a bit higher, so you at least have some area left for soldering.

After milling the board, and applying the silkscreen markings using a lasercutter, I realized I had forgotten markers for pin 1 on the connectors. I've fixed this in the design, for anyone else to benefit.

For a lot more info about milling PCBs using a Mantis CNC mill, see this page at fablabamersfoort. It's an ongoing report of my experiments. It is written in Dutch, though.

I made the board design available on github, for anyone else to play with.

Top side PCB, with lasercutter silkscreen markings Bottom side PCB Schematic Board


Top side PCB, after soldering Bottom side PCB, after soldering

Soldering the board was easy, though some of the traces ended up a bit small because I'm running against the limits of my CNC mill. One trace actually got destroyed while soldering, so I fixed that using a piece of wire on the copper side.

To mount the board on top of the JTAGICE3, I had to make a few modifications to the programmer. The original PCB in the JTAGICE3 already had three (unused) mounting holes, which I could conveniently use. I found some 8mm high copper standoffs, which together with a 1.5mm plastic ring (for insulation) fit snugly between the original PCB and the top of the casing.

I slightly enlarged the mounting holes in the original PCB to 3mm, so I could fit standard M3 bolts. I also drilled 3mm holes in the top side of the casing, to match the mounting holes. I also needed to mill away a bit of plastic from the inside of the bottom casing, just below the three mounting holes, to make room for the head of the bolt.

To solidly fix the new board, I used a fourth mounting hole in my own PCB, which only attaches to the JTAGICE3 casing, not the original PCB.

During assembly, it turned out that my cutout was positioned slightly too high. I already corrected this in the board design I published, so that one should be good.

The end result is an elegant and solid extension of the programmer, with easily accessible connectors. There's still a tiny ribbon cable in there, but it fits perfectly and should never be under much stress. Now, back to actual work :-p

Original PCB with standoffs Standoffs closeup Original PCB, bottom side Front view after assembly Side view after assembly JTAGICE3 with bolts protruding Top view after assembly

0 comments -:- permalink -:- 17:05
Efficient compiletime initialization of variables in C++

Every now and then I work on some complex C++ code (mostly stuff running on Arduino nowadays) so I can write up some code in a nice, consise and abstracted manner. This almost always involves classes, constructors and templates, which serve their purpose in the abstraction, but once you actually call them, the compiler should optimize all of them away as much as possible.

This usually works nicely, but there was one thing that kept bugging me. No matter how simple your constructors are, initializing using constructors always results in some code running at runtime.

In contrast, when you initialize normal integer variable, or a struct variable using aggregate initialization, the copmiler can completely do the initialization at compiletime. e.g. this code:

struct Foo {uint8_t a; bool b; uint16_t c};
Foo x = {0x12, false, 0x3456};

Would result in four bytes (0x12, 0x00, 0x34, 0x56, assuming no padding and big-endian) in the data section of the resulting object file. This data section is loaded into memory using a simple loop, which is about as efficient as things get.

Now, if I write the above code using a constructor:

struct Foo {
    uint8_t a; bool b; uint16_t c;};
    Foo(uint8_t a, bool b, uint16_t c) : a(a), b(b), c(c) {}
Foo x = Foo(0x12, false, 0x3456);

This will result in those four bytes being allocated in the bss section (which is zero-initialized), with the constructor code being executed at startup. The actual call to the constructor is inlined of course, but this still means there is code that loads every byte into a register, loads the address in a register, and stores the byte to memory (assuming an 8-bit architecture, other architectures will do more bytes at at time).

This doesn't matter much if it's just a few bytes, but for larger objects, or multiple small objects, having the loading code intermixed with the data like this easily requires 3 to 4 times as much code as having it loaded from the data section. I don't think CPU time will be much different (though first zeroing memory and then loading actual data is probably slower), but on embedded systems like Arduino, code size is often limited, so not having the compiler just resolve this at compiletime has always frustrated me.

Constant Initialization

Today I learned about a new feature in C++11: Constant initialization. This means that any global variables that are initialized to a constant expression, will be resolved at runtime and initialized before any (user) code (including constructors) starts to actually run.

A constant expression is essentially an expression that the compiler can guarantee can be evaluated at compiletime. They are required for e.g array sizes and non-type template parameters. Originally, constant expressions included just simple (arithmetic) expressions, but since C++11 you can also use functions and even constructors as part of a constant expression. For this, you mark a function using the constexpr keyword, which essentially means that if all parameters to the function are compiletime constants, the result of the function will also be (additionally, there are some limitations on what a constexpr function can do).

So essentially, this means that if you add constexpr to all constructors and functions involved in the initialization of a variable, the compiler will evaluate them all at compiletime.

(On a related note - I'm not sure why the compiler doesn't deduce constexpr automatically. If it can verify if it's allowed to use constexpr, why not add it? Might be too resource-intensive perhaps?)

Note that constant initialization does not mean the variable has to be declared const (e.g. immutable) - it's just that the initial value has to be a constant expression (which are really different concepts - it's perfectly possible for a const variable to have a non-constant expression as its value. This means that the value is set by normal constructor calls or whatnot at runtime, possibly with side-effects, without allowing any further changes to the value after that).

Enforcing constant initialization?

Anyway, so much for the introduction of this post, which turned out longer than I planned :-). I learned about this feature from this great post by Andrzej Krzemieński. He also writes that it is not really possible to enforce that a variable is constant-initialized:

It is difficult to assert that the initialization of globals really took place at compile-time. You can inspect the binary, but it only gives you the guarantee for this binary and is not a guarantee for the program, in case you target for multiple platforms, or use various compilation modes (like debug and retail). The compiler may not help you with that. There is no way (no syntax) to require a verification by the compiler that a given global is const-initialized.

If you accidentially forget constexpr on one function involved, or some other requirement is not fulfilled, the compiler will happily fall back to less efficient runtime initialization instead of notifying you so you can fix this.

This smelled like a challenge, so I set out to investigate if I could figure out some way to implement this anyway. I thought of using a non-type template argument (which are required to be constant expressions by C++), but those only allow a limited set of types to be passed. I tried using builtin_constant_p, a non-standard gcc construct, but that doesn't seem to recognize class-typed constant expressions.

Using static_assert

It seems that using the (also introduced in C++11) static_assert statement is a reasonable (though not perfect) option. The first argument to static_assert is a boolean that must be a constant expression. So, if we pass it an expression that is not a constant expression, it triggers an error. For testing, I'm using this code:

class Foo {
  constexpr Foo(int x) { }
  Foo(long x) { }

Foo a = Foo(1);
Foo b = Foo(1L);

We define a Foo class, which has two constructors: one accepts an int and is constexpr and one accepts a long and is not constexpr. Above, this means that a will be const-initialized, while b is not.

To use static_assert, we cannot just pass a or b as the condition, since the condition must return a bool type. Using the comma operator helps here (the comma accepts two operands, evaluates both and then discards the first to return the second):

static_assert((a, true), "a not const-initialized"); // OK
static_assert((b, true), "b not const-initialized"); // OK :-(

However, this doesn't quite work, neither of these result in an error. I was actually surprised here - I would have expected them both to fail, since neither a nor b is a constant expression. In any case, this doesn't work. What we can do, is simply copy the initializer used for both into the static_assert:

static_assert((Foo(1), true), "a not const-initialized"); // OK
static_assert((Foo(1L), true), "b not const-initialized"); // Error

This works as expected: The int version is ok, the long version throws an error. It doesn't trigger the assertion, but recent gcc versions show the line with the error, so it's good enough:

test.cpp:14:1: error: non-constant condition for static assertion
 static_assert((Foo(1L), true), "b not const-initialized"); // Error
test.cpp:14:1: error: call to non-constexpr function ‘Foo::Foo(long int)’

This isn't very pretty though - the comma operator doesn't make it very clear what we're doing here. Better is to use a simple inline function, to effectively do the same:

template <typename T>
constexpr bool ensure_const_init(T t) { return true; }

static_assert(ensure_const_init(Foo(1)), "a not const-initialized"); // OK
static_assert(ensure_const_init(Foo(1L)), "b not const-initialized"); // Error

This achieves the same result, but looks nicer (though the ensure_const_init function does not actually enforce anything, it's the context in which it's used, but that's a matter of documentation).

Note that I'm not sure if this will actually catch all cases, I'm not entirely sure if the stuff involved with passing an expression to static_assert (optionally through the ensure_const_init function) is exactly the same stuff that's involved with initializing a variable with that expression (e.g. similar to the copy constructor issue below).

The function itself isn't perfect either - It doesn't do anything with (const) (rvalue) references so I believe it might not work in all cases, so that might need some fixing.

Also, having to duplicate the initializer in the assert statement is a big downside - If I now change the variable initializer, but forget to update the assert statement, all bets are off...

Using constexpr constant

As Andrzej pointed out in his post, you can mark variables with constexpr, which requires them to be constant initialized. However, this also makes the variable const, meaning it cannot be changed after initialization, which we do not want. However, we can still leverage this using a two-step initialization:

constexpr Foo c_init = Foo(1); // OK
Foo c = c_init;

constexpr Foo d_init = Foo(1L); // Error
Foo d = d_init;

This isn't very pretty either, but at least the initializer is only defined once. This does introduce an extra copy of the object. With the default (implicit) copy constructor this copy will be optimized out and constant initialization still happens as expected, so no problem there.

However, with user-defined copy constructors, things are diffrent:

class Foo2 {
  constexpr Foo2(int x) { }
  Foo2(long x) { }
  Foo2(const Foo2&) { }

constexpr Foo2 e_init = Foo2(1); // OK
Foo2 e = e_init; // Not constant initialized but no error!

Here, a user-defined copy constructor is present that is not declared with constexpr. This results in c being not constant-initialized, even though c_init is (this is actually slighly weird - I would expect the initialization syntax I used to also call the copy constructor when initializing c_init, but perhaps that one is optimized out by gcc in an even earlier stage).

We can user our earlier ensure_const_init function here:

constexpr Foo f_init = Foo(1);
Foo f = f_init;
static_assert(ensure_const_init(f_init), "f not const-initialized"); // OK

constexpr Foo2 g_init = Foo2(1);
Foo2 g = g_init;
static_assert(ensure_const_init(g_init), "g not const-initialized"); // Error

This code is actually a bit silly - of course f_init and g_init are const-initialized, they are declared constexpr. I initially tried this separate init variable approach before I realized I could (need to, actually) add constexpr to the init variables. However, this silly code does catch our problem with the copy constructor. This is just a side effect of the fact that the copy constructor is called when the init variables are passed to the ensure_const_init function.

Remaining Problems

There's two significant problems left:

  1. None of these approaches actually guarantee that const-initialization happens. It seems they catch the most common problem: Having a non-constexpr function or constructor involved, but inside the C++ minefield that is (copy) constructors, implicit conversions, half a dozen of initialization methods, etc., I'm pretty confident that there are other caveats we're missing here.

  2. None of these approaches are very pretty. Ideally, you'd just write something like:

    constinit Foo f = Foo(1);

    or, slightly worse:

    Foo f = constinit(Foo(1));

Implementing the second syntax seems to be impossible using a function - function parameters cannot be used in a constant expression (they could be non-const). You can't mark parameters as constexpr either.

I considered to use a preprocessor macro to implement this. A macro can easily take care of duplicating the initialization value (and since we're enforcing constant initialization, there's no side effects to worry about). It's tricky, though, since you can't just put a static_assert statement, or additional constexpr variable declaration inside a variable initialization. I considered using a C++11 lambda expression for that, but those can only contain a single return statement and nothing else (unless they return void) and cannot be declared constexpr...

Perhaps a macro that completely generates the variable declaration and initialization could work, but still a single macro that generates multiple statement is messy (and the usual do {...} while(0) approach doesn't work in global scope. It's also not very nice...

Any other suggestions?

0 comments -:- permalink -:- 21:25
Replaced GPG key

For anyone that cares: I just replaced my GPG (Gnu Privacy Guard) key that I use for signing my emails and Debian uploads.

My previous key was already 9 years old and used a 1024-bit DSA key. That seemed like a good idea at the time, but for some time these small keys and signatures using SHA-1 have been considered weak and their use is discouraged. By the end of this year, Debian will be actively removing the weak keys from their keyring, so about time I got a stronger key as well (not sure why I didn't act on this before, perhaps it got lost on a TODO list somewhere).

In any case, my new key has ID A1565658 and fingerprint E7D0 C6A7 5BEE 6D84 D638 F60A 3798 AF15 A156 5658. It can be downloaded from the keyservers, or from my own webserver (the latter includes my old key for transitioning).

Now, I should find some Debian Developers to meet in person and sign my key. Should have taken care of this before T-Dose last year...

0 comments -:- permalink -:- 14:03
Automatically restarting my serial console on Arduino uploads

Minicom running under arduinoconsole script Arduino Community Logo

When working with an Arduino, you often want the serial console to stay open, for debugging. However, while you have the serial console open, uploading will not work (because the upload relies on the DTR pin going from high to low, which happens when opening up the serial port, but not if it's already open). The official IDE includes a serial console, which automatically closes when you start an upload (and once this pullrequest is merged, automatically reopens it again).

However, of course I'm not using the GUI serial console in the IDE, but minicom, a text-only serial console I can run inside my screen. Since the IDE (which I do use for compiling uploading, by calling it on the commandline using a Makefile - I still use vim for editing) does not know about my running minicom, uploading breaks.

I fixed this using some clever shell scripting and signal-passing. I have an arduinoconsole script (that you can pass the port number to open - pass 0 for /dev/ttyACM0) that opens up the serial console, and when the console terminates, it is restarted when you press enter, or a proper signal is received.

The other side of this is the Makefile I'm using, which kills the serial console before uploading and sends the restart signal after uploading. This means that usually the serial console is already open again before I switch to it (or, I can switch to it while still uploading and I'll know uploading is done because my serial console opens again).

For convenience, I pushed my scripts to a github repository, which makes it easy to keep them up-to-date too:

0 comments -:- permalink -:- 10:01
Bouncing packets: Kernel bridge bug or corner case?


While setting up Tika, I stumbled upon a fairly unlikely corner case in the Linux kernel networking code, that prevented some of my packets from being delivered at the right place. After quite some digging through debug logs and kernel source code, I found the cause of this problem in the way the bridge module handles netfilter and iptables.

Just in case someone else actually finds himself in this situation and actually manages to find this blogpost, I'll detail my setup, the problem and it solution here.

Tika's network setup

Tika runs Debian wheezy, with a single network interface to the internet (which is not involved in this problem). Furthermore, Tika runs a number of lxc containers, which are isolated systems sharing the same kernel, but running a complete userspace of their own. Using kernel namespaces and cgroups, these containers obtain a fair degree of separation: Each of them has its own root filesystem, a private set of mounted filesystem, separate user ids, separated network stacks, etc.

Each of these containers then connects to the outside world using a virtual ethernet device. This is sort of a named pipe, but then for ethernet. Each veth device has two ends, one inside the container, and one outside, which are connected. On the inside, it just looks like each container has a single ethernet device, which is configured normally. On the outside, all of these veth interfaces are grouped together into a bridge device, br-lxc, which allows the containers to talk amongst themselves (just as if they were connected to the same ethernet switch). The bridge device in the host is configured with an IP address as well, to allow communciation between the host and containers.

Now, I have a few port forwarding rules: when traffic comes in on my public IP address on specific ports, it gets forwarded to a specific container. There is nothing special about this, this is just like forwarding ports to LAN hosts on a NAT router.

A problem with port forwarding like this is that by default, packets coming in from the internal side cannot be properly handled. As an example, one of the containers is running a webserver, which serves a custom Debian repository on the domain. When another container tries to connect to that, DNS resolution will give it the external IP of tika, but connecting to that IP fails.

Usually, the DNAT rule used for portforwarding is configured to only process packets from the external network. But even if it would process internal packets, it would not work. The DNAT rule changes the destination address of these packets to point to my web container so they get sent to the web container. However, the source address is unchanged. Since the containers have a direct connection (through the network bridge) reply packets get sent directly through the original container - the host does not have a chance to "undo" the DNAT on the reply packets. For external connections, this is not a problem because the host is the default gateway for the containers and the replies need to through the host to reach the external ip.

The most common solution to this is split-horizon DNS - make sure that all these domains resolve to the internal address of the web container, so no port forwarding is needed. For various practical reasons, this didn't work for me, so I settled for the other solution: Apply SNAT in addition to DNAT, which causes the source address of the forwarded packets to be changed to the host's address, forcing replies to pass through the host. The Vuurmuur firewall I was using even had a special "bounce" rule for exactly this purpose (setting up a DNAT and SNAT iptables rule).

This setup worked perfectly - when connecting to the web container from other containers. However, when the web container tried to connect to itself (through the public IP address), the packets got lost. I initially thought the packets were droppped - they went through the PREROUTING chain as normal, but never showed up in the FORWARD chain. I also thought the problem was caused by the packet having the same source and destination addresses, since packets coming from other containers worked as normal. Neither of these turned out to be true, as I'll show below.

Simplifying the setup

Since reproducing the problem on a different and/or simpler setup is always a good approach in debugging, I tried to reproduce the problem on my laptop, using a (single) reguler ethernet device and applying DNAT and SNAT rules. This worked as expected, but when I added a bridge interface, containing just the ethernet interface, it broke again. Adding a second (vlan) interface to the bridge uncovered that the problem was not traffic DNATed back to its source, but rather traffic DNATed back to the same bridge port it originated from - traffic from one bridge port DNATed to the other worked normally.

Digging down into the kernel sources for the bridge module, I uncovered this piece of code, which applies some special handling for exactly DNATed packages on a bridge. It seems this is either a performance optimization, or a way to allow DNATing packets inside a bridge without having to enable full routing, though I find the exact effects of this code rather confusing.

I also found that setting the bridge device to promiscuous mode (e.g. running tcpdump) makes everything work. Setting /proc/sys/net/bridge/bridge-nf-call-iptables to 0 also makes this work. This setting is to prevent bridged packets from passing through iptables, but since this packet wasn't actually a bridged packet before PREROUTING, this actually makes the packet be processed using the normal routing code and progresses through all regular chains normally.

Here's what I think happens:

  • The packet comes in br_handle_frame
  • The frame gets dumped into the NF_BR_PRE_ROUTING netfilter chain (e.g. the bridge / ebtables version, not the ip / iptables one).
  • The ebtables rules get called
  • The br_nf_pre_routing hook for NF_BR_PRE_ROUTING gets called. This interrupts (returns NF_STOLEN) the handling of the NF_BR_PRE_ROUTING chain, and calls the NF_INET_PRE_ROUTING chain.
  • The br_nf_pre_routing_finish finish handler gets called after completing the NF_INET_PRE_ROUTING chain.
  • This handler resumes the handling of the interrupted NF_BR_PRE_ROUTING chain. However, because it detects that DNAT has happened, it sets the finish handler to br_nf_pre_routing_finish_bridge instead of the regular br_handle_frame_finish finish handler.
  • br_nf_pre_routing_finish_bridge runs, this skb->dev to the parent bridge and sets the BRNF_BRIDGED_DNAT flag which calls neigh->output(neigh, skb); which presumably resolves to one of the neigh_*output functions, each of which again calls dev_queue_xmit, which should (eventually) call br_dev_xmit.
  • br_dev_xmit sees the BRNF_BRIDGED_DNAT flag and calls br_nf_pre_routing_finish_bridge_slow instead of actually delivering the packet.
  • br_nf_pre_routing_finish_bridge_slow sets up the destination MAC address, sets skb->dev back to skb->physindev and calls br_handle_frame_finish.
  • br_handle_frame_finish calls br_forward. If the bridge device is set to promisicuous mode, this also delivers the packet up through br_pass_frame_up. Since enabling promiscuous mode fixes my problem, it seems likely that the packet manages to get all the way to here.
  • br_forward calls should_deliver, which returns false when skb->dev != p->dev (and "hairpin mode" is not enabled) causing the packet to be dropped.

This seems like a bug, or at least an unfortunate side effect. It seems there's currently two ways two work around this problem:

  • Setting /proc/sys/net/bridge/bridge-nf-call-iptables to 0, so there is no need for this DNAT + bridge stuff. The side effect of this solution is that bridge packets don't go through iptables, but that's really what I'd have expected in the first place, so this is not a problem for me.
  • Setting the bridge port to "hairpin" mode, which allows sending ports back into it. The side effect here is, AFAICS, that broadcast packets are sent back into the bridge port as well, which isn't really needed (but shouldn't really hurt either).

Next up is reporting this to a kernel mailing list to confirm if there is an actual kernel bug, or just a bug in my expectations :-)

Update: Turns out this behaviour was previously spotted, but no concensus about a fix was reached.

Related stories
0 comments -:- permalink -:- 18:40
Introducing Tika

Tika Tovenaar Supermicro 5015A

(This post has been lying around as a draft for a few years, thought I'd finish it up and publish it now that Tika has finally been put into production)

A few months years back, I purchased a new server together with some friends, which we've named "Tika" (daughter of "Tita Tovenaar", both wizards from a Dutch television series from the 70's). This name combine's Daenney's "wizards and magicians" naming scheme with my "Television shows from my youth" naming schemes quite neatly. :-)

It's a Supermicro 5015A rack server sporting an Atom D510 dual core processor, 4GB ram, 500GB of HD storage and recently added 128G of SSD storage. It is intended to replace Drsnuggles, my current HP DL360G2 (which has been very robust and loyal so far, but just draws too much power) as well as Daenney's Zeratul, an Apple Xserve. Both of our current machines draw around 180W, versus just around 20-30W for Tika. :-D You've got to love the Atom processor (and it probably outperforms our current hardware anyway, just by being over 5 years newer...).

Over the past three years, I've been working together with Daenney and Bas on setting up the software stack on Tika, which proved a bit more work than expected. We wanted to have a lot of cool things, like LXC containers, privilege separation for webapplications, a custom LDAP schema and a custom web frontend for user (self-)management, etc. Me being the perfectionist I am, it took quite some effort to get things done, also producing quite a number of bug reports, patches and custom scripts in the process.

Last week, we've finally put Tika into production. My previous server, drsnuggles had a hardware breakdown, which forced me wrap up Tika's configuration into something usable (which still took me a week, since I seem to be unable to compromise on perfection...). So now my e-mail, websites and IRC are working as expected on Tika, with the stuff from Bas and Daenney still needing to be migrated.

I also still have some draft postings lying around about Maroesja, the custom LDAP schema / user management setup we are using. I'll try to wrap those up in case others are interested. The user management frontend we envisioned hasn't been written yet, but we'll soon tire of manual LDAP modification and get to that, I expect :-)

0 comments -:- permalink -:- 14:10
JTAG and SPI headers for the Pinoccio Scout

Pinoccio Scout

The Pinoccio Scout is a wonderful Arduino-like microcontroller board that has builtin mesh networking, a small form factor and a ton of resources (at least in Arduino terms: 32K of SRAM and 256K of flash).

However, flashing a new program into the scout happens through a serial port at 115200 baud. That's perfectly fine when you only have 32K of flash or for occasional uploads. But when you upload a 100k+ program dozens of times per day, it turns out that that's actually really slow! Uploading and verifying a 104KiB sketch takes over 30 seconds, just too long to actually wait for it (so you do something else, get distracted, and gone is the productivity).

$ time avrdude -p atmega256rfr2  -cwiring -P/dev/ttyACM0 -b115200 -D -Uflash:w:Bootstrap.cpp.hex:i

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1ea802
avrdude: reading input file "Bootstrap.cpp.hex"
avrdude: writing flash (106654 bytes):

Writing | ################################################## | 100% 18.80s

avrdude: 106654 bytes of flash written
avrdude: verifying flash memory against Bootstrap.cpp.hex:
avrdude: load data flash data from input file Bootstrap.cpp.hex:
avrdude: input file Bootstrap.cpp.hex contains 106654 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 13.51s

avrdude: verifying ...
avrdude: 106654 bytes of flash verified

avrdude: safemode: Fuses OK (H:DE, E:10, L:DE)

avrdude done.  Thank you.

real    0m32.824s
user    0m0.120s
sys     0m0.560s

Using an external (ISP) programmer is supposed to be faster. I had an STK500 development board to use, but that's also connected to my PC through a 115200 baud serial port, so no help there.

$ time avrdude -patmega256rfr2 -cstk500v2 -P/dev/serial/stk500 -b115200 -D -Uflash:w:Bootstrap.cpp.hex:i
real    0m27.523s

So, I got myself a JTAGICE3 programmer, which as an added bonus can do in-circuit debugging as well (e.g. stepping through the code, dumping variables). After setting up the udev permission rules on my Linux system, I used a bunch of jumper wires to hook the JTAG3ICE to the ISP/SPI pins of my Scout.

JTAG3ICE connected to Scout with jumper wires


SPI pinFunctionPinoccio pin
2VTARGETSCL (through diode)

One pin was particularly cumbersome: the VTARGET pin. On my STK500 board, this pin supplies a configurable voltage, to power the target board. However, on the Scout, there is no direct access to the VCC line on the pin headers (the 3V3 pin is the output from a secondary regulator, not connected to the main MCU's VCC and it cannot be used as a power input).

I tried powering the board through the USB cable as normal and leaving VTARGET pin on my JTAG3ICE disconnected, but that didn't work. Interestingly enough, using the 6-pin ISP "connector" (no pins, only pads) on the bottom of my scout did work right away.

It turns out the JTAGICE3 doesn't actually provide power on the VTARGET pin. Instead, it uses the pin to sample the target's voltage, so it can drive its data pins at the same voltage as the target, without needing to explicitly configure the voltage. This posed me with a challenge: I needed to put 3.3V on the VTARGET pin, but the 3V3 pin needs to be explicitely enabled by the Scout (and gets turned off when the AVR MCU is held in reset during ISP programming).

Fortunately, the SDA, SCL and BKP (backpack bus) pins contain a pull-up resistor. When nothing else is connected to this pins, their voltage is approximately VCC. Connecting VTARGET to one of these made things work!

Update: It seems that connecting VTARGET to BKP doesn't work, since its pullup resistor is big (100kΩ) and the VTARGET pin draws about 15 μA, resulting in a 1.5V voltage drop.

Connecting VTARGET to SDA or SCL prevents I²C from working properly when the JTAGICE3 is still connected, it seems like the JTAGICE3 acts like a capacitor, keeping the pins high and messing up the signals. Adding a 10k resistor helps to fix the I²C, but breaks JTAG programming again for reasons I don't full grasp. However, adding a diode (I used an 1N4148) between SCL and VTARGET (connected such that current can flow from SCL to VTARGET but not the other way around) fixes everything. This does mean the actual programming happens at 2.8V due to the diode voltage drop, but that's still more than enough.

$ time avrdude -patmega256rfr2 -cjtag3isp -B1 -D -Uflash:w:Bootstrap.cpp.hex:i
real    0m17.280s

Note the -B1 option, which selects the fastest SPI speed supported by the JTAGICE3 (You can also just pass a lower number, which should also use the fastest supported speed, according to the avrdude sources).

This is already a bit better, but still not as good as I'd want.

Using JTAG

JTAG pinFunctionPinoccio pin
4VTGSCL (through diode)

Until now, we used ISP which talks to the target chip over the SPI pins (MISO/MOSI/SCK) and is limited to 1Mhz operation on the JTAGICE3. However, you can also program this AVR chip using JTAG, a protocol originally designed for debugging. Using JTAG, the JTAGICE3 can run up to 10Mhz according to avrdude. This isn't 10x as fast as ISP (probably because JTAG has more protocol overhead), but the speedup is significant.

Before we can do JTAG, though, we'll have to enable (switch to 0) the JTAGEN fuse in the atmega256rfr2. We can do this using the JTAG3ICE's SPI mode:

$ avrdude -p atmega256rfr2 -c jtag3isp -U hfuse:w:0x10:m

Note that this fuse setting is specific to the Pinoccio Scout / Atmega256rfr2. If you have another device, check the datasheet for the correct fuse settings. Furthermore, this fuse setting also programs the OCDEN (on-chip debugging) fuse, though I haven't actually tried to use it. Keep in mind that with the JTAGEN fuse enabled, you can't use the JTAG pins for other purposes. Also, enabling the JTAGEN and OCDEN fuses increases power usage.

After enabling the fuse, I can flash sketches through the JTAG pins (which are mapped to A4-A7 on the Scout):

$ time avrdude -patmega256rfr2 -cjtag3 -B0.1 -D -Uflash:w:Bootstrap.cpp.hex:i
real    0m6.715s

Yeah, now we're talking!

Note again the -B option, for which 0.1 seems the fastest value for JTAG (though these values are a bit finniky, it seems that the JTAGICE3 firmware does some manipulation with this value).

Proper headers

However, having to plug in 6 or 7 jumper wires whenever I want to flash a different board is a bit cumbersome (especially having to remember which wire goes where). Having proper headers (6-pin ISP header and a 10-pin JTAG header) would help here. This is where the Pinoccio Protoboard comes in: it's essentially a DIY backpack that you can solder components - or connectors - onto. After figuring out the pinout puzzle (so many crossed wires, good that these boards have two sides!) I ended up with this piece of work:

JTAG3ICE connected to Scout through custom backpack Backpack top Backpack bottom

I marked pin 1 with a small black dot, which is commonly done with these kinds of connectors.

Next up: figuring out the debugging support using avrice on Linux, or if that doesn't work, Atmel Studio on Windows...

Update: The SVN version from avarice supports the JTAGICE3, so I can single-step through my code using gdb :-D

Update: I've been working on a proper PCB design for this board (since I needed one or two more). I haven't actually built one yet, but the design so far is available on github. See below for the rendered schematic and board files.

JTAG Backpack schematic JTAG Backpack board

Peter wrote at 2014-09-11 02:54

Hi Matthijis

Thanks for this useful piece. "Also, enabling the JTAGEN and OCDEN fuses increases power usage." Can I disable the fuse after I prototype with it to lower power usage?

Matthijs Kooijman wrote at 2014-09-11 07:35

Yes, if you disable the fuse again, the power usage will drop back to normal (and of course you can no longer access the chip through JTAG anymore).

Peter Northling wrote at 2014-09-16 02:35

Hey Matthijs

Do you think you can release a simplified schematics for the protoboard connections?

Matthijs Kooijman wrote at 2014-09-16 10:29

Incidentally, I was already working on a proper PCB design for this, so I had a schematic lying around already (planning to publish things once I got the first working PCB milled). But, now that you asked, I cleaned up the schematic a bit and published stuff right away, see the update in the post above.

Thanks for your interest! Let me know if you build one yourself!

Comments are closed for this story.

4 comments -:- permalink -:- 18:01
Using a JTAGICE3 programmer under Linux: Setting up permissions


Last week, I got a fancy new JTAGICE3 programmer / debugger. I wanted to achieve two things in my Pinoccio work: Faster uploading of programs (Having 256k of flash space is nice, but flashing so much code through a 115200 baud serial connection is slow...) and doing in-circuit debugging (stepping through code and dumping variables should turn out easier than adding serial prints and re-uploading every time).

In any case, the JTAGICE3 device is well-supported by avrdude, the opensource uploader for AVR boards. However, unlike devices like the STK500 development board, the AVR dragon programmer/debugger and the Arduino bootloader, which use an (emulated) serial port to communicate, the JTAGICE3 uses a native USB protocol. The upside is that the data transfer rate is higher, but the downside is that the kernel doesn't know how to talk to the device, so it doesn't expose something like /dev/ttyUSB0 as for the other devices.

avrdude solves this by using libusb, which can talk to USB devices directly, through files in /dev/usb/. However, by default these device files are writable only by root, since the kernel has no idea what kind of devices they are and whom to give permissions.

To solve this, we'll have to configure the udev daemon to create the files in /dev/usb with the right permissions. I created a file called /etc/udev/rules.d/99-local-jtagice3.rules, containg just this line:

SUBSYSTEM=="usb", ATTRS{idVendor}=="03eb", ATTRS{idProduct}=="2110", GROUP="dialout"

This matches the JTAGICE3 specifically using it's USB vidpid (03eb:2110, use lsusb to find the id of a given device) and changes the group for the device file to dialout (which is also used for serial devices on Debian Linux), but you might want to use another group (don't forget to add your own user to that group and log in again, in any case).

0 comments -:- permalink -:- 13:57
Showing 1 - 10 of 151 posts
Copyright by Matthijs Kooijman