fun file facts!
Posted by Mark Mon, 09 Jul 2007 06:09:36 GMT
so, I’m doing this odd little visualisation project. Part of it is to do with Facebook, which means that if it gets popular at all, my poor little server is going to get pounded harder than a goat at a furry convention. Therefore, I have some interesting constraints on resource usage.
I’m using the graph drawing library GraphViz, which is in C. I’m writing my app in Haskell using HAppS, and the Haskell interface to C is all fine and dandy: the difficulty comes because GraphViz wants to output its graph to a file, rather than making it available as a string in memory. This makes things difficult: I really don’t want even the possibility of hitting the disk.
To make this more concrete, I wrote a little C testing script to see how fast this is on my laptop. I took three approaches:
100 times, either:
- Copy the strings back and forth in memory - this should simulate what would be happening if GraphViz generated the graph in memory rather than insisted on copying it to a file.
- Create a ramdisk, and write the strings out to it
- Write the strings to disk, then read them in again: the hope here is that the built-in IO caching will save me
The results for each approach respectively:
10:47 ~/projects/current % time ./a.out -m ./a.out -m 1.68s user 0.09s system 71% cpu 2.480 total 10:47 ~/projects/current % time ./a.out -r ./a.out -r 0.43s user 2.56s system 48% cpu 6.233 total 10:47 ~/projects/current % time ./a.out -f ./a.out -f 0.49s user 4.09s system 28% cpu 16.107 total
So it looks like the inbuilt caching method is not so great. The ramdisk is faster, but still not as good as using the strings in memory - presumably system call overhead is hurting me. A third option I haven’t yet investigated would be for the Haskell process to open a named pipe and have C write to it, but I think that would require at least two processes: at the moment, I just have the one HAppS process and would like to keep it that way if possible. (My current host has a limit of 20 processes, which is a bit anemic.) In any case, it’d be at least as bad as the ramdisk approach, although possibly a bit more doable on a shared host.
In other news, I saw the Maladies play last night at the Hoey. They did an absolutely blistering set: I’ve never seen them quite that sharp. Roll on the album…

