Sunday, July 16, 2006

Reminder: reading from disk is expensive

M_01898Diddlbiker is working on a little project at work. What the project consists of is not important, but in one step of the process I have to find transportation links between two locations.
There is all kind of theory written over this, but in this case, the problem was very simple: I have rail links in table 'A', and truck links in table 'B'. What is the best routing option from Houston to New Jersey? Straight over Newark or is railing to Philadelpia so much cheaper that the extra trucking cost doesn't matter.
So, for each route there are about thirty rail links to choose from, and there is only one truck link that connects the rail link to the final destination. So what's the big deal? I have to do it for 200,000 entries, and that takes some time.
The initial approach was a brute-force pure-SQL attempt. Well, that didn't go over that well.
The second approach was a VBA approach. Each link would start with a recordset of all rail links originating from it's starting point, for each rail link I can than look up (another recordset) the truck link, and when I all have them, find the cheapest one. That worked, but it took an amazing amount of time - after 45 minutes of processing, only about 1% of the entries was done!
From here I had three options:

  1. Intelligent approach: go and study and find an optimal way of doing it. I don't think there is a lot to be gained here - I'm only investigation 30 options per link, after all

  2. Reduced data: instead of 30 options per link, I could reduce the rail set by 30% by predeterming what the 10 links are that are closest to the final destination. I wouldn't want to go below 10 - 5 or 6 links might be the closest location, but different stations, and you really want to see if routing over a different location works. It would reduce processing time by 70% - but processing would still take around 20 to 30 hours.

  3. Caching: a lot of time is spent reading the truck rates, and there aren't that many of them (about 110,000 to choose from). Writing a simple cache wasn't that hard, and the result was amazing - total processing time went down to 45 minutes.

One of the nice things about VBA is that you get a pretty good Dictionary for free (it's in the Scripting Runtime Library). A small hashing function was written in minutes.

Total time invested: about one hour
Total time saved: about 80

Yay me!

Wednesday, July 05, 2006

Brilliant!





In really, really rare occasions Diddlbiker feels like he accomplished something brilliant. Today was one of those days. And the best thing is: my artwork is actually something small, elegant and comprehensible.

What happened, was that my clients would need a table in a certain format, but the model it supported would demand a different format:























Client Wishes Model Demands
20060%
2006
1.0000
20075%
2007
1.0500
20084%
2008
1.0920
20093%
2009
1.1248
20102%
2010
1.1473
20111%
2011
1.1587


Clearly, my users want to specify a percentage increase for each year. I on the other hand, want a factor that compounds the increases year after year. The solution I came up with was non-equi self join with a kick:

SELECT tblYear.Year, Exp(Sum(Log([tblRate].[Rate] + 1.0))) AS Factor

FROM tblIncrease AS tblYear

INNER JOIN tblIncrease AS tblRate

ON tblRate.Year <= tblYear.Year

GROUP BY tblYear.Year;

I achieve two things in the query. First of all, by using a JOIN ... ON ... <= ... I'm able to pick up all years up to and including the 'current' year.

Second, SQL doesn't have a PRODUCT aggregate function. By turning the factors into logarithms I can multiply them by adding them together - and SUM is something that you can do in SQL! Once they're all summed, I can reverse the logarithm by using an EXP (exponent) function.

The query basically turns the first table (input-friendly) into the second table (model-friendly) without any scripts, temp tables or other garbage. Brilliant!



Speaking of brilliant: I saw a link on Joel on Software today about Elastic Tabstops - I wished every editor was that smart!

Tuesday, July 04, 2006

Nature in Bergen County

W_02450Diddlbiker's father-in-law got himself a bird feeder. That doesn't mean that you will get birds, though. Those dinosaur-descendants are very, very careful and seem to have a sixth sense when it comes to photography: they will simply not show up.

The good news is that those dang squirells have no problem with biped mammals at all, so while I was sitting outside in the blistering heat, there was at least something that I could take pictures of. An FIY for my European friends: squirells are not looked upon kindly in this country. They are as much enjoyed as pigeons in large European cities. So, in analogy of winged rats one could call them rats with fluffy tails.
So, why the picture of the squirell? This would be a great moment to quote Ansel Adams, or Cartier Bresson, about photographing the ordinary, but the truth is: they are adorable creatures to photograph!

W_02453So what about the birds? Eventually they got used to me and a couple of birds did show up. Thanks to Birds of the Mid-Atlantic I am able to find out what bird this is: a Mourning Dove. Just like in Dutch, American bird names are weird. I mean, you might think "tjiftjaf" and "lepelaar" are weird names, but what about "Grackle" or "Tufted Titmouse"?

W_02454I got even luckier after a few more minutes when another rare bird showed up, in this case a House Sparrow. I even managed to get them into a single frame! Without being sarcastic, there are a lot of interesting birds to be seen - I just never carry my camera with me when I see them (which is rare). So far I have seen:
  • Pelican (in Charleston)
  • Turkey Vultures (lots)
  • Eagles (not bald)
  • Cardinals (without bats)
And of course a whole list of ducks, warblers, sparrows, geese and other common birds.

And one day I will get a picture of that special bird. It will just take a while - I'm not a bird watcher after all.

Sunday, June 25, 2006

...and team Holland is next

Couldn't watch the entire game today, Diddlbiker had to take his son to a birthday party. When I got back, Holland was trailing 1-0 from Portugal and an incredible bunch of red and yellow cards was handed out.
Portugal limped into the next round, decapitated and with most of their players vulnerable with yellow cards - they will not survive the encounter with England.

Thursday, June 22, 2006

Team USA goes home

The USA team lost today in a crucial match at the world cup football (I refuse to call it soccer). Not unexpected, when in a poul together with Italy and the Czech republic, although the final ranking of the pool (Italy - Ghana - Czech - USA) was not completely expected.
The inexperience of the American press with the football and the role the USA team plays in it was fascinating to see. A lot of value was given to FIFA rankings. Rankings that work well in sports like tennis, golf and chess, since the players play a lot against each others, so their ratings are calibrated all the time. National teams, on the other hand, only play a handfull of matches each year, and that makes the ratings far from reliable.
So, going into 'Poule E' as the #5 team of the world looked like there would be a solid chance. To be honest, the team is good enough to consider it a serious chance for ending high in the poule. But some realism should be applied as well - both Czech and Italian teams are fearfull opponents, and African teams have proven over the years to be tough opponents as well.
So, the loss against Czech republic came as a total shock. Losing 3-0? Sadly, that was the only match that the Czech played really good, and the USA played really bad. Hence the result. So far, nothing shocking. Anybody can lose from the Czechs, there's nothing to be ashamed about.

What surprised me was the 'analysis' before the second game. Ghana absolutely kicks ass against the Czechs, and the reaction of the commentators: 'this is good news for team USA. They should get hope out of this game'. Basically, now 'all' that is needed is a draw against Italy, winning from Ghana and Italy beating the Czechs, putting the USA in second place. Let me repeat this: your previous opponent, who humiliated you, gets the crap kicked out of them by the team you will have to beat in the last round. And that is good news?

To be honest, I really hoped that the USA team would have advanced to the next round. Hopefully football will survive a blow like this - a lot of the enthusiasm for 'soccer' had to do with the good performance of the USA team so far. This moment had to come at one point; you can't progress forever, and hoping that team USA would be a serious contender for the world cup in the next ten years would be hopelessly optimistic. Any MLS game (and I've watched too many) will confirm that. Still, an 'honorfull' exit against Brazil would have been better than an early exit after two losses and a draw. But the agony of defeat makes the next succes taste better - something that team USA hopefully will learn now as well.

Monday, June 19, 2006

A trip to the terminal

W_02431Most terminals that Diddlbiker is visiting these days are airport terminals, but sometimes I get lucky. Last week Diddlbiker's department had its bi-annual meeting.
This year, the meeting took place at a container terminal, and included a tour as well! Now here is a place where I'd love to walk around for the entire day, taking pictures at will. Unfortunately, that will not happen for several reasons:
  • Liability: chances that Diddlbiker gets run over by a truck and that the terminal has to pay a huge amount of money to Diddlinabiker and kid is fairly large
  • Security: the usual vague reasons that are always given - anything that is worth photographing seems to be a potential terrorist target nowadays (remember, taking pictures of the GWB will get you arrested!)
  • Competition - competitors of terminal X would love to know how many cranes there are on terminal Y, etc (after all, it's not like you can see those cranes from 5 miles distance...)
So, locked up in a van that rarely stopped, and of course sitting at the left (wrong) side of the van, I still managed to shoot some nice pictures. Worth a visit!
W_02445

Saturday, June 10, 2006

Back from Charleston

One day later than expected. My flight home on friday night was cancelled. I spent the night in the Embassy Suites hotel in Charleston, near the airport - I can recommend it to anyone!
The flight home brought me right over the house that Diddlbker lives in. Unfortunately, I couldn't see it - it was under the plane, not next to it...

Thursday, June 08, 2006

Visiting Charleston

The Hilton Resort in CharlestonDiddlbiker is staying a couple of days in Charleston, for work. Since bringing your camera is always a good idea, I'm able to show some pictures as well.
The hotel is located on the bay in South Carolina, with a view on magnificent new bay bridge, and the USS Yorktown (an old aircraft carrier).
A few things to note about the hotel:
  • There is no gym (kind of surprising for a resort)
  • When going out for a walk in the nearby marshes, watch out for alligators!
Shooting pictures of the bridge is harder then I thought. But I won't be leaving without them!

Wednesday, June 07, 2006

How on earth do we fly in this country

Diddlbiker is writing today from his hotel room in Charleston, SC. The hotel room has free internet, which is nice, by the way. Anyway, my boss decided that we'd leave earlier - I didn't understand way, because the weather wasn't that severe. Just some rain.
Long story short: our 15:15 flight left at 17:45 - and that just for some rain?! No wonder airlines operate at a loss if a few drops of rain already ruin their schedules.

Interesting stuff to read: somebody decided to eat monkey food (as in dog food, but then for monkeys) for a week - after all, humans are primates. Hilarious!

Tuesday, June 06, 2006

Transcending into the next level

Diddlbiker got a little bit closer to Python-Zen today. Discovering that you have a folder with over two hundred files named '05????.txt' and '06????.txt' isn't really fun if they should be named 'm05????.txt' and 'm06?????.txt'. I surprised myself with conseriding this the easiest solution:

import os
myfolder = 'C:\\blablabla'
for myfile in os. listdir(myfolder):
__oldname = os.path.join(myfolder, myfile)
__newname = os.path.join(myfolder, 'm' + myfile)
__os.rename(oldname, newname)

(underscores used to indicate spaces; blogspot removes them no matter what I do, even nbsp's)
Time to write script: about 40 seconds - and done!

One day, I'll be a real Pythonista.

Monday, June 05, 2006

There are no dumb questions...

W_02037But some of them... Diddlbiker was doing his regular workout during lunch: 30 minutes on the stationary bike ('Cascades', 30 min, level 15). The program involves two cascades of increasing and decreasing resistance. I like the program; most of the time you spend spinning with low resistance, but you also spend a couple of minutes at each of the two peaks at high resistance.
Anyway, as I was just past the second 'top', a coworker who is working out on the elliptical machine next to me asks me is it possible to get to a good workout on that bike? Well, let me see... My heartrate is racing at 172, sweat is literaly flowing down my face in rivers, and my shirt if soaking wet from top to bottom. But I stayed friendly, and instead of being sarcastic and answering 'no, it's like a walk in the park', I explained that it all depends on the resistance level and the pedal speed. The weird thing is, she took spinning classes in the past, so she should know better - maybe because it is a recumbent bike?

Sunday, June 04, 2006

They canceled my radio show!

Diddlbikers morning commute is a 40 minute trip over Rt 80 and Rt 287. Listening to a decent radio station makes the trip feel shorter. Living in the NYC are means that there is a good selection of radio stations available. Adding a few reasonable constraints (decent music, English language, traffic information) cuts the list pretty short. The one I stuck with was KTU. I really liked the morning show with DJ's Goumba Johnny and Balthazar.
So, surprisingly, they weren't there on thursday morning. Just another DJ, presenting his records as if he always did. Co-hosts Speedy and Cindy where acting like it was the most normal thing in the world as well. Mmmh. Both hosts sick at the same time?
Friday, same story. Now I start to do some researching, but apart from learning that Goumba Johnny (real name John Sialiano) did six months of jail time early 2000 for tax evasion, at first I learn nothing. Then I find out that they fired the hosts of a succesful morning show to replace them with Whoopi Goldberg. Hopefully they'll move to another station. As for KTU - I don't think their morning show will survive this. Grr!

Saturday, June 03, 2006

JPEG Myths

Today was a rainy day, so Diddlbiker decided to try something out. What I wanted to do was show how image quality deteriorates through subsequent saves as a JPEG file. After all, JPEG is a lossy format, so every time you save, you lose some image quality, right?
Wrong. It seems that the truth is slightly more complicated. Save a picture with exactly the same image quality over and over again, and the image data will be saved in exactly the same way. Here is an illustration:




















This is the original picture.
This is the picture saved at a 25% quality level. Yes, it's pretty bad.
This is the previous picture opened, and saved again at a 25% quality level. You'd expect an even worse picture, but it looks exactly the same!
The process is repeated two more times (four times in total), but no difference with the first picture...

Amazingly, image quality doesn't degrade over consecutive saves. There is the loss of image quality due to the JPEG quality used, but that's about it. I'm figuring that once the JPEG compression has taken place, the end result will, when compressed again, yield the same file. Some caveats however:
  • This only works if the picture stays the same (except for the parts that are changed, of course). Resize the picture, crop it, add border around, mirror, etc, and you'll get another round of quality degradation. Any part that is changed will suffer as well of course.
  • There is of course the initial quality loss - and there's not a lot of difference in file size between 95% quality and 75% quality (more about that in another blog)
  • JPEG supports rotation (at 90 degrees) as well. If you are using a good image editor, all that wil happen when you rotate the picture is that a rotation flag will be changed - the picture itself will still stay unchanged.
Lesson learned: never believe 'general knowledge' if you can check the results for yourself - happy experimenting!

Swabbing

Diddlbiker encountered the problem that most DSLR owners will encounter after some time: dust on the sensor. Now, this is the point where nitpickers want to point out that it is not the sensor, but a piece of glass covering the sensor (aka 'high pass filter'), but the end result is still the same: nasty blobs on the pictures.
Getting a brush didn't get the job done. I could still see that friggin' speck of dust sitting their at my sensor, taunting me! Turn out that Copper Hill sells high-quality sensor cleaning kits for very reasonable prices!
There's still some dust on the sensor, but those are really tiny specks and you'll really have to work hard in photoshop to see them. But that big black blob in the middle of my pictures is gone at least.

Thursday, June 01, 2006

On photo expedition in the 'hood

With memorial day, nice weather has arrived. This looked like a good photo opportunity, so Diddlbiker set course to nearby Garfield, to take pictures of the Orthodox church. It is actually called the Three Saints Russian Orthodox Church. Originally built of wood in 1903, it burned down in 1915.
A new church was rebuilt at the same spot in 1916, made of brick this time. In the beginning, the interior of the church was very sparse (it almost went bankrupt because of the rebuilding cost) but eventually the inside became as rich as the outside with its golden cupolas.
Taking pictures of the church turned out to be tricky. My initial thought was to have the setting sun lighting up the gold, and putting the whole building in a warm orange glow. Well, I was right about lighting up the gold, but that warm glow never quite arrived, due to a lot of clouds and an upcoming thunder storm.
The alternative was to wait for the 'magic light' that appears after sunset. In this case, it took me about 45 minutes before the camera started to capture dramatic light as shown on the left!
The downside of the time exposure is blown out highlights; the center cupola and the light spots are just white spots without any details. But it was worth it, given the wonderful colors in the rest of the picture (there's no photo shop involved, this is just the way the camera saw it - reality was less spectaculair. Sometimes the camera wins it from Eyeball Mk I...

Monday, May 29, 2006

Memorial Day 2006

Monday was Memorial Day. It is a very important day in the US, for various reasons:
  • First of all, all the troops that have fallen are remembered. As a super power, the USA has always been involved in wars everywhere, so there is a lot to remember. Unlike May 4th in Holland, which emphasizes mourning about anyone dying in any conflict anywhere, Memorial day is purely about USA soldiers.
  • Memorial day also marks the beginning of the summer. Granted, Baseball season is already on its way, but grilling outside is simply not done before Memorial day.
  • For that reason, Memorial simply cannot be celebrated in any other way than with the parade, followed by a meal of hamburgers and hotdogs.
The opening of the parade is done by the local police force, and the Grand Marshall of the parade:





Then, off course, there are the veterans. They come in all kinds of shapes and sizes and groups, the old ones, from WW2 and Korea, from Vietnam, but also veterans that recently returned from Iraq.
As Memorial day kicks off the summer, you can expect the weather to be hot, and not all veterans are able to walk the parade in uniform in weather like that, even in summer uniform.
So they'll be driven around in fancy vehicles as well.








Then there is the rest of the parade. No parade is complete without marching bands!
Then there is the local fire brigade, counting four engines in Elmwood Park. Engine #4 is the one closest to us, wedged between the railroad tracks and the local sports fields.
Poor Dakota really doesn't like all the sirens and horns from the ambulances and the fire trucks. So we always bring his ear mufflers to any parade that we take him to.
Then it is time for all the special interest groups: little league teams, soccer teams, boy scouts, girl scouts, more marching bands and the local Hot Rod association
And with the last Little League team disappearing in the distance, the parade is over...
But that doesn't mean that Memorial Day is over! Now it's time to visit the inlaws, play around the pool, and eat hamburgers and hotdogs from the grill!

Happy Memorial Day!

Wednesday, May 24, 2006

Photography == Money pit

Diddlbiker discovers that the more money you pour into your photography gear, the more you'll need. The never ending spiral of more! more! better! faster! bigger! seems to be feeding on the stuff that you throw in it, and as the Monster is getting bigger, so does its appetite.

When recently going through some pictures, I discovered how much better RAW is than JPEG. One would think that the difference wouldn't be that big, and if only Rawshooter would handle JPEG as well and everything, but the pictures in Rawshooter really do come out better than the JPEG's straight from the camera - at least the ones that need some post processing, since there is a lot more room (bits) to play with. When the light is good, the problem isn't that big, but in bad light, when a lot of curving and levelling is needed, RAW is sooo much better.
So, where does the money pit part come into this story? Well, there is of course a price to be paid for all that RAW goodness. I wasn't shooting the Five borough tour in JPEG just for the hell of it - 8MB per RAW picture versus 2MB per JPEG means that you can fit a lot more on your memory card! "So? Get a bigger memory card!", and there is the money pit part. I also need a sensor cleaner, but that is an entirely different story...

Sunday, May 21, 2006

Use the right style

One of the things I have learned is that there is great value in being familiar with more than one programming language. The value of being 'multi-lingual' increases if the languages are from different families as well. Without digressing too much - programming language trees are usually horribly incorrect. I saw one where VB.Net was pictured as a derivate from VB 6, and not from C#, what it effectively is (may with a dotted line coming from VB6?).
I never realized the style issue until I got better at writing Python code. Before that, the only languages I knew where Pascal, Visual Basic, Javascript, and C++ - and there is, quite honestly, not much difference in coding style between VB and C++. The possibilities might be different, but the chosen solutions are conceptually the same. After all, they're both Algol-like languages.
It took my a while to figure out why I had such a hard time with larger Python projects: I was trying to design them as Algol-like solutions. Not that Python is completely Lisp-oriented, but once I started to think of tuples as basic variable primitive object types, solutions became elegant instead of just efficient.
Forcing good practice or habits from one language into another is not only dumb, it prevents you from writing efficient code. For example, unlike most languages, VB doesn't exit a function when the return value is assigned. Which is convenient, because it allows cleanup before exiting the function (DAO recordsets have a history of leaking when not explicitly closed). If you're unaware of this, or you force yourself do do an exit function immediately after assigning the return value 'because any other language does it this way', then you'll end up with storing the return value in a temporary variable, cleaning up, and returning the temp. value. And probably showing off your ignorance by telling all your 'coderz' friendz how uncool VB is.

Saturday, May 20, 2006

Back from hibernation

Actually, Diddlbiker didn't hibernate, he was just very busy. I'm not a ! So, what kept me so busy?
Mainly work - a few business trips to Charlotte, NC, a place that I yet have to see under a clear blue sky. The last trip had a Semi Flight From Hell on the way back, more about that in a second.
My blogging plan for the coming weeks is to write a lot about Python. I did a lot of coding in it, and yesterday was the magical day where I Saw The Light. What light? At the end of the tunnel? The one in the basement that allways burns out? Nope - the same light that Jake of the Blues Brothers saw. Exactly - that light. I'm now Enlightened in How To Write Pythonian code - but that will be subject for another blog.

So, what about the Flight From Hell? I was supposed to fly with a colleague on the 19:30 back to Newark, but I never saw him. The flight was heavily delayed, due to thunderstorms over Newark - and according to my wife, the sky in northern Jersey was as clear as glass. So, the pilots explained to us that the thunderstorms would be there when we'd land, and that nobody really understands what Washington traffic control is thinking anyway.
We finally boarded around 22:00, and were warned that take-off would only be at 23:11 - but we had to clear the gate for an incoming flight. Most passengers had already canceled and stayed in a hotel at that point, so the flight was very empty and service was good.
We landed in Newark around 0:30, in thunderstorms, just as traffic control had told us - thunderstorms that started when we got there - if our flight wasn't delayed for four hours, we would have been fine!

And my colleague? He was "lucky" enough to get a seat on the 18:00 flight (completely booked full of course), selfishly not thinking about me. His punishment: their flight sat on the tarmac for three and a half hours before leaving at 21:30. Now, that was a real Flight From Hell...

Sunday, May 07, 2006

Bike New York

Diddlbiker hasn't posted in a very long time, and feels very bad about that. I do have some valid excuses (or not?). Business trips and family time ate up most of my time available to blog. Of course, 'no time' is not a valid excuse, you can always make time. Let's just say that I made time for things that I deemed more important...
Anyway, today was the Five Borough tour, a bike tour trough the five boroughs of New York: Manhattan, the Bronx, Queens, Brooklyn and Staten Island. To be honest, the ride is mainly Manhatten, Queens and Brooklyn - both the Bronx and Staten Island are 'enter and then leave via the shortest way' kind of adventures. Later this week I'll post some pictures of the event.
Traditionally we finished the event with a dinner at Carmine's. It still amazes me - we had appetizers, four different entrees, icecream and coffee and a large amount of wine - all for $33 per person (including a generous tip).
And no, I didn't get weighted this week - traffic was so back on the way back from my son's karate class that I missed weight watchers - but I did watch what I ate.