ActiveRecord and Thrift Part Deux

I posted about AR and thrift earlier (or late last year) and was reminded this morning that I didn’t follow up that post with the solution. So here it is: Download spike code.

Now it has been a while since i solved this, and the versions may have changed since then, but I do remember that the “tricksy” part was boot-strapping ActiveRecord. That was done using the supplied boot.rb. Once that’s done, you can happily run the thrift server daemon based on textbook instructions and the rest is history. Your models and all the ActiveRecord learnings you’ve had with rails can be re-used quite happily.

Happy coding.


Morty py

Morty py is the same Morty that was built using Ruby on Rails as part of a bigger scheme related to the basics of financial learning, specifically the concept of amortisation. Morty py, as it’s creative name suggests, is a Python implementation. Moreover, it’s also hosted on Google’s AppEngine.

In all, the differences between the frameworks and development experience are both varied and the same. It doesn’t really matter which is “better”- that discussion is a moot point. But in summary, i love the RoR implementation for it’s expressiveness in code and coherence of the MVC pattern. But Google’s AppEngine rocks when it comes to functionality and the tool chain. Performance (for this app) is much of a muchness. Python is really nice, and so is Ruby. Granted, getting to grips with Python was much easier, but that’s only because my multi-lingual skills have improved greatly. And being a multi-linguist is so much more satisfying.

Afterall, imagine, in one day, coding the backend in Python, some related services in Ruby, maybe an optimized service in C, with a front end in C#, possibly ASP.NET or WinForms, a mobile front end in Java and then some obligatory JavaScript to boot. Not forgetting the frameworks that come with each of those languages that make them ultimately productive. For me, seventh heaven 😉


Golden Section Search

For me, implementing code really helps me to understand the algorithms (i need to know) better. It might sound a bit odd in that you might need to understand the algorithm before you can implement it. And that’s partially true. An understanding is definitely required. But, with TDD and iterative processes ingrained, discovery of what makes the algorithm tick is made possible through doing it (by repetition and/or implementation).

Repetition helps you to understand it and use it. Implementing it in code is like rediscovering the algorithm from the beginning. It’s a small taste of that journey- and it’s addictive 🙂 Anyhooo, the Golden Section Search is not an exception- but surprisingly trivial. Maybe it was just all the fluff around the topic that got me distracted….

class Golden
  def search(tolerance)
    return if(@b - @a) <= tolerance

    dif =R*(@b-@a)
    x1 = @b - dif
    x2 = @a + dif
    vx1 = i_eval(x1)
    vx2 = i_eval(x2)

    if(vx1 > vx2)
      @b = x2
      @a = x1


“i_eval” just evaluates the formula you supply (and uses the built-in Ruby expression evaluation) but the part that makes the grok for me is the if(vx1 > vx2) bit.

The way the text books explain it is pretty long-winded. The interval of uncertainty changes, but you’ve got the unknowns a,b,x1 and x2 floating around and changing positions all the time during the explanation. Plus you’ve got to now try remember case 1, case 2 and case 3. Eish. And all that inter-mixed with function and set notation. But that can be maths for you: pick any number between 0 and 10 => pick any integer from the domain of real numbers over the interval from, and including 0 to, and including 10.

The short of it lies in… :

‘b’ changes to x2 if vx2 [or f(x2)] is smaller (or equal) and
‘a’ changes to x1 if vx1 [or f(x1)] is smaller ..

… for the next iteration, everything else keeps the same value. That’s it.

And graphically, it also makes more sense to simplify that for one second and draw the lines in and see how ‘a’ and ‘b’ move along the axis iteration after iteration. Once that’s settled, going back to the ever pedantic yet accurate language of maths is then a whole bunch easier.


Rails, PDF + prawn jumpstart

Looking for a “to-PDF” solution for your rails application? Well, if your journey is (has been) anything like mine [which is pretty standard judging from what i’ve read] then you’ve probably decided prawn is the way forward. And you’re also probably convinced that prawnto is a good idea (it is also highly recommended).

So by now you’ve created a blank rails app, installed the gem, added the plugin, got a controller of sorts setup- all for the purposes of test-driving prawn (with prawnto). You’ve reread the documentation but it’s not going _that_ smoothly. There are one or two little things (not mentioned as explicitly as you’d might want) that you should be aware of. After trawling through some prawn discussions on google groups, i picked a bright penny-moment. Aha! Of course 🙂

Sometimes, while learning a new library/tool/tech/whatever, your brain seems to focus too hard on the problems you’re experiencing and forgets about the basics of the bigger context. When you step away, you (hopefully) realise that the issue you’re having is not the library/tool/tech/whatever, but with the fact that some fundamental got glossed over/ignored/forgotten about. This was one of those.

My controller code was fine. The actions were pretty much {empty}. My view filename was either action.pdf.prawn, action.pdf, action.prawn.pdf, action.prawn or actionpdf.prawn- all sorts of variations ‘cos i couldn’t consistently get the desired effect: an inline pdf render. But the documentation said quite clearly ‘.pdf.prawn’. What does it take to get a simple demo going to start experimenting with? Well… it turns out all i had wrong was the request. I completely ignored the basics (routing) and focused on the problem (pdf library). Not exactly a library issue.

http://localhost:3000/controller/action won’t work => renders .html
http://localhost:3000/controller/action.pdf won’t work either => no route

What i wanted to request was: http://localhost:3000/controller/action/:id.pdf
and :id is the identifier for your model (even if you’re not actually riding with one in our test drive).
And this is due to the default configuration of a default rails app. Check out your routes.rb. Back to basics. Evidently, if you want a different ‘route’ to get to your pdf, then, you guessed it: create a route 🙂


Deploying Rails

A while ago, i got addicted to RoR. Life before RoR was… well. Mundane. Don’t get me wrong. There was still a lot of exciting stuff going on, but RoR opened up a brave new world and it’s “differentness” added to its appeal. And since then, i’ve written a fair amount of Rails apps and a few libraries in Ruby for my own use. And then i tried to deploy a Rails app.

… ?:o

It was hard. And especially hard since i couldn’t eat, sleep and breathe the environment; so every opportunity i got to tackle the problem, i had to relearn the same commands. But i got used to it. I read _a lot_. And i managed to actually understand the conversations at one point. A major plus 🙂

In case you’re wondering what a *normal (or typical?) deployment might look like, take a peek here.

*Normal or typical probably doesn’t even exist, it’s just a phrase which suits my goals at the moment.

In any event, there’s some configuring going on. Examples are here, and here, and here. And there are more.

And despite the seeming “mission” related to deploying apps (and why a lot of folk just abandoned the platform altogether), i still believed it would get better. It just had to.

Hello, Phusion Passenger. Phenomenal! And suddenly, the roses are redder, the skies are bluer, the birds sing clearer and the apps deploy smoother. Waaaay smoother. Keep your eye on this one!

Oh. and here’s more about using Phusion Passenger in development.


Optimizing And Readability

Optimizing code is generally an expensive process (read: time-consuming) and there are established ways of getting to the bottom of “what to optimize”. Thankfully, profilers are available to help with a lot of the guesswork, so it’s generally a good idea to make sure you work with one *most of the time*. Moving along, it was high time for me to look at some Ruby profiling.

The documentation for ruby-prof is pretty neat and the library itself is quick to get up and running with. And so we start. For my initial problem, I wrote a goal-seek algorithm for accurately estimating gross earnings, given a target nett earning using a tax table- as opposed to just using a base tax-rate. Anyhow, my first stab algorithm (a simple linear search) included the lines:

def seek_annual_gross(m_nett, base_perc)
  sample_gross = m_nett * base_perc
  paye =
  p_nett = sample_gross - paye.monthly_tax
  margin = MARGIN*m_nett
  if((p_nett-margin < m_nett) && (p_nett+margin > m_nett))
    return paye.annual_gross.round_to2.to_f
  elsif(p_nett-margin > m_nett)
    return seek_annual_gross(m_nett, (base_perc - (margin)))
  elsif(p_nett+margin < m_nett)
    return seek_annual_gross(m_nett, (base_perc + (margin)))

The profiler showed up what i kinda suspected- always a good sign. Essentially, my incremental margin for the next step was too small (fixed) and thus, getting closer to the solution was taking too long- and endangered the stack 🙂 What i needed was a better guess at how much to increment.

% cumulative self self total
time seconds seconds calls ms/call ms/call name
22.58 0.28 0.28 177 1.58 3.22 Integer#times
16.13 0.48 0.20 171 1.17 259.18 NettGoalSeek#seek_annual_gross
6.45 0.56 0.08 179 0.45 0.50 Float#round_to2

Some minor adjustments to the routine, including an adjusted guess:

increment = MARGIN*(p_nett - m_nett)/margin

and modifying the appropriate calls

if((p_nett-margin < m_nett) && (p_nett+margin > m_nett))
  return paye.annual_gross.round_to2.to_f
elsif(p_nett-margin > m_nett)
  return seek_annual_gross(m_nett, (base_perc + increment))
elsif(p_nett+margin < m_nett)
  return seek_annual_gross(m_nett, (base_perc - increment))

And the profiler now reports:

% cumulative self self total
time seconds seconds calls ms/call ms/call name
2.78 0.16 0.01 11 0.91 6.36 NettGoalSeek#seek_annual_gross
2.78 0.35 0.01 17 0.59 1.18 Integer#times
0.00 0.36 0.00 19 0.00 0.00 Float#round_to2

A significant difference! Incidentally, the time to run, according to the test harness, went down from 1.122316 seconds to 0.189661 seconds. The high-level indicator showing enough of a difference as well.

The by-product of this optimization included the ability to get even more accurate estimations since the stack never overflowed, despite the required margin of error.

The moral: optimization doesn’t need to sacrifice code readability. At the right time, in the right spot, for the right reasons, you can achieve a sweetspot (of sorts) between two opposing(?) constraints. But that’s not to assume i’ve found the nicest sweetspot in this little piece 🙂

So in between refactorings or when there’s a lull in production, indulge the geek inside you.

Business Rants Technology


UPDATE: 3 July 2008
Updated code to reflect more recent tax tables (2009)

When I was asked to estimate PAYE on a gross monthly salary, i hauled out the calculator and started chipping away, according to the SARS Tax Tables. Not being a tax consultant or looking at various structured packages, the first stab is mostly always a straightforward estimate without investigating further deductions. While doing this, the math-programmer inside me went… “Mmmm. Calculator. Boring. Ruby. Smile”

Turns out, it’s a simple little script; a useful little snippet and, bonus, i migrated some more learning onto Ruby. An aside; there’s definitely something about the difference in speed and endurance of learning between my brain versus my hands. You know the feeling. You can forget a password mentally, but let your fingers do the talking… And utilising muscle memory as an aid is just one of the many senses you can draw on…

Back to the snippet. Not too much interesting going on in terms of code. I chose a multidimensional array for storing the tax table. First used a hash, found it was overkill, reverted. Also did a classic switch in the beginning, to determine which “bracket” your pay falls into, but then figured a straightforward loop works just as well. This was an interesting break in habits from C# however.

In C#, looping through an array would be: for(int ii=0;ii<array.length-1;ii++)
I did the same in Ruby, transliterated the code, first time round: for ii in 0..array.length-1
But then, the knowledge of the “times” method changed my thinking completely: 6.times { |ii| … }
There are, afterall, one of 6 tax brackets you are likely to fall in (for all positive salaries)

And that’s where it hit me: the uncomfortable (more about that later) shift away from a corporate-sponsored, statically typed, IDE-integrated, certificate-oriented, compilable(?) programming language into a community-driven, dynamic scripting language is underpinned by these sudden ferocious rushes of freedom. Too much freedom? Certainly, too much to what i’m accustomed to sometimes.

Why’s that uncomfortable? Well, MSDN, VS, MS communities and the framework tell you how to code- to a large extent. They dictate the patterns, the constructs, the idioms; in short, they impose a very definite way of doing things. And it’s a big abstraction layer, forever changing (but not really) and giving you tons of resources to make your coding easier. This is good. Books, online help, built-in help, IDEs, intellisense… and more of all the good stuff. Don’t get me wrong, these things made me very productive and i’m grateful for that. But then you break away from that.

You gotta search for help (no nicely packaged MSDN DVD delivered to your door). You gotta scratch under the hood. You have to engage with community blogs and real people in a virtual world. You are forced to read opinions. You are stripped to the only the most simple of tools- a plain text editor. On a coding level, you are forced to remember namespaces, method names, variable names, libraries… no more intellisense to rely on. And this is where it was difficult. No more crutches to help me be more productive. I had to start thinking- for real now- and remembering stuff. And then i got scared: what if the “community” changed something and i didn’t know about it- or worse, didn’t agree with it? And I didn’t get an email with an updated change delivered to me automatically via updates or DVD? Hang on! Is that really the way it’s supposed to be? Have i become that lazy? Oops.

And all i was doing was having some fun, writing a little script to calculate PAYE so that next time someone asks me, i save myself a little more time.

Btw, the Source Code is here, if you’re interested.

Business Technology

Investing in the Learning Curve

We have a concept of what the learning curve represents, and unfortunately, the same thing can represent 2 opposite concepts. What makes more sense to me is looking at a learning curve from a classical labour cost perspective and more keenly towards labour productivity. In this sense, the learning curve is interpreted, broadly, as: the more you work with something, the more productive (cheaper, better, faster, more knowledgeable) you become. And one can recognise that sentiment when it’s expressed variously with respect to success, specialization, expertise, productivity, quality or minimizing costs. So where’s the investment?

Programming technology changes rapidly, and sometimes to the detriment of programming and business, sometimes not, but also to the advantage of progress and for the sake of technology itself. But changes are also forced to be incremental in order to be successfully adopted, since any radical departure will result in a prohibitively expensive learning curve where the economic costs outweigh the advantages of the change. Similarly, you also cannot force change too frequently, even if it’s small enough, since you never get to break even or realise a profit from the previous change. I think this last point might also be reflected in the current developer attitudes towards the “next big Microsoft thing” and the ubiquitous jading of old hats. Everyone seems to hanging five for a bit before moving forward. Or maybe Douglas Adam’s theory is kicking in?

At the same time though, you need to keep moving forward. So where do you invest your next generation of development so as to minimize the costs of the learning curve if you want to remain marketable and competitive across:
* web development
* mobile application development
* backend systems
* any platform (platform agnostic)

C++, C, C#, VB, Perl, PHP, Python, Ruby… ?

All these languages have their pro’s and con’s and more importantly, costs. As an example, I recently looked at Symbian C++ development, and the learning curve is relatively expensive. The idioms alone take time to get to grips with, so although you got a very powerful API, C++ on Windows desktop, or legacy ATL knowledge is not easily transferrable to a Symbian C++ development effort. Possible, but no as easy as say, being able to use a standard framework and language, with the same idioms, on both (all) platforms. That would be the ultimate prize (for me).

And it doesn’t have to be the same language. Case in point, i’m currently using Monorail (.NET web development) and RoR (other web development) concurrently on different projects and i’m enjoying the benefit of being able to work in a predictable (hence productive) manner switching between the two, relatively seamlessly. Whereas, switching between webform development and RoR, as an example would not be feasible. The traditional 30% context switch overhead would double.

So if you’re faced with “what to learn next”, take a closer look at the learning curve and where you can (need to) apply that knowledge in the future, in order to remain competitive- whether globally, or within your own department. Maybe it’s stating the obvious, but it’s surprising just how un-obvious the obvious can become when there are a lot of flashing lights going off all the time. So it’s not always about the language or the technology, but also a lot about the “way” in which things are done; which, by nature, is usually a little more sublime to spot since it’s hiding in plain sight 😉


Learn By Do. Part III

This is kinda like a bumper edition, since it wasn’t intended but after hacking this for a bit, it turned out to be a pretty useful exercise, no matter the language. Why? Primarily since you get to, depending on implementation, write your own data structures and use recursion. This combination of tasks usually allows you to explore and discover some interesting language-related issues. So then, without further ado.

The task: generate all the possible word combinations from a telephone number. The first stab, ignore dictionaries and words or phrases that actually make sense in English, and just be able to “translate” the number 23 => ad, ae, af, bd, be, bf, cd, ce and cf. If the number is 233, the options become add, ade, adf, aed, aee, aef and so on all the way through to cff. And then later maybe we can reverse the process and make up a word to come up with a number?

Right, the strategy: write tests first! The first thing i need is a class that can accept and validate a string as being a legit phone number.

class TestPhoneNumber < Test::Unit::TestCase
  def test_ctor
    p ="0215556767")
  def test_cant_have_alpha
    assert_raise(ArgumentError) {"065a223")}
  def test_cant_have_punctuation
    assert_raise(ArgumentError) {"087!322")}
  def test_cant_have_other
    assert_raise(ArgumentError) {"0987%")}
  def test_split_numbers
    p ="0123")
    assert_equal(["0", "1", "2", "3"], p.split_numbers)

I also wanted to store the individual digits since in my mind’s eye, i can see myself using a map to index and reference the letters using the digits so it just feels like something i need to do right now. We’ll see how it plays out. Turns out, the implementation is pretty straightforward:

def initialize(number_as_string)
   if number_as_string =~ /\D/
     raise"Can only contain numbers")
   @split_numbers = split_the_numbers(number_as_string)

A simple regular expression takes care of most of what i need to cover. Splitting the numbers is a simple loop, no fuss there.

def split_the_numbers(numbers)
  i = 0
  result = [""]
  while i< numbers.length
    result[i] = numbers[i,1]; i += 1
  return result

The trick now is to be able to write something like:"23").possible_words.each do {|word|
 print word

And get the output:


Mmmm… Tree? Graph? Simple arrays and hashes? What will it be? (…to be continued)


Learn By Do. Part II

In the first installment, we discovered, from the eyes of a noob, a handful of basic concepts about the Ruby language. In this episode, we’ll complete the functions for mean, median, variance and hence standard deviation and discover something you might consider magic…

The mean. It’s the average and so it’s calculation is simple. Add up all the elements and divide by the number of elements. The sum:

    @sum = 0.0
    @elements.each{ |i| @sum += i }

The mean:

  def mean
    if 0 == @elements.length
      return 0
    return @sum / @elements.length

Straightforward and fairly clean. The median is also pretty straightforward; you want the middle element, in the case of an odd-numbered quantity and the average of the middle elements in the case of an even-numbered quantity.

  def median
    if 0 == @elements.length
      return 0
    if 0 == (@elements.length % 2)
      return even_median
      return odd_median

I used the modulus operator to determine if the quantity of elements is odd or even, and it’s pretty much the same as almost every other language i’ve used. There is another way, using the remainder method:
any_even_number.remainder(2) is equal to 0
any_odd_number.remainder(2) is equal to 1
The even/odd_median methods just return the appropriate middle element. First the odd_median:

  def odd_median
    return @elements[(@elements.length/2).floor]

It might seem a little odd, at first, to not use the calculation (n+1)/2:

return @elements[(@elements.length + 1 )/2]

The reason we use the floor is because we know the number of elements divided by 2 will always yield a n.5 and we take the index down form that because we’re using a ZERO-based index in the array. (n+1)/2 works perfectly for a ONE-based index array. We could still use that approach however, but keeping in mind to subtract 1 from the result to get the correct index: ((n+1)/2) -1.

  def even_median
    top = @elements.length/2;
    bottom = top -1
    agg = @elements[bottom] + @elements[top]
    return agg / 2.0

Again, because we’re working with ZERO-based index arrays, the calculations are slightly different to what you’d expect or do in “normal mathematics” (if there can ever be such a term?). Easy, peasy, japaneasy, right?

Variance is a slightly more complicated, although, not complex by any stretch of the imagination. There are two different algorithms for calculating the variance and in the provided code, i have implemented both. Armed with the knowledge of the equations and a decent grip on how to perform some calculations, the implementation is left to you as an exercise in applying your learning.

The magic. At last, the magic. (from Ruby-coloured glasses)

class Float
  def prec(x)
    mult = 10 ** x
    return (self * mult).truncate.to_f / mult

The almost crazy thing happening here is that you have the ability to extend built-in types without much fuss. With this definition, you now have the prec method available to all your Floats. Try it in your console too.

irb(main):006:0> class Fixnum
irb(main):007:1> def plus10
irb(main):008:2> return self + 10
irb(main):009:2> end
irb(main):010:1> end
=> nil
irb(main):011:0> 43.plus10
=> 53

Now that’s cool. Strings, Times, Fixnums, Floats… If your domain requires a specific handling on a particular type; the ability to extend and make that appear “normal” for the remainder of the coding experience is easy. It just slides right in.

And that about wraps up Part II, save one small little detail you’ll find in the code but not elaborated just yet: the accessor. The Ruby User’s Guide has a great explanation, and now you also know about that website too!

The full source code for Learn By Do is available for download: sample_data.rb