Dissecting an interview question

So, it’s time to retire my interview question. It feels like people are doing better on it than they used to (Glassdoor?), and frankly, I’ve asked it so many times that just walking into an interview room is enough to make me want to start chewing my own arm off. Still, if I’m going to stop asking it, then I thought it might be useful to dissect how and why it worked, common mistakes, and so on. I did this a little obliquely in a previous post, but this will be the real deal.

The problem

First, let’s go over the problem itself (I use Java in the code samples below, though I allow candidates to write the code in C, C++, C#, Java, or Python). I have a little script that I’ve developed over the years, that lets me describe the problem as succinctly and concretely as possible:

I’d like for you to implement a binary search tree (BST) class. There are two requirements. The first requirement is that when I initially instantiate a binary search tree object, I would like for it to start out empty. That is to say, with no initial values. Secondly, I would like to be able to insert specifically ints into the binary search tree. So, I don’t want this to be generic/templated – it should be a data structure specifically for use with ints. Does that make sense?

A couple more things. First of all, I don’t want you to worry about balancing the BST (wait for sigh of relief from interviewee). Also, don’t bother writing getters and setters – they’re just boilerplate code, and I don’t want us to waste time on that. You can either assume that an appropriate getter or setter exists, or access elements directly – either way, it doesn’t matter to me. Lastly, don’t worry about any other feature of the binary search tree. For example, don’t worry about removing values from the binary search tree.

To summarize, all that I want is for you to write a binary search tree class that starts out empty, and that you can insert ints into. Does that make sense?

At this point, I write the following on the whiteboard in whatever language they had chosen:

BST t = new BST();
t.insert(10);
t.insert(20);
t.insert(5);
t.insert(15);
t.insert(30);

    10
   /  \
  5    20
      /  \
    15    30

For the coding portion of the interview, I’m not trying to test whether they know what a BST is or not – I want to make sure they have a visual representation of what we’re doing, so that even candidates whose CS101 might be a little weak can complete the problem.

But first, a note to the haters

Before proceeding to the discussion, I’d like to acknowledge and address a couple of complaints. Though they don’t come up that often, they’re common enough that I’d like to just get them out of the way up front.

  • “I haven’t done this since college.”

The only reason I do a binary search tree problem is that most people know what they are, and the problem is easy to describe quickly. Even if you haven’t studied or used BSTs in a while – even if I have to explain them to you on the spot – you should be able to complete the problem without much difficulty. It’s one step more complicated than a linked list problem, and Lord knows those have been done to death.

  • “Normally, I would just use a library, or google it.”

Sure – me too. Except that the goal of this exercise is not to prove that you know how to use libraries or the internet, but that you can solve a very simple coding exercise. Most people can’t program. Hell, most programmers can’t program. This is my version of fizzbuzz, and if you can’t do it, then you aren’t going to be able to solve completely novel problems.

  • “The traditional whiteboard coding exercise is a bad way to interview people.”

I’m open to other approaches, but all of the alternatives I’ve seen are unrealistic for the majority of top candidates (most of whom are currently employed, are interviewing with other companies, and have a limited amount of time to make a decision). And while I’m sympathetic to this opinion, it’s also completely irrelevant. Unless you’re going to limit the companies you apply to based on their interview style, you don’t get to choose how you get interviewed.

Back to our regularly scheduled program, already in progress

One of the reasons I like the BST problem is that there are a lot of different valid solutions. I don’t like problems where you have to make an intuitive leap, or figure out a trick in order to get to a solution – I much prefer evaluating candidates on core coding skills, and deductive problem solving ability. The other nice thing about the problem is that it tests candidates in a variety of ways. It’s an OOP design problem wrapped in a logical puzzle, inside a coding challenge. Following are some common solutions:

One class v1
public class BST {
    private BST left, right;
    private Integer value;
}
One class v2
public class BST {
    private BST left, right;
    private int value;
    private boolean isInitialized;
}
Two classes v1
public class Node {
    private Node left, right;
    private int value;
}

public class BST {
    private Node root;
}
Two classes v2 (inner class)
public class BST {
    private Node root;

    private static class Node {
        Node left, right;
        int value;
    }
}
Iterative insert
public void insert(int value){
    if (root == null) {
        root = new Node(value);
        return;
    }
    Node cur = root;
    while (true) {
        if (value < cur.value) {
            if (cur.left == null) {
                cur.left = new Node(value);
                break;
            }
            cur = cur.left;
        }
        else {
            if (cur.right == null) {
                cur.right = new Node(value);
                break;
            }
            cur = cur.right;
        }
    }
}
Recursive insert v1
public void insert(int value) {
    if (root == null) {
        root = new Node(value);
        return;
    }
    insertHelper(root,value);
}

private static void insertHelper(Node n, int value) {
    if (value < n.value) {
        if (n.left == null)
            n.left = new Node(value);
        else
            insertHelper(n.left,value);
    }
    else {
        if (n.right == null)
            n.right = new Node(value);
        else
            insertHelper(n.right,value);
    }
}
Recursive insert v2 (kind of functional)
public void insert(int value) {
    root = insertHelper(root,value);
}

private static Node insertHelper(Node n, int value) {
    if (n == null)
        n = new Node(value);
    else if (value < n.value)
        n.left = insertHelper(n.left,value);
    else
        n.right = insertHelper(n.right,value);
    return n;
}

All of the above are valid answers – I don’t ding candidates for any of them, though to be honest I find the one class versions a little ugly. They don’t allow meta-data to be stored with the top-level tree, and require additional information to be stored unnecessarily at each node. Still, as long as the candidate is able to write the code quickly and correctly, I’m willing to accept different approaches.

One of my goals, though, is to make sure the candidate can code both iteratively and recursively. So, if they do the recursive version of insert(), I also ask them to do it iteratively. Look at the code – the first recursive version isn’t really that different from the iterative version (it’s tail-recursive, after all), so it should be easy for them to make the conversion. This feels like a gimme, but some candidates go totally blank. Likewise, if they start with the iterative version of insert(), then I give them another problem that’s easiest to do recursively.

Common mistakes

The following are some of the mistakes that pop up over and over again.

  • Forgetting to handle the base case (i.e., the empty tree).
  • Forgetting to handle the case in which the value already exists. I don’t actually care how people handle this – they can put it to the left or right, add a “count” member to nodes, make the tree a set – but if you first check for a lesser value, then a greater value, then you can end up with an infinite loop in the iterative version.
  • Switching the left and right sides. This generally happens when candidates write the conditional starting with the node value, instead of the parameter (i.e., “if (node.value < value)” instead of “if (value < node.value)”).

The above are mistakes I actually look forward to – they aren’t deal killers, and they give me the opportunity to see if candidates can find bugs in their own code through inspection. The following bug, though, is extraordinarily aggravating. It’s also exceedingly common. Candidates will write this code and then stare at it, unable to figure out why it doesn’t work.

Bad code – iterative version
public void insert(int value) {
    Node cur = root;
    while (cur != null) {
        if (value < cur.value)
            cur = cur.left;
        else
            cur = cur.right;
    }
    cur = new Node(value);
}
Bad code – recursive version
public void insert(int value) {
    insertHelper(root,value);
}

private static void insertHelper(Node n, int value) {
    if (n == null)
        n = new Node(value);
    else if (value < n.value)
        insertHelper(n.left,value);
    else
        insertHelper(n.right,value);
}

Can you see the problem? Once the code gets around to instantiating a new object, it’s being assigned to a local variable or parameter that has no connection to the original data structure. I could understand it if this were an outlier, a weird error that very junior candidates sometimes made, but even senior engineers make this mistake all the time. If you can’t see the bug, look harder, walk through the code, do whatever you need to do to grok what’s happening at a deep level – you absolutely need to understand why the above won’t work.

Brownie points

The problem really is simple enough that there aren’t that many ways to show off, but every once in a while someone does something cool. Usually, this involves using a language-specific feature particularly well. Like in the above examples, using the static keyword correctly (very, very few people do). Or in C++, using double pointers well:

void BST::insert(int value)
{
    Node **cur = &this->root;
    while (*cur) {
        if (value < (*cur)->value)
            cur = &(*cur)->left;
        else
            cur = &(*cur)->right;
    }
    *cur = new Node(value);
}

The above isn’t possible in Java, but if you’re coding in C++, it’s a pretty elegant solution. If you’re going to do this, though, I want you to put all of your ampersands, asterisks, and parentheses in the right place – sure, the compiler or IDE will catch bad assignments, but I want you to show me that you know what you’re doing. If you do, great! You’ve just demonstrated that you understand how pointers work at a deep level. If you don’t, well, I’m going to assume you’re making the same mistake as the people up above.

Scoring

In the interview, I don’t really care about specification of public/private. I don’t care about mistakes an IDE will catch. I don’t care about use of getters/setters vs. direct access of member variables. I’ll give you the obvious Node constructor – no need to waste time writing the boilerplate.

The most important thing is correctness. After that, I care about how clean the code is. There are some things I find annoying, like the following “four ifs” solution:

void insert(int value) {
    if (root == null) {
        root = new Node(value);
        return;
    }
    Node cur = root;
    while (true) {
        if (value < cur.value && cur.left == null) {
            cur.left = new Node(value);
            break;
        }
        else if (value >= cur.value && cur.right == null) {
            cur.right = new Node(value);
            break;
        }
        else if (value < cur.value)
            cur = cur.left;
        else
            cur = cur.right;
    }
}

This code hurts my eyes. Besides which, it’s error prone. It’s too easy to use “>” instead of “>=” in the second if, which can lead to an infinite loop, or to data corruption if the final else doesn’t have a condition. Even if it’s written correctly, it’s an ugly anti-pattern that should be avoided.

The other thing I care about is speed. It’s almost always a bad sign if candidates write really slowly – it’s usually because they don’t feel comfortable coding, are out of practice or aren’t that adept in their supposed “language of choice”. Candidates who have trouble with basic syntax, or who think in individual keywords instead of phrases, aren’t going to be successful in a fast-paced environment.

Last thoughts

So there you have it. It was never a question designed to expose brilliance, but I felt that it served me well in evaluating a candidate’s basic ability to code. As with any tool, I came to understand it better over time, learned which parts to pay attention to and which to throw away. Sometimes, the hardest part of an interview is just not getting stuck on the wrong thing. I have a new question now, something quite different – it’ll be interesting to see where this one goes.

If you liked this post, you’ll also like:

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

24 thoughts on “Dissecting an interview question

  1. Please continue to use this question. It’s a wonderful marker for candidates (like me) who don’t want to work with folks who think questions like that are useful.

    To be honest, I think it’s far more interesting to set aside a real world problem that you have encountered at your company and see how a candidate would think through it (out loud, of course). Many issues you’ll need to solve on a day-to-day basis have nothing to do with programming — and instead have to do with knowing related technologies (like a TCP/IP network, or a database) and how they interact.

    You might be shocked to discover how many developers have a deep understanding of algorithms but can’t troubleshoot/design/architect/solve their way out of a paper bag.

    Also: the thought of you ACTUALLY chewing your arm off is morbidly humorous. Thanks for that. :-)

    • In reading your comment, the thing that strikes me is this – a great coder can pick up new languages and new technologies. I’m much more interested in hiring someone with great core skills, than someone who’s expert in a couple of key technologies. So understanding the TCP stack, or a particular database, is less important to me than verifying that the engineer can actually code – which is a surprisingly rare skill, even among software engineers. I do agree, however, that being able to debug complex systems is a critical skill for a senior engineer – I just don’t know how to test that in a non-gotcha way. It sounds like you and I have different points of view on the interview process – would you be willing to share a breakdown of one of your interview questions?

  2. I’ve always found that having a good, free-ranging discussion with a prospective developer shines the light on their strengths and weaknesses much better than stuff like this. Like Dan, when I’m a candidate, I’m glad these questions usually come up quickly, as it allows me a good opportunity to leave quickly. :-)

    • Good, free-ranging discussions are fun, but they are, unfortunately, not indicative of a candidate’s ability to code.

      Circus Manager: How long have you been juggling?
      Candidate: Oh, about six years.
      Manager: Can you handle three balls, four balls, and five balls?
      Candidate: Yes, yes, and yes.
      Manager: Do you work with flaming objects?
      Candidate: Sure.
      Manager: … knives, axes, open cigar boxes, floppy hats?
      Candidate: I can juggle anything.
      Manager: Do you have a line of funny patter that goes with your juggling?
      Candidate: It’s hilarious.
      Manager: Well, that sounds fine. I guess you’re hired.
      Candidate: Umm … Don’t you want to see me juggle?
      Manager: Gee, I never thought of that.
      Peopleware, DeMarco and Lister

  3. That’s nice but you both acknowledge and ignore the real problem: that this isn’t a good way to interview candidates. You’ve just fully-scripted a bad way. That’s not a solution. If you want ketchup for your fries, you don’t streamline the soda dispenser.

    The primary issue here is that, with the exception of interviews, no one ever writes code without running it. This is a trivial problem to see if interviewees can do something stunningly trivial (implement basic binary trees), but do it perfectly the first time.

    If they can’t write a trivial binary tree, they shouldn’t have made it to the interview: have a better phone screen. So being able to write one proves little. Being able to write one perfectly proves little more, because the problem is still trivial, and writing absolutely perfect code before running and testing is not anyone’s job.

    You say you’re not testing someone’s ability to “google it,” but in reality this is probably a better test than simply writing a completely trivial data structure. Give someone a problem they can reasonably solve in an hour, sit them down in front of a box with an editor and their language of choice, and watch them write code *for real*, even if it’s still something as trivial as a binary tree. If they can look it up, understand the concept, debug and implement it correctly and nicely in the time you’ve allotted, this is someone who is going to get the job done day in and day out.

    • You’re showing your age when you say “no one ever writes code without running it”. I have in the past and maybe if you did you’d write better code. Imagine if you only had access to a machine to run your code once a day, week, or month. I think the author is searching for a way to engage the candidate’s brain and not his debugger.

    • “The primary issue here is that, with the exception of interviews, no one ever writes code without running it.”

      I’ve definitely worked at places where people will write and commit code without ever running it. You could say that’s a sign of a bad developer, and you might be right, but that just makes a test like this even more important.

      Besides, being able to ‘run’ code by stepping through it by hand is also an important part of programming today. If you can’t step through a “trivial” 20-line BST insert, how are you going to manage on anything more complex? Debuggers certainly help but I’m not sure I’d believe you can handle real problems if you can’t figure out 2 if-statements without a computer.

      “This is a trivial problem to see if interviewees can do something stunningly trivial (implement basic binary trees),”

      If it’s so stunningly trivial, then people wouldn’t get it wrong so often, at the conceptual level. A test designed to weed out the people who can’t even solve “stunningly trivial problems” sounds like a perfect first test to me!

      “but do it perfectly the first time.”

      False — he explicitly said that he doesn’t require candidates to write perfect code the first time. Part of the test is being able to find bugs in your own code, which is one of the most important parts of programming.

      Imagine auditioning a violinist by listening to them play a piece of music. They don’t need to play it perfectly, but if you play back a recording of it to them, they should definitely be able to identify the problem areas. I don’t know any half-decent musicians who *can’t* observe their own work and tell the difference between good and bad (and similarly with programmers, martial artists, cooks, or athletes).

      “If they can look it up, understand the concept, debug and implement it correctly and nicely in the time you’ve allotted, this is someone who is going to get the job done day in and day out.”

      You’ve described an alternative method which takes more time, takes more resources, and is harder to administer. Do you have any evidence that it’s any better at judging whether a candidate “is going to get the job done day in and day out” than the whiteboard test here?

      You make it sound like you want a real-world test, but your proposed test is completely fake, too: programmers rarely work on a program in a vacuum, without a feature request or bug report, without version history for the source code, without teammates to query, etc.

  4. I don’t code a lot on a day to day basis anymore, and my handwriting kind of got bad (after typing on the computer keyboard so much), so I’m not sure I can do it fast. Hm. Does it mean I’m not the right candidate here?

  5. Interesting discussion.

    Testing’s the key; the ability to take code (whether you wrote it or not) and step through it mentally and figure out what’s going on, with examples.

    I’d expect an interviewee to test this code by stepping through it, and fix errors. (I personally don’t care if the initial version has errors, like mixing ; they’ll be found in testing). That’s the sign of a good engineer.

    But interviewing engineers can be a crap shoot. I personally get a lot more mileage talking to candidates than doing coding questions. Obviously, the reason I have the person in front of me is because I LOVE their resume. I’ll use at least half (maybe all) the interview time trying to see if they can really do everything they say they can do on the resume. If they can’t, then (i) they’re not qualified anyway, and/or (ii) they’ve started the relationship by being a lying sack of shit.

    Maybe Dan’s kind of question is better for graduates than experienced engineers. When I’m looking at paying the asking price for experienced engineers, I’m paying for their knowledge and ability, not just the ability to solve any problem. I need that TCP/IP experience now, not in 6 months.

  6. I’ve seen the same problem with assigning a new object to a local reference and somehow thinking that’s all that’s required. It’s really strange, and some people, senior, junior, etc. really can’t explain it.

    It drives me nuts because it means people are *super-close* to the end of my problem, but they failed to understand what “pass-by-value” means with object references.

  7. I think this is a much better question than a lot of the reactions are indicating here. TCP/IP experience doesn’t define whether a programmer is worth hiring (at least not if it wasn’t in the job description), it determines what they happen to be familiar with. Get that knowledge by actually hiring for it.

    Meanwhile you’re left with two potential problem hires: the guy who knows Strassen’s algorithm cold, but can’t work out that black buttons on a black background don’t work, and the guy who’s full of common sense but doesn’t know a single language properly. My experience has always been that the algorithm’s guy stands out immediately – even on an algorithm question he’s unused to using actual code, and it’ll be clear from a brief conversation (also, these people are rare in job interviews). The guy who can’t write Hello World, on the other hand, is often personable, a good “soft” interviewer, and impossible to distinguish from a real candidate without actually testing his skills. You said you don’t reject people over simple syntax – given that and some amount of “soft” discussion, this seems like a great interview.

    • Sometimes/Often/Always (delete as applicable), you need to hire somebody with a lot of Domain-specific knowledge. You just don’t have the time to bring on a general coder and let them get up to speed. When you need an expert, and you only have an hour of time to interview them, you need to pick your questions carefully; they don’t have to be “soft” questions, but they don’t need to involved a white board either.

  8. // maybe it’s more important to see how much java the candidate
    // knows ???

    public class WhyDoYouAskPeopleToRecreateTheWheel{
    private final Object lock = new Object();
    private final TreeSet ts = new TreeSet();
    public void put(Integer value){
    synchronized (lock) {
    ts.add(value);
    }
    }
    public List getSorted(){
    synchronized (lock) {
    return new ArrayList(ts);
    }
    }
    }

    • Because I’m just as happy hiring a great C++, C#, C, or Python developer. If you’re a great coder, you’ll be able to pick Java up quickly – although someone can impress me by demonstrating particular virtue in their language of choice, I want to avoid language-specific issues in my question.

  9. Your requirements don’t include being able to read data from your binary search tree, so the most efficient implementation of your data structure has no-ops for both initialization and insertion.

      • It reminds me of when I turned in a blank page for my essay on “nothingness” in my Intro to Buddhism class. The teacher indicated that my F would be equally representative of the idea of nothingness.

  10. In my current job, we are experimenting with a take home test. Choose from a large set of problems, varying difficulty. You can spend as long as you want on it, choose your favorite language, but make it be production quality code, e.g. unit tests, comments, good structure, etc. A good developer can typically do a good job in a couple hours.

    If we like what we see, we bring you in. The first session when you come in is to review your code and discuss ways of extending it.

    We’ve seen interns up to 30 years of experience take the test, and most people can at least do it. We see a huge range in quality, and so far we like the correlation of quiz quality with what we find in the rest of the interview.

    I’m personally fine with whiteboard testing, but it is pretty stressful for the candidate and (I believe) causes you to miss out on some good candidates. I really don’t buy most of the negative arguments around whiteboard testing, but we’re finding the take home variant may work better.

    — david

Leave a comment