My Favourite Interview Question To Assess Design

When interviewing higher level software engineers, it’s good to assess not just ability to write code, but also to design systems. To accomplish that, I used to ask people to design something, such as an elevator control system.  However, that leaves some gaps.  For example, if you ask them to design a distributed storage system, they may decide to use a two phase commit.  But do they remember why?  How do simpler schemes fail?  Also, when designing with a group of engineers, are they able to give useful feedback to other people?

So instead, I give them naive designs, and ask them to list the designs’ pros and cons of each.  For example:

I want you to help the team design a distributed storage system.  I’ll play the role of the team, suggesting designs, and I’d like you to give feedback on the pros and cons of the designs.

Let’s design a distributed file system. Companies like Facebook and Google don’t use centralized storage, instead they have lots of computers each with commodity disks, and if one fails, they can continue serving requests without losing data.  They have an API that makes the disks look like a single block device with a single, unified address space.

So let’s say the team comes to you with this design:

def write(int address, byte data[4096]):
     Store the data on the computer where the
     write takes place.

def read(int address):
    Lookup the address on the computer where
    the read was issued.
    If we find it, return the data.
    Otherwise, broadcast a message to all
    computers, "please send me the data for
    this address".
    If any computer has it, return that data.
    If no computers have it, return "not found."

 

I then draw a diagram with four computers, and work through an example where one of the computers writes a block, then another computer reads it.

It’s interesting to see what people list as pros and cons. Many talk about the high cost of broadcasting to all nodes. Few catch the potential data inconsistency. The simplicity is a pro, but lack of redundancy is a con.

When they point out that there’s no redundancy, I suggest writing to two machines independently. If both succeed, then we return “success,” otherwise, we return “fail.” Can they spot the problems? Can they point out pros like that disks will be fill up evenly, or will they focus only on cons?

I find that asking people to critique designs gives me a better idea of how they understand a system. Do they just throw around buzz words and boxes, or can they think through the details? Do they have a sense for where race conditions might hide? (Hint: multiple computers touching the same data; things finishing in a non-intuitive order; nodes failing.) Can they actually work through those examples to find the problem? Can they understand that every design has pluses and minuses, or only focus on the parts they don’t like?

And I can also use it as a background for asking: Do you facilitate discussions and encourage team members to come up with their ideas? Or do you see generating designs and architectures as your job alone?

Posted in Uncategorized | Leave a comment

A Few Things I’ve Learned About Agile

Many years ago I worked on code that integrated tightly with 3rd party software. The APIs were not always documented, and when they were, the documentation sometimes had omissions or was just plain wrong. In a situation like this, what to do?

Management’s idea was to involve experts, people who were already familiar with the 3rd party software, in fact some had worked for the company that made it had a hand in creating it. However, the software was vast and nobody could understand more than a fraction of it. Once again, QA could only test the basics during development, thorough testing would have to wait until just before release. And that’s when we’d discover that our current design wouldn’t work and would need significant revision.

Lesson learned: When there are lots of things you can’t know, iterative development is the key to validating designs and getting a predictable schedule as soon as possible. It’s more important to get something working and testable first, then refine it, than it is to build it “production quality” from the start.

At another job, the team would divide up each project into independent components, and have each person work on a different component. We’d intend to do design reviews, and sometimes even do them. Then, after a few weeks, when it was time for a code review, someone would often point out a simpler way to achieve the same goal, or where more defensive programming was needed throughout the work. Only now it was a lot more work to change it. In other words, each part is exposed to the entire genius and foibles of the individual programmer who worked on it.

So we kept saying we needed more reviews and tried various things to encourage that. The problem is, when developers had a choice of working on their own code or going over someone else’s, they usually chose their own. That’s the task that was on the backlog that everybody else was tracking. Interrupting a co-worker to say “can you show me what’s changed since the last time I stopped by?” felt unusual.

The one thing that worked — and worked well — was for people to trade off tasks that lasted around a day. That way, I need to understand the code you wrote yesterday since I’m modifying it today. Plus, your change probably touched code that I’ve worked on and have opinions on. And discussing those opinions is exactly what we needed.

Lesson learned: It’s important to have multiple developers thinking about the details, and it’s really hard to thoroughly consider the details of something you’re not working on.  So explicitly resist developers urge to work on independent parts of a system.  Instead, assign people interdependent tasks that each take around a day.

 

None of this replaces good coding practices such as always checking return values (even if with just an assert), automated tests, readable code, etc. In fact, it’s designed to encourage those, to encourage people to share such best practices early and often.

Posted in Uncategorized | Leave a comment

Higher order functions are (sometimes) more natural

I had a debate with some friends at work the other day, about whether the higher-order function map just adds needless complexity to a language. They considered it a kind of “advanced” function, which surprised me. They said that by not having a variable that holds the current item, it was too concise and cryptic. So I did an experiment.

Daddy: I have a number i, it starts at 1, increases each time and stops and three, and for each i, take Oreo i and eat it.

5 year old: Oreo i? Oreos?

He walked away puzzled. A few days later, I tried again:

Daddy: For each Oreo o, eat o.

5 year old: ??

Again he just walked away without understanding what I’d said.

Daddy: Eat the Oreos.

5 year old: Sure! Yay!

In natural language, at least, we often don’t introduce a term to stand for the thing we’re talking about. Introducing it can just make your expression round about and clunky, giving you lots of noun and verb phrases to decode. Strunk’s Elements of Style says “Vigorous writing is concise.” The same can apply in programming.

Posted in Uncategorized | Leave a comment

Writing To A Binary Stream In C/C++: Harder Than It Should Be

Suppose you want to write some code to communicate using a binary protocol. You would think C (or C++) was a natural language for this. After all, low level data manipulation is one of it’s strengths. But it’s surprisingly hard to get this right — and surprisingly easy to write code that works now, but might stop working in the future.

Here’s a simplified version of the header:

struct request_header {
    uint8_t magic;  // Always 0x80
    uint8_t opcode; // e.g. get, set, add, etc.
    uint16_t key_length;
    uint32_t body_length;
};

How do you format a packet to send? Here’s the most straightforward way:

void write_get_header(uint8_t *buffer, uint16_t key_length)
{
    request_header *header = reinterpret_cast(buffer);
    header->magic = 0x80;
    header->opcode = 0x00; // GET
    header->key_length = key_length;
    header->body_length = key_length;
}

This is exactly what memcached itself does. The problem is, it’s not guaranteed to do what we want, because it uses undefined behavior. The reinterpret_cast is valid (if buffer happens to be aligned to the alignment of request_header, more on this later), but dereferencing the pointer leads to undefined behavior. Note that casting the other direction — casting an existing request_header to uint8_t * — lets you dereference, which is handy when you’re reading from the wire. But when formatting data to send, you’re out of luck.

On x86 with common compilers (g++, clang++, I assume MSVC) this compiles, runs, and does the right thing. So what’s the problem? If you change optimization settings, or upgrade you compiler, it could stop working, because according to the C++ language spec, the compiler is allowed to generate code that does just about anything, including going into an infinite loop or reformatting your hard drive. And this stuff can actually happen. There are contests for Craziest Compiler Output due to Undefined Behavior, some of the source files look quite reasonable, yet don’t do what you expect. Here’s a good introduction, and What Every C Programmer Should Know About Undefined Behavior.

If the compiler won’t warn you, even with -Wall -Wextra, and the code runs just fine, how can you detect undefined behavior? In general there’s no reliable way. Valgrind finds some of it, but because it only works on the compiled output, it won’t find this. Clang has an “undefined behavior sanitizer,” which doesn’t catch the reinterpret_cast, although it does catch the unaligned writes (see below).

But low level data manipulation like this are strengths for C and C++. So there must be a standard idiom for this, right?

The memcached server itself has this problem when it prepares to send replies. It uses the “union” of the struct with a char array of the same size:

union {
    struct {
        uint8_t magic;  // Always 0x80
        uint8_t opcode; // e.g. get, set, add, etc.
        uint16_t key_length;
        uint32_t body_length;
    } request;
    uint8_t bytes[8];
} request_header;

The idea is to take your uint8_t *, cast it to a request_header *, then write to it through the field. That’s supported in C99, but unfortunately not in C++.

And even in C, it’s only supported with caveats. The struct can have padding, and the amount of padding is up to the compiler and the target architecture.  This doesn’t lead to undefined behavior, but it does lead to incorrect behavior: putting data in the wrong place.  If you restrict yourself to processors with byte addressing and power-of-two word size you’re probably ok, because of the way the fields are laid out: the total size of all fields before the uint16_t, assuming no padding, are a multiple of 16 bits, and before the uint32_t, 32 bits.

To tell the compiler you don’t want padding, you can use #pragma pack But now you’re not using ISO standard C, meaning your code isn’t portable to other architectures. And if #pragma pack actually changes the layout of your struct, accesses to it are unaligned, pretty much by definition. What’s the problem with that?

Even on x86_64, uint16_t has an alignment of 2, which means it’s address needs to be even. If we start our header at an odd address, because the previous request had an odd-lengthed key, we’ll be writing through an odd address which is — you guessed it — undefined behavior, and on some popular architectures can be hundreds of times slower, or can simply seg fault. And even on x86, a compiler is free to assume that the lower order bit of the pointer is zero and store other data there if it likes.

In practice, on x86 such writes turn into movl assembly instructions, which will do the right thing with un-aligned addresses. But again, we’d rather not have to rely in intimate knowledge of the compiler and processor. We’d like to say what we mean in valid C/C++, so the compiler can exploit all that. C/C++ is designed for low level stuff like creating network messages, right? So what’s the right idiom?

The official way is to use memcpy() and trust the compiler to optimize it away:

void write_get_header_with_memcpy(uint8_t *buffer, uint16_t key_length)
{
    uint8_t magic = 0x80;
    memcpy(buffer, &magic, sizeof(uint8_t));  // Not needed for uint8_t, but I like the consistency.

    uint8_t opcode = 0x00;
    memcpy(buffer+1, &opcode, sizeof(uint8_t));  // Not needed for uint8_t, but I like the consistency.

    uint16_t key_length_network_order = htons(key_length);
    memcpy(buffer+2, &key_length_network_order, sizeof(uint16_t));

    uint32_t body_length_network_order = htonl(key_length);
    memcpy(buffer+4, &body_length_network_order, sizeof(uint32_t));
}

Because memcpy needs an address, we need to create temporaries, so every line turns into two. If we’re not worried about padding problems we can use offsetof(), otherwise we need to compute the offsets ourselves. This is pretty verbose and error prone.

Helper functions can make things a little better, e.g.:

void write_uint16_t(uint8_t *buffer, uint16_t value)
{
    int16_t value_network_order = htons(value);
    memcpy(buffer, &value_network_order, sizeof(uint16_t);
}

And you can introduce constants for the offsets, to make things a little more readable. But it’s still harder to read than the struct version.

You could also write the entire header in a local variable, then memcpy() the entire thing to the buffer, and hope the compiler optimizes out the copy. I tried this, and in simple example code, it worked! But in production code, with argument checks and returning the address of the next byte to write to, neither gcc nor clang were able to optimize away the memcpy().

So in the end, people mostly use the struct, assume there won’t be any extra padding as long as the total size of all fields before a given one is a multiple of the field’s size, with the union (most often, because it works in C) or the reinterpret_cast, and that writing to unalinged memory won’t cause any problems. This is exactly what memcached itself does.

Posted in Uncategorized | 3 Comments

In Defense Of Implicit Code In C++

tl;dr: When C programmers start using RAII in C++, they’re less productive at first because they don’t think of return; as cleaning up and returning, just returning. They blame the language, but they just need to adjust their mental habits a little. The problem isn’t C++ (it’s got lots of other problems), it’s just that they’re trying to write C in another language. Then you can keep the resource handling code separate from your main logic, you’ll get it correct a lot sooner, etc.

At work we’re having a debate on what language we should use to implement our software, C or C++. There has been a lot written about why C++ sucks, and it really is a bloated language with a lot of traps for programmers at all levels of experience. But I believe that when it comes to writing good, efficient, system-level code, using the right set of C++ features can make your code better. And Resource Acquisition Is Initialization is one of those.

In RAII, the compiler inserts implicit calls to destructors, so you don’t need to remember them, and therefore can’t forget them. But this can lead to confusion, because now it’s harder to know what a particular line of code does. People with a C background look at a statement like “return;” and think “that returns from the function,” not “it cleans up and returns.”  And until they change that mental habit, RAII is murky, confusing and full of gotchas.

Like many things in life, the implicit thing has advantages and disadvantages. Being explicit has the benefit that every line of code is “self contained.” To know what it does, you only need to look at that line of code. At a previous company, we had some horrible implicit code, where a[n] = b[n] created all sorts of temporaries and did all sort of magic under the hood, because operator[] and operator= were overloaded. That style of code was promoted by those who were quick to perceive the benefits of abstraction, but slow to realize its costs. To understand the performance of that one line — where a and b were rows of a matrix — you had to find and understand 5 different classes. That was just too much for most people, and so a lot of sloppy code was written, which we spent months trying to understand and speed up.

So there’s a cost to implicit, but what problems does it let you avoid? With explicit resource management, there’s a convention that when a function allocates some resource, then later encounters an error, it should free the resource before returning. With C, you can’t tell the compiler that explicitly, so you have to do the work of putting in the free() call in all error paths. This means you might forget to put it somewhere, or if someone adds a new error handler they might forget to call free(), or if they reorder the code they might overlook a call to free() they should have added, etc. So while explicit makes it easy to see what a line of code does, it doesn’t make it easy to see if there’s anything missing.

It also means the code that implements separate concerns are mixed together, making it harder to get any one of them correct. The logic for allocating and freeing is mixed in with the main work of the function, as well as error checking. If you’re looking at the code and thinking about the typical, non-error case, it’s easy to overlook a problem with error detection or resource management. So when I’m writing or reading the code, I find it hard to understand all aspects of the code in a single pass. Instead, after I’ve read the code once, I need to do a separate pass of “now did they remember to call free() everywhere?” With practice, you can keep a stack in your head of all the things allocated up to this line of code, and a mental checklist of all the error conditions you might want to check. But that’s a set of mental habits that we need to develop, and before we develop them, we get a lot of leaks and missed error checking.

An alternative is to specify all the steps in one place, e.g. a class definition, then have the compiler insert them for you. That’s what happens with std::unqiue_ptr, or a class to lock and free a Mutex. So it means that when variables go out of scope, things happen that aren’t explicitly listed in that line of code, so return ; can perform arbitrary computation, as can }. That means you need to change the way you look at such statements, because new code conventions develop. But just because those code conventions are new, and different from what you’re used to, doesn’t mean they’re bad. Any tool takes a little getting used to, and until you do, there will be confusion and gotchas. I’d say the mental habits of the “implicit” approach are different — and easier for people to learn. Which is why I like the implicit approach in moderation.

So it’s true you have to learn something about the order of constructors and destructors to get std::unique_ptr right. But it’s actually pretty easy, because it’s designed to do the obvious thing as much as possible.

Posted in Brain Rental | 1 Comment

Finding A Job You’ll Love: Negotiating An Offer

So you did well in an interview and they’ve made you an offer. And now you just have to decide which offer to accept. Right?

Well, not quite. You can always go back to your top choice and ask for more. And now, when they’ve decided that they want you but before you’ve accepted, is the best time. Once you’re employed, you’ll have very little leverage. So, even though your value to the company goes up dramatically in the first couple years, as you learn more of the details of their code and systems, you won’t get much of a raise.

So you’d better negotiate now. Most companies will ask how much you’re currently making and offer you a small increment on top of that. They’ll even ask you what your “salary expectations” are, meaning they’ll ask you for the smallest increment that you’ll accept! I’ve heard it said that you should respond “I hope to be paid fairly” and otherwise refuse to give them a number. Maybe you can explain to them that you feel the process is unfair, and that you won’t give them a number any more than they would if you asked “What’s the most you’re willing to pay me?”

I’ve never done that, because I’ve been worried that I would come across as difficult to work with, or just generally being obstructive. But as much as possible, I try to follow the principals in Getting to Yes. You should all go out and read that now: You’re guaranteed to have to negotiate something at some point in your life, such as buying a house or a used car. The short version is:

  • Emphasize objective criteria. Look for salary surveys on the web. The tricky part is finding a job description that accurately describes your job, but you should be able to get a range of possibilities. Here’s a tip: most organizations believe they do a thorough job of interviewing people, and thus end up with people who are above average. So once you’ve found the salary range, you should be able to say “I think my background and skills match this position more than most of these people, so I think a fair salary is somewhere in the upper end.”

    Emphasize that you want to be paid fairly, not that you’re on the fence about going there. It’s not about “would I still work for you if you didn’t increase the offer,” it’s “I want to be paid what I’m worth.” In a market based economy, “what I’m worth” is more or less what the market is willing to bear, so…

    Get an offer from another company. This is part of why you can’t just let a recruiter do the whole search for you, they’ll only get an offer from a single company. When another company makes you an offer that’s a little more than what you’re making now, go back to the first company and say “If you offer me a little more than that, I’ll be happy to work for you.”
     

  • Invent options for mutual gain. Salary is the big one that everyone focuses on, but its often hard to get approval for more than a small increase in salary. Look at the benefits that other companies are offering. How many holidays, vacation days and sick days? Will they pay for parking or a subway pass? Do they have a 401(k) match? If you’ll be taking a hit on any of those, compared with either your current job or other companies you’re applying to, bring it up. If the company can’t address them directly (e.g. they can’t institute a 401(k) matching program just for you), you can use them to argue for greater salary.
     
  • And with that, and the info in the rest of the “Finding A Job You’ll Love” series, I hope you’ll be a little better and finding and landing that great job. Good luck, and please let me know if you found it useful!

Posted in Brain Rental | 6 Comments

Finding a Job You’ll Love: The Interview

It’s best if you can do some prep for the interview. Try to understand what the company’s product is. It’s amazing how cryptic a web site can be, especially if it’s enterprise software in some industry you don’t know. Still, make an attempt, and ask about anything that’s confusing. The fact that you’ve understood some basic things about their product, and that you want to understand more, will set you apart. Of course, it will also help you understand what you’d be doing if you worked there.

Then you should try to read up on the sorts of questions people like to ask. For some reason, people ask algorithms questions far out of proportion to how often they come up in an actual job. Pick up a copy of The Algorithm Design Manual
and read the first section (chapters 1-10) during lunch or in the evening. Even if you only get half way through, you’ll be better prepared than most.

The interview is lopsided: you’ll spend almost all your time answering their questions and only have a little time to ask yours. That’s ok. If you’re still uncertain after the reverse phone screen and the in-person interview, you can always ask to talk to someone on the phone another day.

Keep in mind that they’re trying to figure out whether to hire you. So while it’s good to mention how great this job would be for you, you want to emphasize how great you would be at this job. They don’t really care that you’ve always wanted to learn Lisp and this job would let you do so. They care that you can design & implement software, quickly understand complexity, etc. With that in mind, ask yourself “what are they really trying to get at with this question?” For example, if they ask about a particular job on your resume, feel free to say that another job is probably more relevant and ask them if they’d rather hear about that.

Generally you’ll meet 5-6 people for an hour each, and there are two great questions you should ask everyone. After the initial small talk, ask “What does someone need to be really successful at this position?” Not only will many interviewers think this is a good question for a candidate to ask, but they’ll tell you what they’re looking for. For the rest of the hour, when answering questions, try to answer with stuff related to what they’re looking for. Also, write down the answer. You’ll want to refer to it much later when you’re trying to decide between the different places you interviewed at.

Save the second question for near the end of the hour. “What is the best and worst thing about working at X?” Again, you should write down the answer, because you’ll have 5 or 6 of them from every company you interview at. Both the best and worst thing are important to consider.

Beyond that, here are some good questions to ask:

  • What are the day-to-day tasks I’d have?
     
  • What day-to-day tasks do you typically do? What have you been doing over the past month?

For a manager:
These were covered in the Reverse Phone Screen post.

For the tech lead:

  • How is scheduling done? How are estimates and deadlines created?
     
  • How are tasks divided up among programmers?
     
  • Who does the overall design/architecture?
     
  • Who does the design/architecture of individual contributor’s tasks?
     
  • How much time do leads spend with their programmers?
     
  • Do you do code reviews? How are they done?
     
  • How much do you look over people’s check ins?
     

Good luck, and try to be relaxed and friendly. But most of all, focus on the technical puzzles they give you. If you got into programming because you enjoy that kind of puzzle solving, hopefully the puzzles will at least be interesting.

Posted in Brain Rental | 1 Comment