Return to Blogs

3/16/2010 11:06:18 AM - Google Interview Questions

I found a list of Google interview questions here:

Personally, I've always thought that asking questions like these during an interview is inane. In my opinion, the best - bar none - trait that a programmer can have is an ability to adapt. That's where I think that old school game programmers - the best ones - had an edge. The best old school game programmers couldn't rely upon a programming team being so large that you could just pass off everything that you didn't know to someone who specialized in that field. Need to calculate a second-order differential equation? Perform a fluid dynamics simulation for your water currents? Resolve collision detection issues within a physics simulation? Create a proprietary algorithm to compress video and then allow it to be decompressed in real-time on a double-speed CD-ROM player on an 80486? Render shadow volumes? As a computer programmer, you're likely to run into many situations where you need a bit of knowledge that you don't currently possess, and oftentimes that knowledge will reside in a field in which you've done little or no studying, or which you haven't studied for many years.

The typical response to such criticisms is that the interviewer isn't really looking to see if the interviewee gets the right answer or not, but rather how they think. That's nonsense.

First of all, if you really want to see how a person thinks, you shouldn't be asking ridiculous questions about things like the basic quick sort algorithm or what specific C library function does some obscure task. You should ask conceptual questions - things that the interviewee isn't going to be able to recite from memory (such as "use the so-and-so sorting algorithm which executes in O (n log n) time") and thus would require them to demonstrate their ability to truly understand the underlying issues and derive, whether through their current knowledge or research, a potential solution to the problem.

Second, you shouldn't ask dozens and dozens of questions intended to "see how a person thinks" because if the questions are any good they'll take a bit of time for the person to comprehend, potentially research, and formulate a solution.

To illustrate the problem with Google's current approach I'd point to the software engineer question asking:

Given a file of 4 billion 32-bit integers, how to find one that appears at least twice?

One of the more popular answers suggested is (and I quote, hence the grammatical errors):

The maximum size of int is 2,147,483,647 in case of 32-bit integers. Thus we need declare an array of long long int. Then we can do a merge sort and in doing so we can find the integers which appear at least twice in the merge step. Thus we can solve the problem in n log n time.

Would you really learn anything from this answer in regards to the programmer's skill? The solution would work but it wastes both CPU cycles and memory, and you'd have no idea if the interviewee could come up with something better with a bit more time and effort (rather than just spouting whatever comes to their mind first), or whether they'd always be constrained to such an "academic" approach to solving such a problem.

First of all, they don't need to declare "an array of long long int". The question states that the values are integers - thus you only need an integer array to store them. However, you need more integer storage areas than you could specify with a signed 32-bit integer...which is why you should go with an unsigned integer (which can specify up to 4,294,967,296 values) for your indexing value.

Using a merge sort for a problem like this is wildly inefficient for multiple reasons. It will waste billions of CPU cycles, require far more memory than should actually be necessary, and will consume vast amounts of memory bandwidth as the nodes are constantly re-organized. This is often, however, exactly the type of "quick" answer that people asking these sorts of questions think says something positive about a programmer. Trust me. It doesn't. This is the kind of solution that you'll get from people that can quote a "standard" answer to a "standard" problem from a book but don't have any real-world experience in terms of trying to build a custom solution to a specific problem in the real world, where things like execution speed and efficient utilization of memory actually matter.

Here's my solution:

Read the entire file of 4 billion 32-bit integers into memory (for speed - the hard drive will be able to accomplish that operation much faster than if you tried to read each integer individually.) Allocate 500 million bytes to utilize as a bit field. Traverse through the numbers sequentially. As each number is read, perform two operations upon it - a divide to ascertain the byte index and a modulo to determine the bit offset within that byte. The pseudo-code would look like this:

// Assumptions:
// 4 billion integers are stored in an integer array called IntegerArray[].
// 500 million bytes are reserved (and set to zero) in an unsigned char array called ExistsArray[].

unsigned int c;

for (c=0;c<4000000000;c++)
int ByteIndex=IntegerArray[c]/8;
int BitIndex=IntegerArray[c]%8;
unsigned char ByteValue=ExistsArray[ByteIndex];

if (ByteValue&(1 << BitIndex))
; // The value specified by IntegerArray[c] already existed.
// The value specified by IntegerArray[c] did not already exist.

ExistsArray[ByteIndex]=ByteValue|(1 << BitIndex);

The above pseudo-code will blow the doors off of any algorithm utilizing a general-purpose merge sort and use a lot less memory...and that's before implementing obvious technical optimizations like unrolling the loop to minimize some of the calculations. (For example, the 'ByteIndex' value is divided by 8 and thus only needs to be calculated once for every 8 iterations of the main loop. But I digress....) The solution demonstrates a true understanding of the problem, an ability to derive a solution (rather than simply quote some existing function that could do it in a less than optimal way), and it will attract more ladies because of the prodigious amounts of bit twiddling. If the amount of memory used was an issue, there are a variety of things that you could to do address that. As but one example, you could compress "strides" of the integer existence bits - say, 256 at a time - via RLE (run-length encoding - a fast and efficient method of encoding data that favors sequences that repeat for at least short periods of time.) Each time that you needed to read or set a given bit, you'd decompress the stride (and recompress it if you change something.) Your access time - latency - per read/write operation would slow, but that might be less important than minimizing memory use. Some operations could be optimized by noting whether an entire sequence was set to a given value (so that read operations could immediately return the bit's value without decompressing the stride and write operations could return - avoiding the decompression and recompression - if the value being set matched the value utilized throughout the stride.) You could buffer writes so that the decompression and recompression of a "set bit" operation wouldn't stall the main thread (which would improve performance on a CPU supporting more than one core.)

The point here is that in programming the devil is often in the details. It's frequently the case that a programmer can't craft an intelligent solution to a problem without knowing the specifics. Thus, if you want to truly test their ability, you should have a discussion about how they might go about solving a particular problem rather than simply listing a variety of problems on a piece of paper and expecting an answer in a few sentences. This means, of course, that you probably ought to ask fewer such questions (so that the conversation can go into more depth) rather than more. It's a lot more work on the part of the interviewer, but you'll actually gain some real insight into how the person might approach a given problem.

- TZ

Add Comment

3/17/2010 12:09:44 AM by TZ

I just found a different set of "Google Interview Questions" and I think that these ones are a bit better than the previous list...although I'd still seriously question their predictive value in terms of accurately identifying superior talent for the job in question. Here are some example questions...and my immediate responses:

Question: How much should you charge to wash all of the windows in Seattle?
My answer: I'd charge whatever my costs were, plus 20%.

Question: How many piano tuners are there in the world?
My answer: Roughly the number of pianos divided by 100. (I would have divided by 50, but there are an awful lot of pianos that clearly aren't tuned very often.)

Question: Why are manhole covers round?
My answer: Because if they were square they wouldn't fit in the round holes in the street.

Question: A man pushed his car to a hotel and lost his fortune. What happened?
My answer: He bet someone his fortune that he'd win a race back to the hotel and his car broke down.

Question: You need to check that your friend, Bob, has your correct phone number but you cannot ask him directly. You must write the question on a card and give it to Eve, who will take the card to Bob and return the answer to you. What must you write on the card, besides the question, to ensure Bob can encode the message so that Eve cannot read your phone number?
My answer: Calculate a standard CRC-32 (checksum) on my phone number and write it down on a piece of paper.

Question: You're the captain of a pirate ship and your crew gets to vote on how the gold is divided up. If fewer than half of the pirates agree with you, you die. How do you recommend apportioning the gold in such a way that you get a good share of the booty, but still survive?
Answer: Tell the pirates that you're going to split the booty evenly...with 50% of the crew. If you really wanted to maximize your share, though, you'd tell them that you'll split the treasure with the last man standing and let them duke it out until there's only one pirate left (thus ensuring that you get a full 50%, rather than splitting it evenly amongst a much larger number of people.) Yes, it's brutal, but that's pirate life on the high seas.

Question: Explain a database in three sentences to your eight-year-old nephew.
Answer: A database is like a computerized file cabinet. It allows you to store information within it and retrieve that information based upon your selected search criteria. Learning more about databases on your own may result in you making a lot more money in the future than you otherwise would.


3/17/2010 1:43:39 AM by TZ

Since I'm not a fan of questions such as those listed above in terms of identifying superior talent for a given job, what types of questions would I recommend instead?

How long have you wanted to do a job like this? Why?

Why do you think that our culture is a good fit for you?

How do you spend your free time?

What have you done that makes you think that you'll be able to succeed in this job? Provide some details.

What changes would you make to our business and company if you were the CEO?

I could go on and on but you get the point. With questions such as these, you'll learn something about the person - what makes them tick, motivates them, drives them...and just how driven they actually are. You'll learn what they've worked on in the past that's relevant to the job for which they're applying, and whether they're the creative type that's always wondering how something that's "good" could be made "better".

If you follow up on the their answers - say, by asking some detailed questions about work they've done in the past - you'll get a good idea of their technical prowess, their creativity, their ability to efficiently schedule their time, and a multitude of other things. Notice, though, that doing that type of interview means that you need to actually trust the interviewer to think on their feet. Look at it this way - do you think that you're likely to get a better result from a pre-planned interview, or from one that actually takes into account what the interviewee says and responds to it to learn even more?

It's idiotic. It reminds me of how banks got into their current mess. In the old days, banks used to pay loan officers real salaries and had them do real work - actually get to know the customer coming into a branch and formulate, within some basic guidelines, whether loaning that person or company that amount of money made sense and, if so, at what interest rate and on what terms. Nowadays, though, regardless of whether you bank at Wells Fargo, JP Morgan Chase, Bank of America, or your local community bank, the person with whom you're going to interact is viewed by the bank's upper management as too dimwitted to make such decisions (which is a byproduct of the banks wanting to ratchet up their profitability by lowering wages, which required that they standardize loan terms since you can't have inexperienced people making such important financial decisions.) Thus, what you get are standardized loan application sheets that are relevant for the "average" person but overstate and understate risk for many others.

That's basically what Google and companies like them are doing when they pre-fabricate interview questions - especially technical ones and others intended, they say, to help the interviewer get a sense of "how the interviewee thinks" - rather than relying upon the skill of the interviewer to customize the interview as the two parties get to know one another and carry on a conversation. Questions are fine as a starting point, but what's really important is where they lead, and some questions - such as the ones that I listed above - are far better than others at getting a conversation going that's going to result in the company getting an accurate idea of who the interviewee really is and a rough idea of their potential.

I think that technical questions, in particular, are often used as a crutch by companies that don't actually trust their interviewers. They want them to ask a standardized set of questions, rate the interviewee's answers, and then mail out offer letters based upon the scores. It's all very methodical, formulaic, scientific...and, often, completely inaccurate at identifying superior talent.


3/14/2011 10:40:08 PM by Tyronne Mayadunne

Why you dont you just declare a hashtable and hash all 4 billion integers into the array. Each time you have a collision, check if the two numbers (ie the one previously present and the one just hashed in) are equal or whether they just hashed to the same slot due to the hash function.

Running time is O(n) where n is the number of integers (4 billion in this case).

-Tyronne Mayadunne

Home | Download Games | Buy Games | News | Blog | Links | Press | PAD Files | About Us | Contact