Return to Blogs|
3/16/2010 11:06:18 AM - Google Interview Questions
I found a list of Google interview questions here: http://blog.seattleinterviewcoach.com/2009/02/140-google-interview-questions.html
Personally, I've always thought that asking questions like these during an interview is inane. In my opinion, the best - bar none - trait that a programmer can have is an ability to adapt. That's where I think that old school game programmers - the best ones - had an edge. The best old school game programmers couldn't rely upon a programming team being so large that you could just pass off everything that you didn't know to someone who specialized in that field. Need to calculate a second-order differential equation? Perform a fluid dynamics simulation for your water currents? Resolve collision detection issues within a physics simulation? Create a proprietary algorithm to compress video and then allow it to be decompressed in real-time on a double-speed CD-ROM player on an 80486? Render shadow volumes? As a computer programmer, you're likely to run into many situations where you need a bit of knowledge that you don't currently possess, and oftentimes that knowledge will reside in a field in which you've done little or no studying, or which you haven't studied for many years.
The typical response to such criticisms is that the interviewer isn't really looking to see if the interviewee gets the right answer or not, but rather how they think. That's nonsense.
First of all, if you really want to see how a person thinks, you shouldn't be asking ridiculous questions about things like the basic quick sort algorithm or what specific C library function does some obscure task. You should ask conceptual questions - things that the interviewee isn't going to be able to recite from memory (such as "use the so-and-so sorting algorithm which executes in O (n log n) time") and thus would require them to demonstrate their ability to truly understand the underlying issues and derive, whether through their current knowledge or research, a potential solution to the problem.
Second, you shouldn't ask dozens and dozens of questions intended to "see how a person thinks" because if the questions are any good they'll take a bit of time for the person to comprehend, potentially research, and formulate a solution.
To illustrate the problem with Google's current approach I'd point to the software engineer question asking:
Given a file of 4 billion 32-bit integers, how to find one that appears at least twice?
One of the more popular answers suggested is (and I quote, hence the grammatical errors):
The maximum size of int is 2,147,483,647 in case of 32-bit integers. Thus we need declare an array of long long int. Then we can do a merge sort and in doing so we can find the integers which appear at least twice in the merge step. Thus we can solve the problem in n log n time.
Would you really learn anything from this answer in regards to the programmer's skill? The solution would work but it wastes both CPU cycles and memory, and you'd have no idea if the interviewee could come up with something better with a bit more time and effort (rather than just spouting whatever comes to their mind first), or whether they'd always be constrained to such an "academic" approach to solving such a problem.
First of all, they don't need to declare "an array of long long int". The question states that the values are integers - thus you only need an integer array to store them. However, you need more integer storage areas than you could specify with a signed 32-bit integer...which is why you should go with an unsigned integer (which can specify up to 4,294,967,296 values) for your indexing value.
Using a merge sort for a problem like this is wildly inefficient for multiple reasons. It will waste billions of CPU cycles, require far more memory than should actually be necessary, and will consume vast amounts of memory bandwidth as the nodes are constantly re-organized. This is often, however, exactly the type of "quick" answer that people asking these sorts of questions think says something positive about a programmer. Trust me. It doesn't. This is the kind of solution that you'll get from people that can quote a "standard" answer to a "standard" problem from a book but don't have any real-world experience in terms of trying to build a custom solution to a specific problem in the real world, where things like execution speed and efficient utilization of memory actually matter.
Here's my solution:
Read the entire file of 4 billion 32-bit integers into memory (for speed - the hard drive will be able to accomplish that operation much faster than if you tried to read each integer individually.) Allocate 500 million bytes to utilize as a bit field. Traverse through the numbers sequentially. As each number is read, perform two operations upon it - a divide to ascertain the byte index and a modulo to determine the bit offset within that byte. The pseudo-code would look like this:
// 4 billion integers are stored in an integer array called IntegerArray.
// 500 million bytes are reserved (and set to zero) in an unsigned char array called ExistsArray.
unsigned int c;
unsigned char ByteValue=ExistsArray[ByteIndex];
if (ByteValue&(1 << BitIndex))
; // The value specified by IntegerArray[c] already existed.
// The value specified by IntegerArray[c] did not already exist.
ExistsArray[ByteIndex]=ByteValue|(1 << BitIndex);
The above pseudo-code will blow the doors off of any algorithm utilizing a general-purpose merge sort and use a lot less memory...and that's before implementing obvious technical optimizations like unrolling the loop to minimize some of the calculations. (For example, the 'ByteIndex' value is divided by 8 and thus only needs to be calculated once for every 8 iterations of the main loop. But I digress....) The solution demonstrates a true understanding of the problem, an ability to derive a solution (rather than simply quote some existing function that could do it in a less than optimal way), and it will attract more ladies because of the prodigious amounts of bit twiddling. If the amount of memory used was an issue, there are a variety of things that you could to do address that. As but one example, you could compress "strides" of the integer existence bits - say, 256 at a time - via RLE (run-length encoding - a fast and efficient method of encoding data that favors sequences that repeat for at least short periods of time.) Each time that you needed to read or set a given bit, you'd decompress the stride (and recompress it if you change something.) Your access time - latency - per read/write operation would slow, but that might be less important than minimizing memory use. Some operations could be optimized by noting whether an entire sequence was set to a given value (so that read operations could immediately return the bit's value without decompressing the stride and write operations could return - avoiding the decompression and recompression - if the value being set matched the value utilized throughout the stride.) You could buffer writes so that the decompression and recompression of a "set bit" operation wouldn't stall the main thread (which would improve performance on a CPU supporting more than one core.)
The point here is that in programming the devil is often in the details. It's frequently the case that a programmer can't craft an intelligent solution to a problem without knowing the specifics. Thus, if you want to truly test their ability, you should have a discussion about how they might go about solving a particular problem rather than simply listing a variety of problems on a piece of paper and expecting an answer in a few sentences. This means, of course, that you probably ought to ask fewer such questions (so that the conversation can go into more depth) rather than more. It's a lot more work on the part of the interviewer, but you'll actually gain some real insight into how the person might approach a given problem.