Category: Algorithms

# 面试高频题(转）

http://www.mitbbs.com/article_t/JobHunting/32058385.html

coding:
– JOIN: nested join, hash join, sort-merge join
– Number: Fibonacci, prime，随机取文件某一行
– String: strstr, wordcount
– Tree: height, lca, balance tree
– Heap: 查找最大的k个数
– DP: 最大连续子串和
– array: find a key in rotated array, 去除重复字符
– 递归回溯：变化很多，这方面需要大量练习

java GC
C++ virtual, smart pointer
regex使用

search engine: 倒排表,拉链，稀疏索引，空间向量模型，tf*idf,
large scale data: hash, consistent hash, bloom filter, bitmap, 外排序，
partition

network: socket, tcp3次握手, asyschnoized io, epoll, select, 惊群

queue/stack实现
LRU
trie　tree

# 自底向上遍历树避免重复计算

1 Given a binary tree, find the lowest common ancestor of two given nodes in the tree.

http://leetcode.com/2011/07/lowest-common-ancestor-of-a-binary-tree-part-i.html

2

### 判断二叉树是不是平衡

http://zhedahht.blog.163.com/blog/static/25411174201142733927831/

http://www.geeksforgeeks.org/check-given-binary-tree-follows-height-property-red-black-tree/

3.

http://leetcode.com/2010/11/largest-binary-search-tree-bst-in.html

Naive Approach:
A naive approach is to reuse the solution from Determine if a Binary Tree is a Binary Search Tree. Starting from the root, we process in the order of current node, left child, then right child. For each node, you would call isBST() to check if the current subtree is a BST. If it is, then we have found the largest BST subtree. If it is not, then we have to continue examining its left and right child. If only one of the subtrees is BST, then we can return that subtree. However, if both left and right subtrees are BSTs, then we have to compare which subtree is larger (has more descendant nodes), then return the larger one.

Assume that we have a complete tree (ie, all leaves are at the same depth) with n nodes, the naive approach’s run time complexity is O(n lg n). The proof is left as an exercise to the reade

A Bottom-up Approach:
The naive approach is using a top-down approach. It is hardly efficient, simply because we are calling isBST() over and over again. Each time isBST() is called, it traverses down to the leaves to verify if the subtree is a BST.

# Counting sort, bucket sort and radix sort

When comparing counting sort, bucket sort and radix sort, it’s important to know it’s about trade off between memory, time, and simplicity of data structure. Counting sort, bucket sort and radix sort can essentially all be considered as bucket sorting.

Radix sort: n log_r(k). Essentially radix sort performs bucket sort log_r(K) times. It can deal with larger key range (than bucket sort).    r is the base here. the larger is r the better time complexity we got.  but it also means more memory. remember the bucket   array in radix sort is of length r (smaller than n or maximum value of K)  If we use n as the base, it becomes a normal bucket sort. why?   cuz now we have only one ‘digit’, so we only do one bucket sort/counting sort.

Bucket sort: is a generalization of counting sort. The number of bucket sort used  is n. and range can be larger than n. thus we saved some space than in counting sort. but it loses the worse case of O(n+K).  Also the data has to be uniformly distributed.

Counting sort: O(n+k) time and O(K) space

Reference:

https://www.ics.uci.edu/~eppstein/161/960123.html

# Given a stream of unsorted integers, find the median element in sorted order at any given time

http://www.ardendertat.com/2011/11/03/programming-interview-questions-13-median-of-integer-stream/

http://codercareer.blogspot.com/2012/01/no-30-median-in-stream.html

http://www.geeksforgeeks.org/reservoir-sampling/

# 取样问题:从未知总数的元素中选择一个

从事先未知总数为n的对象中随机选择一个的方法。有两种常见的具体问题：

1.读取一个未知行数的文件，随机输出其中的的一行，同时最多只能缓冲一行的内容（《编程珠玑（续）》习题15.3利用了这种形式）；

2.对链表进行一趟遍历，随机输出其中的一个结点的元素，只能使用一个临时指针。