Leetcode 5: Longest Palindromic Substring

Longest Palindromic Substring

Total Accepted: 71879 Total Submissions: 343504 Difficulty: Medium

Given a string S, find the longest palindromic substring in S. You may assume that the maximum length of S is 1000, and there exists one unique longest palindromic substring.

Code:

class Solution(object):
    def longestPalindrome(self, s):
        """
        :type s: str
        :rtype: str
        """
        if len(s) == 0:
            return ""

        start = 0
        max_len = 1        

        for i in xrange(len(s)):
            if i-max_len-1 >=0 and s[i-max_len-1:i+1] == s[i-max_len-1:i+1][::-1]:
                start = i - max_len - 1
                max_len = max_len + 2
                continue

            if i-max_len >=0 and s[i-max_len:i+1] == s[i-max_len:i+1][::-1]:
                start = i - max_len
                max_len = max_len + 1

        return s[start:start+max_len]

 

Idea: 

The optimal solution is O(N^2) time complex. So nothing very special. You traverse every position in `s`. You always compare if s[i-max_len-1:i+1] reads same as s[i-max_len-1:i+1][::-1]. See the picture below.

Untitled Diagram

We first test whether s[i-max_len-1:i+1] is palindrome then s[i-max_len:i+1] because the next palindrome can possibly be longer than the previous palindrome by at most 2 characters. 

 

Reference:

https://leetcode.com/discuss/21332/python-o-n-2-method-with-some-optimization-88ms

Compile Hadoop 2.7 on Eclipse on Ubuntu 15.04

  1. Install libtool: sudo apt-get install libtool 
  2. Install protoc 2.5: http://stackoverflow.com/questions/29797763/how-do-i-install-protobuf-2-5-on-arch-linux-for-compiling-hadoop-2-6-0-using-mav
  3. Git clone hadoop-common and compile using Maven then import into Eclipse: https://wiki.apache.org/hadoop/EclipseEnvironment
  4. Make sure Eclipse uses officially supported JRE: https://wiki.apache.org/hadoop/HadoopJavaVersions, http://help.eclipse.org/mars/index.jsp?topic=%2Forg.eclipse.jdt.doc.user%2Ftasks%2Ftask-add_new_jre.htm
  5. You may still find some class cannot be resolved, or some import package can’t be resolved. This is due to that Maven generates some java files that Eclipse doesn’t know how to link. The only way to do that is to move those mvn generated files (usually in `some nested directory/target/generated-sources or generated-test-sources`) to the correct place as indicated from the Eclipse error message.
  6. For other miscellaneous errors you can refer to http://diary-of-programmer.blogspot.com/2014/11/hadoop-tips-hadoop-build-error-related-findbugs-protobuf-avro.html
  7. Run the word count example to test!

Python multiprocessing map function with shared memory object as additional parameter

Suppose you want to create a bunch of processes to consume a list of elements:

import multiprocessing
def consume(ele):
    return ele ** 2    

if __name__ == '__main__': 
    m_list = [9,8,7,6,5]
    pool = multiprocessing.Pool(processes=300)
    result = pool.map(consume, m_list)
    print result

 

Now, if you want to have a dictionary to be shared across processes, and pass it as an additional parameter to `consume`:

import multiprocessing
from multiprocessing import Manager
import functools

def consume(ele, dic):
    if dic.get(ele) is None:
        dic[ele] = ele ** 2
    return ele ** 2    

if __name__ == '__main__': 
    m_list = [9,8,7,6,5]
    # Use proc manager to create a shared memory dict across processes
    proc_manager = Manager()
    m_dict = proc_manager.dict([(s, s**2) for s in [3,4,5,6]])
    pool = multiprocessing.Pool(processes=300)
    result = pool.map(functools.partial(consume, dic=m_dict), m_list)
    print result
    print m_dict

 

Notice that we use functools to add additional parameter. We use Manager to create a shared memory object across processes.

 

p.s.: This post may be helpful to understand how (when) processes are forked in Python: http://stackoverflow.com/questions/8640367/python-manager-dict-in-multiprocessing/9536888#9536888

Very important to set verbose output for Ubuntu User

I often mess up with Ubuntu System so much that I can’t manage to boot the system. I always end up with being stuck at the splash screen with no clue what is going on in the background. So, I suggest that for every ubuntu user you should turn on verbose output during the boot. This can be achieved by:

  1. open /etc/default/grub, change quiet splash to empty string for `GRUB_CMDLINE_LINUX_DEFAULT`.
    # GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
    GRUB_CMDLINE_LINUX_DEFAULT=""
  2.  Update to take effect:

    sudo update-grub

     

Reference:

http://askubuntu.com/questions/248/how-can-i-show-or-hide-boot-messages-when-ubuntu-starts

Leetcode 4: Median of Two Sorted Arrays

https://leetcode.com/problems/median-of-two-sorted-arrays/

Median of Two Sorted Arrays

Total Accepted: 65901 Total Submissions: 385447 Difficulty: Hard

There are two sorted arrays nums1 and nums2 of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

Let me first show the code and then show the idea.

import sys

class Solution(object):
    def findMedianSortedArrays(self, nums1, nums2):
        """
        :type nums1: List[int]
        :type nums2: List[int]
        :rtype: float
        """
        if len(nums1) == 0 and len(nums2) == 0:
           return None
        	
        if len(nums1) < len(nums2):
           nums1, nums2 = nums2, nums1

        if len(nums2) == 0:        
           return (nums1[(len(nums1)-1)/2] + nums1[len(nums1)/2]) / 2.0

        jmin, jmax = 0, 2*len(nums2)        
        while jmin <= jmax:
            jmid = (jmin + jmax) / 2
            imid = len(nums1) + len(nums2) - jmid
            i_left = nums1[(imid-1)/2] if imid else - sys.maxint
            i_right = nums1[imid/2] if 2*len(nums1) - imid else sys.maxint
            j_left = nums2[(jmid-1)/2] if jmid else - sys.maxint
            j_right = nums2[jmid/2] if 2*len(nums2) - jmid else sys.maxint
            if i_left > j_right:
                jmin = jmid + 1
            elif j_left > i_right:
                jmax = jmid - 1
            else: // i_left <= j_right and j_left <= i_right               
                return (max(i_left, j_left) + min(i_right, j_right)) / 2.0 

 

Idea

The length of nums1 or nums2 can be odd or even. The total length of nums1 and nums2 can also be odd or even. To calculate median, if the list has N (N is odd) elements, you should return the very middle element list[N/2]. If N is even, you should return the average of the two middle elements: (list[(N-1)/2] + list[N/2]) / 2. All these considerations will make your brain twisted. So first of all we need to come up with a more general notation system to deal with either odd or even numbers.

The notation system can be better illustrated in the following two examples:

original array: 99, 100, 101

original index: 0, 1, 2

expanded array: #, 99, #, 100, #, 101, #

expanded index: 0, 1, 2, 3, 4, 5, 6

median cut original index: 1 

median cut expanded index: 3

median: 100

 

original array: 99, 100, 101, 102

original index: 0, 1, 2, 3

expanded array: #, 99, #, 100, #, 101, #, 102, #

expanded index: 0, 1, 2, 3, 4, 5, 6, 7, 8

median cut original index: 2,3 

median cut expanded index: 4

median: (100 + 101)/2

The two arrays are expanded by adding `#` between numbers as well as at the beginning and at the end of the array. You may find that ‘#’ always have odd index in the expanded array. Numbers always have even index in the expanded array. Every position in the expanded array denotes a possible cut. A cut at position `c` (0-based, c!=0 and c!=len(arr)) in the expanded array will return:

  1. If you cut at `#`, then the cut will return the average of two numbers on its left and right. i.e., (ori[(c-1)/2] + ori)/2
  2. If you cut at a number in the expanded array, you return that number directly. Or equivalently, you can imagine you cut on that number so the number is copied and splitted both on the cut’s left side and right side. So you still return the average of two (same) numbers on its left and right, i.e., (ori[(c-1)/2] + ori)/2, since ori[(c-1)/2] and ori always point to that number in this case.

The expanded array always has odd number length. If we know where we should cut in the expanded array to find the median of the original array (say position `c`), you can always return (ori[(c-1)/2] + ori)/2. You can think that the largest value on the left of cut `c` in the original array is always ori[(c-1)/2]. The smallest value on the right of cut `c` in the original array is always ori. 

Now go back to our problem. We have nums1 and nums2. We make sure that nums1 always has no shorter length than nums2 in order to make us focus only on one case. (line 13-14.) 

From now on we suppose that nums1 and nums2 have already been expanded. So the two expanded arrays have 2*N1+1 and 2*N2+1 length respectively. (N1 >= N2). The total length of nums1 and nums2 is thus 2*N1+2*N2+2, an even number.  The whole algorithm’s idea is to find two cuts, cut_i and cut_j, in nums1 and nums2 so that the median of the two lists resides in the middle of the two cuts. Such two cuts, cut_i and cut_j, will have two following properties:

  1. There should be totally (N1+N2) elements on the left of cut_i in nums1 and on the left of cut_j in nums2. There should also be totally (N1+N2) elements on the right of cut_1 in nums1 and on the left of cut_j in nums2. So once cut_i’s or (cut_j’s) index is determined, the other is determined also.
  2. The median is between the two cuts’ values. So, the largest element in nums1 which is on the left of cut_i (nums1[(cut_i-1)/2]), should be no larger than the smallest element on the right of cut_j in nums2 (nums2[cut_j/2]). The largest element on the left of cut_j in nums2, nums2[(cut_j-1)/2], should be no larger than the smallest element on the right of cut_i in nums1 (nums1[cut_i/2]).  

Using these two properties, you can start to binary search a cut in nums2, and consequently determine the cut in nums1. If either of the two properties doesn’t hold, you need to adjust the cut in nums2 in a binary search manner. However, we shouldn’t first start binary search a cut in nums1 (the longer list) and then determine the cut in nums2 (the shorter list). Because you may find a cut in nums1 on the left of which there already are more elements than N1+N2. In that case, you can’t even determine where is the cut in nums2.

The only corner cases you need to handle is when any cut is at index 0 or end index. But such cases can be handled intuitively: if you cut at index 0, your left value should be -inf. If you cut at end index, your right value should be inf.

 

 

Reference

https://leetcode.com/discuss/41621/very-concise-iterative-solution-with-detailed-explanation

 

 

 

Leetcode 3: Longest Substring Without Repeating Characters

https://leetcode.com/problems/longest-substring-without-repeating-characters/

Longest Substring Without Repeating Characters

Total Accepted: 95017 Total Submissions: 467741 Difficulty: Medium

Given a string, find the length of the longest substring without repeating characters. For example, the longest substring without repeating characters for “abcabcbb” is “abc”, which the length is 3. For “bbbbb” the longest substring is “b”, with the length of 1.

 
Idea:
In a string, if you see the same character duplicated at two position `i` and `j`, then the max length of string without duplicated character must either end before `j` or start after `i`. 
 
Use a dictionary in which keys are characters and values correspond to the last positions seeing those characters. Traverse every character of input string `s`. Use a variable `start_pos` to denote what is the foremost starting position of a substring that can end at current position without seeing repeating characters. So the maximum length of non-repeating characters string containing the current character will always be `max(max_len, cur_pos – start_pos + 1)`. If  the previous position of seeing current character (`last_pos`) is equal to or after `start_pos`, we need to move `start_pos` one position after `last_pos` because any other string that end at further position can’t start from `last_pos` or before.
 
 

Code:

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        start_pos = 0
        character_dict = {}
        max_len = 0
        
        for i in xrange(len(s)):
            last_pos = character_dict.get(s[i], -1)
            if last_pos != -1 and last_pos >= start_pos:
                start_pos = last_pos + 1
            cur_len = i - start_pos + 1
            if cur_len > max_len:
                max_len = cur_len
                
            character_dict[s[i]] = i
        
        return max_len

 

 

 

Leetcode 90: Find all possible subsets of a subset

https://leetcode.com/problems/subsets-ii/

Subsets II

Total Accepted: 47791 Total Submissions: 170711 Difficulty: Medium

Given a collection of integers that might contain duplicates, nums, return all possible subsets.

Note:

  • Elements in a subset must be in non-descending order.
  • The solution set must not contain duplicate subsets.

For example,
If nums = [1,2,2], a solution is:

[
  [2],
  [1],
  [1,2,2],
  [2,2],
  [1,2],
  []
]

 

Idea:

Initialize the return list, `res`, with a list of an empty list, since any set will have empty set as one of its subset. You need to sort the numbers first, because you are asked to give non-descending order subsets. 

Then, starting from the first number in the sorted list, you compare the current number with the previous number. If they are different (or the current number is the first number you scan), you need to concatenate the current number with all existing subsets in `res`.  If the current number is the same as the previous number, the current number has no need to concatenate with all previous subsets. It will introduce duplication if you do so. Instead, you only need to concatenate the current number with the subsets that have been added into `res` by the previous same number. For example, if the previous number is 2, and it introduces [2], [1,2] into the `res` by concatenating with [] and [1]. Then, the current number being 2 will only need to append new subsets [2,2] and [1,2,2] by concatenating with [2] and [1,2]. However, it is not needed to concatenate with [] and [1] again.

 

Code:

class Solution(object):
    def subsetsWithDup(self, nums):
        """
        :type nums: List[int]
        :rtype: List[List[int]]
        """
        res = [[]]
        nums.sort()
        for i in xrange(len(nums)):
            if len(res) == 1 or nums[i] != nums[i-1]:
                l = len(res)
            for j in xrange(len(res)-l, len(res)):
                res.append(res[j] + [nums[i]])
        return res

 

 

 

Leetcode 218: The Skyline Problem

https://leetcode.com/problems/the-skyline-problem/

The Skyline Problem

A city’s skyline is the outer contour of the silhouette formed by all the buildings in that city when viewed from a distance. Now suppose you are given the locations and height of all the buildings as shown on a cityscape photo (Figure A), write a program to output the skyline formed by these buildings collectively (Figure B).

Buildings Skyline Contour

The geometric information of each building is represented by a triplet of integers [Li, Ri, Hi], where Li and Ri are the x coordinates of the left and right edge of the ith building, respectively, and Hi is its height. It is guaranteed that 0 ≤ Li, Ri ≤ INT_MAX, 0 < Hi ≤ INT_MAX, and Ri - Li > 0. You may assume all buildings are perfect rectangles grounded on an absolutely flat surface at height 0.

For instance, the dimensions of all buildings in Figure A are recorded as: [ [2 9 10], [3 7 15], [5 12 12], [15 20 10], [19 24 8] ] .

The output is a list of “key points” (red dots in Figure B) in the format of [ [x1,y1], [x2, y2], [x3, y3], ... ] that uniquely defines a skyline. A key point is the left endpoint of a horizontal line segment. Note that the last key point, where the rightmost building ends, is merely used to mark the termination of the skyline, and always has zero height. Also, the ground in between any two adjacent buildings should be considered part of the skyline contour.

For instance, the skyline in Figure B should be represented as:[ [2 10], [3 15], [7 12], [12 0], [15 10], [20 8], [24, 0] ].

Notes:

  • The number of buildings in any input list is guaranteed to be in the range [0, 10000].
  • The input list is already sorted in ascending order by the left x position Li.
  • The output list must be sorted by the x position.
  • There must be no consecutive horizontal lines of equal height in the output skyline. For instance, [...[2 3], [4 5], [7 5], [11 5], [12 7]...] is not acceptable; the three lines of height 5 should be merged into one in the final output as such: [...[2 3], [4 5], [12 7], ...]

Credits:
Special thanks to @stellari for adding this problem, creating these two awesome images and all test cases.

 

I failed to do it myself. My first thought was to traverse each building once and determine all skyline points that should be added. However, `O(N)` seems unable to solve this problem. Because there are many cases that you need to go back to check already added skyline points.

 

The correct idea is to use merge and sort, though there are so many edge cases. Two solutions that seem easy to understand:

https://leetcode.com/discuss/37444/my-220ms-divide-and-conquer-solution-in-python-o-nlogn

http://www.programcreek.com/2014/06/leetcode-the-skyline-problem-java/

http://www.geeksforgeeks.org/divide-and-conquer-set-7-the-skyline-problem/

 

One piece of code from them:

#
class Solution:
    # @param {integer[][]} buildings
    # @return {integer[][]}
    def getSkyline(self, buildings):
        if buildings==[]:
            return []
        if len(buildings)==1:
            return [[buildings[0][0],buildings[0][2]],[buildings[0][1],0]]
        mid=(len(buildings)-1)/2
        left=self.getSkyline(buildings[0:mid])
        right=self.getSkyline(buildings[mid:])
        return self.merge(left,right)

    def merge(self,left,right):
        i=0
        j=0
        result=[]
        h1=None
        h2=None
        while i<len(left) and j<len(right):
            if left[i][0]<right[j][0]:                            # Condition 1
                h1=left[i][1]
                new=[left[i][0],max(h1,h2)]
                if result==[] or result[-1][1]!=new[1]:
                    result.append(new)
                i+=1
            elif left[i][0]>right[j][0]:                          # Condition 2
                h2=right[j][1]
                new=[right[j][0],max(h1,h2)]
                if result==[] or result[-1][1]!=new[1]:
                    result.append(new)
                j+=1
            else:                                                 # Condition 3
                h1=left[i][1]
                h2=right[j][1]
                new=[right[j][0],max(h1,h2)]
                if result==[] or result[-1][1]!=new[1]:
                    result.append([right[j][0],max(h1,h2)])
                i+=1
                j+=1
        while i<len(left):                                        # Condition 4
            if result==[] or result[-1][1]!=left[i][1]:
                result.append(left[i][:])
            i+=1
        while j<len(right):                                       # Condition 5
            if result==[] or result[-1][1]!=right[j][1]:
                result.append(right[j][:])
            j+=1

        return result

 

The basic idea is to divide and conquer the problem. In a basis case, where we only have one building, we return the building’s left up corner and its right down corner. (line 8-9). The crux of this problem is the `merge` function. It will always merge two sets of skyline points so that a set of merged valid skyline points is returned. So the `getSkyline` function is always merging two sets of already valid skyline points.

In `merge`, no matter how you append a skyline point to `result`, you always compare two things: 1. whether `result` is empty, which means you can always safely add it as a first element to `result`. 2. otherwise, whether the last element in `result` has the same height as the current skyline point that is about to be appended. That is because we want to avoid two consecutive skyline points on the same height.

In another perspective, `merge` always scans a pair of sets of skyline points from left to right, and only add those which:

  1. if the pair of sets both have skyline points left to be scanned, then the current skyline point will be added either if it has a higher height than the previous added skyline point from the other set, or the current result is empty.
  2. if one set has no skyline point left to be scanned,  then we will add all the remaining skyline points to the result, as long as they don’t have the same height as the element added most recently.

Let’s look at four simple examples. All the four examples have input of two buildings.

leetcode218i (2)

Example 1:

In `merge`, the `left` list has point 1 and 2. The `right` list has point 3 and 4. Point 1 will be scanned first and be added according to condition 1. Next, point 2 will be scanned and be added according to condition 1.  After that, point 3 and point 4 will be added according to condition 5.

 

Example 2:

In `merge`, the `left` list has point 1 and 2. The `right` list has point 3 and 4. Point 1 will be scanned first and be added according to condition 1. Point 3 will be scanned next and be added according to condition 2. Point 2 will be scanned next but be discarded by condition 1’s sub-condition `result[-1][1]!=new[1]`. Point 4 will be added according to condition 5.

 

Example 3:

In `merge`, the `left` list has point 1 and 2. The `right` list has point 3 and 4. Point 1 will be scanned first and be added according to condition 1. Point 3 will be scanned next and be added according to condition 2. Point 4 will be scanned but be discarded by condition 2’s sub-condition `result[-1][1]!=new[1]`.  Point 2 will be scanned at last and be added according to condition 4.

 

Example 4:

In `merge`, the `left` list has point 1 and 2. The `right` list has point 3 and 4. Point 1 and point 3 will be scanned together according to condition 3 and point 3 will be added to the result. Then, point 2 and point 4 will be scanned together according to condition 3 and point 2 (or point 4) will be added to the result.

Leetcode 42: Trapping Rain Water

Trapping Rain Water

Total Accepted: 47601 Total Submissions: 157411 Difficulty: Hard

Given n non-negative integers representing an elevation map where the width of each bar is 1, compute how much water it is able to trap after raining.

For example,
Given [0,1,0,2,1,0,1,3,2,1,2,1], return 6.

https://leetcode.com/problems/trapping-rain-water/

 

Naive idea: traverse from left to right. left[i] means the height of the tallest wall on the left of element i (element i itself excluded). right[i] means the height of the tallest wall on the right of element i (element i itself excluded). Then you traverse the `height` list again, determine how high each element can hold water. This idea will take O(N) time and O(N) space.

class Solution(object):
    def trap(self, height):
        """
        :type height: List[int]
        :rtype: int
        """
        # Get left[i]
        prev = 0
        left = [0] * len(height)
        for idx, val in enumerate(height):
            left[idx] = prev
            if val > prev:
                prev = val
        
        # Get right[i]
        prev = 0
        right = [0] * len(height)
        for idx, val in enumerate(height[::-1]):
            right[len(height)- 1 - idx] = prev
            if val > prev:
                prev = val
        
        sum = 0
        for i in xrange(0, len(height)):
            diff = min(left[i], right[i]) - height[i]
            if diff > 0:
                sum += diff
        print left
        print right
                
        return sum

 

Another idea takes O(N) time but O(1) space: You have two pointers, left and right. The water that an element can hold is dependent on min height (the tallest wall to its left, the tallest wall to its right).  We use `max_left` and `max_right` to record both heights. You roll out the algorithm from the two ends of the list, every time either moving `left` to one element right or `right` to one element left until they meet somewhere in the central. If the current element has a lower height of the tallest wall to its left than the tallest wall to its right, its vertical water holding amount will be determined by `max_left` as well as the element’s own height. If the current element has a lower height of the tallest wall to its right than the tallest wall to its left, its vertical water holding amount will be determined by `max_right` as well as the element’s own height.

class Solution(object):
    def trap(self, height):
        """
        :type height: List[int]
        :rtype: int
        """
        left = 0
        max_left = 0
        right = 0
        max_right = 0
        right = len(height) -1
        
        sum = 0
        while left <= right:
            if max_left < max_right:
                if height[left] > max_left:
                    max_left = height[left]
                else:
                    sum += max_left - height[left]
                left += 1
            else:
                if height[right] > max_right:
                    max_right = height[right]
                else:
                     sum += max_right - height[right]
                right -= 1

        return sum

 

 

Leetcode 2: Add Two Numbers

Add Two Numbers

 
Total Accepted: 89439 Total Submissions: 437620 Difficulty: Medium

You are given two linked lists representing two non-negative numbers. The digits are stored in reverse order and each of their nodes contain a single digit. Add the two numbers and return it as a linked list.

Input: (2 -> 4 -> 3) + (5 -> 6 -> 4)
Output: 7 -> 0 -> 8

https://leetcode.com/problems/add-two-numbers/

Idea: traverse two linked lists to get the numeric values they represent. Add them. Then convert to string back and save each character as linked list element.
 
# Definition for singly-linked list.
# class ListNode(object):
#     def __init__(self, x):
#         self.val = x
#         self.next = None

class Solution(object):
    def addTwoNumbers(self, l1, l2):
        """
        :type l1: ListNode
        :type l2: ListNode
        :rtype: ListNode
        """
        if not l1 or not l2:
            return None
        
        digit = 0
        l1_v = 0
        while l1:
            l1_v = l1_v + l1.val * (10 ** digit)
            digit = digit + 1
            l1 = l1.next
        
        digit = 0
        l2_v = 0
        while l2:
            l2_v = l2_v + l2.val * (10 ** digit)
            digit = digit + 1
            l2 = l2.next
        
        sum_str = str(l1_v + l2_v)
        head = ListNode(int(sum_str[-1]))
        prev = head
        for s in sum_str[:-1][::-1]:
            prev.next = ListNode(int(s))
            prev = prev.next
        
        return head

 

Later I found that you don’t actually need such traverse three times, though time complexity is always O(N). Here is the correct concise code:

def addTwoNumbers(self, l1, l2):
    dummy = cur = ListNode(0)
    carry = 0
    while l1 or l2 or carry:
        if l1:
            carry += l1.val
            l1 = l1.next
        if l2:
            carry += l2.val
            l2 = l2.next
        cur.next = ListNode(carry%10)
        cur = cur.next
        carry //= 10
    return dummy.next

A dummy head is needed.