Leetcode 274: H-index

https://leetcode.com/problems/h-index/

H-Index

Total Accepted: 10163 Total Submissions: 40365 Difficulty: Medium

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.

According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.

Note: If there are several possible values for h, the maximum one is taken as the h-index.

 

Code:

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        cnt = [0] * (len(citations)+1)
        for c in citations:
            if c>= len(citations):
                cnt[-1] += 1
            else:
                cnt += 1
        
        total = 0
        for i in xrange(len(cnt)-1, -1, -1):
            print i
            total += cnt[i]
            if total >= i:
                return i

 

Idea:

This algorithm needs O(N) time and O(N) space. A researcher’s h-index can’t be beyond the number of his publications, i.e., len(citations). So we create an array `cnt` of length len(citations)+1. For any citation, if it is larger than len(citations), we count that in cnt[-1]. The meaning of `cnt` array is that cnt[i] is the number of publications which have i citations. Exceptionally, cnt[-1] is a count for all other publications with the citations >= `len(citations)`. 

Then we traverse from cnt[-1] to cnt[0]. If cnt[-1] is already larger than len(citations), that means all publications have citations more than len(citations). if not, whenever the accumulated sum of cnt until `i` is equal to or larger than `i`, that means we have found at least `i` publications with at least i citations. Thus we return `i` immediately.

Another idea is to sort `citations` first in place in O(nlogn) then you don’t need the O(n) space. See H-Index II.

 

Reference:

https://leetcode.com/discuss/56987/python-o-n-lgn-time-with-sort-o-n-time-with-o-n-space

 

 

Leave a comment

Your email address will not be published. Required fields are marked *