【LeetCode OJ】Max Points on a Line

Posted on 2014-01-10 10:53  卢泽尔  阅读(2083)  评论(1编辑  收藏  举报

Problem:

Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.

Suppose that the structure Point is already defined in as following:

 

/**
 * Definition for a point.
 * struct Point {
 *     int x;
 *     int y;
 *     Point() : x(0), y(0) {}
 *     Point(int a, int b) : x(a), y(b) {}
 * };
 */

 


Solution:

First idea coming to my mind is brute-force method which can be done in O(n^2) time...however, after I thinked hard and did some research, what I have is still O(n^2) solution. The brute-force solution is very straightforward, by given n points, there are at most n^2 different lines. To sovle the problem, we just check all possible lines and count the number of points on it. The naive approach is in O(n^3) time. However, if we check the lines of every pair of points and use a hash table to store the lines, we can do it in O(n^2) time.

Also, there is another way to solve the problem by converting the 2D-plane to its dual space. In a word, map the (points, lines)-space into a (lines, points) space. The n points are converted into n lines in the dual space, and the intersection of lines corresponds to the lines passing through those points in the original space. Therefore, instead of solving the problem "find the line pass through most points by given n points", we are asked to solve the problem "find the intersection points that most lines pass through by given n lines". The bad news is that, the time complexity to solve this new problem seems to be same as O(n^2)...

And I also tried to solve the problem from a algorithm view, use Dynamic Programming. Let M[n] be the solution to the problem with points P1, P2, ..., Pn. Then we have the recursive function as follows:

  M[n] = n, if n <= 2

  M[n] = max( BestSolutionPassingPoint(Pn, {P1...Pn-1}), M[n-1] ), otherwise

However, this DP is just a very simple divide-and-conquer idea. Suppose we have points P1, ..., Pn, there are only two non-overlapped cases: (1) The solution line passes through the point Pn; or (2) The solution lines does not pass through the point Pn. If case (1), then we only need to find the number of points lie on the line that also passes through Pn (the BestSolutionPassingPoint function). This can be done in O(n^2) time by using hash technique but much easier than brute-force method (I will explain it later). If case (2), removing the Pn does not change anything, which means the solution for P1,...,Pn-1 is equal to that for P1,...,Pn.

Finally, I decided to use the DP approach for two reasons:

1. The line representation and hashing lines. The key of the algorithm is to represent the line passing through two points and hash the line. Let (x1,y1) and (x2,y2) be two points, then we can represent a line by kx+y=d. However, this representation has a drawback that it is not able to cover the case of x1==x2. (Another representation is to use ax+by=c, but personally I do not like it since it requires gcd and more values to hash.) Therefore, the pair (k,d) is unique to a line, so it can be used as the hash key. However, in DP approach, all lines we concern are passing through a same point. Then we considered this point as an origin, then all lines can represented as y=kx or x=c for lines parallel to the x-axis. So we can use the value of k only as the hash key. Note that the line x=c should be considered seperately.

2. Duplicate points. We have to consider the case that there exist same points in the point set. If we use brute-force approach, we need to scan the points and count the duplicates first. For DP approach, it becomes much easier to handle the duplicates. To calculate BestSolutionPassingPoint(P, S), if there exist same point to P in S, lets say Q. What we are going to do is just to add 1 to every count since all lines must pass through Q since they passes through P.

The C++ code should like:

/**
 * Definition for a point.
 * struct Point {
 *     int x;
 *     int y;
 *     Point() : x(0), y(0) {}
 *     Point(int a, int b) : x(a), y(b) {}
 * };
 */
#include <unordered_map>

class Solution {
public:
    static const int EPSILON = 1000000;  // float precision purpose
    std::unordered_map<int, int> lines;  // We use the unordered_map, key is the slope of the line and value is the count of points
    int find_sub_max_points(vector<Point> &points, int last) {
        lines.clear();  // Clear the hash table
        int num_same_x = 0;  // count the number of points with same x-coordinate of Point[last]
        int num_same_point = 1;
        const int x0 = points[last].x;
        const int y0 = points[last].y;
        int x,y,tmp;
        int res = 0;
        for (int i = last-1; i >= 0; i--) {
            x = points[i].x;
            y = points[i].y;
            if (x == x0) {
                if (y == y0) num_same_point++;
                else if(++num_same_x > res)
                    res = num_same_x;
            }
            else {
                tmp = int( (y-y0) * EPSILON ) / (x-x0);
                if (++lines[tmp] > res) res = lines[tmp];
                /* the line above is equal to following lines
                std::unordered_map<int,int>::const_iterator got = lines.find (tmp);
                if (got == lines.end())
                    lines[tmp] = 1;
                else
                    lines[tmp]++;
                if (lines[tmp]>res) res = lines[tmp];
                */
            }
        }
        return res+num_same_point; // Point[last] itself should be counted
    }
    int maxPoints(vector<Point> &points) {
        int n = points.size();
        if (n <= 2) return n;
        int res = 2;
        int tmp = 0;
        for(int i=2; i<n; i++) {
            tmp = find_sub_max_points(points, i);
            if (tmp > res) res = tmp;
        }
        return res;
    }
};

 


C++ basics:

In C++, we can use <unordered_map> as hash table. And there is a very quick way to do the following update:

if the key is not in contained in the hash table, then set the key with value 1;

otherwise, add 1 to the key's value.

This udpate can be done in one line (see code) since the HashTable[key] is set to 0 if the key does not exist.