Uniform distribution on sphere

 

http://corysimon.github.io/articles/uniformdistn-on-sphere/

 

Mathemathinking

Math, Data, Machine learning, Chemical engineering

Generating uniformly distributed numbers on a sphere

Cory Simon bio photoBY CORY SIMON

So, we want to generate uniformly distributed random numbers on a unit sphere. This came up today in writing a code for molecular simulations. Spherical coordinates give us a nice way to ensure that a point (x,y,z)(x,y,z) is on the sphere for any (θ,ϕ)(θ,ϕ):

image
x=rsin(ϕ)cos(θ)x=rsin⁡(ϕ)cos⁡(θ)
y=rsin(ϕ)sin(θ)y=rsin⁡(ϕ)sin⁡(θ)
z=rcos(ϕ).z=rcos⁡(ϕ).

In spherical coordinates, rr is the radius, θ[0,2π]θ∈[0,2π] is the azimuthal angle, and ϕ[0,π]ϕ∈[0,π] is the polar angle.

A tempting way to generate uniformly distributed numbers in a sphere is to generate a uniform distribution of θθ and ϕϕ, then apply the above transformation to yield points in Cartesian space (x,y,z)(x,y,z), as with the following C++ code.

#include<random>
#include<cmath>
#include<chrono>

int main(int argc, char *argv[]) {
// Set up random number generators
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::mt19937 generator (seed);
std::uniform_real_distribution<double> uniform01(0.0, 1.0);

// generate N random numbers
int N = 1000;

// the incorrect way
FILE * incorrect;
incorrect = fopen("incorrect.csv", "w");
fprintf(incorrect, "Theta,Phi,x,y,z\n");
for (int i = 0; i < N; i++) {
// incorrect way
double theta = 2 * M_PI * uniform01(generator);
double phi = M_PI * uniform01(generator);
double x = sin(phi) * cos(theta);
double y = sin(phi) * sin(theta);
double z = cos(phi);
fprintf(incorrect, "%f,%f,%f,%f,%f\n", theta, phi, x, y, z);
}
fclose(incorrect);
}

The distribution in the θθ-ϕϕ plane in this strategy is uniform:

image

After mapping these points in the θθ-ϕϕ plane to the sphere using the relationship between spherical and Cartesian coordinates above, this incorrect strategy yields the following distribution of points on the sphere. We see that the points are clustered around the poles (ϕ=0ϕ=0 and ϕ=πϕ=π) and sparse around the equator (ϕ=π/2ϕ=π/2).

image

The reason for this is that the area dAdA of a differential surface element in spherical coordinates is dA(dθ,dϕ)=r2sin(ϕ)dϕdθdA(dθ,dϕ)=r2sin⁡(ϕ)dϕdθ. This formula for the area of a differential surface element comes from treating it as a square of dimension rdϕrdϕ by rsin(ϕ)dθrsin⁡(ϕ)dθ. These dimensions of the differential surface element come from simple trigonometry.

image

So, close to the poles of the sphere (ϕ=0ϕ=0 and ϕ=πϕ=π), the differential surface area element determined by dθdθ and dϕdϕ gets smaller since sin(ϕ)0sin⁡(ϕ)→0. Thus, we should include less points near ϕ=0ϕ=0 and ϕ=πϕ=π and more points near ϕ=π/2ϕ=π/2 to achieve a uniform distribution on the sphere.

Our goal is to find and then draw samples from the probability distribution f(θ,ϕ)f(θ,ϕ) that maps from the θθ-ϕϕ plane to a uniform distribution on the sphere.

Let vv be a point on the unit sphere SS. We want the probability density f(v)f(v) to be constant for a uniform distribution. Thus f(v)=14πf(v)=14π since Sf(v)dA=1∫∫Sf(v)dA=1 and SdA=4π∫∫SdA=4π. We want to represent points vv using the parameterization in θθ and ϕϕ and find the corresponding probability density function f(θ,ϕ)f(θ,ϕ) that maps to a uniform distribution on the sphere. We can obtain a uniform distribution by enforcing:

f(v)dA=14πdA=f(θ,ϕ)dθdϕ,f(v)dA=14πdA=f(θ,ϕ)dθdϕ,

since f(v)dAf(v)dA is the probability of finding a point in an area dAdA about vv on the sphere. Because dA=sin(ϕ)dϕdθdA=sin⁡(ϕ)dϕdθ, it follows that f(θ,ϕ)=14πsin(ϕ)f(θ,ϕ)=14πsin⁡(ϕ).

Marginalizing the joint distribution to get the p.d.f of θθ and ϕϕ separately:

f(θ)=π0f(θ,ϕ)dϕ=12πf(θ)=∫0πf(θ,ϕ)dϕ=12π
f(ϕ)=2π0f(θ,ϕ)dθ=sin(ϕ)2.f(ϕ)=∫02πf(θ,ϕ)dθ=sin⁡(ϕ)2.

We see that θθ is a uniformly distributed variable and f(ϕ)f(ϕ) scales with sin(ϕ)sin⁡(ϕ); we want more points around the equator, ϕ=π/2ϕ=π/2, which is where sin(ϕ)sin⁡(ϕ) takes its maximum.

Now, how can we sample numbers ϕϕ that follow the distribution f(ϕ)f(ϕ)? We’d like to use the readily available uniform random number generator in [0,1][0,1] as before. Inverse Transform Sampling is a method that allows us to sample a general probability distribution using a uniform random number. For this, we need the cumulative distribution function of ϕϕ:

F(ϕ)=ϕ0f(ϕ^)dϕ^=12(1cos(ϕ)).F(ϕ)=∫0ϕf(ϕ^)dϕ^=12(1−cos⁡(ϕ)).

Keep in mind that F(ϕ)F(ϕ) is a monotonically increasing function from [0,π][0,1][0,π]→[0,1] since it is a cumulative distribution function. Thus, it has an inverse function F1F−1.

Let UU be the uniform random number in [0,1][0,1] that we do know how to generate. To see how inverse transform sampling works, note that

Pr(UF(ϕ))=F(ϕ).Pr(U≤F(ϕ))=F(ϕ).

This is a property of the uniform random variable U[0,1]U[0,1], since for any number x[0,1]x∈[0,1], Pr(Ux)=xPr(U≤x)=x. As FF is invertible and monotone, we can preserve this inequality by writing:

Pr(F1(U)ϕ)=F(ϕ).Pr(F−1(U)≤ϕ)=F(ϕ).

Aha! This shows that F(ϕ)F(ϕ) is the cumulative distribution function for the random variable F1(U)F−1(U)! Thus, F1(U)F−1(U) follows the same distribution as ϕϕ. The algorithm for sampling the distribution f(ϕ)f(ϕ) using inverse transform sampling is then:

  • Generate a uniform random number uu from the distribution U[0,1]U[0,1].

  • Compute ϕϕ such that F(ϕ)=uF(ϕ)=u, i.e. F1(u)F−1(u).

  • Take this ϕϕ as a random number drawn from the distribution f(ϕ)f(ϕ).

In our case, F1(u)=arccos(12u)F−1(u)=arccos⁡(1−2u).

The algorithm below in C++ shows how to generate uniformly distributed numbers on the sphere using this method:

#include<random>
#include<cmath>
#include<chrono>

int main(int argc, char *argv[]) {
// Set up random number generators
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::mt19937 generator (seed);
std::uniform_real_distribution<double> uniform01(0.0, 1.0);

// generate N random numbers
int N = 1000;

// the correct way
FILE * correct;
correct = fopen("correct.csv", "w");
fprintf(correct, "Theta,Phi,x,y,z\n");
for (int i = 0; i < N; i++) {
// incorrect way
double theta = 2 * M_PI * uniform01(generator);
double phi = acos(1 - 2 * uniform01(generator));
double x = sin(phi) * cos(theta);
double y = sin(phi) * sin(theta);
double z = cos(phi);
fprintf(correct, "%f,%f,%f,%f,%f\n", theta, phi, x, y, z);
}
fclose(correct);
}

We then get the following distribution of points in the (θ,ϕ)(θ,ϕ) plane. There are more points around ϕ=π/2ϕ=π/2 (the equator) than around the poles (ϕ=0,πϕ=0,π), as we had hoped for.

image

And, finally, a uniform distribution of points on the sphere.

image

Alternative method 1

An alternative method to generate uniformly disributed points on a unit sphere is to generate three standard normally distributed numbers XX, YY, and ZZ to form a vector V=[X,Y,Z]V=[X,Y,Z]. Intuitively, this vector will have a uniformly random orientation in space, but will not lie on the sphere. If we normalize the vector V:=V/VV:=V/‖V‖, it will then lie on the sphere.

The following Julia code implements this. We have to be careful in the case that the vector has a norm close to zero, in which we must worry about floating point precision by dividing by a very small number. This is the reason for the while loop.

n = 100

f_normal = open("normal.csv", "w")
write(f_normal, "x,y,z\n")

for i = 1:n
v = [0, 0, 0] # initialize so we go into the while loop

while norm(v) < .0001
x = randn() # random standard normal
y = randn()
z = randn()
v = [x, y, z]
end

v = v / norm(v) # normalize to unit norm

@printf(f_normal, "%f,%f,%f\n", v[1], v[2], v[3])
end

To prove this, note that the standard normal distribution is:

f(x)=12π−−√e12x2.f(x)=12πe−12x2.

As XX, YY, and ZZ each follow the standard normal distribution and are generated independently:

f(x,y,z)=f(v)=(12π−−√e12x2)(12π−−√e12y2)(12π−−√e12z2).f(x,y,z)=f(v)=(12πe−12x2)(12πe−12y2)(12πe−12z2).

With some algebra:

f(x,y,z)=f(v)=1(2π)32e12(x2+y2+z2)=1(2π)32e12v2.f(x,y,z)=f(v)=1(2π)32e−12(x2+y2+z2)=1(2π)32e−12‖v‖2.

This shows that the probability distribution of vv only depends on its magnitude and not any direction θθ and ϕϕ. The vectors vv are thus indeed pointing in uniformly random directions. By finding where the ray determined by this vector vv intersects the sphere, we have a sample from a uniform distribution on the sphere.

Alternative method 2

Credit to FX Coudert for pointing this out in a comment, another method is to generate uniformly distributed numbers in the cube [1,1]3[−1,1]3 and ignore any points that are further than a unit distance rr from the origin. This will ensure a uniform distribution in the region r1r≤1. Next, normalize each random vector to have unit norm so that the vector retains its direction but is extended to the sphere of unit radius. As each vector within the region r1r≤1 has a random direction, these points will be uniformly distributed on a sphere of radius 1.

Alternative method 1 can be used to efficiently generate uniformly distributed numbers on a hypersphere– i.e. in higher dimensions. On the other hand, as the number of dimensions grows, the ratio of the volume of the edges of a cube to the volume of the ball inside of it grows (see Wikipedia article). Hence, a larger and larger fraction of the uniformly generated numbers in the cube will be rejected using Alternative method 2, and so this algorithm will be inefficient.

posted on 2018-04-23 10:20  fanbird2008  阅读(476)  评论(0编辑  收藏  举报

导航