Conversion to millimeters, 3D points, OpenNI

Hi,

I have two questions:

  • how to convert the depth that I register to millimeters? The pixel format that I use is set to PIXEL_FORMAT_DEPTH_100_UM and the values coming from sensor are in ushort

  • how to get 3D point coordinates?

Regarding the second question, I have found another thread on forum. According to the code posted there, the conversion looks like this:

    float *data = (GLfloat *)_pointsData.mutableBytes;
    const float *depths = [depthFrame depthAsMillimeters];
    
    for (int r = 0; r < _rows; r++)
    {
        for (int c = 0; c < _cols; c++)
        {
            float depth = depths[r * _cols + c]/1000.0;
            float * point = data + (r*_cols+c)*3;
            point[0] = depth * (c - _cx) / _fx;
            point[1] = depth * (_cy - r) / _fy;
            point[2] = 2.0f - depth;
        }

Where:

#define QVGA_COLS 320
#define QVGA_ROWS 240
#define QVGA_F_X 305.73
#define QVGA_F_Y 305.62
#define QVGA_C_X 159.69
#define QVGA_C_Y 119.86
_fx = QVGA_F_X/QVGA_COLS*cols;
_fy = QVGA_F_X/QVGA_ROWS*rows;
_cx = QVGA_C_X/QVGA_COLS*cols;
_cy = QVGA_C_Y/QVGA_ROWS*rows;

If this code works for iOS Structure SDK, then the only thing needed is the actual conversion from depth to millimeters, which I cannot find in OpenNI SDK.

Okay, how stupid of me. If I set pixel format to PIXEL_FORMAT_DEPTH_100_UM then the only thing I need to do is divide the depth by 10 and I will have millimeters (as floats). Now the only thing to do is to get intrinsic parameters of the structure sensor for VGA resolution and I will obtain a full point cloud.

Hi phellstein,
I’m working with something very similar.
I’m looking for how we determine the values of the focal lengths, and the focal plane centers.

I would imagine that these values differ slightly for each camera, do you know how to determine these for an individual camera?
Or do these values work for every structure sensor?
I know that each stream(depth and IR) can return their horizontal and vertical field of views, but I’m a bit stuck on converting field of view to focal depth and focal plane center.

Maybe a manual calibration (measuring distance between camera and subject in addition to the x and y lengths of some the subject and backsolving for focal values)is required for high accuracy for each device. Is there a good write up of this conversion model anywhere?
I found this article about using a matrix to transform between depth pixel and 3d point coordinates. But to me it is unclear how exactly to retrieve the z coordinate. While they suggest linear interpolation between the known coordinate at the center of the image and the z coordinate at the edge of of the field of view, this would lose significant accuracy. With the high angle field of view of the structure sensor, a linear interpolation would lose accuracy especially near the camera. So my next thought was using the pythagorean theorem to use the x, y and depth values to find the z coordinate, but that’s a non-linear equation and can’t be simply incorporated into a linear transform. Now that I’m thinking about it, there might be a linear solution using trigonometry…

Anyways, have you come across any particularly good write-ups on the subject?

Thanks,
Andrew