My company is producing an app using the Structure Sensor which needs to display highly precise and accurate scans of details on a surface. I am aware of the limits of the accuracy associated with a single depth frame taken by the scanner (https://s3.amazonaws.com/io.structure.assets/structure_sensor_precision.pdf), but by taking many frames, aligning them with each other, getting rid of outlier points and doing some local averaging on the remaining points, I have been able to get a lot of the noise and systematic errors out of the scans to give a fairly smooth but accurate surface.
However, I am now at the stage where the remaining noise is fairly wide, and I cannot get rid of it easily without risking getting rid of real details in the surface being scanned. For example, when scanning a perfectly flat surface, I am getting fairly wide bumps and dips with heights of less than 1mm, but if I want to be able to capture a 0.5mm dent in the surface, the dents risks disappearing if I try to get rid of the wide noise just with algorithms.
That having been said, do you have any advice on getting rid of as much noise as possible from combined depth frames before I do the pruning and averaging? For example, I remember reading somewhere that hardware registered depth (which I am using) is less accurate than using the normal (not-aligned-to-color-camera) depth frames. Would it be worth me trying to use the normal depth frames if they are more accurate, and try to align them to the colour camera myself?
I’ve tried simply taking more frames and combining them, but I get harshly diminishing returns in terms of getting rid of the wider noise when I do that, so another solution would be preferred if one exists.
Some cropped screenshots to show what I mean:
- An example of what I’m scanning
- The resulting scan (coloured section uses all 25 frames for better accuracy)
- What the depths and cross sections look like when looking at the flat table between the objects
All numbers are millimetre deviations from a flat plane defined by three points on the table in the scan, which will themselves have errors/noise. Number in the grid squares represent the most extreme depth within that square. As you can see, the depth on the flat surface’s cross section ranges from +0.5mm to -0.5mm, and can get more extreme in other parts of the scan.