RGB alignment/registration to depth frame.


#1

Hello, I am hoping someone can help me. I am working on an AR application and I need to get the rgb image aligned properly with the depth image on the structure core. Is there a standard way to do that? I have seen the realsense SDK has a function to do that kind of alignment.

I haven’t seen anything in the structure core sdk. The infrared is nicely aligned by default, the image size is slightly different from the depth but I jusy overlay it on the top left corner and it aligns almost perfectly.

The rgb, appears to not be corrected and aligned in the same way. Are there any code examples to do the alignment?


#2

No answers? I would have thought this would be a pretty important basic feature to have. In any case, after further investigation I found that there is a general way of going about it, but still not easy for a beginner. I am also not super experienced with C++, I do most computer vision in Python.

In any case, I found the code base for me to start off with:

Would be nice if there was a function in the SDK that just worked out of the box though. You would think this would be a univerally wanted/basic feature.


#3

Ok, it looks like I have been able to get the depth registration code from the above link working in my own project which doesn’t use ROS.

Now I think I understand why it isn’t included in the SDK by default. I didn’t realize depth registration was so computationally expensive. It seems to really lower the framerate, and I need low latency, high framerate. Will have to just get by without rgb alignment then.


#4

Just in case anyone reading this in the future wants to do the same thing, I solved the speed issue by modifying the above code to precalculate the transforms into a 3 dimensional array as a lookup table.

For every possible u, v, z value I store the calculated rgb depth.

At centimeter precision level it takes 2GB of ram but is fast. I get full and consistent framerates.

Higher precision would go up into the tens of gigabytes. I will try a memory mapped file for that and see if it’s still fast enough.

If not I will try a neural network and see if it can learn the transformation from the lookup table, and test if it is fast enough.


#5

Ok I really have it solved now. No need for fancy stuff like a neural net or anything.

I was trying to cache the transform for xyz in a 3d array which was huge (gigabytes). However the Z dimension doesn’t change much between the depth camera point of view and the rgb point of view, so it was unnecessary to cache the z dimension.

Some precision is lost by ignoring the z transformation but it is negligible.

So all that was needed was initializing two 2d arrays of x and y using the above code, and when a new frame comes in I use the values for rgbx and rgby to rearrange the depth pixels.

Uses a few mb and is very fast for aligned depth to rgb.

No examples of that anywhere that I could find, just had to think about it for a bit. It works. Just odd nobody ever mentions it. It is essential to me for using this camera for AR.


#6

@matthewpottinger are you blogging, tweeting, etc. about your work?

-jim


#7

Haha I suppose I am. I found the lack of information or replies very frustrating though. So hoped to help anyone in the future in the same situation as myself. Contributing in some small way.

This forum seems pretty dead anyway right now at least. Figures the only reply is your somewhat funny but unhelpful comment.

Btw I had no reply from Occipital support on this topic either. Just silence. I was a little disappointed in my purchase because of this, but hey, at least I have it solved. Nice hardware though.


#8

Oops, let me rephrase that! Please post your twitter handle, blog, etc., if you’re sharing the details of your work somewhere. I’d love to hear more about it!

I tried finding you, but there are an amazing number of Matthew Pottingers out there and none seemed like you!

-jim


#9

Ohhh, you were serious! Could have sworn it was sarcasm. Sorry about that, seems my sarcasm detector was a bit too sensitive!

I don’t really keep any blogs etc, or use twitter. I haven’t even posted on github.

I think I will in the future once I get AR fully working with this.

I am not sure yet if this will be a commercial product I am working on, so maybe not the entirety of what I am trying to build, but at least I might show how to get basic AR with occlusion, etc.

There seems to be precious little info out there even on basic features like realtime occlusion in Unity with a camera like this. I suppose it is a niche area of interest


#10

No worries!

My current project uses Structure Sensor so I haven’t even grabbed a Core to play with. I did work on color calibration for a laser triangulation scanner back in the day, but we designed the depth and color cameras so that was pretty straightforward after all was said and done.

One day maybe I’ll get a chance to play with the Core, and also dust off the RealSense cameras that have been sitting in their boxes for a while!

@alessandro.negri have you been following this thread?

-jim


#11

I just saw this, thank you Jim!

I did this a while ago too, on maybe the same scanner as Jim or a different version of it. Not sure what rgb frame you are using, but for the best color mapping results is rectifying the RGB source with the OpenCV library, then calculate the extrensic parameters between the color camera and the ir camera (again with the OpenCV library), and finally apply those transformations and you can look up the texture coordinate on the point cloud. This way you can use any RGB camera as a source, just make sure not to change the distance between the color camera and the sensor and make sure the camera is as close as you can get it to the structure sensor’s camera so you lose the least amount of overlap (the nice thing about the intel D415/D435 is that it comes out of the box, it is already calibrated fairly well… not sure if you can use part of the intel software with the structure sensor with some time I am sure you could).

Do post code on what you have if you are able to share!


#12

Hey Matt,

Sorry for the late reply. I was on vacation and just got back into the office today.

We currently do not have a way to do this with a function in Structure SDK (Cross-Platform) but have added this as a feature request that will, hopefully, be on a new release of the SDK sometime in the future.


#13

No problem. I am happy now that I have it working thanks to the code from Chad Rockey. I would have still been lost without that code example!

It was frustrating because I knew the SDK gave a transformation matrix to use, but there were no examples I could find anywhere on how to use it except that code in the ros wrapper.


#14

I am just using the rgb frame from the color camera on the structure core. I am using the camera with a Surface Go tablet which also has a camera, but that doesn’t have a Linux driver yet.

The structure SDK has a function to return a transformation matrix between the rgb and depth, as well as the intrinsics. I had no idea how to make use of that information once I had it though!

I have it working now though quite well.

I will look further into what OpenCV can do, hopefully it can give me something I can just plug into the existing code. It would be nice to have the option of using another rgb camera in the future!