All sorts of interesting topics in this set of student posts, including some inside stories from the creator of ALVINN!
Emphatic Camera Calibration With OpenCV
While trying to undistort his camera images, Chris walked into a store and asked to take a photo of their floor. Then things got really weird.
“I wrote a program that iterated through all possible grid sizes and looked at all images. Now I was finding grids. Ah ha! Turning to the documentation to figure out what exactly was going on, I noticed the function had a parameter,
flags, which could be set to enable certain grid finding techniques. I set one of the flags and the grids I could detect changed quite a bit. Now I added to my program another inner loop to iterate through all the detection modes.”
Dean Pomerleau, the creator of ALVINN, responded to Param Aggarwal with some cool stories about how ALVINN took advantage of confusion in the network to estimate how confident it was about its own steering ability:
“Using the OARE technique and a related one called Input Reconstruction Reliability Estimation (IRRE), ALVINN was able to localize itself (e.g. ‘I’ve reached the fork in the road!’), tell the human safety driver (me) when it needed help, arbitrate between networks trained on different road types, and even tell when there was crap on the windshield in front of the camera obstructing its view of the road.”
Uki riffs here on all of the various projects he could be working on, how he chooses to spend his limited time, and where that intersects with career development.
“The next part of the career development is keeping up with the computer science basics. Honestly, it does not matter how much programming you do on daily basis, you will not pass the “whiteboard hazing” without any preparation. I lost countless of interviews with fine companies like Amazon, to what I thought was a “power trip” of some engineer without any social skills in a cookie factory — for years I was saying, “Why do I need that? I can make good money on my own”. Only later, I have read books and articles on interviewing and realized that the “whiteboard” is simply a thing they do and that people prepare for it for months.”
“A voxel is a volume unit in space, similar to pixel in 2D images. I first constrained our space so x-dimension (front), y-dimension (L-R) varied between -30 and 30, and vertical dimension varied between -.1.5 and 1 m. I next constructed voxels of width and length .1 m and height 0.3125 m. I then computed maximum height in each voxel and used this value as the height of the point cloud in that voxel. This gave us a height map of 600X600X5 features. We specifically chose 5 height maps because Udacity’s data uses vlp-16 lidar and having more fine discretization can result in height slices without any points.”
What is a Kalman filter? Why do we use it? An gives a more intuitive explanation here than you will find on Wikipedia:
“Assume the car makes the lane change successfully to get in front of me, I still continuously observe the car and adjust my speed so my car can always stay in the safe zone. If the car goes slow, I predict the car will still be slow in the next seconds and I’ll stay at a slow speed behind it. However, if it suddenly goes fast, I can speed up a little bit (as long as under speed limit) and update my belief. What I did there is a continuous process of prediction and update.”