CS332 Assignment 6

CS 332

Assignment 6

Due: Monday, November 18

This assignment contains one problem related to the recovery of the motion of an observer from 2-D image velocities, and a second problem related to the recovery of the time-to-contact of approaching ovjects. To begin this assignment, download the following folder from the CS file server: /home/cs332/download/observer and set the Current Directory in MATLAB to this folder.

Problem 1 (65 points): Recovering Observer Motion

Part a: The following equations specify the x and y components of the image velocity (V_x,V_y) as a function of the movement of the observer (translation (T_x,T_y,T_z) and rotation (R_x,R_y,R_z)) and depth Z(x,y):

V_x = (-T_x + xT_z)/Z + R_xxy - R_y(x²+1) + R_zy
V_y = (-T_y + yT_z)/Z + R_x(y² + 1) - R_yxy - R_zx

Write a function setupImageVelocities that computes the image velocity field that results from a movement of the observer:

function [vx vy xfoe yfoe] = setupImageVelocities(Tx,Ty,Tz,Rx,Ry,Rz,zmap)

The input zmap is a 2-D matrix of the depths of the surfaces that project to each image location. The output matrices vx and vy should be the same size as the input zmap. Assume that the origin of the x,y image coordinate system is represented at the center of the zmap, vx, and vy matrices. The indices of these matrices should also be scaled to obtain the x and y coordinates that are used in the above equations. In particular, let i and j represent the indices of the zmap, vx, and vy matrices (where i is the first (row) index and j is the second (column) index), and let icenter and jcenter denote the indices of the center location in these matrices (icenter is the middle row and jcenter is the middle column). Then the x and y coordinates that are used in the above expressions for V_x and V_y should be calculated as follows:

x = 0.05*(i-icenter); y = 0.05*(j-jcenter);

The setupImageVelocities function should also return the x and y coordinates of the focus of expansion, T_x/T_z and T_y/T_z (if T_z is 0, then this function can just return a value such as 1000.0 for the coordinates of the focus of expansion, to indicate that it is undefined in this case). The function displayVelocityField in the observer folder displays velocities at evenly spaced locations in the horizontal and vertical directions. It has three inputs that are the vx and vy matrices and the distance between the locations where a velocity vector is displayed. The foeScript.m code file contains some initial statements for testing your setupImageVelocities function and displaying the results. The depth map used in these examples consists of a central square surface at a distance of 25 from the observer, in front of a background surface at a distance of 50. There are three examples, and each velocity field is displayed in a separate figure window. The coordinates of the FOE's are also printed. You can expand on these examples to explore the appearance of the velocity field for different combinations of translation and rotation of the observer, and surface depths. Given the depths and image coordinates used in the initial examples, velocities with a reasonable range of speeds can be obtained if the translation parameters are specified in the range of about 0.0-0.5, and the rotation parameters are specified in the range of about 0.0-0.03.

Part b: In class, we discussed an algorithm for recovering the direction of motion of the observer that was proposed by Longuet-Higgins and Prazdny. This algorithm is based on the following observation. At the location of a sudden change in depth in the scene, the component of image motion due to the observer's translation changes abruptly in the image, but there is very little change in the component of image motion that is due to the observer's rotation. Furthermore, the vector difference between the two 2-D image velocities on either side of a depth change lies on a line that points toward the focus of expansion (the observer's heading point). The computeObserverMotion function in the observer folder implements a simple version of Longuet-Higgins and Prazdny's algorithm. At each image location, this function first determines whether there is a large change in 2-D velocity in the horizontal or vertical direction. If so, the vector difference in velocity in the horizontal or vertical direction contributes toward the computation of the observer's heading point. To determine this heading point, the function combines velocity differences from a large number of image locations. In particular, it finds the best intersection point of all of the lines containing large velocity differences. There are two tests of the computeObserverMotion function in the foeScript.m code file that are initially commented out. Each call to this function is followed with a statement (also initially in comments) that prints the values of the coordinates of the true heading point and the computed heading point for each example. The true and computed values can be compared to determine the accuracy of computed heading. Expand on the examples provided in the foeScript file to demonstrate the following two properties:

The calculation of the heading point degrades if the range of depths in the scene is reduced
The calculation of the heading point degrades if there is significant rotation of the observer during translation

Part c: Longuet-Higgins and Prazdny's algorithm assumes that the scene is stationary. In other words, the scene cannot contain objects that undergo their own motion in space. In general, if an object undergoes its own motion, the vector differences in velocity across the boundaries of the object may no longer point to the observer's true heading point. Consider the diagram below, which shows sample velocities in four regions around the border of a square surface that is placed in front of a background surface. Assume that the true heading point is located at the center of the square, as shown. All of the velocity vectors point away from the FOE, and the velocities inside the border of the square have larger magnitude because the central surface is closer to the observer. The vectors obtained by computing the difference between the velocities inside and outside the border of the square also have a direction that lies on a line through the FOE. Suppose that the central square undergoes its own motion, consisting of a constant translation of the square patch to the right, as shown by the arrow attached to the FOE. The final motion of each point on the central surface would then be the result of adding this constant rightward velocity to the velocity that results from the observer's translation. For each of the four regions circled below, show how this added object motion would alter the difference in velocity measured across the border. Draw the lines that contain the four velocity difference vectors and show roughly where these lines intersect in the image (this is the heading point that would be computed by Longuet-Higgins and Prazdny's algorithm). Do the same construction for the case where the object shifts to the left instead (by the same amount). Can this algorithm compute the correct heading point for these situations where the object undergoes its own motion?

Problem 2 (35 points): Measuring Time-to-Contact

In lab, we derived a simple approximation to the time-to-contact (TTC) between an observer and an object surface, where one or both of the observer or object are moving. In particular, we observed that the time-to-contact is roughly equal to the ratio between the size of an object and its rate-of-change of size over time. Not being able to let go of the excitement of the 2013 Baseball World Series, in this problem, you'll apply this strategy to the analysis of an image of baseballs moving toward the observer. The script named makeBaseballs in the observer folder creates a sequence of images of five baseballs moving roughly toward the viewer, and displays the sequence as a movie. The sequence simulates a set of baseballs that each start at different distances from the observer and move with different speeds in depth. The question that you need to answer is, which baseball will hit you first?

The sequence of 40 images is stored in a vector of structures that each contain a 2D image. Use the last image (index 40) and one that is some distance from the end (e.g. index 35) to compute the time-to-contact of each of the baseballs from an analysis of these two images, using the above approximation. You must compute the TTC by analyzing the images - you cannot use information in the makeBaseballs script to determine this information! Create a separate script for your code and add a comment to your script indicating the TTC of each of the five baseballs, and which one will hit you first (assuming they keep moving along their current directory). Also add comments to answer the question, under what assumptions is this approximation to time-to-contact valid? Give an example where the approximation would not be valid.

Hint: built-in MATLAB functions that you may have used in the previous assignment, like bwlabel and find, may be handy here.

Submission details: Hand in a hardcopy of your setupImageVelocities.m code file, your script for Problem 2, and your answers to the questions in Problem 1. Drop off an electronic copy of your code files by logging into the CS file server, connecting to your observer folder, and executing the following command:

submit cs332 assign6 *.*