Intro to computer Vision (Tue Nov 5, lect 19) | previous | next | slides |

Some basics about CV to get you started

Logistics

** QUIZ ** TODAY

Computer Vision

Seeing ==? Vision

We focus here on cameras
Ignore other kinds of “seeing” such as touch sensors or LIDAR.
A lot more than taking pictures
Can be used for:
1. Mapping
2. Localization
3. Object recognition
4. Trajectory Estimation
5. Driver Distraction
6. Redundancy

Camera Types: Regular (webcam-like)

RGB images
Video is basically a stream of images
Frame rate: How many images sent per second
The higher, the more bandwidth and CPU is used
Usually it is treated just that way
Each image is analyzed individually

Other light bandwidth cameras

infrared, etc.
Not common

Depth cameras and other special cameras

Mobile phone facial recognition
Microsoft Kinnect
Many others
Conceptually, In addition to R,G and B, each pixel has a number saying how far away that surface is
Also, “AI” Cameras like HuskyLens

Considerations

Resolution of image
Power requirements
Fixed direction or sometimes a swivel
ROS needs to know how the cameras position and direction relates to the overall robot base
Another job for TFs
What would happen if this information is incorrect?
Where do the images have to be “sent” to?

Connections

USB, direct to Raspberry Pi
Image needs to be viewed through unix utility
Which is not always easy
Bandwidth and speed of connection

Role of TFs

What is the likely TF that would be needed?
What would that tell us?

Computer Vision (CV)

Recognizing: faces, locations, fiducials, gestures
Image processing: filtering colors, isolating shapes etc.
Machine Learning (ML): Statistical analysis of tagged images

Code

NoteBurger TB3 in Gazebo does not have a camera! Must use Waffle

roslaunch prrexamples linemission.launch model:=waffle
Look at cvexample.py

Image Properties

Images are stored in arrays
Top left pixel is 0,0
Bottom right pixel is m,n
Each element of this array is another array, with the values for that pixel
And that depends on the image encoding.

Image Encodings

Usually each component is an integer from 0-255
Zero means none of that component; 255 means the maximum of that component
The first three components determine the color
There can be more components, e.g. distance, or transparency, or others.
Generally you can freely convert between encodings

Encoding for color: RGB

RGB - Red, Green, Blue
Like mixing colored lights. 100% of Red + Blue + Green gives white

Encoding for color: HSV

HSV - Hue, Saturation, Value
A little more subtle. You have to practice your eye
Hue is the color where all colors are arranged in a rainbow from 0 to 255
Saturation is how “pure” or “saturated” that color is
Value is how “bright” the color is
Sometimes its easier to think in terms of HSV


# Simple manipulations

tl_pixel = image[0][0] 				# top left pixel
r_channel = image[0][0][0] 			# 'R' channel of top left pixel
cropped_img = image[120:240, 0:320]	# crops the image to be only the bottom half

Image processing with OpenCV

OpenCV is the key library used for image procesing in ROS
It is totally distinct from ROS but great bridging functions exist
The current version is OpenCV2

Basics to understand

Understand representations of images as 2+ dimensional arrays
Understand that a given image “array” will have image information encoded
There are different encodings which change what is stored at the innermost array
Monochrome, Depth, RGB, HSV, and more
Different OpenCV functions expect and generate images of specific types

Connecting OpenCV to ROS …

CvBridge() provides a series of methods that convert from ROS messages to OpenCV2() formats
And vice versa
e.g. CvBridge().imgmsg_to_cv2(msg, desired_encoding='bgr8')

How to think about algorithms

They are a series of distinct steps (e.g. a pipeline)
Often, given an image, transform it, to a different image
The order really matters
Sometimes, given an image, analyze it, generate data (a number, an array, a list, etc)

Intro to computer Vision (Tue Nov 5, lect 19) | previous | next | slides |

Logistics

Computer Vision

Seeing ==? Vision

Camera Types: Regular (webcam-like)

Other light bandwidth cameras

Depth cameras and other special cameras

Considerations

Connections

Role of TFs

Computer Vision (CV)

Code

Image Properties

Image Encodings

Encoding for color: RGB

Encoding for color: HSV

Image processing with OpenCV

Basics to understand

Connecting OpenCV to ROS …

How to think about algorithms

Thank you. Questions? (random Image from picsum.photos)