Overview of Computer Vision
Ahid Naif • Sep 11 2020
You might've heard of the term computer vision before, but you still don't understand what it is exactly about or you may confuse it with other terms, such as image processing and artificial intelligence. In this article, I will help you have an idea about computer vision and clear up the confusion among the terms: computer vision, image processing, and artificial intelligence. Also, I will discuss why computer vision is important.
Computer vision is a scientific field that deals with how computers can be made to gain a high-level understanding from digital images or videos. For humans, this is a trivial task. We see images every day, we can analyze and interpret them very easily. Humans can conclude a lot of information and ideas just looking at a picture. There is an English idiom that says "A picture is worth a thousand words". This means that a lot of complex ideas can be conveyed with just a single picture. For example, looking at the picture below, we can see four tabby kittens inside a brown basket. The basket is on the grass and it has a tag with a kind of serial number on it, "368 3 13". There is also a small plant with few purple flowers behind the basket. At a glance, we could obtain all of this information from the image. Probably, you can get more information than I could.
However, this is not the case for computers. An image from a computer's perspective is just an array of pixels. This is how the image of kittens above look like for a computer:
Computers can interpret nothing from an image. So, here comes the use of computer vision field which is concerned with the automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images (videos). The ultimate goal of computer vision is to emulate human vision including the ability to learn and make inferences from input images and take actions based on these inferences.
As the name suggests, image processing involves processing the image, such as smoothing, sharpening, contrasting, stretching and even cutting out regions of interest. In other words, image processing is the field that deals with performing operations on images for the purpose of extracting useful information from the image or just enhancing it. The image below is an example of image processing where I have used mathematical algorithms to detect the edges in an image of a child.
<div style="text-align:center"><img src ="//binarytorch.com.my/storage/edge_detector.jpg" /></div>
Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. AI is essentially the umbrella term for any technology that tackles complex problems in a human-like manner, or that generally acts in a human-like way. AI projects aim to make machines perform tasks as well as, or even better than humans. AI systems are designed and developed in a way that makes them behave in their own way which emulates human behavior in terms of interaction with inputs, unlike traditional systems which are set with specific rules by a developer to always behave in the same way when receiving a specific input. Through this approach, it has become possible for computers to start doing tasks that were dependent on humans, such as customer service, textual analysis or driving cars.
Thus, we can see that systems designed with artificial intelligence do not necessarily involve computer vision. Assistants such as [Siri](//simple.wikipedia.org/wiki/Siri_(software)), [Cortana](//en.wikipedia.org/wiki/Cortana), and [Alexa](//en.wikipedia.org/wiki/Amazon_Alexa) are examples of systems that apply AI algorithms and emulate human speech. However, these systems do not apply any computer vision algorithms. So, computer vision is not a subfield of artificial intelligence, however, it overlaps with it.
A very famous example of artificial intelligence applications is the driver-less car. A driverless car ? can detect people in front of it. Not only that but also it can distinguish people from vehicles, traffic signs and other important objects on the road. Also, as its name implies, it can drive itself which means it can take actions based on what it sees.
<div style="text-align:center"><img src ="//binarytorch.com.my/storage/tesla_driverless_car.gif" />
Tesla’s driverless system detecting objects in a foggy scenario via <a href="//www.bloomberg.com/news/articles/2016-12-20/the-tesla-advantage-1-3-billion-miles-of-data">Source</a>
The terms artificial intelligence and machine intelligence are used interchangeably!
Difference between Image Processing and Computer Vision
Now that all the terms are defined, I will run a comparison that can help you distinguish these terms. First, let us start with the main difference between machine vision and image processing since they are very related. Image processing is when operations are done to get a better version of an image, in other words, to enhance an image. On the other hand, computer vision is much wider than image processing. It can be defined as the science of machines or computers that can see. Computer vision is concerned with making computers extract useful information from the scenes they see. The applications of computer vision are numerous and include:
Augmented Reality (AR)
Industrial quality inspection
By listing some applications, we can see that computer vision is a very wide field and interconnected to other fields. Some of these fields are image processing and machine learning. Machine learning is making computers or machines capable of learning so that they can perform (without being explicitly programmed) high-level tasks with human precision.
The main difference between image processing and machine learning is in goals, not in methods because some algorithms may be applied in both of the fields. For example, if the goal is to enhance an image for later use, then, at this stage, it is called image processing. However, if the goal is to emulate human vision such as object recognition or automatic driving, then it is called computer vision. For example, in a face detection system, it is called image processing at the stage when operations are done on the image such as removing the noise and getting the region of interest (faces in this case) while it is called machine learning when it comes to learning faces. However, the whole face detection system where both image processing and machine learning algorithms are applied is considered a computer vision application.
Difference between Computer Vision and Artificial Intelligence
The difference between these two can be understood very easily. In the previous section, I have mentioned that computer vision field is interconnected with machine learning. Essentially, machine learning is a sub-field of artificial intelligence. Thus, computer vision is interconnected with artificial intelligence. The conclusion is that a computer vision application may or may not have artificial intelligence algorithms applied.
Keep in mind that these fields are not strictly categorized in this way. Different scientists and practitioners might have different perspectives.
Having an overview of the methods applied in all of the fields will help you have a deep understanding of the differences among them.
Why is Computer Vision Important?
Computer vision is used across industries all over the world. There are industries that would not even exist without computer vision. Computer vision helps industries increase efficiency, increase security, reduce cost and enhance the consumer experience. It provides solutions to a wide variety of fields:
In manufacturing, industries use computer vision to identify defects of products. Computers nowadays can detect a lot of different kinds of defects even on very small products.
In the medical field, computer vision systems can help doctors identify abnormalities on different kinds of scans. These systems have managed to reach human precision level.
Defense and Security
In high-security environments like banking, computer vision is used for the purpose of identifying customers when large amounts of money are being exchanged. Computer vision systems can analyze hundreds of video feeds at once which is impossible for security guards.
Daily Life Usage
Computer vision makes our lives easy. Imagine being able to find how many calories are in your meal only by taking a snap of it or being able to identify the type of your pill by only taking a snap of it. Also, imagine being able to identify objects or read labels and text by pointing your smartphone's camera at the object or the text. Such applications make life easy and fancy! Actually, these applications already exist! ?
Computer vision applications and usages are enormous and cannot be listed. The examples of applications I mentioned are just a drop in the ocean!
Do you feel excited to learn about computer vision and play a role in developing computer vision systems?
Go to [Learn Image Processing and Computer Vision with OpenCV](//mysenior.io/blog/image-processing/learn-image-processing-and-computer-vision-with-opencv) and start your computer vision learning journey!.