Q&A: A Look into Computer Vision
It’s no secret that computer vision, machine learning, and augmented reality is on a rapid incline, transforming the tech field as a whole. We sat down with 2 of our team’s computer vision interns, Manasa Manohora and Deepika Kanade (pictured below), to chat about where they see the CV industry going.
1. Since you’ve started in computer vision, what overall progressions have you seen?
Manasa: It’s been over a year since I started working in the field of computer vision. Previously, many companies used to stick to traditional OpenCV methods to process and analyze image data. But now, most of them are moving towards deep learning to solve the same problems. Deep learning methods have achieved accuracies in identifying and classifying image data which were before unattainable by traditional methods. Every identification, reconstruction, classification problems are now being solved by deep learning and machine learning algorithms.
Deepika: I feel that the computer vision industry has progressed a lot in the last five of years or so, especially due to the increase in companies working towards autonomous cars. The use of deep learning frameworks is on a rise, which has brought about a lot of new applications of computer vision. I feel that there are so many other applications of computer vision that can perpetually change the world which are being worked on by engineers of today.
2. What excites or interests you about the field?
Manasa: Computer vision gives a machine the ability to comprehend what we see. There are a lot of applications of CV like reconstructing 3D data, surveillance, and stores like Amazon Go using multiple CV algorithms in parallel to identify and understand what the customers are purchasing. Unlike humans, who can get overwhelmed or biased, a computer can see many things at once and I think completely automating everyday activities excite me.
Deepika: My favorite aspect of working in this industry is that there is a lot of scope for research. As the industry is blooming, a lot of freedom is there in terms of experimenting things and coming up with new models. Also, what excites me is the fact that whatever we as engineers are doing, can bring about a change in which things will work some years down the lane. AI, machine learning and computer vision can change the way the world is functioning now. I am mostly interested in computer vision and mixed reality, and the best thing about that is that every algorithm that we come up with can make us see things better visually. The concept of making the machines see the way man does, intrigues me a lot.
3. Give us insight on a project you are working on and the challenges that have come with it.
Manasa: The projects that I am currently working on involve deep learning to convert 2D to 3D and to obtain stereo depth maps. There are challenges right from the start as you are looking at multiple papers and publications to filter your approaches to solve the problem. Some papers don’t have their approaches clearly defined which makes it harder to understand where we could improve the current state of the art implementations, whereas some of them do provide their implementations, but none of the results match up to the standards mentioned in the paper. After we decide on an approach, there are a lot of problems with respect to using the correct dataset, their preprocessing, exploring hardware options, trying out different frameworks to achieve the result. 2D to 3D conversion in real time would be really cool as you could process and display 3D content with no extra hardware on the camera.
Deepika: I am working on a project which involves conversion of 2D pictures into 3D pictures and generating depth maps. This will basically help to see all the pictures that we click in our cameras to appear in 3D, resembling how we see the world. The 3D industry is growing rapidly and I feel that this project will garner good responses from the customer as now-a-days, everyone wants to see everything in 3D. The challenges I faced while working on the project where related to hardware. As we are deploying deep learning in the project, GPU’s have to used which have inherent memory problems for larger dataset. Also, there is always an unpredictability related to the desired output when deep learning is used, which makes the process harder.
4. Where do you, personally, see the industry going?
Manasa: Computer Vision technology is very versatile and can be adapted to various fields. Healthcare, creating content, automotive, and retail is going to be revolutionized by CV. The main challenge that I see for the field would be that it can lack the right amount of high-quality data required to train the machine.
Deepika: I see that the computer vision/artificial intelligence/machine learning industry has a lot of scope for making innovations as making the machine work like a human brain can have varied applications in a lot of domains. I feel that running multiple computer vision algorithms in parallel is an aspect which is still not explored by a lot of companies. If we can do that, it can definitely ease our everyday life.