A comparison of Deep Learning and human vision
Overview
Deep learning models have found to be highly suc- cessful in achieving state-of-the-art results for image recognition tasks. Since AI started as a field to represent human’s cognitive behavior, it would be interesting to learn how cutting edge models perform on certain tasks that are mundane to humans. This paper explores the connections between certain deep learning models and human cognition especially in the area of vision. It starts by giving an overview about AI, deep learning and cognitive science and describes vision transformers in brief. This project aims to compare these models with humans by building a computational model using vision transformers and convolutional neural network trained on CIFAR10 dataset. The results of these models are compared to humans using various metrics. Finally it also explores similarities in architecture between transformers and human brains. The learning of the paper has implications on building more robust models that are closer to humans not only in a laboratory environment but also in real life complex tasks.