Sree Harsha Kalli

I am a graduate student at Carnegie Mellon University pursuing a masters degree in Computer Vision. I am particularly interested in applying deep learning techniques to solve vision related problems.

My resume is here.

Contact me at


Carnegie Mellon University

 Master of Science in Computer Vision
     QPA : 4.11
 Courses : Computer Vision, Machine Learning, Deep Reinforcement Learning, Visual learning and Recognition

Indian Institute of Technology, Hyderabad

 Bachelor of Technology in Electrical Engineering, 2015
     CGPA : 9.59/10

Honors and Awards

  • Silver Medal
     Recepient of silver medal from IITH for being the branch topper of EE
  • TODAI Scholarship
     Was awarded this scholarship for excellent academic performance by the University of Tokyo in association with the Mori Seiki Company
  • JENESYS student exchange program
     Was among the four students selected from my batch to visit Japan as a part of an exchange program.I visited various research hubs and universities in Japan.
  • Certificate of Academic Excellence
     Received this award twice for excelling in academics in my undergrad


I have worked on several projects in the fields of Computer vision and Deep Learning. Some of them are listed below

Large Scale Video Action Recognition in collabortation with Disney Research

We used a two stream CNN network to achieve a 90% MAP result on the activity Net data-set. Also, proposed a new optical flow which preserves the long term motion dependency in videos.


Fooling Neural Networks

Came up with an architecture to generate images which look like dogs but fool the neural network into classifying them as cats using GAN’s and VAE’s .


Surround View System

Developed a surround view system to detect moving objects around a vehicle using views from fur fish eye cameras mounted on the vehicle. This module was developed as a part of a project on autonomous vehcile being developed in CyLab.


Video Segmentation using Convolutional Neural Networks

Augmented pixel level semantic segmentation with object masks to create better semantic segmentation in videos.


Adaptive Tracking of an object using deep networks

Built a recognition system to track an object in a video robustly. The weights of the neural network were initialized using a pre trained Stacked Denoising Auto Encoder. The object was tracked using particle filtering and identified by passing it through the MLP network.

Circuit Solver Using Image Processing

Came up with an algorithm (using Hough transforms ) to analyse a circuit by just taking the image of the circuit. Won the best project award foor the course.


3D Reconstruction using Prior Information

In this project I have addressed the problem of determining the number of views required and also the transformation required to reconstruct a 3D object using some prior information embedded in the form of a manifold.


Audio Finger Priniting

Developed algorithms to recognize the song hummed by the users. The pitch for different frames was extracted from the song and this was compared with the pitch of the reference signal using a progressive filtering framework based on Dynamic Time Warping and edit distance.


Glucometer Reader

Developed a portable add on to a glucometer which reads the reading on the display using a photo diode and then this info is processed in the ARM micro controller present in the add on to determine the reading.



As an intern at IISC Bangalore, I worked on implementing different algorithms used in an NMR spectrometer on FPGA's. Here is a cool video of John Conway's game of life being displayed on a CRT display interfaced with a Xilinx FPGA.