The Problem

As undergraduate participants of the Indiana University Computer Vision Lab, we were tasked with discovering ways to combat the problem identifying counterfeit integrated circuits (ICs). Counterfeit ICs are essentially electronics that are made to look and act like regular ICs, but may be less expensive to produce or even contain malicious components. One of the ways to find counterfeit ICs it to simply count the number of ICs on a circuit board and compare that number to what it should be. However, this is extremely time consuming to do manually with the human eye.

The Solution

I trained a deep convolutional neural network (CNN or convnet), a particular method of machine learning (deep learning, specifically), to count the number of cars in an image of a parking lot. I took this approach at the suggestion of my faculty advisor because 1.) viewed from above, parking lots and circuit boards containing ICs share a similar structure; 2.) there are, to our knowledge, no large datasets of circuit boards, while the ~12,000 sample parking lot dataset we found was ready to use; and 3.) I wanted to have some fun learning about convnets!

Existing Literature

In research conducted by Cazamias and Marek at Stanford University, the authors attempted to count the number of empty parking spaces in images of parking lots. They did this in two ways, the second of which would be the basis for my attempt at counting cars instead; here they trained a CNN on 70%/20%/10% train/test/validation split of 12,417 images of parking lots to yield a model achieving an accuracy of about 81%. The images of the parking lots consisted of three different camera angles of two different parking lots. Each was labeled with the number of cars present.

connectrix graph
Image from the dataset publication by Almedia et. al.: "Images captured under different weather conditions: (a) sunny (b) overcast, and (c) rainy from UFPR04; (d) sunny (e) overcast, and (f) rainy from UFPR05; and (g) sunny (h) overcast, and (i) rainy from PUCPR"

The CNN, which I reconstructed from scratch in TensorFlow/Keras, was composed of 3 convolutional layers of depths 10, 20, and 30 nodes respectively. A 5 x 5 kernel was used at each convolutional layer, and each was followed with batch normalization, a ReLU activation function, then max pooling of size 2 x 2. After these three layers, the input was flattened and fed to three fully connected layers of size 30, 20, and 10 respectively with ReLU activations after each. A softmax layer was used to classify a "count" of the number of empty spaces or cars. in their research they classified between 0 and 20 empty spaces, saying that anything over 20 empty spaces would matter in application since it would be easy to find a parking spot; however, my research used classes between 0 and 100 since there were no images in which the number of cars exceeded 100. This, I'll reason, could be why my accuracy was much lower than theirs on their problem.

connectrix graph
Image of the convolutional neural network architecture by Cazamias and Marek. Layer depths were 10, 20, 30, 30, 20, and 10 respectively (with 5 x 5 convolution windows). This image depicts the input being a single parking space/car, however for my problem, the entire parking lot image was used.

Technologies

My favorite part of the project was learning about and using cutting edge technologies. I started off using TensorFlow directly on my laptop. I did use my GPU, but it ate up the resources and took about 10-15 minutes to train the models. Then I got access to FutureSystems, a cloud computing service associated with our university. I trained on an NVidia Tesla K80, which reduced training time to less than 5 minutes and freed up my machine to do other things. This came in very handy when training and retraining for cross validation. Eventually, a colleague told me about Keras, a library that sits on top of TensorFlow, which I tried and loved. It made the CNN building and tinkering much easier and intuitive.

Accuracy

My CNN would ultimately achieve an accuracy of about 60%, though given the differences in difficulty between our problems (as I reason above my random benchmark would be 1/100 while theirs was 1/20), it's fair that my accuracy be lower than their 81%. I also surpassed their accuracy, achieving 84% counting empty spaces, by implementing a technique on the data that I used to increase the accuracy on my problem.

I tried several things to increase an accuracy of ~55% initially. The first was using a continuous predictor instead of discrete classifier. This produced similar, but worse results. I also tried using color instead of grayscaled images, as is often done to speed up training time. This did help a bit, but results weren't much better. Finally, the most prroductive of my efforts were in manipulating the images before training. I had noticed that some images labeled as having 0 cars, actually often contained some. So, looking at details of the dataset, I found that only cars within certain coordinates of the parking lots were counted. Therefore, I cropped the pictures to bound only those coordinates. I would argue this to be legal, because for counting ICs on an integrated circuit, one would be taking many, practically identical pictures, and could crop the images to what they knew could contain ICs. This is what ultimately lead to my highest accuracy of 60% and how I achieved 84% accuracy on the previous reachers' task, compared to their 81%.

connectrix graph
Only cars in the bounded areas were counted. For example, this image was labeled to contain 0 cars even though there are obviously 2 present. Therefore, cropping to these cardinal bounds removed much of the noise from the other cars. All images consisted of two parking lots, one of which had two different camera angles. Coordinates of spaces counted were included in the dataset documentation.

Validation

I used ten iterations of Monte Carlo Cross Validation to validate these results. The standard deviation among trials was low.

With the cropped photo method detailed above, I graphed the following frequencies of car count labels and car count predictions using the trained model in order to visualize their distributions. I'd say they are fairly similar, enough to conclude that the model wasn't just guessing and getting lucky.

connectrix graph
Distribution of Number of Cars Across Samples According to Labels: The distribution of the number of cars in the full dataset was skewed toward 0, i.e. many photos had 0 or few cars. But with the range of classification much wider (the model predicted from 0 to 100 cars), the probability of 0 cars was less than 25% for a given photo.
connectrix graph
Distribution of Number of Cars Across Samples According to Classification by the Trained Model: Here the distribution of classifications resembles the actual distribution. Interesting spikes occur at around 72, 90, and 95 cars, where the model was biased towards those numbers instead of spreading things out.

There are several things that can be done to improve this model. Since there seems to be high bias, further meta-analysis of the model and its results could be done to see if the bias-variance trade-off is biased towards bias (haha). If this is the case the architecture could be made deeper or regularization parameters relaxed, since high bias is indicative of underfitting. Another interesting approach would be to use transfer learning, perhaps with a model trained on ImageNet, that might help the algorithm identify cars.

However, if I do choose to proceed with this project, I'll probably try to bring it back to the original problem of counting ICs. Perhaps after some adjustments, transfer learning from this model to a model trained on an IC/circuit board dataset (During this research I helped our lab label ICs on about 1000 images of circuit boards.) Regardless, it was a fun project, and I'm glad I got to test my machine learning knowledge and learn a lot more in the process.