CMU's AI Summer Scholars Program



AS AS

In July 2023, I volunteered to be a project leader for a CS Pathways initiative by CMU, called AI Summer Scholars Program. The program gives 30 rising high school seniors from backgrounds underrepresented in the STEM fields the unique opportunity to learn about Artificial Intelligence and Computing over the course of four weeks. The program requires no coding background. Students often start as absolute beginners and go on to create their own first AI projects (E.g., a simple image classifier on a problem statement of their choice).

The program was designed to teach them AI-ML fundamentals followed by hands-on-experience in the form of group projects. We kickstarted the first week with each of the six project leaders teaching a 90 minute tutorial session on concepts like image processing, convolutions, CNNs and CNN optimization along with a demo of sample project code to help them get started. I had a lot of fun preparing for my convolutions tutorial, where I taught them the Convolution operation, pooling, strides and padding, ending the session with code examples.

From the next week, we were assigned a group of 5 students and we started building their very first image classification project! In the first session, I encouraged my team to come up with a list of ideas from every individual to get an idea of the topics our team would be interested in. By the end of the session, we had 9 ideas with 4 strong and interesting ones. I asked them to go back, discuss these ideas and find publicly available datasets for each of these ideas for the next session. Interestingly, by the next session they had all converged to the idea of ‘Air Pollution Classification’ from images. Usually, Air Quality Index (AQI) prediction is a regression problem and the dataset is a time series data comprising of different pollutant concentration at different times of the day. After some research, my mentees came across one dataset collected from India and Nepal, with images and corresponding annotations of the 6 AQI classification levels.

I wanted my mentees to learn how to load their own data from scratch, so the next 3 sessions we experimented with various train-test-val splits and loading the data in a shared Google Drive. Anytime we were stuck on something, two of the kids would go back home and get it done by the next morning. Soon it felt like my own project and I was actively trying to get each person involved and maximise their learning experience.

By the end of the next 7 days, these kids had trained multiple models on 2 different dataset splits, had used pre-trained MobileNetV3 and ResNets to fine-tune their network, knew the importance of Batch Normalization, Dropout, initial learning rate and learning rate schedulers.

Two highly energetic students kept me occupied with their never ending doubts about loading the model, saving the model, predicting and testing on a pre-trained model, etc as they learnt basics of python and numpy type conversions and reading Keras documentations for the first time. This one quiet kid had immense creative potential, as he impromptu came up with the title of the project ‘PollutAI: detecting pollution, one pixel at a time’. He was responsible for hyperparameter tuning. Another student took the initiative to create the project website, and the 5th kid was responsible for designing our slides. The final two days went in discussing the presentation flow (yes, they made a diagram of their own ML pipeline), and thinking why our model was failing, they were intrigued by the confusion matrices they plotted and had fun working hard till the presentation day. I even took them around the CMU campus and we collected a small sample test image dataset, measuring the actual AQI that day using the EPA AirNow mobile application. We had a demonstration of these images on our website in the final presentation. Check out their project presentation here!

As there was no pre-defined system to mentor them, everyday I would do my homework about how to resolve current issues and have clear next steps in mind. As mentees in my group came from very diverse prior coding experiences, my biggest challenge was to ensure that everyone felt heard, actively contributed, and maximized their growth through this opportunity. It was a unique experience for me as I got to learn a lot about myself by this hands on management, and was proud of the team as we worked extra hours and late night zoom calls before the final presentation.

I saw some of the high school student teams design websites better than I ever have and was surprised at their confidence and clarity of thought during presentations. Their energy was amazing and I hope these passionate kids land their dream schools this application season. This was my favourite 50 hour project in a very long time!