Inside the College’s Crash Course in Coding
The world is drowning in data. According to Statista, one of the leading providers of market and consumer data, more than 300 million terabytes of data are being created each day, and 90 percent of the world’s data has been created in the last two years alone. For undergraduates entering the job market in the next few years, that abundance of information could represent a tremendous opportunity, especially for those who have the skills to manage massive amounts of information and to understand what it’s trying to tell us.
Outside of earning a degree in computer science, though, learning the coding skills to manage large data sets has been a challenge for students interested in making use of the fire hose of information being created about our health, our behavior and the world around us. A new course created by Prof. Craig Group from UVA’s College and Graduate School of Arts & Sciences’ Department of Physics is providing students the practical skills they need to analyze and visualize data on a large scale. It’s also giving them a variety of practical skills they’ll need to be competitive in an increasingly data-centric world.
For several years, Group, an associate professor of physics whose research is focused on experimental particle physics, taught a 2000-level course focused on teaching undergraduates the C programming language and computational and statistical analysis techniques. However, Group felt the learning curve was too steep for students who were less interested in becoming computer programmers and statisticians and who simply needed a way to solve the kinds of problems they are likely to encounter in a lab or in an office.
“You really couldn’t do much more than learn the basics of C in a one-semester class,” Group said. “And the language doesn’t include higher-level tools that you could apply to problems in physics or whatever kind of science you might do.”
A conversation with his department’s chair led to the creation of a new course called “Introduction to Python for Scientist and Engineers,” a 1000-level course launched in the spring of 2022. The course requires no prior knowledge of coding or physics and can be used to satisfy general-education requirements for students in any discipline. Instead of focusing on languages like C, the course teaches the students Python, a popular programming language that’s rapidly becoming the language of choice for coders in the sciences, mostly due to the libraries of ready-made modules and packages and other functionalities that allow coders to break large programming tasks into smaller, more manageable subtasks that can be easily combined to create more comprehensive applications. Just as significantly, Python minimizes the amount of code students need to write from scratch.
The course doesn’t teach students everything about becoming a Python programmer, but it teaches them enough to do the kind of work they’re likely to encounter in the lab or in any situation where they’re working with data.
“Python is a much higher-level language with a lot of tools just built in that you can use to do more advanced things,” Group said. “And the syntax is really simple. It doesn't take you as long to get to the point of being able to do something pretty complex like making really nice plots of data that are publication ready, and you can learn to do that in matter of hours.”
Group calls the class a bootcamp for data scientists in which he spends only a third of the course giving students a fast-paced, hands-on introduction to the Python language and its syntax. In the second third of the semester, students focus on some of the basic statistics students need to understand what the data is telling them, and then Group spends the last third of the class focused on some of Python’s more advanced features.
“By the end of the class, we're training neural networks and doing some pretty advanced things,” Group said.
Students in Group’s class also learn to use practical and popular collaboration tools like Jupyter Notebook and GitHub. These web-based applications for group-based coding projects are useful job tools whether Group’s students pursue careers in academia or in industry.
The class also teaches students how to solve real-world programming problems in much the same way that a professional programmer would.
Darren Upton, who graduated this spring with an undergraduate degree in physics and will begin a Ph.D. program in nuclear physics in the fall, has worked with Group as a teaching assistant for three semesters. As a T.A., Upton helped Group shape the class into something that’s practical and accessible for the students who are taking the class, no matter what their background or degree path. For Upton, learning to code is less about learning to be fluent in a particular programming language and more about learning to be resourceful.
“The resources for programming are more numerous than you think, but knowing where to look, and knowing how to look for them is the real skill in programming,” Upton said. “You just need to know which questions to ask and where to look.”
Group’s class is also designed to be an active-learning experience where students take a hands-on approach to learning the subject rather than just listening to lectures. Students spend time working on the kinds of problems they’ll see in a lab or in real research settings. They work in groups for much of the semester before getting the opportunity to demonstrate what they can do on their own at the end of the class.
“I think we owe it to our students not to just teach them physics, but to teach them modern skills that are going to make them marketable, to make them ready to go out into the world to do something that's super valuable right now,” Group said. “They can certainly use those skills in physics, but they can also use those skills almost anywhere else.”
Lindsay Grose, who graduated in May with a double major in environmental sciences and statistics and will begin her graduate studies at the University of Rhode Island in the fall, was heavily involved in research as an undergraduate at UVA. For Grose, learning coding opened a new world of opportunities.
“Coding lets me do so many more things in so much less time,” Grose said. “And it allows you to do research that’s never been done before.”
As a Ph.D. student, she’ll study how ocean currents transport heat around the globe and how that influences climate change.
“I can say confidently that I would not have gotten into graduate school for what I want to do without knowing coding,” Grose said. “It definitely makes you a much more desirable candidate.”
Sarah Hunter-Chang, a doctoral candidate in neuroscience, credits her knowledge of coding with her success as a student at UVA. She took a coding class in high school and later realized it might be useful in managing big data sets.
“I think you can definitely have a productive career in science without coding skills,” Hunter-Chang said, “But increasingly, I think you're going to miss out on opportunities. It opens so many doors.”