Bridging the Gap to Statistical Programming Languages

Bridging the Gap to Statistical Programming Languages
Kyle Cox
This course serves as a bridge into advanced research courses such as multilevel modeling and structural equation modeling. From the very first time I taught the course I knew it was imperative to introduce R as the primary software for conducting analyses in the class. Many students enter this class with limited experience in SPSS and Excel so switching to statistical software with a command line interface (i.e., programing or coding with syntax) is daunting. The learning curve is steep and coupled with the anxiety inducing topic of statistics can compound difficulties. Even with these challenges I continue to emphasize the use of R because it provides immediate access to the most cutting-edge statistical tools. The ability to clean, manage, and analyze data in a programming language is a near professional requirement for educational scholars. Completing a graduate degree without some exposure to this skill set would be a disservice to students.

I provide substantial support to students to get them familiar and functional in R. Initial code files, videos, class time, and office hours dedicated to improving students’ ability to conduct statistical analyses in R. It has taken this effort to learn a programming language because it truly is a different language. If only there was a tool available to translate common terms into R code.
Over the past year, I began to hear about the ability of ChatGPT, Gemini, and other generative AI tools to generate code in a variety of programming languages. I began to experiment with using these generative AI tools to help write R code for a variety of my own research tasks including data cleaning, data management, data visualization, and a variety of analyses. Rather than sorting through various blog posts and discussion boards, generative AI proved to be an incredibly efficient tool to help me code in R.
This semester in my Advanced Statistics class I demonstrated using Gemini to aid in the use of R coding. Specifically, I showed students how Gemini could be used to (a) interpret and explain the R code I provided, (b) explain and provide guidance for any errors or warnings students received while coding in R, and (c) generate R code based on specific prompts. These three capabilities help students move from new and novice users to functional programmers. R is an incredibly powerful tool for educational researchers, one that has often been inaccessible. With the help of generative AI, students are better equipped to bridge this gap and access the newest and most innovative analytic tools.