My Experience with R
- Abhay Sri
- Mar 14, 2021
- 2 min read
As you may or may not know, R is a language primarily used for data science and statistics. Consequently, it is highly specialized, and only includes components that would be useful for professionals in the data science or statistics field. It's a language I fell in love with, and I first got into R from my non-profit organization, pH2O Analytics.
In order to create data models for pH2O Analytics, I needed to create a program to refine and model the data. Originally, I created a program in Java, the language I'm most familiar with. However, Java is incredibility inefficient for statistics, especially when the data sets range anywhere from 200,000 to 500,000 data points. Also, finding a simple graphing package in Java proved to be hard. As a result, I did some research on programming languages that would be useful for refining and modeling data. I came across two main options: Python and R. I already had experience with Python, as I learned it in 9th grade. However, R was new to me. After I did a little more research, I found that R would be the best fit. It had graphing utilities built in, and the language provided and easy way for me to filter data.
The next step would be learning R. I knew code academy had a course on R, so I went straight to it and learned the basics. The logic was similar to Python and Java, but the syntax was slightly different. Unlike Java, you did not need any methods, as R is not an object orientated language. Furthermore, one thing I really love about R is how it starts its elements at the index of 1, contrary to almost every other programming language, which start at the index 0. After I finished the code academy course on R, I immediately started working on creating models for pH2O Analytics. I first used the R command read_csv and turned the dataset into a dataframe, which allowed me to edit the values in R. I also used the popular R package ggplot2, which allowed me to visually represent the data. The rest is, well, history. If you are new to data science and want to get into the field, I highly recommend learning R, and a great place to start is R for Data Science, there is a virtual version here: https://r4ds.had.co.nz. You can also check out my code for pH2O here: https://github.com/aabhaysri/ph2oanalytics.