top of page
Search

Cloud Computing & Data Science

  • Writer: Abhay Sri
    Abhay Sri
  • Nov 28, 2020
  • 3 min read

Cloud computing was first introduced in the 1960s, when Joseph Licklider integrated it in ARPNET, the forerunner of the internet. However, the term and field greatly expanded after 2006, when large companies started using it to describe new technologies. Since then, the field of cloud computing has expanded into a 371.4 billion dollar industry. This is with good reason, too, because cloud computing has revolutionized storage, databases, analytics, and the field of data science.



AWS , A Popular Cloud Computing Platform


Not only has cloud computing reduced storage cost, but it has also allowed data to become more accessible. Due to cloud computing, massive amounts of hardware are no longer needed, and companies can rely on cloud computing to access data from anywhere in the world. One great example of cloud computing that has proliferated is Google Drive. With Google Drive, you can share documents and access them from anywhere. In addition, you can collaborate with people and the edits will all be saved virtually. Google Drive has changed the way we create and share documents, and as a student, I can confidently say my education would not be the same without it.


Cloud computing has now becoming integrated with the field of data scientist. Virtually every data scientist in today's world needs to know how to work with cloud computing services. Cloud computing provides means of storage for big data, and data scientists can collaborate on large datasets like never before. As our world as a whole consumes more products and services, monumental amounts of data will be produced, and cloud computing helps data scientists and companies manage the data and information the consumers will generate.



Cloud computing can also be used to store and analyze trends in ocean acidification. In fact, the datasets pH2O Analytics use are likely stored on a cloud platform themselves. Monitoring stations around the world generate data points every second, and the storage and accessibility of the datasets can be greatly simplified with cloud computing. Traditionally, the data would be stored in a server or in a local machine. Therefore, it would take someone manually accessing the data for it to become available, and this would be tedious and time consuming. Furthermore, if someone wanted to apply a machine learning or framework to the data, the processing power of the local machine would cause it to take months or even years. In addition, the data currently being generated by the monitoring stations in real time would not be processed instantly, delaying the availability of datasets. However, with cloud computing and storage, most if not all of these problems disappear. If cloud servers are used, the data generated by monitoring stations could be instantaneously processes, and the process for data retrieval would be simplified. In addition, if the credentials are shared, data scientists from across the world can view the data sets generated by the monitoring stations and they can work together to create information, connections, and trends. As a result, I am really excited about cloud computing and its potential impacts on data science. I know that the field, along with data science, will only continue to grow and create insights in the future.




EDIT:


The pandemic has caused many companies to shift over to the cloud so that their employees can work remotely. Read more about its advantages here.

 
 
bottom of page