In this post, I would be taking all of you through one of the latest programming language called R. We all know different computer programming languages. But why R, it’s the next programming language? How it differ from other languages? Common, let’s have a look.
R is a fascinating programming language. That’s partly because the language has grown significantly in popularity; it’s now used in a range of professions including software development, business analysis, statistical reporting and scientific research. Additionally, as the amount of data-intensive work increases, the demand for tools like R for processing, data-mining and visualization will also increase.
- R is not just a statistics package, it’s a language.
- R is designed to operate the way that problems are thought about.
- R is both flexible and powerful.
The Importance of Being a Language
- R does allow you to create new forms of regression (and many people have).
- R also allows you to easily perform the same sort of standard regression on your 5 datasets (or maybe it is 500 datasets).
R in business
R originated as an open-source version of the S programming language in the 90s. Since then, it has gained the support of a number of companies, most notably RStudio and Revolution Analytics which created tools, packages, and services related to the language. But it isn’t limited to these more specialized companies; R also has support from large companies that power some of the largest relational databases in the world. Oracle, for one, has incorporated R into its offerings . Earlier this year Microsoft acquired Revolution Analytics and is including the language in SQLServer 2016. SQLServer administrators and .NET developers now have R at their fingertips, installed with their standard platform tools.
R in higher education
Here’s a fun fact: R originated in academia. Ross Ihaka and Robert Gentleman at the University of Auckland in New Zealand created it, and it’s been widely adopted in graduate programs that include intensive statistical study. R has also been used in Massive open online courses (MOOCs) such as the Coursera Data Science Program and in courses here at Pluralsight (including my own on R and RStudio ). Folks taking graduate studies that involve crunching data are bound to encounter R, and like many other technologies, its introduction in schools leads naturally to its wider adoption in industry. R’s presence in higher education is confirmation of the demand for these skills in business settings.
R is Profitable
Technology is fun, sure, but most of us who enjoy it also do it for a living. Fortunately, R is not only a pleasure to use, but its demand in business often equates to higher salaries for its practitioners. The Dice Technology Salary Survey conducted last year ranked R as a highest-paying skill. The most recent O’Reilly Data Science Salary Survey also includes R among the skills used by the highest paid data scientists.
R has a diverse community
The R community is diverse, with many individuals coming from unique professional backgrounds. This list includes academics, scientists, statisticians, business analysts and professional programmers, among others. CRAN, the comprehensive R Archive Network, maintains packages created by community members that reflect this colorful background. Packages exist to perform stock market analysis, create maps, engage in high-throughput genomic analysis and do natural language processing. This is only the tip of the iceberg; over 7000 packages are available on CRAN as of this writing. Additionally, R-Bloggers is a blog-aggregation site that serves as a hub for news related to the R community.
R is fun
R is FUN! Initially, I was drawn to R for its ability to generate charts and plots in very few lines of code tasks that would require several hundred lines of code in another language could be accomplished in only a few lines. While it’s considered quirky when you compare it with many popular languages, it includes powerful features specifically geared toward data analysis.
Why R for data analysis
- Interactive language
- Data structures
- Missing values
- Functions as first class objects
R has a fantastic mechanism for creating data structures. Real data have missing values. Missing values are an integral part of the R language. Many functions have arguments that control how missing values are to be handled. R has a package system that makes it extremely easy for people to add their own functionality so it is indistinguishable from the central part of R. The R community is very strong, and quite committed to improving data analysis.
COMPARISON BETWEEN R AND PYTHON
R: R is the Open source counterpart of SAS, which has traditionally been used in academics and research. Because of its open source nature, latest techniques get released quickly. There is a lot of documentation available over the internet and it is a very cost-effective option.
Python: With origination as an open source scripting language, Python usage has grown over time. Today, it sports libraries (numpy, scipy and matplotlib) and functions for almost any statistical operation / model building you may want to do. Since introduction of pandas, it has become very strong in operations on structured data.
- Availability / Cost
- Ease of learning
- Data handling capabilities
- Graphical capabilities
- Advancements in tool
- Job scenario
- Customer service support & community
Other factors: Following are some more points worthy to note:
- Python is used widely in web development. So if you are in an online business, using Python for development and analytics can provide synergies.
- SAS used to have a big advantage of deploying end to end infrastructure (Visual Analytics, Data warehouse, Data quality, reporting and analytics), which has been mitigated by integration / support of R on platforms like SAP HANA and Tableau. It is still, far away from seamless integration like SAS, but the journey has started.
- If you are a fresher entering in analytics industry (specifically so in India), I would recommend to learn SAS as your first language. It is easy to learn and holds highest job market share.
- If you are someone who has already spent time in industry, you should try and diversify your expertise be learning a new tool.
- For experts and pros in industry, people should know at least 2 of these. That would add a lot of flexibility for future and open up new opportunities.
- If you are in a start-up / freelancing, R / Python is more useful