My first experience with R

Hi everybody,

I am Sachin Kumar K M. working as an Associate Software engineer.

It was my first day at workplace after 4 months of joyful training in Business intelligence stream. I was assigned to Proof of concept work along with my other three training mates. I introduced myself to my new colleagues who seemed very serious about their work. We were informed that we would be working on R tool.

R studio is an IDE for R. It is a powerful and productive user interface for R. I installed latest version of  Rstudio (which is an open source) from this  CRAN r project website  into  the system allocated to me. It was all Greek to me when I started experimenting with R.I went through various videos in YouTube to get familiar with available options of the tool.

I tried tryR  website to know how Rstudio works. Which is an interactive tutorial through which I realized R is a language and an environment for statistical computing and graphics.

What amazed me is the number of packages it is backed with. R has more than 5000 packages each one of them is meant to serve different purposes (work).I spent my initial days at work with my headphones on. As I had believed YouTube is one of the best sources to start learning. Mean while I developed my curiosity towards wordclouds.

Wordcloud is an eye catching way of representing and analyzing a text document. I am an avid reader of newspapers .I had been witnessing political parties, Business enterprises making use of such word clouds to advertise in newspapers. I found an article which had a code to get word cloud in R.


A sample word cloud

Yes, It worked .As soon as I executed the code there was a beautiful and ordered word cloud on my screen. I was flying in colors. That reminded me of my school days math classes. Where solving a complex problem correctly used to give me an immense satisfaction.

As days passed I learnt there are number of online communities, forums and blogs which are supporting R users in their journey. I signed up in all active communities like R-blogger, stack overflow, R mailing list, Revolution Analytics, statistics blog and many more.

They helped in enhancing my  Data science knowledge. I started reading latest posts regularly. There were many questions related to career in Data science stacked on my mind for which, I found answers  by referring to blogs. My first week at work was full of curiosity, Learning and experimenting with Rstudio.


    • install.packages(“tm”) # for text mining
      install.package(“SnowballC”) # for text stemming
      install.packages(“wordcloud”) # word-cloud generator
      install.packages(“RColorBrewer”) # color palettes# Load
      require(SnowballC)# Read the text file
      text <- readLines("POC.txt")
      docs <- Corpus(VectorSource(text))
      toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
      docs <- tm_map(docs, toSpace, "/")
      docs <- tm_map(docs, toSpace, "@")
      docs <- tm_map(docs, toSpace, "\\|")
      docs <- tm_map(docs, content_transformer(tolower))
      # Remove numbers
      docs <- tm_map(docs, removeNumbers)
      # Remove english common stopwords
      docs <- tm_map(docs, removeWords, stopwords("english"))
      # Remove your own stop word
      # specify your stopwords as a character vector
      docs <- tm_map(docs, removeWords, c("blabla1", "blabla2","may","viabl","page","often","also")
      # Remove punctuations
      docs <- tm_map(docs, removePunctuation)
      # Eliminate extra white spaces
      docs <- tm_map(docs, stripWhitespace)
      # Text stemming
      docs <- tm_map(docs, stemDocument)
      dtm <- TermDocumentMatrix(docs)
      m <- as.matrix(dtm)
      v <- sort(rowSums(m),decreasing=TRUE)
      d <- data.frame(word = names(v),freq=v)
      head(d, 10)
      wordcloud(words = d$word, freq = d$freq, min.freq = 2,
      max.words=200, random.order=FALSE,random.color = TRUE, scale=c(3, 0.3), rot.per=0.15,
      colors=brewer.pal(8, "Dark2"))

