Take note, it’s time to take notes

Great post from Richard Branson, I wanted to share with you…

In his experience, 99 per cent of people in leadership roles don’t take notes. What’s more, males are less likely to take notes than their female counterparts.

 

He recently met with 30 chief executives for a dinner-table conversation about closing the gender gap. They discussed how men can counteract bias in the workplace by speaking up and championing their female colleagues. It was a wonderfully eye-opening discussion, full of valuable insights; yet he was the only person who took notes the entire time – “and boy did I take notes, I ran out of white space and had to write over my notes, my hotel notepad, my report and even my name tag!”, he said.

Sheryl Sandberg, Facebook’s chief operating officer and the founder of LeanIn.Org, expressed her surprise over his note-taking in this excellent piece about women in the workplace co-written with Adam Grant. Sheryl is putting together a series on the subject on LinkedIn too, he’s looking forward to getting involved.

Back in their meeting, conversation came around to the subject of women being more likely to be note takers in meetings, because there is an unfair expectation on them to do support work. In other words as a society we expect the office housework to fall to a woman.

Not only is this unfair to women, but it’s also disadvantageous to men. It’s time for men to step up and do their share of support work. On top of counteracting gender bias in the work force, it will also give men a better understanding of what going on within the business and what needs to be done to make things run more effectively. Mentoring, training and note taking – these are wonderful development areas, which everyone, men and women alike, can greatly benefit from.

Note taking is one of his favorite pastimes,  which he stated with, “I can’t tell you where I’d be if I hadn’t had a pen on hand to write down my ideas (or more importantly, other people’s) as soon as they came to me.” Some of Virgin’s most successful companies have been born from random moments – if they hadn’t opened our notebooks, they would never have happened.

No matter how big, small, simple or complex an idea is, get it in writing. But don’t just take notes for the sake of taking notes, go through your ideas and turn them into actionable and measurable goals. If you don’t write your ideas down, they could leave your head before you even leave the room.

To counteract the gender bias, men shouldn’t take over the note taking from women, everyone should be taking notes!

Miller-Rabin Primality Test

Math ∩ Programming

Problem: Determine if a number is prime, with an acceptably small error rate.

Solution: (in Python)

Discussion: This algorithm is known as the Miller-Rabin primality test, and it was a very important breakthrough in the study of probabilistic algorithms.

Efficiently testing whether a number is prime is a crucial problem in cryptography, because the security of many cryptosystems depends on the use of large randomly chosen primes. Indeed, we’ve seen one on this blog already which is in widespread use: RSA. Randomized algorithms also have quite useful applications in general, because it’s often that a solution which is correct with probability, say, $latex 2^{-100}$ is good enough for practice.

But from a theoretical and historical perspective, primality testing lied at the center of a huge problem in complexity theory. In particular, it is unknown whether algorithms which have access to randomness and can output probably correct answers are more…

View original post 425 more words

Beginning Data Science with R by Manas A. Pathak, Springer

Why we should start learning R for getting into data science…

Compudicted

Beginnng Data Science With R

Continuing on with the Springer series on Computational Intelligence and Complexity I picked another book on the ever increasing in popularity R.

Besides, I read already several books from other publishers in 2014. The books were aiming at different levels, and at people from different professional backgrounds. Myself, a data practitioner, positioned rather away from being a data scientist, sitting closer to the server side, with periodic ETL or Business Intelligence development tasks at hand professional I started to realize the times have changed: each new project requires new depth and breaths of data analysis. Using Excel and its data add-ons does no longer cut it in. I was aware of tools as MATLAB, SAS and SPSS, but boy they cost!

I was always in love with data, linear, discrete algebra and statistics in general so for me R came to the natural choice. Learning tools as R (not just a…

View original post 459 more words

Using GitHub with R and RStudio

 

 

 

github-logo

 

This was an amazing post from molecular ecologist blog. I loved it and I would like to share with you…

A few weeks back, the Molecular Ecologist released an article about GitHub and also created an organization where you can fork or simply download code shared by the Molecular Ecology community. A few of you out there may still be skeptical about the benefits of using GitHub. Or you may find it confusing and not worth the bother. You may be thinking to yourself (well, at least, I was guilty of this) that all of your code is backed up on Dropbox, Google Drive, and three external hardrives – so what could possibly go wrong? The short answer is: lots! The longer answer is that there really are some tremendous advantages associated with using Git and GitHub that may not be immediately apparent.

Git is a version control system and allows you to save copies of your code throughout the entire developmental process. Git isn’t the only version control system out there (e.g., SVN), but it is one of the more popular implementations. GitHub allows you to push your code from your local workspace to be hosted online. GitHub, which seamlessly integrates with Git, allows you to 1.) keep copies of all of your code through time, 2.) compare code from various points in time (very useful for debugging), 3.) collaborate with people on the same project in a non-chaos inducing fashion, and 4.) keep copies of your code both locally and online (note that you should still officially back up all of your work). Still not convinced? I suggest you google ‘why should I use version control?’

Below, I show how to use GitHub with Rstudio and also show that it is equally easy to use GitHub with any simple file of code. Thus, the take home message for the day is ‘GitHub is easy and you should use it.’

RStudio is an excellent integrated development environment built specifically for R. It also contains version control for Git and SVN. Below I outline the simple steps to get RStudio working with GitHub.

  1. Setup a GitHub account here.
  2. Download and install Rstudio.
  3. Download and install the platform-specific version of Git (not GitHub), default options   work well.
  4. Configure Git with global commands. I have found this step necessary both times I     ran through this process. Open up the bash version of Git and type the following:         git config –global user.name “your GitHub account name”                                                     git config –global user.email “GitHubEmail@something.com”
  5. Open Rstudio and set the path to Git executable. Go to Tools > Options > Git/SVN                Screenshot 2013-11-12 09.53.56 - Copy

It is important that you find your git.exe file (as shown above). This may be located in any number of places depending on your operating system, but the location of your GIT install is a good first place to look.

Restart RStudio and that is all there is to it! There are some simple guidelines at the RStudio website, which may be helpful. Now that you have successfully installed everything, lets run through a quick example. There are four terms associated with Git that you must learn: repository, commit, push, and pull. A repository equals the location and name for all the files associated with a particular project. The first step is to log into your GitHub account and create a new repository. Make sure you check the box ‘Initialize this repository with a README.’ When you are done, you should be able to view the Repository like below:

Screenshot 2013-11-12 09.36.42 - Copy

Notice the box highlighted in red. That box is really important – remember it as the ‘red box’. Now, open Rstudio and go to Project > Create Project > Version Control > Git and you should see a screen like below:

Screenshot 2013-11-12 09.37.08 - Copy

In the Repository URL box, you should copy and paste the URL indicated in the ‘red box’ above. This is how Rstudio knows what repository to use and associates it with your new project files. In this box you can also set the project directory.  Now do some work in your new R project and create and save some files. The next step is to ‘commit’ your work – essentially making a copy of all of your script files (i.e., .R files) associated with the R project. To do this go to Tools > Version Control > Commit.  This brings up the following window:

Untitled

Here you can see that I have saved two files, test1 and test2. Now I simply check the files that  I want to commit and press the ‘commit’ button, highlighted with the green box. If I want to also move these files onto the GitHub servers, I will click on the red box, marked ‘push’.  Look at your repository online to double check that your files actually made it there. That is pretty much all there is to it. You can also use the ‘git’ box in the top right-hand corner of Rstudio to make commits or use the various keyboard shortcuts. One feature that I think would be useful is for a commit to be made every time you save a file. I haven’t figured out how to do this, so please post a comment if you know how – or if you think that this would actually be a bad idea in practice.

What if you decide that RStudio isn’t for you because you can’t live without Notepad++ or Sublime Text? No worries – GitHub is super easy to use on Mac or Windows (and, of course Linux, but you probably already knew that).  Simply download

GitHub for Windows or GitHub for Mac

Follow the installation directions.  Create a few files and use the GUI to commit and push your files (see screenshot below) – it couldn’t be easier!

Screenshot 2013-11-12 11.50.53

One advantage that I find to using RStudio is that everything is integrated, so it really takes no time at all to commit my R code and push it on to GitHub.  This extra convenience means that I make more frequent commits.  Remember that it is a good idea to commit and push often.  Well that’s about it.  Please feel free to contribute and pull from the Molecular Ecologist’s repositories – this resource will only get better as more people use it. Also, please add any tricks or tips to the comments below!

The R Markdown Cheat Sheet

RStudio Blog

R Markdown is a framework for writing versatile, reproducible reports from R. With R Markdown, you write a simple plain text report and then render it to create polished output. You can:

  1. Transform your file into a pdf, html, or Microsoft Word document—even a slideshow—at the click of a button.
  2. Embed R code into your report. When you render the file, R will run the code and insert its results into your report. Use this feature to add graphs and tables to your report: if your data ever changes, you can update your figures by re-rendering the report.
  3. Make interactive documents and slideshows. Your report becomes interactive when you embed Shiny code.

We’ve created a cheat sheet to help you master R Markdown. Download your copy here. You can also learn more about R Markdown at rmarkdown.rstudio.com and Introduction to R Markdown.

RM-cheatsheet

View original post

Should you teach Python or R for data science?

Here is an amazing post in r-bloggers, which I thought would be great to think about it…

Comparing Python and R language is very hard in terms of usage in data science field. I personally love to work in Python, but R has some amazing features that makes it faster then Python to solve problems. Let’s go and  compare with in different perspective.

Having a familiarity is important?

If you have some experience in programming learning Python may be better, because the syntax is somewhat similar to other languages.(Not quite much, but more similar then the R.) R has very weird syntax rules which gives you cramps most of the time, but when you get used to it -I assure you- you will love it.  If you don’t have programming experience and you want to start from very beginning, R would be good, because once you get used to the syntax, you could change your knowledge into high-level programming languages. Lastly, as I Python lover, I would like to pointed out that Python is the best programming language for the starters. most Ivy League schools change their introductory language to Python…

Which area you want to work? (Academic or Industry)

In academic areas , because of statistics area, R is more widely used than Python. But in industry productivity is very important, and Python makes every job done faster.

 

Machine Learning or Statistical Learning?

The line between these two terms is blurry, but machine learning is concerned primarily with predictive accuracy over model interpretability, whereas statistical learning places a greater priority on interpretability and statistical inference. scikit-learn, by far the most popular machine learning package for Python, is more concerned with predictive accuracy. Thus, R is probably the better choice if you are teaching statistical learning, though Python also has a nice package for statistical modeling (Statsmodels) that duplicates some of R’s functionality.

Are you looking for your language to look sexy?

If so, R is not very sexy language. As I said above, syntax of R is weird and looks creepy. It feels old, and its website looks like it was created around the time the web was invented. Python is the “new kid” on the data science block, and has far more sex appeal. From a marketing perspective, Python may be the better choice simply because it will attract more students.

And more Information…

Installing of both language is quite simple.

Installing R is a simple process, and installing RStudio (the de facto IDE for R) is just as easy. Installing new packages or upgrading existing packages from CRAN (R’s package management system) is a trivial process within RStudio, and even installing packages hosted on GitHub is a simple process thanks to the devtools package.

By comparison, Python itself may be easy to install, but installing individual Python packages can be much more challenging. In my classroom, we encourage students to use the Anaconda distribution of Python, which includes nearly every Python package we use in the course and has a package management system similar to CRAN. However, Anaconda installation and configuration problems are still common in my classroom, whereas these problems were much more rare when using R and RStudio. As such, R may be the better choice if your students are not computer savvy.

Data cleaning (also known as “data munging”) is the process of transforming your raw data into a more meaningful form. I find data cleaning to be easier in Python because of its rich set of data structures, as well as its far superior implementation of regular expressions (which are often necessary for cleaning text).

The pandas package in Python is an extremely powerful tool for data exploration, though its power and flexibility can also make it challenging to learn. R’s dplyr is more limited in its capabilities than pandas (by design), though I find that its more focused approach makes it easier to figure out how to accomplish a given task. As well, dplyr’s syntax is more readable and thus is easier for me to remember. Although it’s not a clear differentiator, I would consider R a slightly easier environment for getting started in data exploration due to the ease of learning dplyr.

R’s ggplot2 is an excellent package for data visualization. Once you understand its core principles (its “grammar of graphics”), it feels like the most natural way to build your plots, and it becomes easy to produce sophisticated and attractive plots. Matplotlib is the de facto standard for scientific plotting in Python, but I find it tedious both to learn and to use. Alternatives like Seaborn and pandas plotting still require you to know some Matplotlib, and the alternative that I find most promising (ggplot for Python) is still early in development. Therefore, I consider R the better choice for data visualization.

We chose it because we deal with huge amounts of data. Besides, it sounds really cool.

Larry Page, founder of Google

I hope, You love the topic… Leave comment below, Share your thoughts with me, I will pleased to hear them

Productivity Killer 10 Habits

Being productive isn’t easy, regardless of how badly you’d like to be and how hard you think you’re willing to work. But increasing your output at work and in life is a much more attainable goal if you’re not sabotaging yourself with bad habits.

Here are 10 things you should stop doing right now:

1. Impulsive web browsing: Since most of us work with access to the internet, it’s easy to get side-tracked looking up the answer to a random question that just popped into your head.

That’s why Quora user Suresh Rathinam recommends writing down these thoughts or questions on a notepad. This way, you can look up the information you want later, when you’re not trying to get work done.

2. Moral licensing: Whether it’s a new diet, workout routine, or work schedule, one of the most difficult things about forming a new habit is the urge to cheat as a reward for sticking to a routine for a while. This idea that we “deserve” to splurge on fancy meal after being thrifty for a week is called “moral licensing,” and it undermines a lot of people’s plans for self-improvement

Instead, try making your goal part of your identity, such that you think of yourself as the kind of person who saves money or works out regularly, rather than as someone who is working against their own will to do something new.

3. Putting off your most important work until later in the day: People often start off their day by completing easy tasks to get themselves rolling and leave their more difficult work for later. This is a bad idea, and one that frequently leads to the important work not getting done at all.

As researchers have found, people have a limited amount of willpower that decreases throughout the day. That being the case, it’s best to get your hardest, most important tasks done at the beginning of the day.

4. Taking many meetings: Nothing disrupts the flow of productivity like an unnecessary meeting. And with tools like email, instant messenger, and video chat at your fingertips, it’s best to only use meetings for introductions and serious discussions that can only be held in person.

BlueGrace Logistics founder Bobby Harris recommends that people don’t accept a meeting unless the person who requested it has put forth a clear agenda and stated exactly how much time they will need. And even then, Harris recommends giving the person half of the time they initially requested.

5. Multi-tasking: While many people believe they are great at doing two things at once, scientific research has found that just 2% of the population is capable of effectively multi-tasking.

For the rest of us, multi-tasking is a bad habit that decreases our attention spans and makes us less productive in the long run.

6. Hitting the snooze button: It might feel like pressing the snooze button in the morning gives you a little bit of extra rest to start your day, but the truth is that it does more harm than good.

That’s because when you first wake up, your endocrine system begins to release alertness hormones to get you ready for the day. By going back to sleep, you’re slowing down this process. Plus, nine minutes doesn’t give your body time to get the restorative, deep sleep it needs.

7. Failing to prioritize: It’s only natural for people to hedge against failure by keeping their options open and trying to pursue a bunch of different goals simultaneously. Take, for instance, the person who is five years into a career in marketing, but preparing themself for law school just in case. Unfortunately, this sort of wavering can be extremely unproductive.

Warren Buffett has the perfect antidote. Seeing that his personal pilot was not accomplishing his life goals, Buffett asked him to make a list of 25 things he wanted to get done before he died. But rather than taking little steps toward completing every one of them, Buffett advised the pilot to pick five things he thought were most important and ignore the rest.

8. Over-planning: Many ambitious and organized people try to maximize their productivity by meticulously planning out every hour of their day. Unfortunately, things don’t always go as planned, and a sick child or unexpected assignment can throw a wrench into their entire day.

Instead, you might want to try planning just four or five hours of real work each day, that way you’re able to be flexible later on.

9. Under-planning: With that being said, you should take time to strategize before attempting to achieve any long-term goal. Trying to come up with the endgame of a project you’re doing midway through the process can be extremely frustrating and waste a huge amount of time.

Harvard lecturer Dr. Robert Pozen recommends that you first determine what you want your final outcome to be, then lay out a series of steps for yourself. Once you’re halfway through, you can review your work to make sure you’re on track and adjust accordingly.

10. Keeping your phone next to your bed: The LED screens of our smartphones, tablets, and laptops give off what is called blue light, which studies have shown can damage vision and suppress production of melatonin, a hormone that helps regulate the sleep cycle.

Research also suggests that people with lower melatonin levels are more prone to depression.

I hope, You love the topic… Leave comment below, Share your thoughts with me, I will pleased to hear them
Productivity is being able to do things that you were never able to do before
Franz Kafka