Shantel A. Martinez

Logo

Wheat Molecular Genetics | Preharvest Sprouting

HOME

R, Data, & Viz

Tricks of the trade. A compilation of topics I found to be useful while coding in R. Thought I would share them with you all.


Plant Breeding & Genetics

Heritability Are we still talking about how to calculate heritability? Yes… yes we are. It’s been a topic in our office for what feels like a month, so I wrote up a small summary & how to calculate H2 in R

QTL Analysis I recently published an article that included QTL analysis; follow the entire workflow from data to the published figures.

GWAS in GAPIT Determining what is the “favorable” allele for your trait based on the + or - effect in GAPIT

Genomic Prediction The most recent topic I’m learning, so you can see my work in progress as I learn too.


Data Visualization

The Basics

How do I even read that graph? The Data Visualisation Catalogue provides a great break down of different plot types.

Data Visualization Chapter 5 is where I learned what those pretty correlation table/plots were called: correlogram! Now I can google how-to’s so I can make one. A lot of great basics in this book.

Want some figure inspiration for your R plots?

#TidyTuesday hashtag on Twitter. TidyTuesday has people coding on the same topic every week. You can see how different graphics can be made from the same dataset, and often people share the scripts that created the graph. Engaging in tidytuesday expands my creativity and improves my technical skills. Here is an example of a #TidyTuesday I did for Week 20

tidyverse and dataviz are two other hashtags on twitter I keep and eye on.. in addition to following Dr. Cédric Scherer, his data viz shares are always inspiring.

Data Visualization Society Join the society and their slack workspace. They have critique, inspirational, and showcase channels that I obsessively read and see how others create stories from data.

Nightingale has really good data viz blog posts.

Favorite blog posts

What to consider when creating tables Dos & don’ts of table design
What to consider when choosing colors for data visualization a ton of ‘not ideal’ and ‘better’ graphic comparisons.


Resources

Computing Resources includes topics such as electronic notebooks, coding shortcuts and links to cheatsheets.

Data Management:

No more excuses for non-reproducible methods. If I ever want to published raw data files, the script to organize and analyze the data, all the way down to producing the final figures, I know from personal experience that I need to be extremely organized, and this article is a good start. I have high admiration for scientists that are very transparent, and I dream of getting to the point of publishing a github repo with every raw data file with ‘clean’ script for the public to follow.

Also… can we really avoid Excel, GoogleSheets, and csv files? No. But we can make our life easier downstream with these tips:

How to prepare your data for analysis and charting in Excel & Google Sheets Clean up data to prepare it for further analysis.

Data organization for Spreadsheets 13 basics by Karl W. Broman and Kara H. Woo

Tips on how to improve your R code:

This is where I started. But most of the new tricks I learn is reading stackoverflow posts when I am trying to find a coding solution, and also twitters #Rstat and #tidytuesday hashtags give me a lot of articles on how to tidy code up or run things a different way.
google style suggestions
Collaborating reproducibly is a talk by Karl Broman
Karl Broman Blog posts on everything coding, science, and reproducability


HOME v2019.12.08