Welcome

This personal blog is where I publish articles on science, data science, sports, and bioinformatics. Feel free to contact me via my socials. Thanks for your interest, Sciathlon

How to install the Robitools

Introduction The OBITools are a set of tools meant to format, edit, filter and analyze environmental DNA (eDNA) next generation sequencing (NGS) data in the command line using bash. They were developed at the Laboratoire d’Ecologie ALpine (LECA), in Grenoble, France. There are many other pipelines for metabarcoding data analysis, but if you choose these, so as long as you can format your files for the OBITools (a sometimes non-trivial task, I know…), they are fairly easy to get a handle on. [Read More]

Anaconda packages and environments for bioinformatics: a tutorial

After abandoning this blog for several years I am back with different content. I want to write tutorials on what I’ve learned doing bioinformatics since I started my PhD, among other things and might be interesting or helpful to others. I started using anaconda during my postdoc in genomics and I use it every single day. Today, I wanted to write a summary of things I’ve learned using anaconda packages and environments for bioinformatics (but the same commands will work for other types of packages as well). [Read More]

Athlete's foot and its treatment

Today we will be looking at data about a health issue that affects many athletes: athlete’s foot. It’s not a very glamorous subject but it’s still interesting, as fungi are warrior eukaryotes that survive everywhere. Athlete’s foot is generally a fungal infection on the feet referred to as “tinea pedis” but can be something else that causes inflammation on the foot, however the definitions vary according to who you talk to… I am going to stick to the more accepted definition which is the fungal infection. [Read More]

Running races and waste

I am tackling a new topic today, which is waste generated during races. I’ll tell you why this is coming up now. I started the racing season with a road 10km called Asparun which was a relatively small race, 354 runners on the 10k plus probably as many on the 5k, which is more people than I usually see at my small local races, and I saw so many people just throwing their water goblets not IN the trash but sometimes 500 meters away, and it shocked me. [Read More]

Favorite trail race: Tencin 15km analysis

I am continuing my journey to learn awk and I finally managed to process (almost) an entire file today so let’s analyse the 2018 Tencin trail race. It was my first trail race of the season, and it kicks off the “challenge intercommunal du grésivaudan”, which is a set of 11 races in this region called the Grésivaudan which is a valley in between the two mountains of Belledonne and Chartreuse in the Alps. [Read More]

My first half marathon and its data analysis

A me-centered data analysis today where I will share my experience running my first half marathon and the race data. I’ve only been running regularly for a year, and I’d been dreaming about this day since I started. It was still kind of a dream for the longest time, something that I wanted but was still only a concept until the day arrived and I realized: “I’m running a half marathon today. [Read More]

Biathlon data analysis

This week I’m at a metabarcoding school in Norway, so it gets dark very early and I have time for another post. In honor of the 2018 Winter Olympics finally kicking off, I asked a friend of mine who loves sports what data he would be interested in seeing in the sports that are in the winter Olympics and he answered “biathlon.” I found a website that collects lots of data on the sport called biathlonresults. [Read More]

Triathlon pubmed analysis

Today I am using the RISmed package for R to analyze publications about triathlon. It is an amazing package to look through the Pubmed database for what they have on a certain subject. Pubmed is a NIH (USA) funded database which hosts articles about medicine and biology. Today, I am looking at studies that have been done on injuries, disease and human physiology that have to do with triathlon. I am not a triathlete yet, I am just focusing on running this year but it is on my radar. [Read More]

Figure skating athletes' personal best

Today I am writing another piece about figure skating, also another piece about data analysis in this event. But I am not focusing solely on the Olympics this time but on the best scoring athletes and the best scoring event of each athlete. There is a lot of data to go through, so let’s get right into it! The data comes from the (International skating Union) website which I downloaded on the 20/01/2018 (if you try my code and the results are different it might be because the data on the website has changed). [Read More]

R figure skating analysis

Analysing medals won per athlete/per country with R As I am learning data science by doing little projects, today I am introducing a little data analysis using R on figure skating in the Olympics. I am going to look at the medals won using data from the official Olympics database website. First I formatted the data to make it a csv (comma separated values) file which is very easy to parse in R [Read More]