yHat
  • ~/
  • ~/blog
  • ~/projects
  • ~/astronomy
Categories
Benchmarking
Books
Docker
EDA
Notes
Postgres
Programming Practices
Python
Shiny
TidyTuesday
Visualization

How LDA Works, Using Shiny for Python

A small Shiny for Python app exploring how LDA works

Nov 1, 2023
2 min

2023 #30DayChallenge

A few charts related to time-series packages for the 30DayChallenge

Apr 3, 2023
29 min

Performance Benchmarking Data Read Write

Which of the popular data read write methods is faster? Let’s find out.

Sep 17, 2022
5 min

Tétouan Electric Consumption, EDA & Predictive Modeling

An exploration of electric consumption, and some quick predictive modeling

Aug 4, 2022
14 min

EDA: McDonalds India

Exploration of McDonalds India data from a Kaggle dataset

Aug 1, 2022
17 min

EDA: R Package History

Exploration of CRAN Package History

Jul 31, 2022
25 min

Making the Anomaly Database

This is part two of the two part post related to Docker, Postgres databases and Anomaly data-sets. Read Part 1…
Aug 18, 2021
4 min

Docker based RStudio & PostgreSQL

How to setup a Docker based workflow for development in RStudio with a local Postgres server, also hosted in Docker

Aug 7, 2021
7 min

Enhance ETL pipeline monitoring with text plots

Quick visualizations in command line using {txtplot}

Jul 19, 2021
1 min

Visualizing Correlations

Correlation plot for Kepler’s Planets, for day 13 of the 2021 30-day-chart-challenge

Apr 13, 2021
23 min

TidyTuesday - The Tate Collection

Last week's #TidyTuesday. Had something very specific in mind & it forced me to learn a new pkg and some base R to finish this plot.

I wanted to showcase the change in the dominant medium from Graphite to…
Jan 19, 2021
7 min

TidyTuesday - Transit Costs

Comparing Indian rail projects to our neighbour China, I find that, on average, Indian lines have a higher number of stations and longer lines than our Chinese counterparts.

Also, number of stations to track length is amazingly linear…
Jan 11, 2021
8 min

TidyTuesday - Big Mac Index

For my first #TidyTuesday post, I've attempted a comparison of the 2015 to 2020 movement of the Big Mac index : https://t.co/AOGOvt3ve5#RStats #dataviz #r4ds #ggplot2 pic.twitter.com/1…
Jan 6, 2021
6 min

Perf Benchmarking Dummy Variables - Part II

Is {fastDummies} any better than {stats} to create dummy variables? Let’s find out.

Dec 16, 2020
13 min

M5 Competition Virtual Awards Ceremony

Notes from the M5 Forecasting Competition keynote speakers.

Oct 29, 2020
5 min

Reproducible Work in R

A few ways I ensure my work is reproducible in R

Oct 10, 2020
6 min

Using tryCatch for robust R scripts

A quick introduction to tryCatch below, followed by three use-cases I use on a regular basis.

Dec 20, 2018
9 min

Performance Benchmarking for Date-Time conversions

I have 6 methods compete against each other to figure out the fastest way to convert characters to date-time for large datasets.

Apr 12, 2018
2 min

Books I Reference

A list of Data Science books I reference

Feb 13, 2018
1 min

Visualising Linear Discriminant Analyses

Linear Discriminant Analysis visualized using Shiny

Jan 27, 2018
1 min

Performance Benchmarking for Dummy Variable Creation

How do the four popular methods of creating dummy variables perform on large datasets? Let’s find out!

Sep 27, 2017
7 min

Pur(r)ify Your Carets

You’ll learn how to use purrr, caret and list-cols to quickly create hundreds of dataset + model combinations, store data & model objects neatly in one tibble, and post process programatically. These tools enable succinct functional programming in which a lot gets done with just a few lines of code.

Sep 17, 2017
9 min
No matching items
 
2023 rahul-s