Data Science Services
I am currently available for any full-time, part-time, or freelance opportunity that comes up. Here are several areas of machine learning and statistics that I have taken the time to learn.
Data Cleanup and ETL
- Comfortable extracting/loading any type of data from any source
- .csv, .txt, ODBC, MongoDB, database, DBMS, images, audio, network packets, videos, JSON, XML
- Expert knowledge in Python packages to clean up the messiest of data
- pandas, numpy, scipy
- Also proficient with excel/R, but I reserve R mainly for statistical modeling
Data Modeling/Mining
Bolded items are algorithms I have implemented in R/python from scratch
- Time Series forecasting (ARIMA, Holt Winters triple exponential smoothing, Monte Carlo simulation)
- Statistical models (linear regression, (M)ANOVA, Chi-Squared Test, Correlation)
- Feature selection using Lasso or Ridge
- Neural networks (deep, recurrent, convolutional, LSTM)
- Computer Vision (openCV and facial recognition)
- Clustering (hierarchical, k-means, DBscan, Fourier-transformation time series clustering)
- Decision Trees (C4.5 algorithm, random forest)
- Collaborative Filtering
- Information Retrieval (Boolean Model)
- Bagging and Boosting algorithms to increase performance
- Technologies: tensorflow, sci-kit learn, R forecast library
- Log5 Model
Data Visualization Tools
- Python matplotlib and altair libraries
- R ggplot2
- Tableau
- PowerBI
- TIBCO Spotfire
Distributed Computing
- Hadoop (HDFS, Map/Reduce)
- Spark (RDD, scala)