Tools Intermediate
Summary
Extensive experience using Pandas for data manipulation in research and personal projects. Core tool for data preprocessing in IEEE-published healthcare analytics research, handling large datasets with complex transformations and feature engineering.
How I Apply This Skill
- Preprocessed SEER-Medicare healthcare data for 3 IEEE publications
- Built data cleaning pipelines handling missing values and outliers
- Performed feature engineering creating derived variables for ML models
- Applied groupby aggregations for statistical analysis across patient cohorts
- Created merge/join operations combining multiple data sources
- Exported publication-ready datasets for statistical analysis
Key Strengths
- Data Cleaning: Missing value imputation, outlier detection, type conversion
- Transformations: Groupby, pivot tables, apply functions, vectorized operations
- Merging: Inner/outer joins, multi-key merges, data alignment
- Time Series: DateTime indexing, resampling, rolling windows
- Integration: NumPy arrays, CSV/Excel I/O, database connections
Related Projects
- Long COVID Prediction - Healthcare data preprocessing
- Echo State Network - Time series data preparation