Duration
16 Weeks
Fees
Kshs. 15,000 Per Month
10
Comprehensive Modules
1
Capstone Portfolio Project
140+
Core Content Hours
Curriculum Deep Dive
Unlocking Insights from Data: From Fundamentals to Advanced Machine Learning
Explore the complete journey from data fundamentals to advanced analytics and machine learning.
Intro to Data Science
Duration: Approx. 1 Week
Learning Objectives
- Define Data Science, Data Analytics, and their differences.
- Understand the Data Science workflow (CRISP-DM or similar).
- Explore the roles and skills of a Data Scientist/Analyst.
Topics Covered
-
What is Data Science?
Definition & Scope / Relationship with AI/ML, Big Data
-
Data Science Life Cycle
Problem understanding, Data collection, Cleaning, Modeling, Deployment, Communication
-
Key Concepts
Types of data (structured, unstructured) / Descriptive vs. Inferential Statistics
Programming for Data
Duration: Approx. 4 Weeks
Learning Objectives
- Master Python/R fundamentals for data manipulation.
- Utilize key libraries for numerical computing (NumPy) and data structures (Pandas).
- Write clean, efficient, and well-commented data code.
Topics Covered
-
Python/R Basics
Variables, Data Types, Control Flow / Functions, Classes (OOP basics)
-
Python: NumPy for Numerical Data
Arrays, Vectorization / Mathematical operations
-
Python: Pandas for Data Structures
DataFrames, Series / Indexing, Slicing, Merging, Grouping
-
R: Data Structures & Base Functions
Vectors, Data Frames, Lists / Data manipulation
-
Code Best Practices
Readability, Comments, Error Handling
Data Collection & Wrangling
Duration: Approx. 2 Weeks
Learning Objectives
- Acquire data from various sources (CSV, SQL, APIs).
- Perform data cleaning: handle missing values, outliers, inconsistencies.
- Transform data into a suitable format for analysis.
Topics Covered
-
Data Acquisition
Reading CSV, Excel files / SQL Databases (basic queries) / Web Scraping (intro) / APIs (intro to fetching data)
-
Data Cleaning
Handling Missing Values (imputation, removal) / Outlier Detection & Treatment / Data Type Conversion
-
Data Transformation
Feature Scaling, Encoding Categorical Data / Merging, Joining, Reshaping Data
-
Data Validation
Ensuring data quality and integrity
EDA & Visualization
Duration: Approx. 2 Weeks
Learning Objectives
- Conduct exploratory data analysis (EDA) to uncover patterns and insights.
- Create compelling static and interactive data visualizations.
- Effectively communicate data findings through visual storytelling.
Topics Covered
-
Exploratory Data Analysis (EDA)
Summary Statistics / Correlation Analysis / Hypothesis Generation
-
Static Visualization (Matplotlib/Seaborn in Python, ggplot2 in R)
Histograms, Scatter Plots, Bar Charts, Line Plots / Heatmaps, Box Plots
-
Interactive Visualization (Plotly/Dash, Bokeh in Python, Shiny in R)
Creating dynamic charts / Dashboards (conceptual)
-
Principles of Data Visualization
Choosing appropriate chart types / Clarity, Aesthetics, Storytelling
Statistical Foundations
Duration: Approx. 8 Weeks
Learning Objectives
- Understand key statistical concepts relevant to data science.
- Apply hypothesis testing for data-driven decision making.
- Perform regression and classification analyses.
Topics Covered
-
Probability & Distributions
Basic probability, Conditional probability / Normal, Binomial, Poisson distributions
-
Descriptive Statistics
Measures of central tendency (mean, median, mode) / Measures of dispersion (variance, std dev)
-
Inferential Statistics & Hypothesis Testing
Sampling, Confidence Intervals / T-tests, Chi-square tests, ANOVA (overview) / P-values and significance
-
Regression Analysis
Linear Regression (simple & multiple) / Assumptions, Interpretation
Machine Learning Fundamentals
Duration: Approx. 4 Weeks
Learning Objectives
- Grasp core Machine Learning concepts (supervised, unsupervised).
- Implement common ML algorithms using libraries (Scikit-learn).
- Evaluate model performance and mitigate overfitting/underfitting.
Topics Covered
-
Introduction to Machine Learning
Supervised vs. Unsupervised Learning / Training, Validation, Test Sets / Bias-Variance Tradeoff
-
Regression Algorithms
Linear Regression (revisited), Ridge, Lasso / Decision Trees, Random Forests (for regression)
-
Classification Algorithms
Logistic Regression / K-Nearest Neighbors (KNN) / Support Vector Machines (SVM) / Decision Trees, Random Forests (for classification)
-
Clustering Algorithms
K-Means, Hierarchical Clustering (overview)
-
Model Evaluation
Metrics (Accuracy, Precision, Recall, F1-score, ROC-AUC) / Cross-validation
-
Feature Engineering
Creating new features from existing data
Advanced ML & Deep Learning (Optional)
Duration: Approx. 4 Weeks
Learning Objectives
- Explore advanced machine learning techniques.
- Understand the basics of neural networks and deep learning.
- Apply deep learning frameworks for practical problems.
Topics Covered
-
Ensemble Methods
Bagging (Random Forests revisited) / Boosting (Gradient Boosting, XGBoost, LightGBM)
-
Neural Networks Fundamentals
Perceptrons, Activation Functions / Feedforward Networks, Backpropagation (conceptual)
-
Introduction to Deep Learning Frameworks
TensorFlow/Keras or PyTorch (basics)
-
Convolutional Neural Networks (CNNs) – Intro
Image processing basics
-
Recurrent Neural Networks (RNNs) – Intro
Sequence data basics
Big Data Technologies (Optional)
Duration: Approx. 4 Weeks
Learning Objectives
- Understand the challenges and solutions of Big Data.
- Gain an overview of distributed computing frameworks (Hadoop, Spark).
- Learn about data warehousing and data lakes.
Topics Covered
-
Big Data Concepts
Volume, Velocity, Variety, Veracity / Distributed computing overview
-
Hadoop Ecosystem (Conceptual)
HDFS, MapReduce (overview)
-
Apache Spark (Conceptual)
Resilient Distributed Datasets (RDDs) / Spark SQL, Spark Streaming
-
Data Warehouses vs. Data Lakes
Purpose and use cases / Cloud-based solutions (e.g., AWS S3/Redshift, GCP BigQuery/Cloud Storage)
Data Storytelling & Deployment
Duration: Approx. 4 Weeks
Learning Objectives
- Effectively communicate data insights to diverse audiences.
- Build interactive dashboards and reports.
- Understand basic concepts of deploying ML models.
Topics Covered
-
Effective Communication
Structuring data narratives / Tailoring insights to audience
-
Dashboarding Tools (Conceptual)
Tableau, Power BI, Google Data Studio (overview) / Building interactive dashboards
-
Web Frameworks for Data Apps
Dash (Python) or Streamlit (Python) / Flask/Django for simple web UIs for models
-
Model Deployment Basics
Saving/Loading Models / API Endpoints for Predictions (conceptual)
Capstone Project & Career Prep
Duration: Approx. 4 Weeks
Learning Objectives
- Apply the entire data science workflow to a real-world problem.
- Develop a comprehensive data science portfolio.
- Prepare for data science/analytics job interviews.
Topics Covered
-
Capstone Project
Problem Definition, Data Collection & Cleaning / EDA, Modeling, Evaluation, Deployment (optional), Storytelling
-
Building a Data Science Portfolio
Showcasing projects (GitHub, Kaggle) / Blogging about projects
-
Resume & LinkedIn Optimization
Highlighting data science skills
-
Interview Preparation
Behavioral, Technical (SQL, Python, ML concepts) / Case studies
Take the first step towards achieving your academic, career, and life goals.
Whether you're preparing for global opportunities or reskilling for the digital economy, Oval Training Institute is your trusted partner.
Contact Us: 0741518500 / 0741 426 603
Visit: www.oti.co.ke
Location: Zion Mall, 2nd Floor, Uganda Road, Eldoret





