1. Basics of Statistics
1. Descriptive Statistics
- Measures of Central Tendency (Averages):
- Mean
- Median
- Mode
- Measures of Dispersion:
- Range
- Percentiles, Quartiles, Interquartile Range (IQR)
- Variance
- Standard Deviation
- Measures to Describe the Shape of Distribution:
- Skewness
- Kurtosis
2. Measures of Correlation
- Correlation:
- Importance of Correlation
- Types of Correlation
- Degree of Correlation
- Methods to Measure Correlation:
- Scatter Diagram
- Karl Pearson’s Coefficient of Correlation
- Spearman’s Rank Correlation Coefficient
3. Probability for Statistics
- Types of Events:
- Independent Events
- Dependent Events
- Mutually Exclusive Events
- Inclusive Events
- Types of Probability:
- Marginal Probability
- Joint Probability
- Conditional Probability
- Bayes Theorem
4. Inferential Statistics
- Estimation
- Hypothesis Testing
5. Probability Distributions
- Discrete Distributions:
- Uniform Distribution
- Binomial Distribution
- Poisson Distribution
- Continuous Distributions:
- Exponential Distribution
- Normal Distribution (Gaussian Distribution) (Bell Curve)
- Standard Normal Distribution (Z-distribution)
- Student’s T-Distribution
2. Introduction to Python Libraries for Data Science
- NumPy: Numerical Computing
- Pandas: Data Manipulation
- Matplotlib: Data Visualization
- Seaborn: Statistical Data Visualization
3. Introduction to Artificial Intelligence and Machine Learning
- Overview of AI and ML: Foundations of AI and its capabilities.
- Applications of AI: Real-world use cases of Artificial Intelligence.
- AI Project Life Cycle: Key stages in the development and deployment of AI projects.
- Types of Learning:
- Supervised Learning: Predict outcomes using labeled data.
- Unsupervised Learning: Discover patterns without labels.
- Semi-Supervised Learning: Leverage labeled and unlabeled data.
- Reinforcement Learning: Learn optimal actions using rewards.
- AI Ethics and Bias: Ensuring fairness and reducing bias in AI systems.
4. Foundations of Machine Learning
- Steps to Build an ML Model: End-to-end ML model creation process.
- Overfitting vs Underfitting: Balancing model complexity for better predictions.
- Data Preprocessing: Handling missing values, outliers, and noisy data.
- Evaluation Metrics:
- Regression Metrics: MAE, MSE, R2 Score.
- Classification Metrics: Accuracy, Precision, Recall, F1-Score.
5. Supervised Learning
- Regression Algorithms:
- Linear Regression, Polynomial Regression.
- Ridge Regression (L2 Regularization), Lasso Regression (L1 Regularization).
- ElasticNet Regression (L1 and L2 Regularization).
- Classification Algorithms:
- Decision Tree, Random Forest, SVM.
- Naive Bayes, K-Nearest Neighbors (KNN).
- Advanced Topics: Handling imbalanced data and class weighting.
6. Unsupervised Learning
- Clustering:
- K-Means Clustering, Hierarchical Clustering.
- DBSCAN (Density-Based Spatial Clustering of Applications with Noise).
- Dimensionality Reduction:
- Principal Component Analysis (PCA), t-SNE (t-Distributed Stochastic Neighbor Embedding).
- Association Rule Mining:
- Apriori Algorithm, F-P Growth Algorithm.
- Anomaly Detection:
- Isolation Forest, One-Class SVM.
- Advanced Clustering: Spectral Clustering and Affinity Propagation.
7. Feature Engineering
- Encoding Techniques:
- Label Encoding, One-Hot Encoding.
- Count Encoding, Mean Encoding, Weight of Evidence Encoding.
- Feature Interaction: Creating new features from existing data.
- Datetime Functions: Extracting useful features from time data.
- Text Features: Tokenization and text vectorization (e.g., Word2Vec).
8. Feature Selection
- Filter Methods: Removing irrelevant features based on metrics.
- Wrapper Methods:
- Forward Selection, Backward Elimination.
- Recursive Feature Elimination (RFE).
- Embedded Methods:
- Ridge Regression, Lasso Regression, ElasticNet.
- Decision Tree-Based Methods (e.g., Random Forest, XGBoost, LightGBM).
9. Optimization and Model Building
- Loss Functions: Quantifying the error in predictions.
- Gradient Descent:
- Batch Gradient Descent, Stochastic Gradient Descent (SGD).
- Mini-Batch Gradient Descent.
- Hyperparameter Optimization:
- Grid Search, Random Search, Bayesian Optimization.
- Model Tuning: Fine-tuning hyperparameters for improved accuracy.
10. Advanced Techniques
- Ensemble Learning:
- Bagging: Random Forest and Bootstrap Aggregation.
- Boosting: Gradient Boosting Machines (GBM), XGBoost, LightGBM, CatBoost.
- Transfer Learning: Reusing pretrained models for new tasks.
- Time Series Analysis:
- ARIMA and SARIMA Models, Prophet for Forecasting.
- Feature Engineering for Time Series Data.
11. Explainability and Interpretability
- SHAP (SHapley Additive exPlanations): Explaining feature impacts on predictions.
- LIME (Local Interpretable Model-agnostic Explanations): Simplifying complex models for human interpretation.
- Advanced Tools: Counterfactual Explanations and Saliency Maps.
12. Model Deployment and Real-World Applications
- Tools for Deployment:
- Deploying with Flask, Django, Streamlit.
- APIs and Endpoints: Building interfaces for model access.
- Real-World Application Projects: End-to-end deployment of ML solutions.
13. Ethics and Governance in AI
- AI Fairness and Transparency: Ensuring equitable AI decisions.
- Bias Mitigation Techniques: Strategies to reduce biases in AI systems.
- Ethical Use Cases: Real-world examples addressing ethical challenges.
14. End-of-Course Projects
- Hands-on Real-World Applications:
- Predictive Modeling.
- Customer Segmentation (Clustering).
- Fraud Detection (Anomaly Detection).
- Time Series Forecasting.
- Model Deployment Project.