2022. 11. 19. 19:54ㆍ레퍼런스/Tech : 기술
i.am.aiAI Expert Roadmap
https://i.am.ai/roadmap/#note
required for any path
fundamentals - Basics
Matrics & LinearAlgebra Fundamentals
Database Basics
- Relational vs non-relational databases
- SQL + Joins (Inner, Outer, Cross, Theta Join)
- NoSQL
Tabular Data
DataFrame & Series
Extract, Transform, Load (ETL)
Reporting vs BI vs Analytics
Data Formats
- JSON / XML / CSV
Python Programming
- Expressions
- Variables
- Data Structures
- Functions
- install packages (via pip, conda or similar)
- Codestyle, e.g. PEP8
- Numpy
- Pandas
Jupyter Notebooks / Lab
Data Sources
Exploratory Data Analysis / Data Munging / - Wrangling
Principal Component Analysis (PCA)
Dimensionality & Numerosity Reduction
Data Scrubbing, Handling Missing Values
Unbiased Estimators
DataScience Roadmap
Statistics
- Randomness, random variable and random sample
- Conditional probability and Bayes' theorem
- iid
- cdf, pdf, pmf
- Cumulative distribution function (cdf)
- Probabiltiy density function (pdf)
- Probability mass function (pmf)
Continuous distributions (pdf's)
- Beta
Discrete distributions (pmf's)
- Binomial
- Poisson
Summary statistics
- Expectation and mean
- Variance, standard deviation (sd)
- Covariance and correlation
- Percentile / Quantile
- Mode
Important Laws
- Maximum Likelihood Estimation (MLE)
- Kernel Density Estimation (KDE)
- p-Value
- F-test
- t-test
Visualization
Chart Suggestions thought starter
Python
- Bokeh
- seaborn
Web
- D3.js
Dashboards
- Dash
BI
- Tableau
- PowerBI
Machine Learning Roadmap
General
Concepts, Input & Attributes
- Categorical Variables
- Ordinal Variables
- Numerical Variables
Cost functions and gradient descent
Training, validation and test data
Methods
- Classification Rate
- SVM
- Gaussian Mixture Models
Unsupervised Learning
- Clustering
- DBSCAN
- HDBSCAN
- Fuzzy C-Means
- Mean Shift
- Agglomerative
- OPTICS
- Association Rule Learning
- Apriori Algorithm
- ECLAT algorithm
- FT Trees
- Dimensionality Reduction
- Principal Component Analysis (PCA)
- Random Projection
- NMF
- T-SNE
- UMAP
Ensemble Learning
- Boosting
- Bagging
- Stacking
Reinforcement Learning
- Q-Learning
Use Cases
Sentiment Analysis
Collaborative Filtering
Prediction
Tools
Deep Learning Roadmap
Papers
Deep Learning Papers Reading Roadmap
Papers with code - state of the art
Neural Networks
Vanishing / Exploding Gradient Problem
Architectures
Convolutional Neural Network (CNN)
- Pooling
Recurrent Neural Network (RNN)
- LSTM
- GRU
- Encoder
- Decoder
Generative Adversarial Network (GAN)
Training
Optimizers
- SGD
- Momentum
- Adam
- AdaGrad
- AdaDelta
- Nadam
- RMSProp
- Dropout
Tools
Important Libraires
Model optimization (advanced)
Neural Architecture Search (NAS)
Data Engineer Roadmap
Summary of Data Formats
Data Discovery
Data Source & Acquisition
Data Integration
Data Fusion
Transformation & Enrichment
Data Survey
How much Data
Using ETL
Dockerize your Python Application
Big Data Engineer Roadmap
Big Data Architectures
Architectural Patterns & Best Practices (video)
Principles
Horizontal vs vertical scaling
Tools
Check the Awesome Big Data List
- HDFS
- Loading data with Sqoop and Pig
Spark (in memory)
RAPIDS (on GPU)
Flume, Scribe : For Unstruct Data
Data Warehouse with Hive
Elastic (EKL) Stack
'레퍼런스 > Tech : 기술' 카테고리의 다른 글
[감정 분석] SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining (번역) (0) | 2021.05.07 |
---|---|
사람 포즈 추정에 대한 2019년도 가이드 (번역) (0) | 2021.04.20 |
[Pose-Estimation] DeepPose: Human Pose Estimation via Deep Neural Networks (번역) (0) | 2021.04.15 |
[LSTM] LSTM 네트워크 이해하기 (번역) (0) | 2020.11.12 |
[추천시스템] RFM기법과 연관성 규칙을 이용한 개인화된 전자상거래 추천시스템 논문 요약 (0) | 2020.05.19 |