Tuesday, August 15, 2017

Biostatistics for biomedical use..

Feature selection is used to identify the most discriminating features for biomarker discovery, medical diagnosis, and gene selection. 
Random Forest (RF): it is an ensemble (of multiple decision trees) classifier, which applies bagging technique to construct an ensemble of trees, with randomization technique for the growth of each tree.   RF is suitable for high-dimensional and small-sample datasets.
Support Vector Machine (SVM): it is a supervised classifier, generally used in bi-classification problem, but can be extended to multi-class problem.
provide a part of the data to linear SVM and tune the parameters such that SVM can can act as a discriminatory function separating the ham messages from the spam messages
#In R code
sms_data<-read.csv("sms_spam.csv",stringsAsFactors = FALSE)

head(sms_data)
************
Parzen window based distribution calculates probability density function (pdf) in non-parametric approach

Each data point contributes equally to pdf
Uniform distribution
Normal distribution (bell curve). pdf is Gaussian function here
Variance is square of standard deviation
Probability (p) = k/n
************
Artificial neural network (ANN)
ANN helps model complex relations between input and output. Finds patterns in data (e.g. protein catabolic rate, optical character recgnition (OCR), )
Input---hidden---output
ANN architecture can have many layers i.e 1 (3 node), 2 (4 node), 3 (2 node)...
Transfer function = sum of all weight * input
There are man activation functions
Deep learning is about making data analysis sophisticated enough to derive personality
Lowest E means less difference between desired and actual value (training iteration tends to minimize E)
Genetic algorithm (GA) is more random
If wave E(w), use GA
If steep descent, use back propagation or anything based on gradient descent
Clustering can be crisp or fuzzy

No comments:

Post a Comment

Laboratory tools and reagents (Micro-pipettes)...

Micro-pipettes are essential tools of R & D labs, and integral part of Good Laboratory Practices (GLPs) Micro-pipetting methods include ...