DEEP LEARNING

ltr

DEEP LEARNING

Deep learning can be defined as a subset of machine learning. In deep learning techniques neural networks are involved. Neural networks are made using artificial neurons. In neural networks first layer is called input layer and the last is output layer, the middle layers are called hidden layers. For better results we can introduce several hidden layers. In each layer we can use different activation functions also to get a better model. Hence neural network is a basic unit of deep learning architectures.
 
Artificial Neural Network Multilayer perception is a feed forward network which consists of more than two layers of neurons, first layer is input layer and the last layer is output layer. The middle layer is called hidden layer. Multilayer perceptron needs to be train using some training data and then we can some input as test data to predict the outcome. When the output is compared to target data and error is minimized by back propagating in the network, then it is called ANN with back propagation.
 
""
Recurrent Neural Network
Recurrent Neural Networkis defined as a deep learning technique in which a network can learn using previous experiences or computations. In RNN input in hidden layers are the output from previous layer and also the input for that state. Hence It can take sequential input and gives more than one output.

""

In the above figure X t represents number of input sequences, ht represents output sequences. RNN It does not have memory so it genrally suffers from vanishing gradient problem. This problem can be removed using LSTM model.
 
Long Short TermMemory (LSTM)
In RNN network if memory element is added then it is called LSTM model. It is used as a deep learning technique to solve large problems. It also removes vanishing gradient problem It is capable for the learning of long term dependencies. In this model there are three gate one is input gate second is forget gate and the third is output gate. In LSTM cell, on the input gate two inputs are from previous state one is previous state and other is previous layer output and the third is external input. Hence for LSTM cell three dimensional input data is required. On the input gate sigmoid activation function is used. On the forget gate sigmoid as well as tanh activation functions are used. At the output gate again tanh and sigmoid functions are used. The output given by LSTM cell is a two dimensional output. LSTM enables RNN to memorize inputs for long time because LSTM uses memory to store information. This LSTM memory cell is LSTM cell which can be described as three gates. In this input gate is used to get data as input from external as well as previous layers, forget gate determines whether the information should be stored or deleted and the output gate is used to get the two dimensional output.

""

i t represents input gate
f t represents forget gate
o t represents output gate
σ represents sigmoid function
Wxy represents weight for respective gate(x) neurons and from input y.
ht-1 represents output of previous LSTM block
X t represents output at current timestamp.
b x represents biases for respective gates(x).
C t = Cell state at timestamp t.
C t = Candidate for cell state at timestamp t
Forget gate determines whether the information should be stored or deleted. It takes ht-1 and xt as
input and produces an output which lies between between 0 and 1 for each cell state Ct-1 . If we get
‘1’ then accept and ‘0’ denotes to reject.
f t = σ (Wf h * h t-1 + W f x * xt ] + b f )
Next step involves which new information should be stored in next state. It involves two layers one
is sigmoid layers and second is tanh layer. Input gate determines the values to be updated and the
tanh layer computes a candidate values vector Ct . These are combined in next step to produce
update for the state.
i t = σ (W i h * h t-1 + W i x * x t ] + b i )
C t =tanh(Wch * h t-1 + Wcx * x t ] + bC )
Now the old value of cell state C t-1 , is updated to new value of cell state C t . Then multiply C t−1 by
f t then add i t * C t to it. This is new updated candidate value, scaled according to decided valued to
update each state.
C t = σ (f t * C t-1 + i t * C t ]
The last layer decides the output. Final output depends upon h t-1 and x t In the last layers two layers
are involved one is sigmoid layer and the other is tanh layer. Output is a two dimensional data
which is obtained using previous state and these two layers.
o t = σ (Woh * h t-1 + Wox * x t ] + bo )
h t = o t * tanh(Ct )

Search Your keyword

Request a call

Admission Enquiry
Online Fee & Reg.