Convolutional Neural Network written in Java. This project tries to rewrite the ConvNetJS system built in Javascript.
Alternatives To Javacnn
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
a year ago53HTML吴恩达老师的深度学习课程笔记及资源)
Deeprl Agents2,004
4 years ago44mitJupyter Notebook
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.
2 years ago1April 12, 201722bsd-2-clauseHaskell
Deep Learning in Haskell
17 days ago2November 14, 202057otherPython
A toolbox to iNNvestigate neural networks' predictions! Solutions756
6 years ago3
Solutions of assignments and translation to Chinese
42 months ago15July 22, 2021cc-by-sa-4.0Swift
Add the missing network activity indicator on notched iPhones
3 years ago28mitPython
A unified framework of perturbation and gradient-based attribution methods for Deep Neural Networks interpretability. DeepExplain also includes support for Shapley Values sampling. (ICLR 2018)
Integrated Gradients449
a year ago15Jupyter Notebook
Attributing predictions made by the Inception network using the Integrated Gradients method
Toy Neural Network Js407
4 days ago64mitJavaScript
Neural Network JavaScript library for Coding Train tutorials
4 years agoJupyter Notebook
Example code for the Siggraph Asia Tutorial CreativeAI
Alternatives To Javacnn
Select To Compare

Alternative Project Comparisons

Convolutional Neural Network written in Java

This project tries to rewrite the ConvNetJS system built in Javascript.

I've taken the liberty to change some names and I haven't implemented all classes yet but will try to do that in the future.

The net class has the new name of JavaCNN and all the features of creating a network from a definition is removed. Might build those in the future but for this first implementation they aren't required.

This implementation is also made into an video series that you can follow along at Machine Learning


The class Vol in ConvNetJS it has the new name of DataBlock.


Holding all the data handled by the network. So a layer will receive this class and return a similar block as a output that will be used by the next layer in the chain.


When we have done a back propagation of the network we will receive a result of weight adjustments required to learn. This result set will contain the data used by the trainer.


A convolution neural network is built of layers that the data traverses back and forth in order to predict what the network sees in the data.


The input layer is a simple layer that will pass the data though and create a window into the full training data set. So for instance if we have an image of size 28x28x1 which means that we have 28 pixels in the x axle and 28 pixels in the y axle and one color (gray scale), then this layer might give you a window of another size example 24x24x1 that is randomly chosen in order to create some distortion into the dataset so the algorithm don't over-fit the training.


This layer uses different filters to find attributes of the data that affects the result. As an example there could be a filter to find horizontal edges in an image.


This layer is useful when we are dealing with ReLU neurons. Why is that? Because ReLU neurons have unbounded activations and we need LRN to normalize that. We want to detect high frequency features with a large response. If we normalize around the local neighborhood of the excited neuron, it becomes even more sensitive as compared to its neighbors.

At the same time, it will dampen the responses that are uniformly large in any given local neighborhood. If all the values are large, then normalizing those values will diminish all of them. So basically we want to encourage some kind of inhibition and boost the neurons with relatively larger activations. This has been discussed nicely in Section 3.3 of the original paper by Krizhevsky et al.


This layer will reduce the dataset by creating a smaller zoomed out version. In essence you take a cluster of pixels take the sum of them and put the result in the reduced position of the new image.


Neurons in a fully connected layer have full connections to all activations in the previous layer, as seen in regular Neural Networks. Their activations can hence be computed with a matrix multiplication followed by a bias offset.


This layer will remove some random activations in order to defeat over-fitting.


Implements Maxout nonlinearity that computes x -> max(x) where x is a vector of size group_size. Ideally of course, the input size should be exactly divisible by group_size


This is a layer of neurons that applies the non-saturating activation function f(x)=max(0,x). It increases the nonlinear properties of the decision function and of the overall network without affecting the receptive fields of the convolution layer.


Implements Sigmoid nonlinearity elementwise x -> 1/(1+e^(-x)) so the output is between 0 and 1.


Implements Tanh nonlinearity elementwise x -> tanh(x) so the output is between -1 and 1.

Loss layers


This layer will squash the result of the activations in the fully connected layer and give you a value of 0 to 1 for all output activations.


This layer uses the input area trying to find a line to separate the correct activation from the incorrect ones.


Regression layer is used when your output is an area of data. When you don't have a single class that is the correct activation but you try to find a result set near to your training area.


Trainers take the generated output of activations and gradients in order to modify the weights in the network to make a better prediction the next time the network runs with a data block.


Adaptive delta will look at the differences between the expected result and the current result to train the network.


The adaptive gradient trainer will over time sum up the square of the gradient and use it to change the weights.


Adaptive Moment Estimation is an update to RMSProp optimizer. In this running average of both the gradients and their magnitudes are used.


Another extension of gradient descent is due to Yurii Nesterov from 1983,[7] and has been subsequently generalized


Stochastic gradient descent (often shortened in SGD), also known as incremental gradient descent, is a stochastic approximation of the gradient descent optimization method for minimizing an objective function that is written as a sum of differentiable functions. In other words, SGD tries to find minimums or maximums by iteration.


This is AdaGrad but with a moving window weighted average so the gradient is not accumulated over the entire history of the run. It's also referred to as Idea #1 in Zeiler paper on AdaDelta.

Popular Network Projects
Popular Gradient Projects
Popular Networking Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Gradient Descent