Skip to content

AlexandreBrillant/tensorflow-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hello,

While studying the book "Deep Learning with JavaScript" by Shanqing Cai, Stanley Bileschi, Eric D. Nielsen, and François Chollet,

TensorFlow.js book

I was not entirely satisfied with the examples and results obtained using the TensorFlow.js library.

As a result, I decided to rewrite the examples from scratch using different strategies to see if better results could be achieved. I replaced too commonjs modules by a modern usage of the ES6 modules. I removed all unnecessary dependencies (like the one for CSV format).

You need to install tensoflow.js (I used the simple one tfjs) on your machine

Using my package :

npm i

or directly :

npm i @tensorflow/tfjs

(c) Alexandre Brillant

Prediction : example1.js

Data

The data directory contains a local version of the Boston Housing dataset (all with the CSV format), which includes 12 features and 333 samples. I wrote myself the CSV parsing inside the loadData method.

I choose to normalize the data using the formula: (value − min_value) / (max_value − min_value). I don't use the book normalize function with the mean. This is a general normalizer for any tensor2d object.

I added a second parameter colValues containing the max/min value for each column of the train data set to be sure to have the same normalization space.

function normalizer( tensor2d, colValues ) {
    const shape = tensor2d.shape;
    const colCount = shape[1];
    const normalisees = [];
    const lastColValues = [];

    for ( let i = 0; i < colCount; i++ ) {
        const col = tensor2d.slice( [ 0, i ], [-1, 1 ] );
        const minValue = colValues ? colValues[ i ].minValue : col.min();
        const maxValue = colValues ? colValues[ i ].maxValue : col.max();
        const delta = maxValue.sub( minValue );
        const colNorm = ( col.sub( minValue ) ).div( delta );
        normalisees.push( colNorm );
        lastColValues.push( {
            maxValue,
            minValue
        } );
    }

    return {
        tensor:tf.concat( normalisees, 1),
        colValues:lastColValues
    }
}

Goal

I train a non-linear model with 2 layers using various strategies.

The goal is to estimate the price for house using 12 features (crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,lstat).

Run the example

npm run ex1

or

node src/example1.js

Book result

Final loss inside the Book at 23 with 50 units for the first layer ?

Stategies

All my strategies (the hyperparameters) are configurable via this table:

const strategies = [
    { maxUnits : 1, maxEpochs : 100, loss : "meanAbsoluteError", activation: "relu", optimizer : "sgd" },  // Good
    { maxUnits : 5, maxEpochs : 100, loss : "meanAbsoluteError", activation: "relu", optimizer : "sgd" },  // Good
    { maxUnits : 5, maxEpochs : 100, loss : "meanSquaredError", activation: "relu", optimizer : "sgd" }, // Bad
    { maxUnits : 5, maxEpochs : 100, loss : "meanAbsoluteError", activation: "sigmoid", optimizer : "sgd" }, // Bad
    { maxUnits : 5, maxEpochs : 100, loss : "meanAbsoluteError", activation: "relu", optimizer : "adam" },  // Bad
];

It uses 5 units only and the basic "sgd" optimizer for the best result.

Result

The result for this data is:

Loss Result with {"maxUnits":1,"maxEpochs":100,"loss":"meanAbsoluteError","activation":"relu","optimizer":"sgd"} .... : 3.983289031982422

Loss Result with {"maxUnits":5,"maxEpochs":100,"loss":"meanAbsoluteError","activation":"relu","optimizer":"sgd"} .... : 3.623922109603882

Loss Result with {"maxUnits":5,"maxEpochs":100,"loss":"meanSquaredError","activation":"relu","optimizer":"sgd"} .... : 19.283775329589844

Loss Result with {"maxUnits":5,"maxEpochs":100,"loss":"meanAbsoluteError","activation":"sigmoid","optimizer":"sgd"} .... : 6.683498382568359

Loss Result with {"maxUnits":5,"maxEpochs":100,"loss":"meanAbsoluteError","activation":"relu","optimizer":"adam"} .... : 9.521827697753906

The best loss achieved is 3.6, but it is possible to go below 2 by increasing the number of epochs.

Conclusion

Definitely, the choices proposed in the book were not optimal. Using a much more computationally expensive strategy, they achieved a loss of 23, which is double the worst performance of my example.

It is possible that the normalization technique has a significant impact ?.

Binary classification : example2.js

Data

The data directory contains a local version of the Phishing web site dataset (all with the CSV format), which includes 30 features and about 5000 samples. I wrote myself the CSV parsing inside the loadData method.

We needn't to normalize data as this is a set of 0 or 1.

Goal

I train a non-linear model with 2 layers using various strategies.

The goal is to classify a phishing or not web site using 30 features. We considere a label to 1 for "Yes this is a phishing web site" and the label to 0 for "No, this is not a phishing web site".

Run the example

npm run ex2

or

node src/example2.js

Stategies

const strategies = [
    { maxUnits : 10, maxEpochs : 100, loss : "binaryCrossentropy", activation: "sigmoid", optimizer : "adam", threshold:0.5 },
    { maxUnits : 10, maxEpochs : 100, loss : "binaryCrossentropy", activation: "sigmoid", optimizer : "adam", threshold:0.6 },
    { maxUnits : 10, maxEpochs : 100, loss : "binaryCrossentropy", activation: "sigmoid", optimizer : "adam", threshold:0.7 },
    { maxUnits : 10, maxEpochs : 100, loss : "binaryCrossentropy", activation: "sigmoid", optimizer : "adam", threshold:0.8 }
];

We use only the binaryCrossentropy which adds good or bad score depending on the prediction rate.

When a result probability > theshold, then it means this is a label 1 and else this is a label 0. So each time we increase the threshold, we improve the precision of the prediction because we ask a very high probability.

Result

The result are only for the Label 1 (phishing detection). Good prediction is a rate of quality when the model predicts a label 1. The Miss prediction is for the case the model predicts a label 0 for a label 1.

  • Label 1 : Good prediction (97.64791025872988%) - Missed prediction (4.776551474579338%)
  • Label 1 : Good prediction (97.90121223086665%) - Missed prediction (5.210783426813824%)
  • Label 1 : Good prediction (98.91442011941378%) - Missed prediction (8.576081056631084%)
  • Label 1 : Good prediction (100%) - Missed prediction (43.495567215487604%)
  • Label 1 : Good prediction (99.6924190338339%) - Missed prediction (11.072914781979373%)

=> High Precision (100% at threshold 0.7):

The model only predicts "phishing" when it is almost certain. Pro: No false alarms (all predicted "phishing" sites are truly phishing). Con: Many actual phishing sites are missed (high missed prediction rate, e.g., 43%).

Low Missed Predictions (Lower Threshold):

The model predicts "phishing" more often, catching more actual phishing sites. Pro: Fewer missed phishing sites. Con: More false alarms (lower precision).

There's no universal solution ! The choice depends on your priority

Multi-class classification : example3.js

Data

The data directory contains a dataset for the iris flowers.

There're 4 features for detecting the following flowers :

Iris-setosa Iris-versicolor Iris-virginica

Each flower is used as an array with 3 columns :

Iris-setosa : [1,0,0] Iris-versicolor : [0,1,0] Iris-virginica : [0,0,1]

Each column is a probability (so 1 is for 100%).

Goal

Be able to detect a flower from 4 features.

Run the example

npm run ex3

or

node src/example3.js

Stategies

const strategies = [
    { maxUnits : 100, maxEpochs : 500, loss : "categoricalCrossentropy", activation: "sigmoid", optimizer : "adam" },
    { maxUnits : 10, maxEpochs : 250, loss : "categoricalCrossentropy", activation: "sigmoid", optimizer : "adam" },
    { maxUnits : 100, maxEpochs : 250, loss : "categoricalCrossentropy", activation: "relu", optimizer : "adam" },
    { maxUnits : 10, maxEpochs : 500, loss : "categoricalCrossentropy", activation: "relu", optimizer : "adam" },
    { maxUnits : 10, maxEpochs : 500, loss : "categoricalCrossentropy", activation: "tanh", optimizer : "adam" }
];

The "categoricalCrossentropy" is required for a multi-class problem. Here we have 3 labels for 3 flowers.

Result

We displayed both the total accuracy and the accuracy by flower for each strategy. You may run several times for comparing the results.

{"maxUnits":100,"maxEpochs":500,"loss":"categoricalCrossentropy","activation":"sigmoid","optimizer":"adam"}
Total Accuracy =98%
- Flower Iris-setosa Accuracy = 100 %
- Flower Iris-versicolor Accuracy = 100 %
- Flower Iris-virginica Accuracy = 96 %
--------------------------------------------
{"maxUnits":10,"maxEpochs":250,"loss":"categoricalCrossentropy","activation":"sigmoid","optimizer":"adam"}
Total Accuracy =68%
- Flower Iris-setosa Accuracy = 100 %
- Flower Iris-versicolor Accuracy = 100 %
- Flower Iris-virginica Accuracy = 65 %
--------------------------------------------
{"maxUnits":100,"maxEpochs":250,"loss":"categoricalCrossentropy","activation":"relu","optimizer":"adam"}
Total Accuracy =98%
- Flower Iris-setosa Accuracy = 100 %
- Flower Iris-versicolor Accuracy = 100 %
- Flower Iris-virginica Accuracy = 73 %
--------------------------------------------
{"maxUnits":10,"maxEpochs":500,"loss":"categoricalCrossentropy","activation":"relu","optimizer":"adam"}
Total Accuracy =97%
- Flower Iris-setosa Accuracy = 100 %
- Flower Iris-versicolor Accuracy = 100 %
- Flower Iris-virginica Accuracy = 77 %
{"maxUnits":10,"maxEpochs":500,"loss":"categoricalCrossentropy","activation":"tanh","optimizer":"adam"}
Total Accuracy =98%
- Flower Iris-setosa Accuracy = 100 %
- Flower Iris-versicolor Accuracy = 96 %
- Flower Iris-virginica Accuracy = 100 %

The relu activation for the first layer is good enough. The sigmoid usage from the book is not necessary. 10 neurons is enough too for the first layer.

Convolution for MNIST : example4.js

Data

The MNIST Data is a 60000 28x28 images for training and 10000 28x28 images for testing. Each image is a number between 0 to 9. So a label is this number.

The data base used a ubyte specific format for storing each image and label.

Goal

Be able to detect the number that this drawn inside an image.

Run the example

npm run ex4

or

node src/example4.js

Important note : Running the MNIST data base is a high cost for the CPU. So it is recommanded to switch from

import tf from '@tensorflow/tfjs';

to

import tf from '@tensorflow/tfjs-node';

or better if you have a GPU

import tf from '@tensorflow/tfjs-node-gpu';

Note that for ARM (my cpu here), you can't switch to tfjs-node or tfjs-node-gpu.

I have added a limitSize parameter if you use tfjs only. Else you must use

const limitSize = Number.MAX_SAFE_INTEGER;

For running on the whole MNIST database.

Stategies

{ 
    kernelSize:2,
    filters:8,
    units:32
},
{ 
    kernelSize:3,
    filters:16,
    units:64
},
{ 
    kernelSize:3,
    filters:32,
    units:64
},
{ 
    kernelSize:3,
    filters:32,
    units:128
},
{ 
    kernelSize:3,
    filters:64,
    units:64
},    
{ 
    kernelSize:2,
    filters:32,
    units:64
},       
{ 
    kernelSize:4,
    filters:32,
    units:64
}

I have limited the model to one convolution layer for performance reason.

Result

The training/evaluation here is only with 1000 images, so it impacts the accuracy rate. I didn't set too the epochs parameter for performance.

{"kernelSize":2,"filters":8,"units":32}
Loss : 2.100555896759033 / Accuracy : 40.50%
Evaluating...
================================
{"kernelSize":3,"filters":16,"units":64}
Loss : 1.7298730611801147 / Accuracy : 62.10%
Evaluating...
================================
{"kernelSize":3,"filters":32,"units":64}
Loss : 1.7157704830169678 / Accuracy : 52.60%
Evaluating...
================================
{"kernelSize":3,"filters":32,"units":128}
Loss : 1.5706546306610107 / Accuracy : 56.00%
Evaluating...
================================
{"kernelSize":3,"filters":64,"units":64}
Loss : 1.6900482177734375 / Accuracy : 53.20%
Evaluating...
================================
{"kernelSize":2,"filters":32,"units":64}
Loss : 1.820560097694397 / Accuracy : 48.10%
Evaluating...
================================
{"kernelSize":4,"filters":32,"units":64}
Loss : 1.6055097579956055 / Accuracy : 56.80%

It seems there's no a big impact for the filter around 16, and for the units of the complex layer around 64.

About

Rewrite of the examples of the book Deep Learning with JavaScript

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors