Artificial Intelligence
Develop a Neural Community for Banknote Authentication
It may be difficult to develop a neural community predictive mannequin for a brand new dataset.
One method is to first examine the dataset and develop concepts for what fashions may work, then discover the training dynamics of easy fashions on the dataset, then lastly develop and tune a mannequin for the dataset with a strong check harness.
This course of can be utilized to develop efficient neural community fashions for classification and regression predictive modeling issues.
On this tutorial, you’ll uncover the best way to develop a Multilayer Perceptron neural community mannequin for the banknote binary classification dataset.
After finishing this tutorial, you’ll know:
- The best way to load and summarize the banknote dataset and use the outcomes to counsel knowledge preparations and mannequin configurations to make use of.
- The best way to discover the training dynamics of easy MLP fashions on the dataset.
- The best way to develop sturdy estimates of mannequin efficiency, tune mannequin efficiency and make predictions on new knowledge.
Let’s get began.
Develop a Neural Community for Banknote Authentication
Photograph by Lenny Okay Images, some rights reserved.
Tutorial Overview
This tutorial is split into 4 elements; they’re:
- Banknote Classification Dataset
- Neural Community Studying Dynamics
- Sturdy Mannequin Analysis
- Remaining Mannequin and Make Predictions
Banknote Classification Dataset
Step one is to outline and discover the dataset.
We might be working with the “Banknote” commonplace binary classification dataset.
The banknote dataset entails predicting whether or not a given banknote is genuine given numerous measures taken from {a photograph}.
The dataset comprises 1,372 rows with 5 numeric variables. It’s a classification drawback with two courses (binary classification).
Beneath offers a listing of the 5 variables within the dataset.
- variance of Wavelet Reworked picture (steady).
- skewness of Wavelet Reworked picture (steady).
- kurtosis of Wavelet Reworked picture (steady).
- entropy of picture (steady).
- class (integer).
Beneath is a pattern of the primary 5 rows of the dataset
3.6216,8.6661,-2.8073,-0.44699,0 4.5459,8.1674,-2.4586,-1.4621,0 3.866,-2.6383,1.9242,0.10645,0 3.4566,9.5228,-4.0112,-3.5944,0 0.32924,-4.4552,4.5718,-0.9888,0 4.3684,9.6718,-3.9606,-3.1625,0 … |
You possibly can be taught extra concerning the dataset right here:
We are able to load the dataset as a pandas DataFrame straight from the URL; for instance:
# load the banknote dataset and summarize the form from pandas import learn_csv # outline the situation of the dataset url = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/banknote_authentication.csv’ # load the dataset df = read_csv(url, header=None) # summarize form print(df.form) |
Operating the instance hundreds the dataset straight from the URL and experiences the form of the dataset.
On this case, we will verify that the dataset has 5 variables (4 enter and one output) and that the dataset has 1,372 rows of information.
This isn’t many rows of information for a neural community and suggests {that a} small community, maybe with regularization, can be applicable.
It additionally means that utilizing k-fold cross-validation can be a good suggestion given that it’ll give a extra dependable estimate of mannequin efficiency than a prepare/check cut up and since a single mannequin will slot in seconds as an alternative of hours or days with the biggest datasets.
Subsequent, we will be taught extra concerning the dataset by taking a look at abstract statistics and a plot of the information.
# present abstract statistics and plots of the banknote dataset from pandas import read_csv from matplotlib import pyplot # outline the situation of the dataset url = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/banknote_authentication.csv’ # load the dataset df = read_csv(url, header=None) # present abstract statistics print(df.describe()) # plot histograms df.hist() pyplot.present() |
Operating the instance first hundreds the information earlier than after which prints abstract statistics for every variable.
We are able to see that values fluctuate with totally different means and commonplace deviations, maybe some normalization or standardization can be required previous to modeling.
0 1 2 3 4 rely 1372.000000 1372.000000 1372.000000 1372.000000 1372.000000 imply 0.433735 1.922353 1.397627 -1.191657 0.444606 std 2.842763 5.869047 4.310030 2.101013 0.497103 min -7.042100 -13.773100 -5.286100 -8.548200 0.000000 25% -1.773000 -1.708200 -1.574975 -2.413450 0.000000 50% 0.496180 2.319650 0.616630 -0.586650 0.000000 75% 2.821475 6.814625 3.179250 0.394810 1.000000 max 6.824800 12.951600 17.927400 2.449500 1.000000 |
A histogram plot is then created for every variable.
We are able to see that maybe the primary two variables have a Gaussian-like distribution and the following two enter variables could have a skewed Gaussian distribution or an exponential distribution.
We could have some profit in utilizing an influence rework on every variable as a way to make the chance distribution much less skewed which can doubtless enhance mannequin efficiency.

Histograms of the Banknote Classification Dataset
Now that we’re conversant in the dataset, let’s discover how we would develop a neural community mannequin.
Neural Community Studying Dynamics
We’ll develop a Multilayer Perceptron (MLP) mannequin for the dataset utilizing TensorFlow.
We can’t know what mannequin structure of studying hyperparameters can be good or finest for this dataset, so we should experiment and uncover what works effectively.
On condition that the dataset is small, a small batch dimension might be a good suggestion, e.g. 16 or 32 rows. Utilizing the Adam model of stochastic gradient descent is a good suggestion when getting began as it is going to robotically adapt the training charge and works effectively on most datasets.
Earlier than we consider fashions in earnest, it’s a good suggestion to overview the training dynamics and tune the mannequin structure and studying configuration till we’ve steady studying dynamics, then take a look at getting probably the most out of the mannequin.
We are able to do that through the use of a easy prepare/check cut up of the information and overview plots of the studying curves. This can assist us see if we’re over-learning or under-learning; then we will adapt the configuration accordingly.
First, we should guarantee all enter variables are floating-point values and encode the goal label as integer values 0 and 1.
... # guarantee all knowledge are floating level values X = X.astype(‘float32’) # encode strings to integer y = LabelEncoder().fit_transform(y) |
Subsequent, we will cut up the dataset into enter and output variables, then into 67/33 prepare and check units.
... # cut up into enter and output columns X, y = df.values[:, :–1], df.values[:, –1] # cut up into prepare and check datasets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33) |
We are able to outline a minimal MLP mannequin. On this case, we are going to use one hidden layer with 10 nodes and one output layer (chosen arbitrarily). We’ll use the ReLU activation operate within the hidden layer and the “he_normal” weight initialization, as collectively, they’re a very good observe.
The output of the mannequin is a sigmoid activation for binary classification and we are going to reduce binary cross-entropy loss.
... # decide the variety of enter options n_features = X.form[1] # outline mannequin mannequin = Sequential() mannequin.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’, input_shape=(n_features,))) mannequin.add(Dense(1, activation=‘sigmoid’)) # compile the mannequin mannequin.compile(optimizer=‘adam’, loss=‘binary_crossentropy’) |
We’ll match the mannequin for 50 coaching epochs (chosen arbitrarily) with a batch dimension of 32 as a result of it’s a small dataset.
We’re becoming the mannequin on uncooked knowledge, which we predict is perhaps a good suggestion, however it is a crucial place to begin.
... # match the mannequin historical past = mannequin.match(X_train, y_train, epochs=50, batch_size=32, verbose=0, validation_data=(X_test,y_test)) |
On the finish of coaching, we are going to consider the mannequin’s efficiency on the check dataset and report efficiency because the classification accuracy.
... # predict check set yhat = mannequin.predict_classes(X_test) # consider predictions rating = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % rating) |
Lastly, we are going to plot studying curves of the cross-entropy loss on the prepare and check units throughout coaching.
... # plot studying curves pyplot.title(‘Studying Curves’) pyplot.xlabel(‘Epoch’) pyplot.ylabel(‘Cross Entropy’) pyplot.plot(historical past.historical past[‘loss’], label=‘prepare’) pyplot.plot(historical past.historical past[‘val_loss’], label=‘val’) pyplot.legend() pyplot.present() |
Tying this all collectively, the whole instance of evaluating our first MLP on the banknote dataset is listed beneath.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
# match a easy mlp mannequin on the banknote and overview studying curves from pandas import read_csv from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder from sklearn.metrics import accuracy_score from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense from matplotlib import pyplot # load the dataset path = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/banknote_authentication.csv’ df = read_csv(path, header=None) # cut up into enter and output columns X, y = df.values[:, :–1], df.values[:, –1] # guarantee all knowledge are floating level values X = X.astype(‘float32’) # encode strings to integer y = LabelEncoder().fit_transform(y) # cut up into prepare and check datasets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33) # decide the variety of enter options n_features = X.form[1] # outline mannequin mannequin = Sequential() mannequin.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’, input_shape=(n_features,))) mannequin.add(Dense(1, activation=‘sigmoid’)) # compile the mannequin mannequin.compile(optimizer=‘adam’, loss=‘binary_crossentropy’) # match the mannequin historical past = mannequin.match(X_train, y_train, epochs=50, batch_size=32, verbose=0, validation_data=(X_test,y_test)) # predict check set yhat = mannequin.predict_classes(X_test) # consider predictions rating = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % rating) # plot studying curves pyplot.title(‘Studying Curves’) pyplot.xlabel(‘Epoch’) pyplot.ylabel(‘Cross Entropy’) pyplot.plot(historical past.historical past[‘loss’], label=‘prepare’) pyplot.plot(historical past.historical past[‘val_loss’], label=‘val’) pyplot.legend() pyplot.present() |
Operating the instance first matches the mannequin on the coaching dataset, then experiences the classification accuracy on the check dataset.
Notice: Your outcomes could fluctuate given the stochastic nature of the algorithm or analysis process, or variations in numerical precision. Take into account working the instance a number of instances and evaluate the typical final result.
On this case, we will see that the mannequin achieved nice or good accuracy of 100% p.c. This may counsel that the prediction drawback is simple and/or that neural networks are a very good match for the issue.
Line plots of the loss on the prepare and check units are then created.
We are able to see that the mannequin seems to converge effectively and doesn’t present any indicators of overfitting or underfitting.

Studying Curves of Easy Multilayer Perceptron on Banknote Dataset
We did amazingly effectively on our first attempt.
Now that we’ve some concept of the training dynamics for a easy MLP mannequin on the dataset, we will take a look at creating a extra sturdy analysis of mannequin efficiency on the dataset.
Sturdy Mannequin Analysis
The k-fold cross-validation process can present a extra dependable estimate of MLP efficiency, though it may be very sluggish.
It is because ok fashions have to be match and evaluated. This isn’t an issue when the dataset dimension is small, such because the banknote dataset.
We are able to use the StratifiedKFold class and enumerate every fold manually, match the mannequin, consider it, after which report the imply of the analysis scores on the finish of the process.
... # put together cross validation kfold = KFold(10) # enumerate splits scores = record() for train_ix, test_ix in kfold.cut up(X, y): # match and consider the mannequin… ... ... # summarize all scores print(‘Imply Accuracy: %.3f (%.3f)’ % (imply(scores), std(scores))) |
We are able to use this framework to develop a dependable estimate of MLP mannequin efficiency with our base configuration, and even with a variety of various knowledge preparations, mannequin architectures, and studying configurations.
It will be significant that we first developed an understanding of the training dynamics of the mannequin on the dataset within the earlier part earlier than utilizing k-fold cross-validation to estimate the efficiency. If we began to tune the mannequin straight, we would get good outcomes, but when not, we would do not know of why, e.g. that the mannequin was over or underneath becoming.
If we make massive adjustments to the mannequin once more, it’s a good suggestion to return and ensure that the mannequin is converging appropriately.
The whole instance of this framework to guage the bottom MLP mannequin from the earlier part is listed beneath.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# k-fold cross-validation of base mannequin for the banknote dataset from numpy import imply from numpy import std from pandas import read_csv from sklearn.model_selection import StratifiedKFold from sklearn.preprocessing import LabelEncoder from sklearn.metrics import accuracy_score from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense from matplotlib import pyplot # load the dataset path = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/banknote_authentication.csv’ df = read_csv(path, header=None) # cut up into enter and output columns X, y = df.values[:, :–1], df.values[:, –1] # guarantee all knowledge are floating level values X = X.astype(‘float32’) # encode strings to integer y = LabelEncoder().fit_transform(y) # put together cross validation kfold = StratifiedKFold(10) # enumerate splits scores = record() for train_ix, test_ix in kfold.cut up(X, y): # cut up knowledge X_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix] # decide the variety of enter options n_features = X.form[1] # outline mannequin mannequin = Sequential() mannequin.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’, input_shape=(n_features,))) mannequin.add(Dense(1, activation=‘sigmoid’)) # compile the mannequin mannequin.compile(optimizer=‘adam’, loss=‘binary_crossentropy’) # match the mannequin mannequin.match(X_train, y_train, epochs=50, batch_size=32, verbose=0) # predict check set yhat = mannequin.predict_classes(X_test) # consider predictions rating = accuracy_score(y_test, yhat) print(‘>%.3f’ % rating) scores.append(rating) # summarize all scores print(‘Imply Accuracy: %.3f (%.3f)’ % (imply(scores), std(scores))) |
Operating the instance experiences the mannequin efficiency every iteration of the analysis process and experiences the imply and commonplace deviation of classification accuracy on the finish of the run.
Notice: Your outcomes could fluctuate given the stochastic nature of the algorithm or analysis process, or variations in numerical precision. Take into account working the instance a number of instances and evaluate the typical final result.
On this case, we will see that the MLP mannequin achieved a imply accuracy of about 99.9 p.c.
This confirms our expectation that the bottom mannequin configuration works very effectively for this dataset, and certainly the mannequin is an effective match for the issue and maybe the issue is kind of trivial to unravel.
That is shocking (to me) as a result of I might have anticipated some knowledge scaling and maybe an influence rework to be required.
>1.000 >1.000 >1.000 >1.000 >0.993 >1.000 >1.000 >1.000 >1.000 >1.000 Imply Accuracy: 0.999 (0.002) |
Subsequent, let’s take a look at how we would match a ultimate mannequin and use it to make predictions.
Remaining Mannequin and Make Predictions
As soon as we select a mannequin configuration, we will prepare a ultimate mannequin on all obtainable knowledge and use it to make predictions on new knowledge.
On this case, we are going to use the mannequin with dropout and a small batch dimension as our ultimate mannequin.
We are able to put together the information and match the mannequin as earlier than, though on the complete dataset as an alternative of a coaching subset of the dataset.
... # cut up into enter and output columns X, y = df.values[:, :–1], df.values[:, –1] # guarantee all knowledge are floating level values X = X.astype(‘float32’) # encode strings to integer le = LabelEncoder() y = le.fit_transform(y) # decide the variety of enter options n_features = X.form[1] # outline mannequin mannequin = Sequential() mannequin.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’, input_shape=(n_features,))) mannequin.add(Dense(1, activation=‘sigmoid’)) # compile the mannequin mannequin.compile(optimizer=‘adam’, loss=‘binary_crossentropy’) |
We are able to then use this mannequin to make predictions on new knowledge.
First, we will outline a row of latest knowledge.
... # outline a row of latest knowledge row = [3.6216,8.6661,–2.8073,–0.44699] |
Notice: I took this row from the primary row of the dataset and the anticipated label is a ‘0’.
We are able to then make a prediction.
... # make prediction yhat = mannequin.predict_classes([row]) |
Then invert the rework on the prediction, so we will use or interpret the consequence within the appropriate label (which is simply an integer for this dataset).
... # invert rework to get label for sophistication yhat = le.inverse_transform(yhat) |
And on this case, we are going to merely report the prediction.
... # report prediction print(‘Predicted: %s’ % (yhat[0])) |
Tying this all collectively, the whole instance of becoming a ultimate mannequin for the banknote dataset and utilizing it to make a prediction on new knowledge is listed beneath.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
# match a ultimate mannequin and make predictions on new knowledge for the banknote dataset from pandas import read_csv from sklearn.preprocessing import LabelEncoder from sklearn.metrics import accuracy_score from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense from tensorflow.keras.layers import Dropout # load the dataset path = ‘https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/banknote_authentication.csv’ df = read_csv(path, header=None) # cut up into enter and output columns X, y = df.values[:, :–1], df.values[:, –1] # guarantee all knowledge are floating level values X = X.astype(‘float32’) # encode strings to integer le = LabelEncoder() y = le.fit_transform(y) # decide the variety of enter options n_features = X.form[1] # outline mannequin mannequin = Sequential() mannequin.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’, input_shape=(n_features,))) mannequin.add(Dense(1, activation=‘sigmoid’)) # compile the mannequin mannequin.compile(optimizer=‘adam’, loss=‘binary_crossentropy’) # match the mannequin mannequin.match(X, y, epochs=50, batch_size=32, verbose=0) # outline a row of latest knowledge row = [3.6216,8.6661,–2.8073,–0.44699] # make prediction yhat = mannequin.predict_classes([row]) # invert rework to get label for sophistication yhat = le.inverse_transform(yhat) # report prediction print(‘Predicted: %s’ % (yhat[0])) |
Operating the instance matches the mannequin on the complete dataset and makes a prediction for a single row of latest knowledge.
Notice: Your outcomes could fluctuate given the stochastic nature of the algorithm or analysis process, or variations in numerical precision. Take into account working the instance a number of instances and evaluate the typical final result.
On this case, we will see that the mannequin predicted a “0” label for the enter row.
Additional Studying
This part offers extra sources on the subject if you’re trying to go deeper.
Tutorials
Abstract
On this tutorial, you found the best way to develop a Multilayer Perceptron neural community mannequin for the banknote binary classification dataset.
Particularly, you discovered:
- The best way to load and summarize the banknote dataset and use the outcomes to counsel knowledge preparations and mannequin configurations to make use of.
- The best way to discover the training dynamics of easy MLP fashions on the dataset.
- The best way to develop sturdy estimates of mannequin efficiency, tune mannequin efficiency and make predictions on new knowledge.
Do you’ve any questions?
Ask your questions within the feedback beneath and I’ll do my finest to reply.
Develop Deep Studying Initiatives with Python!
What If You May Develop A Community in Minutes
…with only a few strains of Python
Uncover how in my new E book:
Deep Studying With Python
It covers end-to-end tasks on subjects like:
Multilayer Perceptrons, Convolutional Nets and Recurrent Neural Nets, and extra…
Lastly Deliver Deep Studying To
Your Personal Initiatives
Skip the Lecturers. Simply Outcomes.



Give attention to capex: Centre removes month-to-month spend caps

Our Favourite Instructor Bracelets to Give and Obtain

European Values Confront AI Innovation in EU’s Proposed AI Act – AI Tendencies

33 Black Historical past Month Actions for February and Past

Entrance-Finish Efficiency Guidelines 2021 — Smashing Journal
