Ebooka przeczytasz w aplikacjach Legimi na:
Odsłuch ebooka (TTS) dostępny w abonamencie „ebooki+audiobooki bez limitu” w aplikacji Legimi na:
Joyful AI, Book 1
First published by Joyously Aware Media in 2018.
Copyright © Alan French, 2018. All rights reserved.
First edition 2018.
ISBN: 978-988-78725-4-2 (ebook), 978-988-78725-5-9 (print)
Published in Hong Kong.
No part of this publication may be reproduced, stored, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise without written permission from the publisher. It is illegal to copy this book, post it to a website, or distribute it by any other means without permission.
Copyediting by Leonora Bulbeck, reedsy.com/leonora-bulbeck
Image credits:Cover image by user ulleo at Pixabay. https://pixabay.com/en/walnut-nut-shell-nutshell-open-3072681/ CC0 Creative Commons licence, Free for Commercial Use, No Attribution Required.
Many thanks to my students, who kept asking all the good questions. I hope that this little book answers some of them.
This is a book on neural networks for non-technical readers. Nowadays, when AI and neural networks influence and control the lives of all, everyone needs to have a very basic idea of what neural networks are and how they work. Students of the humanities, for example, philosophy, ethics, linguistics, translation, but also social science, political science, medicine and even the arts will most likely, at some point in their career, be confronted with some profound effect of AI systems on their fields of study. Specialists in all fields, car drivers, programmers and web designers will interact more and more with AI. Many of us will eventually be replaced entirely by AI systems on the job market. Already, neural networks built into our phones, computers, intelligent fridges and smart speakers slowly take over control of our homes.
If you are anything like me, you are fascinated by artificial intelligence, neural networks and all the magic they can do. But you are scared by the math that is behind them. If you look at online courses on neural networks, perhaps your eyes cloud over when you first meet the derivative of the sigmoid function, without which no neural network would be able to learn.
Or perhaps you are just a citizen who is concerned about how deep learning is changing our society. It is crucial for a democracy that we, as citizens, are able to understand the forces that control our lives, our futures and the futures of our children, and that we don’t surrender control to a caste of AI priests that manage our societies in our name. This is a problem not only for AI but also for many other technologies. If I want to know whether I want nuclear power in my country, I need to have a basic understanding of what nuclear power is and what its dangers and benefits are. In short, I need to be able to make an informed choice.
It is crucial for a democracy that we, as citizens, are able to understand the forces that control our lives.
This is precisely the aim of this book. This book is not meant for programmers, and even less for AI researchers. But I hope that it can be of use in an introductory AI class outside of computer science, for example, in the humanities, arts and social sciences. It should also be useful to the general reader who wants to get an understandable first introduction into these technologies.
After reading this book, you should have a pretty good understanding, not only of the technology behind neural networks, artificial intelligence and deep learning but also of the significance of these technologies for society, for ethics and for questions of responsibility and liability ascription in law.
Let’s dive in!
Neural networks have become fashionable. But the technology is not as new as one might think. The first artificial neurons were actually created in the 1950s, so they are just as old as other forms of artificial intelligence and almost as old as digital computers in general.
The reason artificial neural networks did not catch on earlier is that they require relatively big amounts of computational power. Early computers did not have the ability to run neural network code at sufficient speeds to be practical.
Only since around 2010 has the development of hardware caught up with the requirements of deep neural networks. Suddenly, after decades of silence around neural networks, all the magical applications that we can see around us today became possible.
Before artificial neural networks were widely used, a programmer would tell the computer what to do by issuing a sequence of commands. The machine would then execute these commands one by one. This way of programming a computer is called imperative programming. The problem with imperative programming is that a programmer can only solve problems for which she can provide a list of commands that solves the problem. Unfortunately, many real-world problems are so hard that programmers don’t know how to solve them.
Dealing with noisy input is precisely what neural networks are particularly good at.
For example, assume that you want to recognise a handwritten letter. Let’s say the letter a. You could try to describe to the computer what an a looks like. The problem is that different people have different handwriting. Even the same person might write the letter a differently from time to time. It would be very difficult to describe in abstract terms how an a should look and which variations should still count as an a while others do not.
Other kinds of real-world data that are not precise, but noisy, would cause similar problems. For example, voice recognition. Right now, I am dictating this paragraph, and the computer recognises my spoken words and types them into a document. It would be extremely hard, if not impossible, to describe in abstract terms how each one of these different words sounds. Particularly since no two utterances of the same word sound exactly the same. One might pronounce particular vowels differently from time to time, or one might have a cold, or other environmental noises might interfere with the recording. As we will see later, dealing with noisy input is precisely what neural networks are particularly good at.
Artificial neural networks, as opposed to conventional imperative programs, can be taught to recognise patterns by example. This means that we can create systems that are able to recognise patterns even if we are not able to clearly describe the pattern itself. A neural network that has been shown a great number of different handwritten letters will be able to recognise these letters even if the programmer is not able to describe their differences in a precise way. A neural network that has been successfully trained to recognise a spoken word will be able to identify this word even if the programmer does not know how to describe the word in terms of sound frequencies.
Additionally, neural networks are not sensitive to small changes in their input patterns. If I have trained a neural network by showing it different versions of handwritten letters a, then the neural network will be able to recognise not only these letters that it has been trained upon but also similar but different letters. Neural networks can deal with noisy input.
In the rest of this small book, we will see how neural networks achieve these results. Let us first begin with a look at the basic idea behind biological neurons.
Artificial neural networks are inspired by the neurons in living organisms. Although we don’t know precisely and in every detail how biological neurons work, the basic principle behind them is easy to understand.
Whether artificial neural networks actually work like biological ones or not does not really matter much for AI. In the same way that an aeroplane or helicopter can fly without having feathers like a bird, an artificial neural network can process information successfully and perform some of the functions of a biological brain without needing to work technically in exactly the same way.
Here is a very basic image of the functional architecture of a biological neuron.
On the left side of this image, we can see the input side of the neuron. This is where the signals from other neurons come in. Every neuron is connected to many other neurons through long tendrils called dendrites. All the dendrites end up connecting to the cell body.The cell body processes the input signals and decides whether it should emit an output signal or not. If the cell decides to emit an output signal, this signal then travels down the axon. At the end of the axon, the signal splits into many tendrils again, which then connect to the dendrites of other neurons.
Each dendrite connects to its neuron at one point that is called a synapse. Every synapse has the ability to either strengthen or weaken the signal that comes through it. In an abstract sense, we can see the synapse as a kind of regulator that turns the input signal’s “volume” up or down. We therefore speak of synaptic weights. A synaptic weight is just a factor by which the synapse multiplies the input signal before it reaches the neuron. Each synapse can have a different weight, and in this way, each synapse can process its signal differently, either strengthening or weakening it.
The human brain is made up of billions of such neurons. These neurons are arranged into bigger groups that specialise in particular kinds of information processing. Some neurons are responsible for the processing of images from our eyes, while other parts of our brain specialise in memory, hearing, smell, the processing of speech or in controlling our muscles.
In 1957, Rosenblatt developed the first artificial neuron. It is actually a very simple computational device.
The perceptron, as it is called, is inspired by biological neurons. On the left side, you can see the inputs to the perceptron, which correspond to the dendrites of a biological neuron. Each input has its own synaptic weight. A synaptic weight is just a factor that we multiply the input signal by. It is easiest to think of these weights as little “volume dials” that regulate the strength of the input signal. So Weight 1 would be a number between 0 and 1 that is multiplied by Input 1. In this way, the weight can regulate the strength of the input between a minimum of 0 and a maximum of the value of Input 1. The same applies to all other inputs and weights. We will see in a moment why these weights are important.
If the sum of the weighted inputs exceeds a particular threshold, then the neuron will produce an output.
All these weighted inputs are then fed into a function that adds them all up. This is the main processing unit of the neuron. If the sum of the weighted inputs exceeds a particular threshold, then the neuron will produce an output on its output side (which corresponds to the axon). If the sum of the inputs is not strong enough (that is, if it is lower than the threshold value), the neuron will not fire but stay quiet, effectively “swallowing” its input signals.
The threshold, therefore, is a cut-off value. If the sum of the inputs is lower than the threshold, the neuron stays quiet. If the sum of the inputs is bigger than the threshold, the neuron will produce an output signal.
Sometimes it is useful that neurons do not only work in this binary way, where the output is “silence” or “signal”, or in other words, “0” or “1”. Instead, we might want the neuron to produce a signal that is in some way proportional to the sum of its inputs. In the image you can see that a particular function is used. This is called the sigmoid function. There are many functions that one can use to calculate the output signal from the sum of the input signals of the neuron. All have different properties and are suitable for particular kinds of applications.
But we don’t need to go into this detail right now.
A single artificial neuron does not yet do anything very interesting. In order to get artificial neurons to do something useful, we must connect them with each other, just as biological neurons are connected to each other in our brains.
In the picture, we see an example of a small artificial neural network that has eight neurons.
You see that the neurons are now arranged in different layers. First, we have an input layer