One of today’s most overused buzzword is “Artificial Intelligence”. Both technical and general press is full of articles talking about machines that compose music, paint masterpieces, recognize cats, drive autonomous cars and invent new languages. Many also talk about intelligent machines being a threat to humanity. Machine Learning is an essential part of the AI puzzle and Deep Learning is one of the most popular approaches to implement Machine Learning.
Interestingly, Deep Learning is not new. Geoffrey Hinton demonstrated the use of back-propagation of errors for training multi-layer neural networks in 1986, more than 30 years ago. Even earlier, in the 60’s, Kelley, Bryson and Ho published research papers about dynamic optimization which many consider as the basis for back-propagation. Generations of researchers have shown that, given enough data, neural networks can be trained to recognize things. This training consists in slow, progressive, iterative adjustments that allow the network to progressively configure itself to produce the desired answer. Deep Learning is not new but it recently became popular because of the availability of GPU/TPU/VPU architectures which offer some level of parallelism and therefore deliver acceptable performance for some applications.
Based on a totally different technology called RBF neural networks, Cogito Instruments has developed and is selling products that also enable machine learning and give machines the ability to use their learned knowledge to recognize things without the need to learn from back-propagation of errors. Cogito Instruments uses a technology which is not new either. The principles of this technology were formulated in 1988 by Broomhead and Lowe. It was then implemented in hardware on the IBM ZISC in 1993. Later, Guy Paillet, who was also at the origin of the IBM ZISC, and Anne Menendez implemented the CM1K chip which can learn up to 1024 patterns and perform fast learning and classification of patterns. This technology, called NeuroMem® is at the heart of Cogito Instruments products.
What are the key differences between Deep Learning and RBF-based solutions and why is Cogito not using Deep Learning?
Online vs Offline learning
Deep Learning is an offline learning process. The learning phase and the execution (inference) phase are separate and, very often, are not even processed on the same machine. Typically, the learning phase happens in a data center. A massive data set is crunched to generate a neural network. This takes huge computing resources and can take days depending on the size of the data set and the number of levels in the network. Once the network has been generated, it can be executed to perform the required recognition tasks. Such inference execution can sometimes be achieved on relatively low power devices (Intel-Movidius or Nvidia Jetson are good examples of embedded inference processing platforms that are not capable of embedded learning). More often, powerful PCs with GPU accelerators are used, leading to significant cost and power consumption. Moreover, as the training dataset grows during the learning phase, there is no guarantee that the target inference hardware will remain sufficient and users may have to upgrade their inference hardware to execute properly after a new network has been generated during the learning phase. In a way, this is similar to the PC world where you have to upgrade your hardware regularly if you want to run the newest games. This continuous and fast upgrade cycle is driving a healthy consumer business but is totally unacceptable in an industrial environment.
The most important limitation of this approach is that new training data cannot be incorporated directly and immediately in the executable knowledge. In a fairly static environment where the training data is not changing often, this may not be a problem. For example, speed limit road signs are always the same so you don’t need to learn new ones dynamically.
However, in an industrial environment, novelty is very common. New components, new suppliers, new configurations happen almost every day and it’s critical to be able to train an industrial machine dynamically, just like we train operators, on-the-job. In fact, modern manufacturing techniques tend to encourage novelty with smaller volumes per products and higher level of customization.
Cogito Instruments’ approach, using NeuroMem® neuromorphic technology, solves that problem. Training can happen on-line, at any time, dynamically. In addition, unlike Deep Learning networks, RBF networks are free of convergence problems and they can be easily mapped on hardware because the structure of the network does not change with the learned data. This ability to map the complete network on specialized hardware allows RBF networks to reach unbeatable performances in terms of speed and power dissipation both for learning and recognition.
In contrast, any other Neural Network solution based on back propagation of errors for learning needs to be mapped (and remapped after each learning process) on programmable hardware (CPU, GPU or FPGA with specific hardware assist or not), which is a lot costlier in terms of complexity and power dissipation. Deep Learning is fundamentally a software technology which requires powerful, expensive and power consuming hardware to achieve reasonable levels of performance. It often also requires a fair amount of hand coding and tuning to deliver useful performance on the target hardware and is therefore not easily portable.
Local vs Remote learning
Another issue with Deep Learning is that data is usually crunched in a data-center which means that it is handled on someone else’s computer. This may create confidentiality or security issues. Many industrial customers prefer to keep their precious data local. The data used by industrial customers is very sensitive because it may contain their process and quality secrets. Ownership and control over this data is usually very critical to their business. With Cogito Instruments’ approach, precious data stays local. It is learned and then recognized on the same machine, in a totally controlled environment. This gives the ability for the domain experts to train the machines themselves without having to outsource the training process to IT specialists who do not necessarily understand the meaning behind the data. The domain expert has a lot more control over the training of the machine and has full control over the qualification of that machine and its release to production.
Additive learning vs Destroy-and-Re-build learning
When a Deep Learning based system needs to learn something new, it needs to forget everything it knows and learn from scratch, based on the new dataset. In a way, it’s similar to “old manufacturing” style in which you have to break the old mold and build a new one if you want to have a different plastic casing. In our modern world of additive manufacturing and flexible production chains, it is paradoxical to introduce a machine learning technology which is not additive in nature.
Besides the lack of flexibility, this creates another potential problem. When a Deep Learning based system “batch-learns” from a new incrementally better dataset, there is no guarantee that previous results will be maintained. In an inspection system, parts that were good before may be bad now and vice versa. When Cogito Instruments’ products learn something new, prior knowledge is not forgotten. It is still there and still produces the same results on known valid data. The result of learning is that what was unknown before becomes known and therefore classifiable. RBF learning is an additive adaptive process, unlike Deep Learning.
It is also important to note that Deep Learning requires a lot of training data to produce acceptable results whereas Cogito systems can usually be trained with few tens or hundreds of vectors in most applications. Even with minimal training, the NeuroMem® RBF classifier will output the closest match along with a confidence factor. It is also capable of pinpointing uncertainties and unknowns therefore enabling dynamic learning. Redundant RBF classifiers can also work in parallel using different features to produce more robust decisions. The ability to detect unknown situations is essential for the implementation of anomaly detection in predictive maintenance applications.
The bottom line for machine makers and machine users is that the marginal cost of increasing the capability of a machine is much lower with Cogito-enabled machines than with Deep Learning.
Predictable recognition latency
Thanks to the NeuroMem® hardware architecture, Cogito products process the vectors in a fully parallel way and therefore produce results in a predictable and constant number of clock cycles, whatever the size of the dataset is. This is particularly important if the recognition is part of a control loop for example. In this case, having a constant latency is critical for the stability of the whole system. For all industrial applications, low and constant latency is a very desirable feature because it guarantees high and predictable productivity.
With Deep Learning, latency varies. Typically, the more the system learns, the slower it gets. This is due to the Von Neumann architecture bottlenecks found in all computers which are sequential by nature. Even the most modern multi-core architectures, even the best GPU/TPU/VPU architectures, have limitations to their parallelism because some resources (cache, external memory access bus, …) are shared between the cores and therefore limit their true parallelism. The neuromorphic architecture used in Cogito products goes beyond the Von Neumann paradigm and, thanks to its in-memory processing and fully parallel nature, does not slow down when the training dataset grows.
In addition, the shallow nature (3 levels) of RBF networks is not a disadvantage for such applications as researchers have shown that 3 layers are sufficient to solve any pattern classification problem. The quality of the recognition is therefore not compromised.
Deep Learning produces results and, most of the time, it produces good results. However, it is very difficult to explain how the machine got to that result. The deep, multi-layer nature of Deep Learning neural networks makes it very hard to justify a decision. This may even have legal implications considering new General Data Protection Regulations that will be enacted next year (see the link to Fabio Ciucci’s article below). From a more practical standpoint, the much more transparent and traceable results produced by RBF classifiers are an advantage when trying to understand what went wrong. Cogito products offer full transparency and it is always easy to trace which vector triggered which neuron in the neural network, thus allowing to pinpoint erroneous or contradictory training in case of unexpected false positive.
Reverse engineering a decision taken by a machine (or a person) is indeed very useful when performing root cause analysis or, more generally, when auditing a process. The “black-box” nature of Deep Learning networks is something researchers and engineers might be fascinated with but it is likely not welcome for managers who must manage and minimize risks.
Deep Learning is an exciting field of research and it has produced amazing results in many Cloud-based applications where its limitations are not critical. However, in an industrial, real-time, high productivity, high predictability but high flexibility environment, Cogito Instruments considers that Deep Learning is not the best approach to solve the machine learning problems the market is facing for inspection, monitoring, maintenance and robotics applications. In fact, any environment which needs dynamic on-the-job learning, fast and predictable latency, easy auditing of decisions is likely to be better served by the RBF neural networks offered by Cogito Instruments rather than by Deep Learning neural networks.
At Cogito Instruments, we look forward to working with our partners to bring exciting Machine Learning solutions to the market!
The Cogito Instruments team