What’s strange about these write-ups is they don’t mention much about what the brain does. We know it has dedicated portions for memory, from immediate needs to long-term storage. It also ties specialized areas into that memory to feed it and be checked by it.
The natural thought should be: can we do this with neural networks? And what have people made that works like the hippocampus? And, since content-addressable memories solve similar problems, can we do that with neural networks?
If you look into that, then you find that people have built these things. Now, ML engineers just need to build open-source prototypes that let more people experiment with them. I’d especially like to see them used where we can assess if they mitigate hallucinations. They might be great for synthetic, data generation, too.
Well, part of the reason that this writeup doesn't mention the brain is that the author is ignorant of precise details of how the brain works, and chose not to speculate :) Which is exactly why these papers are such a useful reading list. Cheers!
It's fascinating that we have had so many attempts to add memories to NNs, but it was the end-to-end-learned one based on textual context that eventually drove us to models that have emergent reasoning abilities.
RNNs basically have a memory, in the form of their hidden state, but it's been hard to train them efficiently in parallel. There may be a better way though, if we can.
I’ll add two things to “drive us to models that have emergent, reasoning abilities:”
1. It didn’t have that until they spent around $30 million pre-training a specific architecture (GPT3-176B) on TB’s of data with portions having built-in, reasoning steps it could extrapolate. Doing the same with the other architectures might have led to similar, better, or worse results. GPT3’s training investment, not Transformer architecture itself, led to seeing them as reasoning engines.
2. The memory-less architectures also had lots of hallucinations. Another submission said they’re like professional liars who also don’t know when they’re lying. It used to hurt me a lot with even GPT4-generated code. What reduces that, especially simply using huge models, drives pre-training cost back up to exorbitant levels.
So, if you have tens of millions, you can pretrain models for reasoning that also confidently lie to us. The open question is whether that’s built into the prevailing architectures. I think it is given brains’ performance. Also, how specific components of brains being damaged causes hallucinations. Brain architecture has layers of mitigations for it.
If integrating memory, we might see architectures with few to no hallucinations. That can be combined with other techniques already in use. They might also be cheaper to train, too, if they build on what they remember. That’s a bigger maybe.
Yes! What fun to read and I think you are pointing in a smarter direction.
Memory as recursion and the key to step toward AGI.
The trick is recursive self-control of attention while being battered by 1001 input streams. Perhaps better to think about “input” as interrupt requests that a “self” must evaluate and usually ignore to stay on tasks: Making all of those sandwiches!
Bodies may be essential soon along with all of the hard knocks of selection as motivation to memorize, learn, adapt.
Here is a great book that Terry Winograd and Fernando Flores read very carefully:
“Autopoiesis and Cognition: The Realization of the Living” by Maturana and Valera (1980).
This intense book is an axiomatic approach to life and cognition. Enactivist philosophers are now extending their original insights:
1: Terrence W. Deacon: Incomplete Nature
2. Evan Thompson: Mind in Life
3. Luis Favela: The Ecological Brain: Unifying the Sciences of Brain, Body, and Environment
4: Alva Noë: Out of Our Heads
5: Terry Winograd and Fernando Flores: Understanding Computers and Cognition
6: Douglas Hofstadter: I Am a Strange Loop
Pretty sure that Demis Hassabis, Rich Sutton, and Karl J. Friston. Like Maturana and Valera, all three of them have strong backgrounds in neuroscience.
What’s strange about these write-ups is they don’t mention much about what the brain does. We know it has dedicated portions for memory, from immediate needs to long-term storage. It also ties specialized areas into that memory to feed it and be checked by it.
The natural thought should be: can we do this with neural networks? And what have people made that works like the hippocampus? And, since content-addressable memories solve similar problems, can we do that with neural networks?
If you look into that, then you find that people have built these things. Now, ML engineers just need to build open-source prototypes that let more people experiment with them. I’d especially like to see them used where we can assess if they mitigate hallucinations. They might be great for synthetic, data generation, too.
https://pmc.ncbi.nlm.nih.gov/articles/PMC1074338/
https://elifesciences.org/articles/77185
https://perso.uclouvain.be/michel.verleysen/papers/jssc89mv....
https://proceedings.mlr.press/v162/sharma22b/sharma22b.pdf
https://apps.dtic.mil/sti/tr/pdf/ADA192716.pdf
https://www.cell.com/iscience/fulltext/S2589-0042(23)02448-3
Well, part of the reason that this writeup doesn't mention the brain is that the author is ignorant of precise details of how the brain works, and chose not to speculate :) Which is exactly why these papers are such a useful reading list. Cheers!
It's fascinating that we have had so many attempts to add memories to NNs, but it was the end-to-end-learned one based on textual context that eventually drove us to models that have emergent reasoning abilities.
RNNs basically have a memory, in the form of their hidden state, but it's been hard to train them efficiently in parallel. There may be a better way though, if we can.
I forgot one tech that’s actually commercialized by lamini.ai:
https://arxiv.org/abs/2406.17642
https://medium.com/pythons-gurus/mixture-of-memory-experts-l...
Re your reply
I’ll add two things to “drive us to models that have emergent, reasoning abilities:”
1. It didn’t have that until they spent around $30 million pre-training a specific architecture (GPT3-176B) on TB’s of data with portions having built-in, reasoning steps it could extrapolate. Doing the same with the other architectures might have led to similar, better, or worse results. GPT3’s training investment, not Transformer architecture itself, led to seeing them as reasoning engines.
2. The memory-less architectures also had lots of hallucinations. Another submission said they’re like professional liars who also don’t know when they’re lying. It used to hurt me a lot with even GPT4-generated code. What reduces that, especially simply using huge models, drives pre-training cost back up to exorbitant levels.
So, if you have tens of millions, you can pretrain models for reasoning that also confidently lie to us. The open question is whether that’s built into the prevailing architectures. I think it is given brains’ performance. Also, how specific components of brains being damaged causes hallucinations. Brain architecture has layers of mitigations for it.
If integrating memory, we might see architectures with few to no hallucinations. That can be combined with other techniques already in use. They might also be cheaper to train, too, if they build on what they remember. That’s a bigger maybe.
Yes! What fun to read and I think you are pointing in a smarter direction.
Memory as recursion and the key to step toward AGI.
The trick is recursive self-control of attention while being battered by 1001 input streams. Perhaps better to think about “input” as interrupt requests that a “self” must evaluate and usually ignore to stay on tasks: Making all of those sandwiches!
Bodies may be essential soon along with all of the hard knocks of selection as motivation to memorize, learn, adapt.
Here is a great book that Terry Winograd and Fernando Flores read very carefully:
“Autopoiesis and Cognition: The Realization of the Living” by Maturana and Valera (1980).
This intense book is an axiomatic approach to life and cognition. Enactivist philosophers are now extending their original insights:
1: Terrence W. Deacon: Incomplete Nature
2. Evan Thompson: Mind in Life
3. Luis Favela: The Ecological Brain: Unifying the Sciences of Brain, Body, and Environment
4: Alva Noë: Out of Our Heads
5: Terry Winograd and Fernando Flores: Understanding Computers and Cognition
6: Douglas Hofstadter: I Am a Strange Loop
Pretty sure that Demis Hassabis, Rich Sutton, and Karl J. Friston. Like Maturana and Valera, all three of them have strong backgrounds in neuroscience.
Wonderful! Thanks, I'll definitely read the Maturana and Valera work, would be good to work through this line of thinking.