It’s been a year, since I was here, apologies. But here we go again!
I’ve gone farther backwards, this time, in two directions, digging in to the origins of language, of machine language, of language machines, and the possibilities of future languages, as yet undone.
The origin of human language is always complicated, and no one has ever found an answer that can be agreed up. The French decreed that evolutionary linguistics was a forbidden topic in the 18th century, and there was quite a pause before people took it up again.
And in a bookshop last week I found Minsky’s 1968 Semantic Information Processing, which has me reading the past. Symbolic logic, cybernetics, minimal self-organizing systems, artificial intelligence, machine modeling of human behavior. It’s interesting what we have chosen to bring to the present with us. And how much Chomsky there is. Not surprising, but this is no longer a Chomskian world, I would say.
Because my interests combine language and culture, I care less about why humans have language and more about what that means, both day to day in usage, and over time. The current American world, in this respect, is astonishing. Watching cable news, day by day, the words morph and shift in the mouths of speakers, meaning the opposite one day, then again the next. The speed is astonishing. In both my education and my life I can never recall language moving so quickly, and being so forcefully used to cleave.
But then, we all know, human memory is faulty.
I found myself going back to quotes about language, famous ones, and wondering what they would mean if uttered by or applied to, languages spoken between machines. To make this non-threatening, apply these thoughts to C3P0 and R2D2.
“Language, that most human invention, can enable what, in principle, should not be possible. It can allow all of us, even the congenitally blind, to see with another person’s eyes.”
That is Oliver Sacks. We give machines the ability to modify what we teach them. They modify language, then, is this no longer, the most human invention? Or have is the machine human, does it become human, once it has language?
In earlier work I did, on dying and dead languages, it is a breach of human rights to outright kill a language. This is meant for minority populations. If two machines speak a modified language together, and a human kills it off, by re-programming, by shutting down the machines, however–is this a breach of human rights?
Remember, there was a day when half the humans on this planet did not qualify as human.
We we travel from the other side, what rights does a language have, what rights does a machine have, we do find violations. But how to validate these violations? How to even understand if they apply, outside of the thought exercise?
The mission of the new MIT – IBM Watson AI Lab seems to assume that ethics are a form of applicable math, that they have natural laws, and can be understood and applied without undue complexity. The mission here
The collaboration aims to advance AI hardware, software, and algorithms related to deep learning and other areas; increase AI’s impact on industries, such as health care and cybersecurity; and explore the economic and ethical implications of AI on society.
is followed by a note that there will be calls for proposals from those affiliated with either of the institutions, in these key areas:
- AI algorithms: Developing advanced algorithms to expand capabilities in machine learning and reasoning. Researchers will create AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust, continuous learning. Researchers will invent new algorithms that can not only leverage big data when available, but also learn from limited data to augment human intelligence.
- Physics of AI: Investigating new AI hardware materials, devices, and architectures that will support future analog computational approaches to AI model training and deployment, as well as the intersection of quantum computing and machine learning. The latter involves using AI to help characterize and improve quantum devices, and researching the use of quantum computing to optimize and speed up machine-learning algorithms and other AI applications.
- Application of AI to industries: Given its location in IBM Watson Health and IBM Security headquarters in Kendall Square, a global hub of biomedical innovation, the lab will develop new applications of AI for professional use, including fields such as health care and cybersecurity. The collaboration will explore the use of AI in areas such as the security and privacy of medical data, personalization of health care, image analysis, and the optimum treatment paths for specific patients.
- Advancing shared prosperity through AI: The MIT–IBM Watson AI Lab will explore how AI can deliver economic and societal benefits to a broader range of people, nations, and enterprises. The lab will study the economic implications of AI and investigate how AI can improve prosperity and help individuals achieve more in their lives.
The last one is where they note that ethics lives, and it does not seem integral to the research in all areas.
Also to note, there is no nuanced or even truly noted inclusion on the difficulties of ethics, of cultural relativity, nor about the cultures both of the teams of creation, but where these advances will be put into the world.
And of course at the end, they note that a key aspect is commercialization.
Another example of enormous dollars being put to something which will fundamentally changes the ways in which humans and machines function in the world, without seeming to desire to understand how that will change the world, and what this means. The ‘delivery of … benefits’ sounds secondary to the creation of commercial enterprises and new technologies. The good will be an added benefit, if it comes, and the structure of the technology is not focused on beginning with clean data and understanding of sociological contexts in which it is going. Engineering-heavy organizations, those who dictate the future of the 21st century.
Really, I like to think that Orcas already are fluent in human speech, and have just been not speaking to us, realizing that once they opened that door there would be no closing it.
The history of modern human science assumes that humans are the smarter species in what seems like all cases. I’m going to prefer the creepy AMY orca calls is the orca finally tired of humansplaining.
Does this description:
Humanoid form and flexibility – SecondHands will feature an active sensor head, two redundant torque controlled arms, two anthropomorphic hands, a bendable and extendable torso, and a wheeled mobile platform.
Match this image:
It is the wheeled mobile platform. To me, this robot should not have legs, it should look more like the maid from The Jetsons.
Reading the Bloomberg article on nlp comprehension.
Alibaba Group Holding Ltd. put its deep neural network model through its paces last week, asking the AI to provide exact answers to more than 100,000 questions comprising a quiz that’s considered one of the world’s most authoritative machine-reading gauges. The model developed by Alibaba’s Institute of Data Science of Technologies scored 82.44, edging past the 82.304 that rival humans achieved.
What is notable to me is that in this instance, these questions can only have one answer, to be correct.
The quiz itself is based on wikipedia articles. Remember when you would never let your students use wikipedia as a source?
As the Bloomberg article notes, NLP ‘mimics’ human comprehension. The underlying belief is that the machines can answer objective questions.
“That means objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines,” Luo Si, chief scientist for natural language processing at the Alibaba institute, said in a statement.
Functionality and thus comprehension and correctness is based on a binary model of knowledge, and is using wikipedia for the source of correct. Much about that sentence is complicated, from my perspective. The binary model of correctness allows for no nuance, and is based on those who have the power to control the narrative. No alternate views, no other models.
It reminds me of taking standardized tests, when none of the answers seemed exactly correct, and I spent my test taking time trying to imagine which one the test makers believed to be correct. I was forced to fit into the culture of the creators of the exam. Extending this out to what it means that machines ‘know’ and allowing them to provide authoritative answers seems reductive, dangerous, and seems to be moving ahead apace.
As we sit here this morning, I am reading of Berber languages, and W is reading of Sumer and Akkadian.
Berber, and the Tamasheq variant that particularly interests me, has had a long life as an oral language. Sumerian is one of the first known written languages, and while there is a sample of someone reading Gilgamesh in Akkadian, it was a language that was dead long ago, and we modern humans really do not know what it sounds like. The recreation, however, is beautiful.
Many of the languages which die off are oral languages, the last speakers die, and thus the language goes with it. This has long been a concern of linguists, and the popular press doesn’t seem to make it through a year without a piece about it as well.
The rise of social media based on images, the use of video, and the use of emoji are all interesting language shifts at play now. It is difficult to make any long ranging assumptions, but that makes it no less interesting than to watch younger demographics (in particular) prefer to engage with English in its oral form. Not just the in-person conversations that have always existed (and it may be argued that the in-person is diminishing) but the endless youtube videos and channels with millions of followers. The language variations spoken by many of these English speakers are certainly not the written language that we find in standard texts, lexically and grammatically. The use of emoji shifts English to a pictographic language, rather than a symbol corresponding to a sound, it corresponds to a concept or an idea.
There are thus, interesting ideas about the future of the visual language, both photographic and iconic/ideographic, which I am not going to touch at this time.
What I wonder however, if there will be an orality of language that is prioritized in the future, that shifts the current power and status dynamic in which unwritten languages are a lesser language, an archaic form from a culture which has ‘failed to develop’ despite the many ways in which the more oral languages do have advantages in a cultural context.
Imagining a world in which oral English is how stories are shared, that this access to the storytellers is required, beyond books, to belong, to understand, is a world we have, perhaps, never lived in, not in the modern English that we speak now. It would have been centuries since English was predominately oral, and it was an earlier version of English. Back to the time of the bards, except this time around, our bards will be digital.
To start, each constructs bilingual dictionaries without the aid of a human teacher telling them when their guesses are right. That’s possible because languages have strong similarities in the ways words cluster around one another. The words for table and chair, for example, are frequently used together in all languages.
So how does the computer know that table and chair are often used together? What about cultures that do not have chairs, but do have tables? How is the computer mapping co-occurances that have a significant variation by culture, or simply do not exist?
This article uses Chinese and Arabic as the example languages for mapping. The underlying cultural principles are rather different, for community, behaviors, constructs, and as these are mapping IN language, does one become more like the other? Does the machine create an Arabic with Chinese sentiments? [The papers use French and English, which are much more similar, culturally and linguistically.]
Treating language like math is not going to turn out well. Though I haven’t yet read the papers that support these assumptions.
Back to my regularly repeated statement: machine translation and language that does not address the significant cultural components of language, as communication, as culture, as transfer mode of ideology, will, in the end, create a different or new culture, and now would be a very good time to be paying more attention to this.
In a recent Atlantic article “Should Children Form Emotional Bonds With Robots” this:
Shut your phone off, and Cozmo shuts down too.
Another example of human affordances to the machine. In order to have your robot work, you must leave your phone on at all times.
Every morning, of late, when I read the news, there is a slew of headlines of what AI has done for us lately.
Just this morning, I read:
- AI has designed halloween costumes
- AI uncovers anti-aging plant extracts
- AI physicists solve problems humans cannot
- AI robot gets Saudi citizenship
- AI will be the new electricty
Robert Wickham of Salesforce is the source of the last statement, that AI will be the new electricity, once we are done oohing and ahhing. Or being afraid that we will all lose our jobs.
AI, however, is not like electricity. It is not so straight forward. While it may, eventually, be ubiquitous and unconsidered, so far we cannot provide a single and clear definition for what it is, and thus these reductive metaphors create greater confusion than clarity.
In each article ‘AI’ describes something different. Deep learning, neural networks, robotics, hardware, a combination, etc. Even within deep learning or neural networks, the meanings can be different, as can the nuts and bolts. Most media and humans use ‘AI’ as shorthand for whatever suits their context. AI, without an agreed upon definition, but the lack of clarity, differentiation, and understanding does make it very difficult to discuss in a nuanced manner.
There is code, there is data, there is an interface–for inputs and outputs, and all of these are (likely) different in each instantiation. Most of the guts are proprietary, in the combination of code and data and training. So we don’t necessarily know what makes up the artificial intelligence.
Even code, as shorthand to a layperson, as the stuff that makes computers do what they do, is a broad and differentiated category. In this case, like language, it is used for a particular purpose, so this reduction is perhaps not as dangerous. We’ve never argued that code is going to take over the world, or that rogue code is creating disasters. As compared to algorithms, a few years ago, and AI, now.
So much of this lumping is a problem? We lump things, such as humans or cats, into categories based on like attributes, but we do have some ways to differentiate them. These may not be useful categories, nationality, breed, color, behavior, gender. (Even these are pretty fraught of late, so perhaps our categorization schemes for mammals needs some readdressing.) On the other side, we could consider cancer, an incredibly reductive title for very a broad range of…well of what? Tumor types? Mechanisms? Genetic predispositions? There are discussions, given recent research, as to whether cancer should be a noun, perhaps it is better served as a verb. Our bodies cancer, I am cancering, to show the current activity of internal cellular misbehavior.
What if we consider of this on the intelligence side, how do we speak of intelligence, artificial or otherwise? For intelligence, like consciousness, in humans, we do not have clear explanations for what it is or how it works. So not perhaps the simplest domain to borrow language from, and apply it to machines. Neural networks is one aspect, modeled on human brains, but it is limited to the structural pathways, a descriptor of how information travels and is stored.
The choice to use AI to represent such a broad range of concepts, behaviors, and functions concerns me. Even in the set of headlines above, it is difficult to know what is being talked about, from a continuum of visible outputs to robots who speak. If we cannot be more clear about what we are discussing it is incredibly complicated to make clear decisions, about functions, about inputs, about outputs, data, biases, ethics, and all the things which have broad impacts on society.
I haven’t seen clear work on how we should use this language, and though I worked with IBM Watson for a while on exactly this concern, I can’t say I have a strong recommendation for how we categorize not just what we have now, but, as importantly, what is being built and what will exist in the future. The near future.
I’ll work on this later, as in soon, ways in which to talk about these parts in a public context that are clearer, and allow for growth of creations into a systems model. Check back!