Obfuscating Language

Google announced that instead of solving bias issues in their algorithms, they will simply stop using words that make that bias explicit.

It is unclear how opting to not label an image with a word that denotes gender is going to solve any problem that is currently caused by the assumption of gender based on visuals.

So many forms in the modern world require us to tick gender, and we know that is has impact. Think of the to-do over the Apple Card and the men who received better terms and higher credit, based on being male. It was revealed that this was embedded in the algorithm. So sure, we can go back and remove gender markers from algorithms, but the question is, what are we trying to do here?

Men tend to have higher insurance rates, as they have more car accidents and deaths-by-misadventure. Which makes one wonder to what end we’ve all been reduced to maths, and where it is valuable and where it is not.

Either way, short of never calling out gender, and allowing it to have no impact on any calculations of any sort, refusing to allow an AI to choose the gender of a human in a photo seems a stop-gap measure.

I am reminded of a linguistics paper I was reading, on the gender of animals in nature, and the prevalence, in English, to call most animals ‘he’ unless we have clear evidence to the opposite. An interesting bias, given we don’t have a gendered language, such as a romance language, where this would be the default.

Duolingo, for example, always defaults male, in every sentence, and does not explain that, in the early levels of their romance language apps. If you see a cat and type gatta instead of gatto, it’s an error. Which is a pretty hefty bias, in my opinion. It would be lovely if they just swapped the default sentences, not just about cats, to the feminine.

Conflicted Phonemes

Laurence Abu Hamdan’s visually stunning project, Conflicted Phonemes, records the use of accent and language tests by the Dutch immigration service to validate or deny asylum claims.

Abu Hamdan, Conflicted Phonemes

Abu Hamdan, Conflicted Phonemes

It reminded me of the October 1937 mass killing of Haitians, on the island of Hispaniola, in which Dominican soldiers would ask people they suspected of being Haitian the name of the sprig — which they carried for this cause. Perejil, in Spanish, a world difficult for Haitians to pronounce properly. It is unknown how many Haitians were murdered, historians spread from 12,000 to 35,000, under the order of Trujillo. I read of this as a teenager, and it still sticks in my memory.

Abu Hamdan’s work is of similar emotional weight. Using language to sort people is not new, though those Somalis sent back due to these tests are often killed.

The tests themselves, as he notes, hinge on a couple of words, and particular accents. Thus language as a resource is not spread equally across communities and being born into a particular one may result in one’s loss of asylum, or life.

The visuals he uses, mapping the phonemes, the voices, and the outcomes are striking visually, and more so when one understands what is happening, and how language is being used against minority communities, against those who do not have documents, and those whose accents may not match the expectation of the interviewer.

The only thing I can imagine that could be worse with this, is to let an AI run it. Or maybe that won’t be worse? Except it would be difficult to create a database large enough, with enough voice and variety, to ensure adequate representation. Perhaps, though. Hmmm.

If language and accents as identified by humans, are being used to deny legitimate claims of asylum, perhaps this is a space where we can see greater justice with AI assistance.

Handheld rebel robots

“Researchers at the University of Bristol in the U.K. have developed a handheld robot that predicts a user’s plans, and then frustrates the user by rebelling against those plans, demonstrating an understanding of human intention.”

Demonstrating an understanding of human intention — more than we can say for most humans, on the conscious level. Studies show that we do react before our thinking brains get around to deciding what to do, though our thinking brains can over-ride the so-called animal brain. Interesting if we think we can create robots that successfully predict humans actions. And then thwart them.

The goal of this research, from the University of Bristol, is to better understand human machine cooperation, to thus better develop helper robots.

“This research is a new and interesting twist on human-robot research as it aims to first predict what users want and then go against these plans.”

Professor Mayol-Cuevas said: “If you are frustrated with a machine that is meant to help you, this is easier to identify and measure than the often elusive signals of human-robot cooperation. If the user is frustrated when we instruct the robot to rebel against their plans, we know the robot understood what they wanted to do.”

I’d be interested to have access to the speaking that occurs in this process, and assess the levels of anger, frustration in the voices, and any acts of violence that occur, in the humans engaged with the robots.

Language and politics

In the Sept 7, 2019 issue of the Economist, Johnson’s column, Johannes Aavik and his museum in Kuressaare, Estonia, are a concrete reminder that language is a political tool. This exists across all branches of linguistics and the philosophy of language, so this is, of course, nothing new.

Aavik, tells Johnson, coined more words that came to be used, than some undefined number, maybe not Shakespeare, but, many. Aavik set about coining new words from 1918 on, when Estonia declared its independence, after having been under the control of one neighbour or another for most of the territory’s life.

Aavik wanted words, “that sounded beautiful and seemed Estonian.”

While Aavik was part of a wave of nations and languages looking for purity, to ensure that their language was a reflection of their nation and culture, this desire has touched every aspect of cultures of control, from the churches and religions to minority populations, and more.

The once ‘standard’ languages, with their vernacular counterparts were a clear view of high and low culture, slowly dissolved by authors writing in the vernacular, and the languages ‘of the people’ taking hold over the languages of the elite. This has happened to certain degrees, depending on language and place. Some of the languages that are dying off are doing so because they are not given equal value to the minority languages, such as Spanish or English. The dying of a language is a complex mix of history, culture, and economics at the base.

Americans know less of this, as English swells to include words from other languages, slang, dialects, and other in-group words. The rapid advances in technology require more words, and the speed of the internet, particularly Twitter, shares these words around quickly.

In addition to meaning and culture and history, words have emotional valence, that can be personal or broader. The current state of American English is showing rapid shifts, as the President taints words which hold contextual meaning that may not dissolve anytime soon. For example, ‘crooked’ is tied to concepts of Hillary Clinton and depending on where one’s beliefs fall, to a scandal of illegality or a scandal of libel. It’s just one of many. Watching the news in the US, Fox vs CNN, one can watch words swell up with new meanings and underlying accusations and threats.

In research I ran in 2018, about the meaning of language in the realm of human rights, participants parceled out words by party, in one session. Freedom is for the right, justice for the left, and no matter which party they were in, each was tainted, held a hidden agenda and had lost meaning.

It’s hard to know what happens in a country when the concepts of freedom and justice have been politicized and are no longer shared concepts.

While there are many countries, which, for political reasons attempt to engineer a purist language, as Johnson notes, I’m not sure I’ve seen one inadvertently bifurcate language by party to such a degree. Two things come to mind, propaganda, and doublespeak. With the rise of populism and the increase in narrow lines of hate in politics and in the internet, it’s hard to imagine a shared language being possible, but if we cannot find a way to agree on the meaning of the core tenets of the existence of our country, its hard to imagine we can slow the divide and become unified with shared ideals.

Non-human languages: birds

I admit, I spend more time thinking about the evolutionary pathways that machines may take with language, if given enough autonomy, that I rarely write about the types of communications that exist in the world around us.

David G. Haskell’s beautiful article, Five Practice for Listening to the Language of Birds, reminds me that paying attention to the world is not enough.

I love that he writes of this as augmented reality:

The practice of listening to other species is the original “augmented reality.” In opening our minds to the language of species, we experience connection and meaning that far transcend anything offered by electronic simulacra. Why so deep? Because attending to the tongues of other species is our inheritance, bequeathed by a lineage of ancestors extending back hundreds of millions of years. Every one of these grandmothers and grandfathers lived in attentive relationship with the sounds of other species, the diverse conversation of the living Earth.

Though at the same time, I wish he wouldn’t because this is simply reality. Perhaps more pleasurable is the phrase, “ecological polylinguists” which is exactly what we are, and what other species are as well. Perhaps the greatest difference is that many humans pretend we are not, that the languages of other species, of the earth herself, are not as relevant as the languages we speak.

Language is not the only way in which we do not hear the communications of our planet. Scent is another crucial pathway of information. And I like to believe that the magnetic field system, which we humans are not good at, but other species are, is another source of enormously important knowledge.

machine translation from the ancient world

The MIT Technology Review recently posted an article touting the success of machine learning in the translation of long-last languages. Think Linear-A or Linear-B. These are languages found on ancient tablets, from the Minoan civilizations. Like many ancient languages, they were untranslated for a long time, almost two millennia, in this case.

In the non-machine-assisted manner, you have a man of a different culture and a different time looking at the language pattern and what is known of the culture, and interpreting the symbols in a manner that is considered translation. Language IS culture, so if you don’t know the culture, you probably cannot really translate the language. If you think of English today, and the different varieties, and the jargons, one can see that you might have words you recognize, but you may, in context, have no idea what they mean. And this is in a world you largely recognize — though maybe not.

I often point to the images of the Sumerian gods, when asked about translation, and when asked about machine translation, I did the same. Take the following image, and imagine, if you will, that this person lived on Earth, had wings, carried pail, wore a great skirt, and had two magical flower watches. Place yourself within your six-year old mind and the stories you could tell about what this person could do. Now imagine you are a British gentleman from the 1800s. What stories would he tell? And as you are, today, now? We’ve thrown out so many possible interpretations of the image, because we don’t believe in things now, that may have been believed then. And we certainly don’t all believe the same things, today.

So back to machine translation, and machine learning. Someone has to input the culture, or, alternately, ignore that there may be a variance. This article is about statistical analysis of a language structure, and makes the assumption that it can apply to more than one, and that this is meaningful, and that the output may approximate a reality we cannot know.

Another way to think about this, I translate for you, something from English to French, and I tell you it is true in both languages, because I know it to be true, because it matches the pattern.

Scholarly article is here. They speak of ‘decipherment of lost languages’ in which they are looking for cognates. They seem to assume success with their correct translation of 63.7% of the cognates. Last year I did some linguistic research for the McCain Institute, on the meaning of human rights. In our research, done in the US, across demographics, people couldn’t agree on the meaning of human rights, equality, equity, justice, freedom, and other such words. Yes, these are concepts that can be complex, but the degree of variation in meaning was surprising, and I’ve been doing this kind of research for a long time. Part of the issue seems to be that the meanings are changing very fast right now, combined with the media and fake news and all these other pressures on language and on culture. A corpus analysis of usage in the media shows enormous variation in the past few years, in usage, structure, and domains. So what I am saying, is we can’t do this for English from 2019 to 2009.

Yet here is research, saying they’ve derived grammar from probabilities, so here we have a statistical approach to meaning. And as anyone that ever reads this knows, I don’t agree that we can model culture mathematically, nor that we can understand or know the past when there were no living speakers to tell us what concepts meant. Our view is reductive, especially post-enlightenment.

Here’s my fellow, crossing the space-time barrier with his fancy beard and his magic pine cone.

Non-magnetic comms networks and the evolution of thought

While I haven’t been writing, I have still been watching. The world of AI continues to be interesting, with updates and changes, though we are still arguing sentience and intelligence, still using dirty and biased data sources, and still arguing if the demise of humans is on the way, at the hands of these machines too bright and too amoral to let us live.

A recent article that suggests that certain humans do have sensory ability to perceive the magnetic field of the earth got me thinking about whether or not electromagnetism could be a communications network/pathway for the many, many species who can sense it, many in very sophisticated manners. What if there is a whole world happening we’ve never seen? What would it be to tap into it? Which makes me wonder, as well, as a world and species that have evolved on a planet with terrestrial magnetism, what happens if we evolve without it? What would a non-magnetic sentience look like? Could this be what the AIs could be? Not that I actually care about artificial intelligences, specifically, rather I am curious about language evolution in non-human species, about transmission of information, language, and culture, and the ways in which we are encoding the past into a future by our unthinking choices in our view of the present. Granted, on that last bit, we think more now than two years ago or five years ago or ten years ago, but if all of our data comes from the textual residue of the past few hundred years, well, not good, I think, just not good.


Caliban, revolution and human rights

I’ve gone farther backwards, this time, in two directions, digging in to the origins of language, of machine language, of language machines, and the possibilities of future languages, as yet undone.

The origin of human language is always complicated, and no one has ever found an answer that can be agreed up. The French decreed that evolutionary linguistics was a forbidden topic in the 18th century, and there was quite a pause before people took it up again.

And in a bookshop last week I found Minsky’s 1968 Semantic Information Processing, which has me reading the past. Symbolic logic, cybernetics, minimal self-organizing systems, artificial intelligence, machine modeling of human behavior. It’s interesting what we have chosen to bring to the present with us. And how much Chomsky there is. Not surprising, but this is no longer a Chomskian world, I would say.

Because my interests combine language and culture, I care less about why humans have language and more about what that means, both day to day in usage, and over time.  The current American world, in this respect, is astonishing. Watching cable news, day by day, the words morph and shift in the mouths of speakers, meaning the opposite one day, then again the next. The speed is astonishing. In both my education and my life I can never recall language moving so quickly, and being so forcefully used to cleave.

But then, we all know, human memory is faulty.

I found myself going back to quotes about language, famous ones, and wondering what they would mean if uttered by or applied to, languages spoken between machines. To make this non-threatening, apply these thoughts to C3P0 and R2D2.

“Language, that most human invention, can enable what, in principle, should not be possible. It can allow all of us, even the congenitally blind, to see with another person’s eyes.”

That is Oliver Sacks. We give machines the ability to modify what we teach them. They modify language, then, is this no longer, the most human invention? Or have is the machine human, does it become human, once it has language?

In earlier work I did, on dying and dead languages, it is a breach of human rights to outright kill a language. This is meant for minority populations. If two machines speak a modified language together, and a human kills it off, by re-programming, by shutting down the machines, however–is this a breach of human rights?

Remember, there was a day when half the humans on this planet did not qualify as human.

We we travel from the other side, what rights does a language have, what rights does a machine have, we do find violations. But how to validate these violations? How to even understand if they apply, outside of the thought exercise?

AI, ethics, and culture

The mission of the new MIT – IBM Watson AI Lab seems to assume that ethics are a form of applicable math, that they have natural laws, and can be understood and applied without undue complexity. The mission here

The collaboration aims to advance AI hardware, software, and algorithms related to deep learning and other areas; increase AI’s impact on industries, such as health care and cybersecurity; and explore the economic and ethical implications of AI on society.

is followed by a note that there will be calls for proposals from those affiliated with either of the institutions, in these key areas:

  • AI algorithms: Developing advanced algorithms to expand capabilities in machine learning and reasoning. Researchers will create AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust, continuous learning. Researchers will invent new algorithms that can not only leverage big data when available, but also learn from limited data to augment human intelligence.
  • Physics of AI: Investigating new AI hardware materials, devices, and architectures that will support future analog computational approaches to AI model training and deployment, as well as the intersection of quantum computing and machine learning. The latter involves using AI to help characterize and improve quantum devices, and researching the use of quantum computing to optimize and speed up machine-learning algorithms and other AI applications.
  • Application of AI to industries: Given its location in IBM Watson Health and IBM Security headquarters in Kendall Square, a global hub of biomedical innovation, the lab will develop new applications of AI for professional use, including fields such as health care and cybersecurity. The collaboration will explore the use of AI in areas such as the security and privacy of medical data, personalization of health care, image analysis, and the optimum treatment paths for specific patients.
  • Advancing shared prosperity through AI: The MIT–IBM Watson AI Lab will explore how AI can deliver economic and societal benefits to a broader range of people, nations, and enterprises. The lab will study the economic implications of AI and investigate how AI can improve prosperity and help individuals achieve more in their lives.

The last one is where they note that ethics lives, and it does not seem integral to the research in all areas.

Also to note, there is no nuanced or even truly noted inclusion on the difficulties of ethics, of cultural relativity, nor about the cultures both of the teams of creation, but where these advances will be put into the world.

And of course at the end, they note that a key aspect is commercialization.

Another example of enormous dollars being put to something which will fundamentally changes the ways in which humans and machines function in the world, without seeming to desire to understand how that will change the world, and what this means. The ‘delivery of … benefits’ sounds secondary to the creation of commercial enterprises and new technologies. The good will be an added benefit, if it comes, and the structure of the technology is not focused on beginning with clean data and understanding of sociological contexts in which it is going.  Engineering-heavy organizations, those who dictate the future of the 21st century.