Natural Language Generator: a double-faced weapon for activism

This piece of news may have gone unnoticed internationally: Italy’s reform of the electoral law risks to be blocked by Roberto Calderoli, an MP from the opposition. At the beginning of September 2015, Mr Calderoli has deposited no fewer than 500,000 amendments to the reform being discussed. After a couple of weeks, the same MP has pulled quite a few more, reaching a total of over 80 million proposals. His aim is obviously to block the discussions (It would take some 156 years to discuss them all, one every minute, 24 hours a day) and so to have a weapon for bargaining with the government.

How could he produce so many draft proposals in such a short time? With Article Spinning, a branch of the so-called Natural Language Generation. It’s a technique used to produce new texts from an existing one by replacing words with synonyms or expressions having the same meaning – and all this can be done by a piece of computer software, very quickly and in a completely automatic way.

Let’s make an example. Article 3 of the European Declaration of Human Rights says “No one shall be subjected to torture or to inhuman or degrading treatment or punishment”. I have fed this sentence into a free online article spinning portal 10 times and got 10 different sentences. The number of possible combinations can though be much higher.

By using an online dictionary, I have found that some words this article can have the following number of synonyms which would work in this context (the number includes the initial word):

No one: 4
shall: 6
be subjected to: 4
torture: 2
inhuman: 7
degrading: 4
punishment: 4

So we can get 21,504 possible combinations – hence more than 21,000 sentences with basically the same meaning. Of course not all of them would be written in good, elegant English. Still they would need to be discussed if presented as amendments in an hypothetical reform of the Convention.

Article spinning – and, more widely, Natural Language Generation (NLG) – have a great range of implications in the development world. The example at the beginning of this post shows that one MP is enough to block draft laws cutting development funding – but, of course, this technique could be also used to prevent new initiatives going the opposite way.

On the Internet, search engines tend to give more importance to websites with original contents, rather than to those with duplicated texts. NLG can therefore be a tool for activism through Google bombing.

From an engagement point of view, these techniques can be interesting tools for petitions, both on and offline. If an MP receives tens of thousands of letters or emails saying the same thing but with other words, they may feel more compelled to take their voters’ opinion into account.

Moreover some software solutions don’t need a text to start from. They are able to crunch huge quantities of data and produce a meaningful summary, allowing many people to get access to information which otherwise they wouldn’t be able to reach. There are already plenty of examples, like on Wikipedia (where a piece of software is the “author” of 1 article out of 10). Another example is in weather forecasts: the UK Met service is one of the many weather services worldwide whose text forecasts are written by NLG software, quickly (10,000 location, 5-day forecasts in less than 2 minutes) and with a high customer satisfaction level (a PDF-paper on this case study can be found here). NLG is also being applied in the medical sector, with the automated production of texts from analyses, and in the financial sector (further information on Association for Computational Linguistics:

Of course there are also several risks. Despite plenty of reassurance, the automated production of news articles and web content is pretty likely to diminish the space given to human beings. Where will the human touch be if – within 15 years – 90% of the news will be written by computer software? How can the voice of men and women be heard if they simply have nobody of flesh and blood to talk to?

And what are the implications for Communication for Development?

  1. Michael O'Regan says:

    Hey Adriano,
    From a political process point of view, I can’t see NLG making much of an impact, at least not in the immediate term. After all, the political centre has all sorts of structures and processes in place to stymie these sorts of unorthodox interventions. In the case you cite, Renzi’s government has already resourcefully revived a long-dormant statute, to bypass addressing Calderoli’s digital filibuster and ensure safe passage of Senate reform.

    Calderoli’s a pretty unsavoury character, so I won’t shed any tears over his failure, but I can envisage a scenario in which an even more developed NLG becomes more of a hindrance than a help for progressive activists. Once automated content becomes effectively indistinguishable from its human-created equivalent, what’s to stop public representatives from dismissing large-scale correspondence as an orchestrated, subversive campaign that doesn’t reproduce the feelings of the majority of their constituents?

    From a journalism point of view, all this seems like something of a double-edge sword. It’s well known that organisations like AP have been on a sticky financial wicket, and if using Automated Insights’ magical algorithms to collate market data and churn out by-the-numbers reports of baseball games allows its remaining staff to crack-on with more substantive reporting, then more power to them.

    But like you write, this is more than likely the thin end of a very large wedge, and I sympathise with Podolny’s unease with this apparently inexorable discourse of technological evolutionism, which may ultimately deprive us of the “insights a curious and fertile mind could impart when considering the same information”.

    Anderson’s chapter in ‘The Social Media Reader’—cracking read, by the way; highly recommended—includes a fascinating discussion on how the different paradigms of journalistic practice are closely connected to conceptualisations of both civic participation and democracy as a whole. He contends that the algorithmic approach of companies like Demand Media is “not reducible to [the established] conversational, aggregative, or agonistic forms of democratic life”, which implies that we’re sailing in uncharted waters.

    The task of figuring out the implications for Communication for Development is, of course, the guarded province of ComDev theoreticians, so I’m not sure a dilettante like myself can offer much insight. That said, an artificial press would seem sharply antithetical to the precepts of a discipline founded on the desirability of humanistic and dialogical cultural production. Were Paulo Freire alive today, I doubt he’d find much revolutionary potential in the ’20 celebrities who look a bit different than they did last Tuesday’ type articles that are the calling cards of this brave new world.

    C.W. Anderson. (2011). From Indymedia to Demand Media: Journalism’s Visions of its Audience and the Horizons of Democracy. In Mandiberg M. (Ed.), The Social Media Reader (pp. 77-96). New York: NYU Press.

    • Adriano Pedrana says:

      Hi Michael,

      Thank you for your insightful comment.
      I agree with you on the possibly antithetical (and anti-ethical) value of Natural Language Generation (NLG) applied to Communication for Development in itself. Giving voice to a software program rather than to persons doesn’t really seem to improve how people are represented in the media. I can imagine a possible, ‘acceptable’ use of this tool in C4D when some type of information has to be broadcast in several languages – like weather forecasts, as mentioned above.

      As of its use as a political weapon, I am not really sure that political institutions all over the world are able to counter NLG. True, Calderoli’s attempt to block Italy’s Senate reform has been defused. But this was possible only by stretching a statute – something which is not immune to a court appeal. Many people – especially in Calderoli’s party – have seen in this decision an attack to democracy. And anyway the first lot of amendments (some 500,000) could not be blocked.

      I feel that the political institutions are more flexible in some countries than in other ones. Take the US shutdown for example: something like this would not be possible in those nations where interim budgets are put into place, either automatically or through a simple act of law. Therefore the apparent immunity of Italian institutions against digital filibustering doesn’t mean that other governments are automatically vaccinated.

