Art Meets AI Alignment Hurdle: Can We Afford Bananas on the (AI)Wall?

Art, film and other emerging forms of storytelling have always been critical to shaping identity, promoting awareness/social innovation, engaging in critical thinking and civic dialogue. They have also been used to influence public opinion and attitudes. While that isn’t new, the convergence of Generative AI’s deployment with a growing digital economy and an increasing geopolitical tension is. It is at that intersection that we find an AI Alignment issue that doesn’t just affect artists and art producers, but all of us.

Valentine Goddard
12 min readDec 13, 2024

For the sake of this article, I want to focus on RunwayML’s new tool, Act-One, which allows the user to transfer a facial video that conveys emotion to an image. In Part 1, I will share examples to illustrate how I used it, what I got excited about and what needs improvement (technical, aesthetic). In Part 2, I will explain what I mean by we might be feeding a dangerous beast, and leave with you some fun questions to munch on.

*Although I refer to RunwayML as an example, keep in mind that there are a growing number of software companies providing video generation such as OpenAI’s Sora.ai. Part of the reason is very simple: each platform requires a monthly fee. For access to high-resolution videos, Sora.ai monthly fee is 200$.

Part 1: Artistic Explorations

1. Mirror, mirror, do you see me now?

Buddha, Watercolour, Valentine Goddard, 2009

I chose this small size watercolour as my first test and uploaded it into Act One then recorded a 30-second video of myself talking as expressively as I could. Much like a style transfer (make your pic into a Van Gogh), it applies your facial expressions to the chosen image, sort of an “emotion transfer”. I could’ve chosen a video of someone else like a known actor or politician and applied that to a photograph of someone else. Actually, I made one of my husband saying I was always right to which he promptly replied: fake news! :-)

For now, click the video below to see my abstract-ish Buddha come to life.

I’ve always felt this painting had something to say, its eye and mouth frozen in their tracks, left incapable of pronouncing its last words. As the artist, I can now explore what those words could be. I can animate the watercolour in Fresco and record other 30-second segments to make a short poetic animation. I have to say, after this first test was completed, I was quite excited to keep going.

2. Self-perception, identity and emotion transfer

Self portrait, acrylic and collage, 2008 (from the edges, the rest is extended using Photoleap’s AI-extender)

I’m theoretically open to the idea that human creativity and machine interpretation of an artist’s intentions can result in a sort of curating process that can lead to some unpredictable but welcome outcomes. That being said, a self-portrait is in a category of its own because I do want it to reflect how I see myself. I have tried making animated AI-variations of this particular Self-Portrait in January (2024) using RunwayML. All the interesting details of the actual painting vanished and all that was left were bland, homogenized twirls of colour. Overall, in the 12 pieces for the Algorithmic Frontiers online exhibit, the generated outputs standardized my originals, killing any distinct traits (not to mention the biased and sexualized outputs) I intended to express. For example, textures as simple as smile wrinkles were washed out. With Act One this is no longer the case.

In the video below, I can be the voice, the actor of my own painting. There are some changes to the painting but overall not as distracting from the intention. Act One’s “emotion transfer” allows me to be the painter and the director. The generated video is a result of both my visual expression as well as voice (biometric), my emotion, and my words (opinion, culture, emotions). Artistic intent is respected.

Click to see Test 2.

Reacting to this video on Instagram, Erin Gee, Canadian performance artist and assistant professor in the Music Faculty at Université de Montreal, said: “omg your voice is so emotive while I find the filter isn’t that much different in the way it treats your face than someone with a lot of botox…”

3. Selkie-Selfie

Here’s one more example. “Selkie” is a digital assemblage I created for the online exhibit Algorithmic Frontiers. It is made with layers of real graphite and AI-variations digital which are then digitally overpainted (full process explained in the exhibit). Because I had created animations of her using Fresco, I have a number of different versions of her which I could use in Act One. I can then paste them together in Final Cut Pro, VideoLeap or other video editing platform, and make a short animation to guide visitors through my virtual exhibit. Fun project ahead!

One of the Selkie versions in Algorithmic Frontiers, digital layer, Valentine Goddard, 2024

Some of the Current Technical Limitations

We’re in 2024, December to be specific. Things change pretty fast so it’s worth mentioning the month. Here are some of the current technical limitations.

  • The generated output videos are limited to 30 seconds.
  • Generally, a limited part of the face will animate. An input image with realistic eyes (like test 2, Self-Portrait above) and other clearly defined facial features seems to help Act One identify “keyframes” which improves the overall animation.
  • The generated videos often come with added teeth where there aren’t any. It fills the mouth with them when talking. That can be rather cartoonish when added to more abstract figurative portraits.
  • You must collage the generated video onto the larger image. As shown in the screenshot below, I removed the background using Videoleap then video-collaged it over the original image. In the animation process (when the video is applied to the prompted input image), the heads or at least the face moves to varying degrees which can make the exact pasting over tricky or impossible, but some of my tests were quite seamless.
  • In the same screenshot example below, you’ll also notice that the thick oil painting is faded out in the “animation” process this not achieving the desired outcome. Results vary and there is no option to add a written prompt to correct this and better shape outcomes.
Screenshot of a not so good animation result as well as the video editing process

Transition from Part 1 to Part 2

Bananas & responsible art making in the age of AI.

Have you heard about the recent Sotheby’s art auction where a banana duct taped to a wall went for 6 Million American Dollars? The buyer, a cryptomillionaire who reportedly seeks to make America Great Again, ate the banana in front of the journalists. It somewhat depicts quite well the arrogant disregard of Big Tech millionaires for the social impacts of AI.

Short story:

Conceptual art-banana sells for US$ 6,200,000.

700 million plus people are hungry (Dec 2024).

One day’s worth of food for approximately 12.4 million people goes down in cryptomillionaire’s gut.

The artist who, Maurizio Cattelan has said that the work “was not a joke”, telling the Art Newspaper: “It was a sincere commentary and a reflection on what we value.” No kidding!

This inspired me the last example for this blog as I transition to Part 2.

I made the banana drawing with more understanding of on what type of images the tool seems to work best. I drew clear basic facial features distinguishable from each other. I added a thick acrylic layer on the eyelids, and I avoided anything too complex around the face to make the layering-collage process easier afterwards.

Les reels de la banane, Valentine Goddard

If you liked that video (en français), you’ll be glad to hear I had so much fun with it that I opened a new Instagram page called Les Reels de la banane. *Si vous n’avez pas aimé, ne vous en faites pas, je commence mes cours d’humour à l’École nationale de l’humour en février. Je devrais m’améliorer :-P.

Part 2: Artistic Freedom Meets Dual Use of AI Technologies

Despite my excitement, I feel compelled to understand the impact of Generative AI platforms do with my images, my voice, my facial expression.

Videos of people expressing emotions is valuable biometric data, especially, in an era of increased surveillance and a rise in authoritarianism, affecting political freedoms and democratic norms.

Context

September 2024, RunwayML announced a groundbreaking partnership with Lionsgate to develop a custom generative AI model tailored to Lionsgate’s proprietary catalogue of over 20,000 titles. The goal is to integrate AI into various stages of filmmaking, including:

  • Storyboarding: automating the creation of visual layouts for scenes;
  • Backgrounds and visual effects: generating cinematic visuals, such as explosions or landscapes;
  • Pre-production and post-production: enhancing workflows for filmmakers and creative teams.

Lionsgate is a Canadian-American film production and distribution studio headquartered in California renowned for producing and distributing a diverse array of films such as The Hunger Games. It focuses on both foreign and independent films, contributing to its status as one of North America’s most successful mini-major film studios (Source Wikipedia). RunWayML is a for-profit company that could be purchased by anyone, anytime, for the right amount of dollars, or simply be under the control of its main investors including Google and NVIDIA.

Apart from being investigated by the Federal Trade Commission over monopoly laws, Google has sought contracts with the U.S. Department of Defense, is involved with the Israeli government, has worked with militaries indirectly through its venture capital arm, Gradient Ventures, which in turn supports startups providing AI technology to military and law enforcement agencies.

I’ve become rather curious about how the data flows from artistic exploration to algorithmic systems’ training since the sudden rise of interest of major tech conferences (NeurIPS, ACM, etc) in art as well as a growing number of publicly funded AI research institutes holding AI Art residencies. I discussed the militarization of artists in a previous blog, so I’ll simply state here that RunwayML isn’t the only one building video-generating software and somehow, I don’t think it’s for the sake of democratizing Art.

Other video-generating platforms:

Google search screenshot of other video generating platforms, December 2024

Is Open-Sourcing of Generative AI Models better?

VP and Chief AI Scientist at Meta, Yann LeCun posted on LinkedIn that “every institution, library, foundation, cultural group, and government around the world that possesses cultural content should make it available for training free and open AI foundation models. Free and open AI systems will constitute the repository of all human knowledge and culture. Perhaps someone could draft a new content license to that effect: you can use our content to train your AI system, but only if you make it freely available with open weights and open source inference code.” Meanwhile, “Top Chinese research institutions linked to the People’s Liberation Army have used Meta’s publicly available Llama model to develop an AI tool for potential military applications, according to three academic papers and analysts.” Pomfret, Pang, November 1st, 2024 (Reuters)

I don’t want to imagine what an authoritarian regime would do with our cultural data and Meta’s open-source “human knowledge and culture repository”.

The use of AI in the military, which includes self-defence and climate disaster responses, is inevitable and obviously sometimes desirable. However, for democracy’s sake, the conversation around AI Art shouldn’t be limited to whether GenAI is creating good or bad art, but should amplify artists’ moral rights and, at the very least, discuss the importance of their control over the use of their biometric data (fingerprint, face, voice, heartbeat) being injected into data pools that could then leak into deadly purposes.

Going back to the art-banana analogy, what should I do as an artist who cares about world hunger, peace and democracy? Can I adopt this new tool and be certain that my data (face, voice, acting skills, visual art, caricatures, etc) will be used only for training algorithms used in Lionsgate’s film productions?

Emotion Transfer, Deepfakes & Social Media as Warfare

Generative AI tools like RunwayML’s Act-One, present risks when misused for propaganda, disinformation, and polarizing campaigns. The data collected by RunwayML, or other video generating platforms, can be used to produce deepfake videos that spread false information or impersonate individuals, undermining trust in media and democratic institutions; and/or to produce emotionally charged, persuasive yet misleading content, fuelling political polarization and eroding democratic discourse.

A 2023 study by the MIT Technology Review highlights that these tools are increasingly used by political actors to amplify disinformation, with at least 47 governments employing such technologies to influence public opinion.

The Journal of Democracy notes that AI-generated propaganda is as credible as human-created content, posing significant threats to democratic processes by enabling large-scale dissemination of tailored disinformation.

Research published in Social Sciences (journal) examines the ‘polarization loop,’ where emotionally charged, AI-generated content intensifies societal divisions, leading to increased political polarization and distrust in institutions.

The debate of whether generated content is art or not all of a sudden feels rather mundane considering that approximately 5.22 billion people worldwide are social media users, and that all of them, you, us, are potential targets of political propaganda and increased polarization.

The U.S. Department of Defense (DoD) is increasingly exploring information operations and influence campaigns. A 2023 Joint Special Operations Command document revealed interest in technologies capable of generating convincing AI-created personas for use on social media platforms to gather information from public online forums. According to The Intercept, United States Special Operations Command is looking for companies to help it create deepfake internet users that would be undetectable by humans and computers. They certainly aren’t alone as National Security Reports show that Canada’s state adversaries are using cyber operations to disrupt and divide.

UNESCO’s recent policy recommendations call for the watermarking of synthetic content and ensuring content moderation. The use of deepfakes’ uses social influence campaigns raises significant alignment issues with those recommendations. Watermarking is a subject for an entire other blog due to a number of technical and regulatory challenges but suffice to say that social media users are now like citizens in a battlefield. Creative, cultural data are ammunition.

Is it just me or does Shepard Fairey’s piece “Make Art Not War”, created during the Iraq war, suddenly have a new ring to it?

Artist Shepard Fairey

Art Meets AI Alignement Hurdle: Can We Afford To Eat Bananas on the (AI)Wall?

Whether AI “art” is art or not, whether artists’ superhuman creativity can make AI work for themselves or not, isn’t as important as asking ourselves if we are we eating bananas on a wall?

What I mean by this, is that we have an AI Alignment problem that artists and arts organizations must resolve, and it order to do so we must answer questions such as: when we give our data (art, face, voice, etc.) to make a film or an animation, who are we giving it to, and what are we giving it away for? When we make that choice to use or not share our data, does that choice reflect the values we believe in? Will the future of visual storytelling feed a two-headed beast?

Helen Frankenthaler, a major contributor to the history of postwar American painting said “There are no rules. That is how art is born.” In AI-generated art, film and other emerging forms of storytelling, does this affirmation still stand?

I’m stopping here with that question but I’ll quickly add that the Art-Laws is a growing movement of artists with eyes wide open and in a quest for responsible, equitable and sustainable uses of Generative AI in the arts. We will be making our policy recommendations for spring 2025. FYI you can join us January 23rd, 2025 for a symposium on AI, art and climate, and then again, at the end of March for AI Impact Alliance’s annual conference where these artists’ projects will be shared. You can also join that movement and become a member. Bye for now.

Some sources for paragraph one.

https://nanovic.nd.edu/features/fighting-for-democracy-and-human-rights-through-the-arts/; https://www.europeanproceedings.com/article/10.15405/epsbs.2022.06.10; https://research-information.bris.ac.uk/ws/portalfiles/portal/187301935/art_as_a_pathway_to_impact.pdf; https://www.animatingdemocracy.org/sites/default/files/documents/reading_room/Animating_Democracy_Study.pdf; https://pmc.ncbi.nlm.nih.gov/articles/PMC7288198/

--

--

Valentine Goddard
Valentine Goddard

Written by Valentine Goddard

Advisory Council of Canada/United Nations expert on AI & Data Policy & Governance; Lawyer/Mediator/Curator; Socioeconomic, legal, political implications of AI.

No responses yet