View
More

Talking to the Machine: a Longitudinal Analysis of Text-to-image Prompt Trajectories

Understanding human-AI communication through syntactic analysis of user prompts with MidJourney AI.

Live Preview

About the Project

This project aims to discover how humans adapt their communication with AI through the lenses of Communication Accommodation Theory and Register Alignment Theory (both concepts found in human-human communication) by analyzing user prompts fed into MidJourney AI, a text-to-image model (TTI), during a 24-hour hackathon conducted at the iSchool.

Date:

May 2024

Client:

Syracuse University

Services:

UI/UX

Project Details

Research Summary

Generative AI has become an increasingly popular topic particularly in the human-computer interactions (HCI) field within the last decade or so. However, much remains to be seen on whether human-AI communication can be distinguished from human-human or human-animal communication. This study aims to fill this gap in knowledge by conducting a longitudinal analysis of how humans adapt their communication with a text-to-image AI (TTI) over an extended period of time. 

"As users prompt a TTI, how does the part-of-speech (POS) structure of their prompts change over time?"

Methodology

  • 3 teams created a social justice-themed comic book within 24 hours by prompting MidJourney AI.(MJ)
  • Their prompts were recorded and then cleaned in Excel. Cells that were ‘empty’ or had ‘corrupted text’ were omitted. 
  • A Python program was developed to read the cleaned data, apply POS tags to the prompts, and count the instances of each POS. Below, you can see the code for the developed program.
  • The POS counts were exported and visualized using Tableau.
1# Reads "iDare01_Team1.csv", preprocesses punctuation, tokenizes words, uses POS tagging to assign 
2# words as their grammatical function, implements a pandas dataframe, uses a function to define percent change
3# Created by Oanh Nguyen as part of the 2024 REU Program at Syracuse University.
4
5import nltk # imports nltk library
6import pandas as pd # imports pandas library as pd  
7import matplotlib.pyplot as plt # importing matplotlib
8from nltk.tag import pos_tag # imports pos tag from nltk
9from nltk.tokenize import word_tokenize # imports word tokenizer from nltk 
10from collections import Counter # imports counter function
11
12nltk.download('punkt') # downloads punkt package from ltk
13nltk.download('averaged_perceptron_tagger') # downloads perceptron tagger from nltk
14
15df = pd.read_csv("iDare_Team06.csv", usecols=['Content_Cleaned','Time','Date']) # reads csv file and uses content_cleaned, time, and date column
16
17df.drop([3,6,7], inplace = True) # drops user id, mentions, and link columns
18
19df['Content_Cleaned'] = df['Content_Cleaned'].astype(str) # converts content cleaned column to a string
20
21tok_and_tag = lambda x: pos_tag(word_tokenize(x)) # defines function that tokenizes comments and pos tags them
22
23df['Content_Cleaned'] = df['Content_Cleaned'].apply(str.lower) # makes content all lowercase
24df['tagged_sent'] = df['Content_Cleaned'].apply(tok_and_tag) # applies function tok_and_tag
25
26# print(df['tagged_sent']) 
27
28df['pos_counts'] = df['tagged_sent'].apply(lambda x: Counter([tag for nltk.word, tag in x])) # counts number of given pos for given row 
29pos_df = pd.DataFrame(df['pos_counts'].tolist()).fillna(0).astype(int) # fills number of counted pos 
30pos_df.index = df.index # sets index of pos_df same as df
31
32# print(pos_df)
33
34time = df['Time'] # pulls from 'time' column of df
35date = df['Date'] # pulls from 'date' column of df
36
37pos_df.insert(0, 'Time', time) # adds time to leftmost column of pos_df
38pos_df.insert(1, 'Date', date) # adds date to second leftmost column of pos_df
39
40# print(pos_df)
41
42pos_df.to_csv('POS_Team6.csv', index=True) # exports result as a csv file -- make sure to change name!
43

Analysis

Our results show that humans do adapt their communication in some way throughout extended interaction with text-to-image models. Humans may increase or decrease the amounts of POS in their prompts creating "spikes" and "drops" or even push the same amount of POS to create "plateaus" to achieve a specific result from MidJourney. 

When we investigated these changes further, we discovered that these changes correlate with images generated by MidJourney. For example, Team 1 underwent a stylistic change between Prompt #223 to Prompt #478. They began with a semi-realistic style before ending in a more typical "comic" style. This correlates with the drastic changes in parts-of-speech as visualized in Team 1's graph.

Generated images from Team 1 to Prompts 223-478.

Conclusion

Future Goals

While this project has proven humans do indeed adapt their communication with AI, we still don't know why humans accommodate their communication. As I work with Dr. Banks, I will be continuing to explore this project further by researching preexisting literature on human-AI communication in hopes of developing a concise thesis. Ideally, I would be able to interview the team members who participated in this hackathon to further understand the reasoning behind their prompt choices, but this will have to be explored at another time.

Impact

Recently, this research project won third place in the 2024 National Student Data Corps Data Science Symposium's Undergraduate Student Cohort. I'm fortunate to have been given this opportunity and I hope to continue furthering the field of human-computer interaction with additional research.

No items found.