Pat Carroll

AI Collaboration to Build an Album Art Generator

2025-11-06
I used AI in some interesting ways over the past couple of days to do two main tasks: edit writing I had done, and build a website for making generative graphics for album art. The former is probably not so interesting anymore (which is crazy to think, given how novel these workflows are), but the latter definitely was. Both of these collaborations have been interesting in exploring the spectrum between non-use and overuse of AI.

Non-Use and Overuse of AI

Non-use, to me, feels a little short-sighted in some settings, as it denies the possibility for augmenting my skill set to do new things. There are absolutely times for non-use, but I personally definitely want to avoid total Luddism. Overuse, on the other hand, is basically getting the AI to make the entire thing for you — the article, the image, the song, the code script. AI can be overused in these ways, and then you can pass the products off as your own. But even though the AI created the thing, isn’t it still ‘your own’? Or do you have to create the thing entirely yourself in order to say it is, in fact, ‘your own’? These sorts of questions relate to areas outside of AI, such as the use of audio samples from platforms like Splice, or stock images on Unsplash.

I should say that this is not an exercise in proving overuse to be outright Bad — like most things, there’s nuance to be emphasised. Instead, I’m simply exploring my experience of different uses of the technology — what does it feel like to not use it at all? Is it still rewarding to use AI to entirely create the thing? How can there be a balance of getting the reward of creating something while still leveraging the capabilities of the AI?

Collaborating with AI
- Writing
What I was exploring in my two uses of AI yesterday was using it in an assistive way. For the written work, I wrote the draft, and the AI went through it and pointed out possible ways of editing the writing, fixing errors, and identified wrong information (in my case, I got the author of a book wrong). In this instance, the AI was acting in the same way that a human editor acts. I went through a very similar process when I wrote my PhD: passing the draft I had written to the editor, waiting a month, and then receiving a document full of suggestions and fixes. The downsides of this process are that it took a long time, and there were some errors that the editor had made in their suggestions (something to do with the reference style, from memory). The pros, however, were that I gave someone a job — the work employed them, and gave them money for their labour. And, while I said it was a con before, I actually enjoyed that time off and away from the thesis to gather my thoughts about it, and to approach it again with fresh eyes and ideas. By using an AI to edit the document, I am effectively avoiding getting a human editor to do a job.

If I were to get the AI to write the entire article myself, I would not develop any of my writing or thinking skills. Through using AI in more of an assistive way, I am engaging abilities through the act of writing the draft and editing it, constantly practising my writing and thinking skills.

It comes down to this core question: do I want the thing done, or do I want to do the thing?

In using AI, I am trading some work off to it, but, importantly, I’m able to manage how much of this outsourcing I am doing.
- Programming
The other way I was using AI was by building a small program for creating generative visual art pieces for album covers, using the traditional generative art concepts/techniques. In generative art, the artist creates a set of rules and processes which then execute to produce the final art piece, rather than creating the finished piece directly. Each run yields a unique piece, generated within the constraints of the rules laid out in the system. These sorts of systems can be built using code, but I have no experience writing code, so I decided to talk ChatGPT through my ideas for the program, and see how it went. The very first program it created worked very well, generating images exactly like what I was after. The program had a few sliders to adjust parameters like Density and Stroke Weight, and allowed me to select which types of shapes it would use. An element of randomness was implemented, and pressing the ‘Regenerate’ button produced a new image each time, under the same core rules. This allows me to generate a cohesive set of images that share similar characteristics but are individually unique:

Two main issues arose from my minimal coding experience. Firstly, I could not easily edit or debug the generated program myself. When I prompted ChatGPT for fixes, its accuracy was sometimes inconsistent, often leaving me unable to add or alter elements. This collaborative process, however, became a learning experience. ChatGPT responded to me as if I was a beginner, rather than a completely clueless coder. This pushed me slightly beyond my capabilities, developing some of my understandings of how code works. I did, however, struggle at times to find where to paste the new code, so I asked ChatGPT to tell me what the old code looked like so that I could find it and replace it with the new code.

I obviously didn’t feel like I had created the program myself. Sure — the artworks it produced felt sort of like mine, but the program itself didn’t. If I had coded that program, I would feel far more rewarded every time it produced an artwork.

Reward in the Creative Process; Ownership

But is this much different to, say, a person who works in woodworking, doing most things by hand, but then acquiring a particular machine that allows them to do so much more? It’s still creative work, but now the person is relying on a machine to do some of the work that they originally wouldn’t have been able to do themselves. Is there much of a difference here?

(Something I did observe was that it did drive me to really want to learn to code. I’ve been interested in other forms of programming using objects in platforms like MaxMSP and Bitwig’s The Grid, but I’ve never fully taken the plunge with learning to code. That could be a side project I undertake this summer.)

Again, it comes back to the core question: do I want to have the thing done, or do I want to do the thing?

Do I want to learn the techniques, put them to use, fail, succeed, learn and feel ownership over my creations? For sure. But is there also a bit of joy in having this program in front of me that has been made specifically for me, based off my ideas? Absolutely.

I don’t think it’s black or white. Having the AI simply produce the generative art images itself, and then calling them my own… that feels far more empty. In the same way, getting the AI to write the entire article, or getting it to produce an entire piece of music, seems like too much outsourcing to feel much reward in, and connection to, what has been created. There’s very little creative joy in those types of processes.

There is something that feels good about being able to do things ourselves. Sure, we can store information in a personal knowledge management program like Obsidian or Notion, building a large collection of notes about our interest. Or we can just say, ‘Hey, it’s on the internet; what’s the need to remember these things?’. But it feels good to know the things yourself: to hold the ideas in your head, and be able to merge them and explore the connections yourself. There’s a self-sufficiency that comes from that. It feels good to learn new things, and be able to do new capabilities and skills. It feels good to be very good at something. As a software update for a phone makes it a more capable device, going through skill- or knowledge-development processes feels good and deeply rewarding. Gaining new capabilities is one of the things we praise in our culture: development, growth, maturity, advancement. Think of Neo in The Matrix gaining the capabilities of Kung Fu fighting. Think of the montages of characters in sports films, training hard, struggling, falling, getting up again, training, training, training, and eventually getting very good at what they struggled with before. These sorts of stories permeate in our culture because they align with a core element of modern experience: development and expanding capabilities.

AI Augmenting Capabilities

A major part of this is that I can use AI to help me do things I can’t do on my own, rather than getting it to do things that I can and want to do, such as writing out my ideas. It’s important to be aware that whatever I get ChatGPT to do, I won’t get practice in. If I get it to write out my ideas (for example, brainstorm something, or write out an entire article), then I won’t get practice in thinking and converting ideas to written words, which I see as an extremely valuable ability. If I get it to edit my writing, however, I will get practice in writing the ideas and some editing, but I won’t get practice in the proper fine-toothed-comb editing of writing. But this would be the same case as if I worked with an editor. If I get it to write code for programs based on my ideas, I won’t get practice coding. However, I do feel like I learnt a bit about code yesterday by working alongside the AI, copying and pasting chunks of code and looking around the script. I didn’t learn anywhere near the amount I would have if I had written the script myself, but that would take me a very long time to be able to do so. This isn’t a bad thing — learning is supposed that takes time. But this was a different experience to traditional approaches to learning: I could immediately create things of higher complexity, while learning how code works in the process.

But the counter to all of this hyper-optimism is that these positive outcomes will only occur if users are aware of AI’s potential to do the exact opposite: to limit our capabilities, expressive capacities and creativity, to cut us off from opportunities, and to raise new barriers. Over-reliance on the technology will stop us from doing the things that allow for these positive outcomes, and will stunt our growth in developing our own skills and capabilities. Over-reliance will reduce users’ knowledge and mental capabilities, causing all sorts of issues in navigating the world due to under-education.

Just like many past tools and technologies, AI is both a gift and a burden; it can both extend us, and hinder us. Which one of these it falls towards depends on the users’ modes of use.
Patience and Long Projects

2025-11-04

Last year, I read Four Thousand Weeks by Oliver Burkeman. It is a wonderful, realistic perspective on time and time management. One thing that resonated with me was his chapter on patience and the loss of the ability to carry out longer projects — to really stick with them for the time they need to come into being.

I have personally found it difficult at times to sustain a project, or to continue feeling engaged with what I am doing if it is a longer commitment. Part of this difficulty is that my primary motivation for being in the studio has been to transform a work in progress into a completed track, rather than solely focusing on the process of writing music. In other words, the purpose of being in the studio is to attain something in the future, as opposed to finding a fulfilling experience in the present moment.

This relates to notions of mindfulness, internal motivation, flow states, and delayed gratification, and I believe the ability to carry out fulfilling work lies in finding a balance between these factors. This would entail doing creative work that is still working towards a future goal, but is also engaging in the present moment, and ticks all the boxes for facilitating states of flow.

I remember a while back seeing a post somewhere on social media of yet another piece of software being hailed as the thing that will give producers the ability to pump out full tracks in minutes, or something along the lines of this. What I do remember clearly, though, is the top comment on it: “What’s the rush?” I found this perfectly summed up an issue with how music production is marketed, but also how creativity itself is marketed or valued. There is such a focus on the quantity of work, rather than the quality of work.

I believe that what is more important is for composers, producers, and anyone else involved in creative work to cultivate the patience required to allow projects to come into fruition. Sometimes pieces of music need to be sat with or shelved for weeks, months, or even years before their issues and potential ways to elevate them reveal themselves. If producers only aim to create many, many tracks as fast as possible, they lose sight of the importance of the ability to slowly chip away at a project.

I have felt this with my own creative work, but I have also seen it in the context of my education. Sometimes it has taken me weeks to fully grasp a concept I have been studying. I always feel that there is no way that the process could have been sped up: it often simply needs to take weeks or months. The idea that I could learn something and it will immediately click is unrealistic — it needs to circulate in my thoughts, my subconscious, and coalesce with my prior knowledge before reaching the point when it locks in and I understand it. It simply needs to take time and cannot be rushed. This is why I’ve always been against book summaries and ‘gists’; ideas require time and immersion to properly take root, something a book can provide, but a quick summary cannot.

This points to two perspectives of the goals of studio work: should the goal be to spend 3 hours in the studio, or to complete a project? Of course, it should be something in between these: probably to spend 3 hours on the project. The point is to consider whether there is too much of an emphasis on leaving the studio with a finished product, or simply to have spent the time on creative work.

The results of these considerations concern the deadlines imposed on a piece. “Spend three hours on the project” is a goal that doesn’t yet set an arbitrary deadline. If the goal is to come out of the studio with a full track, this sets a deadline of 3 hours on the piece. This will inevitably limit the scope of the piece. I believe that deadlines are extremely useful, but should not be applied to a project until its possible end shape is clear to the composer. In this way, the composer knows what the product could look like as a finished product, judges how long it might take to get it to that level, then creates the deadline.

I have tried various creative challenges aimed at these two approaches: write for 3 hours every day for two weeks, and write a piece of music a day for two weeks. Personally, I enjoyed the former the most, as the required goal was a fixed time commitment. I couldn’t make those three hours shorter or longer—it was just a three-hour block that I worked within. On the other hand, ‘a piece of music’ could be anything, and I found myself a few days just getting in the studio, recording a little improvisation on the synth, and calling it a day. In the three-hour block, that improvisation would have gotten more layers added to it, been edited, further processed; more work was required due to the nature of the challenge. On some days, the three hours flew by, but on others, it felt like a slog, but I did find it easier to get interested in what I was doing, as I had no choice but to keep writing for the three-hour block, whereas on the track-oriented challenge, I could make the day’s session finish anytime I wanted by calling the track I was working on ‘done’.

I personally get much of my creative inspiration from authors who create giant series of books — especially fantasy and science fiction epics. The writer’s ability to sit down and chip away at a book is something I find admirable, as it demands the author to acknowledge the fact that this thing they’re working on isn’t going to be finished until well into the future, which means that the gratification from writing “The End” isn’t coming anytime soon.

So where do they get the gratification from? The act of writing itself—the time spent on the creative task, in the creative process. This highlights internal motivation, a key theme in Daniel Pink’s book, Drive, where motivation is derived from doing the act itself, as opposed to an external reward such as a salary or, say, income from the sales of a book.

In the field of music, I feel that we see this less often today. There are definitely still plenty of artists who spend a lot of time on longer projects, slowly inching towards a finished record of intricate music. But there tends to be a focus on immediacy of output — of staying relevant on people’s feeds.

We see the ‘long game’ approach to creative works in classical music. The best example is Wagner’s Ring Cycle — a massive cycle of operas spanning 15 hours of music, which took the composer 26 years to write.

To me, this type of prolonged creative process requires patience, internal motivation, and regular creative work, while also keeping an eye on the broader picture: working on the trees (the daily work in the studio) without losing sight of the forest (the broader body of work you’re creating, and the reason why you’re creating the work). I often wonder if relevance worth sacrificing in order to create truly creative work? And I often lean towards yes.
Balancing Structure and Variation in the Creative Process

2025-11-04

Desired Output vs. Conventional Structure

When I’ve recently sat down to write music, I’ve been more drawn towards ambient, experimental, and spatial work. It’s difficult for me to pin down exactly the type of music I’d like to be writing, but I have found that I’ve been avoiding dance music, and popular music approaches recently — trending styles with conventional popular structures. To me, this music is often about trying to match something that has come before. I’ve found myself interested in exploring approaches to music that are new and haven’t been explored before, and I often want to actively avoid conventions where possible.

I remember Daniel Avery mentioning how he is confused to hear people talk about wanting to create sounds that reflect natural sounds and sonic behaviours, as he wants to explore sounds that are impossible, new, and haven’t been heard before. I definitely resonate with this feeling.

But right now, my creative process feels like it’s in a bit of disarray. Every time I’ve recently sat down, I seem to want to approach composition in a new way. This does align with wanting to push music into new territory, but it also leaves me feeling fairly untethered. There’s huge value in having a process down pat that you can use to make new, wildly variable music, using the same set of tools. I believe it comes from having a process that is rigid enough to offer security, but flexible enough to facilitate a diverse and experimental output.

The Creative Tension

I think that’s part of the tension: I’m always trying to find the balance of established structure in the creative process, and abandoning conventions in the music itself.

An established structure acts as a foundation. If I ever get lost, I can fall back on it and use it for guidance. I know which synths to use for certain applications, or which plugins to use for specific processing methods. I’d really like that: having, say, a single plugin for a distortion pedal, or a single plugin for a tape emulation. That way, whenever I want distortion or tape emulation, I know exactly where to go. At the same time, this bank of equipment (and creative techniques) must have the potential to create experimental music, absent of conventional structures.

Defining the Creative Process

What actually is the creative process? To me, it’s a collection of procedures, techniques, and equipment/technologies.

Procedures are linear processes I move through to fulfil certain tasks. Techniques are the creative, often signature ways of carrying out those procedures. And finally, equipment/technologies are the tools (DAWs, specific plugins, modulators) required to execute them.

The creative process is almost fractal in nature: micro-techniques nested within macro-procedures. Take, for example, creating a layer of interesting glitchy material. I go through an overarching procedure: setting up the material, routing it to an auxiliary channel, loading a complex processing chain, recording the output, chopping up the resulting recording, isolating the parts that resonate with me, and then further processing them through pitch shifting and time stretching. That whole sequence is a procedure, but the specific choices I make define my technique. Another composer would have an entirely different set of techniques — and use a set of different tools — to carry out their procedure of creating a layer of interesting glitchy material. A single piece might have a hundred of these procedures embedded within it, from long, extensive sequences of actions to concise procedures for things such as isolating transients in a sound.

Rigidity; Avoiding Dogmatism

It’s worth considering the results of rigidity in the creative process. If it’s very rigid, and the composer knows exactly what to do step-by-step — which rhythms to use, which synths to use — the composer will feel secure, but the piece will sound similar to their last one. But if the process is completely unstructured, the composer may feel lost, and each piece will sound wildly different.

The essential task is finding that balance: the process should have enough rigidity to allow the composer to feel somewhat secure and have options for what to do next, but not so much rigidity that the resulting works are predictable. Variation — and its ability to create unexpected outcomes — can be applied to technologies, techniques, and procedures alike.

To me, it’s similar to the move of certain DAWs to create ‘ranges’ in parameters, rather than fixed values. In Live, you can fix a velocity range that each note will play at each time it is triggered, creating a subtle sense of randomness in the sound. In Bitwig, you can create a range that an automated parameter will sit at, rather than a fixed value. This ability of establishing ranges rather than fixed values allows for some structure, but offers the possibility for variation each time the track is played. Applying this to the creative process, there is structure and security, but the possibility for variation each time the creative process is undertaken.

My ideal process is one that results in works that do not align with conventional structures. If you make music based on techniques you’ve heard artists use, your music may align with their sound. This is completely fine early on in a career, as you develop your skill set, but at a certain point, departing from these conventions allow for a unique, personal voice to emerge. You have to allow yourself to develop and evolve the techniques you learn and mimic, to embed them in your own process — to adapt them; to break them into their fundamental components and vary these. Techniques can be built upon others, merging to create entirely new ones, akin to how technologies evolve. I believe it’s worth being aware of the true nature of techniques, and how adapting them can open us up to entirely new ways of working and entirely new types of creative outcomes. I think we’re all aware of techniques, and can talk about them well, but it’s interesting to really unpack them and get to their essence — understanding how they’re similar to technologies; how they relate to craftsmanship; how they construct our own unique ways of doing things.
Reading x like a text

2025-10-31

Something I’ve been interested in for a while is the ways that we interpret our inputs; the way we make sense of what we’re observing and experiencing. This is very connected to semiotics and representation. Perhaps my interest in it is a result of being educated in a creative arts institution like the Sydney Conservatorium of Music, and tending to have a fairly ‘artistic’ lens. I don’t mean that as a brag — I’m merely pointing out that I have tended to see the world as, or in terms of, art and representation.

Representation to me is the core of the arts. It’s the way that certain pieces of art and creative media represent certain meanings that we apply to our own lives and experiences. Certain characters within texts resonate with us differently than the next audience member, because we relate to them differently; we might have gone through similar experiences to what they’re going through, and we’re intuitively comparing what they’re doing to what we did or would do. For another audience member, that character’s experience might be completely foreign, and there’s far less of a way to relate to them. We see a tree in the world, but to some, a tree represents concepts like growth, branching, and lineage. A tree is really just a tree, but to the beholder, it can mean more. A painting of a tree is just a series of lines, shapes, textures and colours. But that painting represents a real-world tree, which in turn represents growth, branching, and lineage. This is the fundamental experience of experiencing art.

The world and everything around it can similarly be read like a text. So-called ordinary things come to represent something for us. This is the domain of semiotics, where images, sounds, words, stand in for and link to other things, either real-world objects, or abstract concepts. For example, the word ‘dog’, consisting of the squiggles ‘d’, ‘o’, and ‘g’, represents to English-speaking people our furry little animal friends. But simultaneously, that furry friend can represent play, joy, vitality, curiosity, or companionship, depending on its behaviours. This is the same experience as above, of observing a piece of art. But the same processes exist outside of the world of art — we do them all day long, and it’s at the core of human creativity.

We can apply this to broader phenomena. What does the gradual move towards isolating technologies such as noise-cancelling headphones say about us? That we don’t enjoy the company of strangers? What about the efforts of tech giants to build social media platforms that pit us against each other, causing massive engagement, increasing revenue from advertisements? That the powerful in our culture prioritise revenue over the wellbeing and cohesion of the social fabric?

The same applies for reading historical artistic movements. In the late 40s, composer-engineer, Pierre Schaeffer pioneered Musique Concrète, a new compositional method based entirely on recorded sounds — fragments of everyday life, musical instruments — manipulated through tape editing and playback. Around the same period, in Cologne, German composers such as Herbert Eimert and Karlheinz Stockhausen developed Elektronische Musik, built exclusively from electronically generated tones. What do these approaches mean? The two schools represent contrasting philosophies of sound: the empirical and phenomenological (building from the concrete) versus the rational and synthetic (building from the abstract). In 1956, Stockhausen’s Gesang der Jünglinge united the two approaches, combining a boy’s recorded voice with electronic tones in a single composition. What does this mean? The fusion can be read as a collapse of mid-century dogmatism — an early sign of the move toward pluralism and post-modern openness in the arts, where boundaries between methods and materials were no longer seen as absolute.

This process of drawing conclusions about events — of ‘reading’ history in particular ways, and using history as evidence for certain conclusions — is a difficult skill to develop. However, I believe it’s fundamentally creative work. Perhaps more precisely, it’s a combination of creativity and critical thinking. It entails making connections between disparate phenomena and justifying the conclusions critically. Reading historical events like a text and drawing creative conclusions, is a very similar — if not the same — process as drawing interpretations of pieces of art. It all involves considering what things mean; what can be made of them; what they represent.

It’s the process for any interpretation of data. Researchers go through it when analysing interview responses or quantitative statistical data. It’s all ‘read’; it’s the researcher’s role to make meaning of these forms of data, in the same way that it’s the audience member’s task to make meaning of the art they observe. (Of course, I’m referring here to the more active approach to cultural consumption, rather than the passive ‘lean back’ consumption that is encouraged by streaming platforms.)

***

This process is one I teach my students. Consider these two images that represent disabled people in our society. These are universal symbols these days, and you find them in most public settings. In my Cultural Studies class, I get the students to compare the two images, looking at what they mean.

Figure 1 – the passive disability icon

Figure 2 – the active disability icon

In my Cultural Studies class, I get the students to compare the two images, looking at what they mean. Compared to the first image, the second is far more active; it pairs disabled people with a representation of agency. Once the meanings of the images have been established, I ask what it means that people in our society designed this second image. Perhaps it means that parts of our society value the wellbeing of disabled people, and would like to recognise their agency. Perhaps it means that parts of our society want to break the stigma of disabled people as entirely incapable. What does this mean? Perhaps it highlights that our society values inclusion and the wellbeing of members of society, representing a streak of altruism.

***

Getting students to really see the world around them and think about what it all means, in my opinion, is a crucial part of their education. While the above example relates to social justice, these sorts of exercises develop the students’ capacity more broadly for creative and critical thinking. It really brings them to the question, ‘What does it all mean?’ I think the more we can strive to answer that question, the less confusion will be in society, and the more connection we will have with the world around us, and the people living within it.