Clarinet’s Puppet Method

Clarinet’s Puppet Method(s)
Discord Contact: @whatnostop#6700 (clarinet) If this was fun/helpful, let me know on Discord, I could always use the dopamine
You can use Midjourney to stage the same characters like actors over and over again in different scenes. I call this the "puppet" method after a similar practice in digital animation. It’s useful for storytelling and storyboarding.
There is a non-spammy link if you are sharing this document: https://bit.ly/Clarinet-MJ-Puppets
ALL YOU NEED TO DO IS USE THIS TEMPLATE. (Don’t worry about the rest unless you love details.)

⚙️Template

http://www.host.com/ReferenceImage.jpg Setting is place:: Name is description:: Setting contains Name doing activity:: atmosphere, filler details, media type, artists --v N --aspect X:Y

FOR EXAMPLE…

https://as2.ftcdn.net/pFtKgQ.jpg Setting is an empty midnight forest clearing:: Amit is a middle-aged Indian man with short black hair:: Setting contains a frightened Amit wearing pajamas pointing a flashlight into the trees:: ominous rainstorm atmosphere, fairy forest details, photography by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --ar 2:3
🎭TLDR: Use this template with the blue keywords. The reference image(s) plus the description of the person (together) act as your reusable Puppet (actor, player, character). You can change the Setting, Activity, Atmosphere, Media Type, etc., around it to stage-direct that character.

If you are sensitive to being overwhelmed by too much information, stop reading now. The rest of this document is needed only if you need help with troubleshooting or if you want to go deeper.

👣 Step-by-Step through the prompt template…

🖼️ Method Zero: Midjourney Controlled Image-Based Puppets (v4 only)

In v4, you can /imagine with one URL of your character and one URL of your background and do nothing else. Midjourney will make a best guess about how to combine them, as shown here with a bathroom and a wizard. This works great if you’re willing to let Midjourney take the wheel.
Column 1
Column 2
Column 3
Column 4
https://lh6.googleusercontent.com/5W5GmUpqF7A_End6nlRIKX1siQFEW7WV6By6M26DLqB6jWgb5INE9jG7e1qUtJ2tsp5QAVjX_9QFJUdyVrM_Ig5tid0gxuI__emS8-tOWOyLhDP3pbgQrBfjU7ye9LnbXQbfN8d1zF68e7Tohttps://lh4.googleusercontent.com/E-vMw0BD27ySPF_R8SOy0IEED9BSaYLlvJOxk-miBxib0f2f7fDXlWmGz_yz2YsxytOhhvNnCpjQe4fEIoHN9eN4tMSglMLGHQCg_MvdSDsy4ztjhNsDDPpk3ITG8UTKseSqPe2j5EZIYJ2_https://lh6.googleusercontent.com/c66o9xGQUHbeGdpARJNVaY55xKShInRz5lCVR-w8-sT14W431mB4Y49lDvdORhIFzUH6TNndbpILWuDKgvsiIUOTzmOEEgA8t2HdFZ1xB00uM2z1X1Y1hO6CjnnSmfUdKW6UxMAI-t9I9Lv-

🖼️Method #1: Image-Based Puppets (for v3 and v4)

If you want to control the action and scene, you’ll want to write a prompt to go with your URLs.
CHECKLIST:
  • reference images (urls ending in a file-format extension like png, jpg, gif)
  • scene, background, or setting (short phrase)
  • description of your character (short phrase)
  • description of the character doing something (short phrase)
  • style words and/or artist references
  1. The reference image of your character…
The reference image of your character should roughly reflect what you want the character to look like when Midjourney has blended it into the final scene. In v4, if you provide a head-and-shoulders reference, be aware that Midjourney will struggle to invent a body for the character. And, if you provide an image where the character’s face is only a few pixels in size, be aware that Midjourney will struggle to invent a face for it. So you want to roughly match the reference image to its ultimate appearance in the final image, to support Midjourney’s efforts.
Use an uncomplicated or blank background in your character reference image to make it easier for Midjourney to focus on the character.
In both v3 and v4, you can use multiple URLs. In v4, I recommend using just one picture devoted to the scene and one devoted to the character, because otherwise the blending effects can get away from you pretty quickly. That said, experiment and find out what works for you.
Weights can help control the blending effect, but they are not available yet in v4. In v3, you can add a space plus ::N after the URL to weight it. The higher the value of N, the more influence your reference image will have. N can be between 0 and 10,000. We recommend you experiment to discover the weight that works. In v3, many people find 0.05 - 6 to be the range. In v4, some people have reported 0.01 - 2 to be enough.
✅Good example of a reference image (load this URL to see the example):https://as2.ftcdn.net/v2/jpg/01/88/09/29/1000_F_188092957_gmFoL2BPkfMfG0h9FzmcgscHRLpFtKgQ.jpg
🚫Bad example of a reference image (load this URL to see the example)https://as1.ftcdn.net/v2/jpg/04/71/68/56/1000_F_471685627_FfgZX96Z5pa9E3xaOj0AZaCcPDpfnreb.jpg
Remember, the reference image of your scene and your character will be interpreted by Midjourney unless it is controlled by the prompt.
  1. The text description of your scene or setting…
If you use a reference image of a scene or setting and want Midjourney to respect it, then the prompt should match/reiterate it. If the prompt and the reference images don’t match/reiterate each other, you will see this reflected in your output as chaotic blending. You can use this effect creatively to your advantage, of course. Here, I provided the same picture of the bathroom but prompted for a jungle.
notion image
The text description of your scene or setting must start with Setting is (case-sensitive) and should be short. The setting is just the background or stage you’re setting for the action. If you wish to avoid chaotic blending, the prompt should match/reiterate the reference image if you’re using one. This is an anchoring step, like setting up a background layer. Do not add players or objects to the scene, you can do that later.
✅Good:
Setting is an empty midnight forest clearing::
🚫Bad:
Setting is an empty forest clearing at midnight where Oberon stands on a tree stump with Puck and a host of fairy soldiers::
The image reference will likely win over your prompt unless you use weights to increase the influence of the prompt.
Here’s where we are in the v4 wizard/bathroom example: <https://s.mj.run/CUB6zznrkZo> <https://s.mj.run/NbRZHucYw8k> Setting is a modern bathroom:: Hinkleberry is a wizard in a yellow hat:: Setting contains Hinkleberry carrying a laundry basket:: morning atmosphere, photographed by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --q 2
  1. The text description of your character…
If you use a reference image of your character, the text description of your character in the prompt should match/reiterate the reference. If the text and reference image disagree, you’ll see chaotic blending. The image reference will likely win over your prompt unless you use weights to increase the influence of the prompt.
The text description should always start with Name is (case-sensitive) and consist of the character’s signature characteristics. You can dress up and puppet the character in a moment. Don’t do it here. Keep it short and focused. This is an anchoring step. It is invoking the puppet and helping Midjourney use your reference image(s) in a helpful way.
✅Good:
Amit is a middle-aged Indian man with short black hair::
🚫Bad:
Amit is a man with blue eyes of Indian descent who has normal short hair and some tattoos and wears business casual like a crisp work shirt and pants and leather shoes and a belt and he’s probably a father of three kids and he’s holding a broadsword::
Here’s where we are in the v4 wizard/bathroom example: <https://s.mj.run/CUB6zznrkZo> <https://s.mj.run/NbRZHucYw8k> Setting is a modern bathroom:: Hinkleberry is a wizard in a yellow hat:: Setting contains Hinkleberry carrying a laundry basket:: morning atmosphere, photographed by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --q 2
  1. The text description of the character doing something…
The text description of your character doing something in the scene or setting must start with Setting contains (case-sensitive). Now you’re puppeting the puppet. This statement should be kept to only two or three elements, such as the character’s wardrobe, an emotion, and a key activity. (See example.) Changing the character’s clothes? Giving him a prop? Giving him a dynamic pose? Remember that the image reference will likely win over your prompt unless you use weights to increase the influence of the prompt.
✅Good:
Setting contains a frightened Amit with messy hair, wearing pajamas, pointing a flashlight into the trees::
🚫Bad:
Setting contains the cinematic wide-angle shot of Amit being horrified at a glimpse of a demonic face in the trees and fumbling in his backpack for his flashlight or a weapon and he realizes he’s wearing pajamas and starts crying::
Here’s where we are in the v4 wizard/bathroom example: <https://s.mj.run/CUB6zznrkZo> <https://s.mj.run/NbRZHucYw8k> Setting is a modern bathroom:: Hinkleberry is a wizard in a yellow hat:: Setting contains Hinkleberry carrying a laundry basket:: morning atmosphere, photographed by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --q 2

4) The text styling/sourcing prompt to help MJ render your vision…

The text styling/sourcing prompt to help Midjourney render your vision is familiar to you from other prompts. I recommend 1-2 atmosphere words, 1-2 filler details, a media type, and some artists. See corresponding colors below for examples. The image reference styles will likely win over your prompt styles unless you use weights to increase the influence of the prompt.
Here’s where we are in the wizard/bathroom example: <https://s.mj.run/CUB6zznrkZo> <https://s.mj.run/NbRZHucYw8k> Setting is a modern bathroom:: Hinkleberry is a wizard in a yellow hat:: Setting contains Hinkleberry carrying a laundry basket:: morning atmosphere, photographed by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --q 2

📣 Here’s how it looks all together…

https://as2.ftcdn.net/pFtKgQ.jpg Setting is an empty midnight forest clearing:: Amit is a middle-aged Indian man with short black hair:: Setting contains a frightened Amit wearing pajamas pointing a flashlight into the trees:: ominous rainstorm atmosphere, fairy forest details, photography by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 3 --ar 9:16
🎭TLDR: Use this template with the blue keywords. The reference image(s) plus the description of the person (together) act as your reusable Puppet (actor, player, character). You can change the Setting, Activity, Atmosphere, Media Type, etc., around it to stage-direct that character.
Use weights to increase the influence of various parts of this prompt.

Example v4 wizard/bathroom output…

<https://s.mj.run/CUB6zznrkZo> <https://s.mj.run/NbRZHucYw8k> Setting is a modern bathroom:: Hinkleberry is a wizard in a yellow hat:: Setting contains Hinkleberry carrying a laundry basket:: morning atmosphere, photographed by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --v 4 --q 2
Column 1
Column 2
Column 3
BEHOLD! The v4 wizard does not have a body or arms. That’s because my reference image did not include them. Your reference image needs to include all the features you want to see in your final output, because Midjourney will struggle to invent what isn’t already there. Poor guy. He just wanted to do his laundry. My reference image has no arms, see.
https://lh3.googleusercontent.com/N-rQWIilzXxGXLByrPusFRB7D8yXd5GgIJFvosmVJALa-6y-1IP2ENAA3popE1KTzNBBG2EEJaywZFdgNWhaEpNFaBn3YZK8KKa0jXRzB74gMdfmtvk9JHi7VLZIT_5OiGy5GeK9bh5X6W4q
Be sure to read up on prompting differences in v4 for best effect!

🔁Example v3 output, first roll, Image-Based Puppet, with one character reference image…

Column 1
Column 2
Column 3
Notes Use weights to increase or decrease the influence of your setting, character, and actions. For example, if Amit just refuses to wear his pajamas or point his flashlight into the trees, increase the weight of that phrase to ::1.1 and roll again.More about weights: https://discord.com/channels/662267976984297473/1020578042285920327 Or, if you’re feeling confident, try moving his costume description into his character definition line, or reiterating it there. FAQ: It’s midnight, why is it so bright in this picture? - Because I specified it as photography, which is always going to be more brightly lit. Experiment with other media types for other effects.
https://lh4.googleusercontent.com/UIb86mUy8pD9-MWp1FY-83hNbAfE3fK6UpoL7Y_UTUySN2lphhaQmQ71VmI0M7lUoi7gRdSmUPkPeSIRgpkqaHeUSpo3QundjX1snGi_8J-6GPcZuIPJCf1PMhWon0nVy8lA61NxMS9WajYi

Method #2: Name-Based Puppet (v3 mostly - v4 may be too chaotic for this)

CHECKLIST:
  • You will need an evocative made-up name for your character.
  • You will need 1 text description of a scene, background, or setting.
  • You will need 1 text description of your character. (do not skip)
  • You will need 1 text description of the character doing something.
  • You will need 1 text styling/sourcing prompt to help MJ render your vision.
In v3, start prompting ”A portrait of Made Up Name” until your result looks like the character you want.
That’s the person Midjourney associates with that made-up name.
You can use this made-up name in your prompts to summon up / invoke a character that looks roughly the same every time.
It’s best if the name encodes some characteristics about the person. For example, “Athena Proudbody” and “Happy Youngman” will create more consistent characters than “Megan Smith” or “Michael McCullough”.
For coherency, you can remix or remaster at the end, but we suggest starting in v3 for this method.
Example: A portrait of Hipsterman Greenfellow A portrait of Artemis Darkshadow A portrait of Oldman McSmileface
I would prompt until I find a character’s appearance that I like…
Now, every time I want to see someone roughly like Oldman McSmileface, I can invoke him with the name and a supporting description.
Setting is an empty midnight forest clearing:: Oldman McSmileface is an old man with messy hair:: Setting contains a frightened Oldman McSmileface wearing pajamas pointing a flashlight into the trees:: ominous rainstorm atmosphere, fairy forest details, photography by Anton Pieck, Frida Kahlo, Beatrix Potter, Gediminas Pranckevicius, Robert Rauschenberg --ar 9:16
Column 1
Column 2
Column 3
https://lh6.googleusercontent.com/6UpYGNnuDcJ5u0VUVgTdllO1acT3uAiX4DrCy6WCGzqREwlWwNILe2nz3Jvfw9u_dg2o0bwjhVCI2U7E2DqPyTl2C_mq5DxOYqQwCcmDAScHeE9SwsugM1FumdV1mIoE68TcEppPRUQ0pB8shttps://lh3.googleusercontent.com/S73DzDKp3FQ7iQI5AlvzFYiv4ixGj12UADFShGWuM5FmWDQuAEmjR9Ji0DikndCrvlfynA8xEifd-zUcai10_JTjM3HoajOxqM1_gNEDOzxFtfSfNiSLQpPteSLi7xBKomftB8CvpwA5Tuaa

🔁Example v3 output, first roll, Name-Based Puppet, no reference image…

Column 1
Column 2
Column 3
Notes As for everything in Midjourney, the quality of the picture may be helped through rerolling, vrolling, and increasing quality settings. For example, you can experiment with --q 2 and --q 5 but be aware these burn more GPU time. Struggling with writing effective prompts?https://bit.ly/Clarinet-Prompt-Troubleshooting
https://lh4.googleusercontent.com/LJC0llxLCo0s9JstTr3HYhM4w_kA3MaLmialgj4PhHXFr_KNlbi4p4fmyN1IbKOW3VOLe-y_efmS2jxAgWL-lyHmoPoVaMj0TjBIpZ7RuBcVxrnQiY0JD0LEjwBHAiLZa6st7DmwkZwiUpA7

📖Telling a Story Using Puppets (v3 example)

Remember that multiple rolls are required to evolve any composition. Below is an example of what this “puppet” template can accomplish within only a handful of rolls. Think of this as a proof-of-concept only.
Watch the scene, action, and atmosphere change around the core character.
Mary had a little lamb whose fleece was white as snow
Setting is an empty grassy meadow:: Mary is a blonde nine-year-old girl in blue 16th century peasant garb:: Setting contains a lamb and standing Mary:: Atmosphere is serene springtime morning, charming details --v 3 --aspect 5:7 --q 2 --uplight - @whatnostopRolls: 1
Files
Setting is a dirt road winding through the countryside:: Mary is a blonde nine-year-old girl in blue 16th century peasant garb:: Setting contains a lamb and walking Mary:: Atmosphere is serene springtime morning, quaint antique charming details --v 3 --aspect 5:7 --q 2 --uplight - @whatnostop Rolls: 1
https://lh5.googleusercontent.com/ztjqujhTaBBOjBsZvEMmSxeBHT9cusus9NJal42r0sUupwu4DwiLJiFcZapIsOaoQqfJbaYi3YSbNnwvMOKiZKn4jWMK8dDY_KV7qAEvSoHzmTyVkEqrGU9iRgpxUeFnXYKjGxmIosm42jXi
Setting is a school-house exterior:: Mary is a blonde nine-year-old girl in blue 16th century peasant garb:: Setting contains a frightened lamb::1.3 nearby, a pensive Mary:: Atmosphere is ominous springtime, charming details --v 3 --aspect 5:7 --q 2 --uplight - @whatnostop Rolls: 2
https://lh5.googleusercontent.com/9YNHb9YXWnbLpFpLW1efInqtldrECgQBWT9FprFjztl0-xqBXGQsnEuhchyrv8ObUWuZJDZLKZpoIA9YsTQI0VCZMMC-OPOr6j9vDb9uH-6XVGRGTlBQXdGIKmRMtE4Dh9B8lm6cG-FDoRHs
Setting is a school-house interior:: Mary is a blonde nine-year-old girl in blue 16th century peasant garb:: Setting contains a lamb::1.2 and Mary:: and several happy excited playful celebrating children::1.1 Atmosphere is joyful light magical springtime, charming details --v 3 --aspect 5:7 --q 2 --uplight - @whatnostop Rolls: 3
https://lh6.googleusercontent.com/LkDIxx5lTLHa8LtiIaB_7o7VBS2N__K2uck43GPNE4-DpPIZVXAaf6t_T6gSeF_CuUh7d0JyhfyeJKKAT8kGQLDdmZssftGVbD2QFltaZtSwzudxFToH8_YqRCTDUXN1PmjDm00NA8mhSkl3
Last updated by @whatnostop#6700 (clarinet) 5:00 PM 11/14/2022.
Please contact me via Discord with feedback and new discoveries! Let’s keep this updated with our community research.
Together, we can improve and refine this method.

🖍️Clarinet’s Midjourney Profile🖍️

Appendix of User Discoveries

11/14/2022 UPDATEI am still exploring puppets in v4. Want to help? Send me a note in Discord if you discover anything and let’s add your discoveries to this document.

08/09/2022 UPDATE

[10:54 AM] Joe Penna (MysteryGuitarMan): Do we know that the "setting is" kind of thing works with MJ?
[11:03 AM] clarinet: It's responding to the repeated term, I think, if-and-only-if that term is not producing any other effect. For example, the same prompt using Scene is actually produces a stage, curtains, actors, and a director. A literal scene. But Setting does not. But perhaps Thsudjfl also would not, because it is also a meaningless string.
[11:04 AM] clarinet: Similarly, Background is has its own baggage.
[11:04 AM] clarinet: Background, scene, backdrop, surroundings, context... all these words did something else.
[11:05 AM] clarinet: The word Setting was the first one I found that was sufficiently baggage-free to use in this manner.
[11:06 AM] clarinet: But listen, try removing the words “Setting is” and “Name is” and “Setting contains” from each phrase and tell me if you get the same results. I am bemused, I don’t think they’re as vital as the template implies.
[11:09 AM] shamgar: huh, you're right. ah so much to play with.