I have several bookmarks that are my go-to when people ask me for help with UTAU, so I figured I'd post them all for you guys XD

What to Look For in a Japanese Reclist (and what to avoid)
How to Pronounce Japanese F's and R's
Oto Theory Tutorial by Cdra
Essential UTAU mixing kit
The Complete Moresampler Tutorial
List of Standard UTAU Flags (and what they do)
PaintedCZ's VCCV tutorials: YouTube playlist | Written Version

If you have any tutorials you like that you want added, go ahead and put them in the comments!

I literally copied this out of word, so if the images don't load right please try one of these downloadable versions.


    Please note that while this is not a direct transcript, all material from this tutorial is taken from CZ’s video tutorials on their channel. I’ve simply recorded the information in those videos in a written format for people who either can’t hear very well, don’t speak English very well, or don’t like sitting through video tutorials. You may translate or repost this tutorial in any form as long as you credit PaintedCZ for the tutorial’s content.

I. Introduction

                This video serves as an introduction to the system as well as the rest of the tutorials. This reclist (called the CORE English Reclist) was created for users of UTAU with the intention of being easy to record and use. This list was designed to be modular, so that things can be easily added or removed, and to be a base for users to work from. This list includes custom phonetics, aliasing, otoing, and usting systems, each of which will have their own section in this tutorial.

Text Box: Figure 1                The reclist is split into ten different sections to make it easier to record and oto. Each section is organized alphabetically and by vowel, as seen in Figure 1. You will also notice that the file names of each list are followed by numbers: the first number is how many recordings are in that list, while the second number (in parenthesis) is the number of oto lines that result from that list. Each section has its own folder and its own base oto which are provided with the reclist. The reclist also includes background music for use while recording with OREMO.


II. Pronunciation (Phonetics)

                In this video, CZ tells how to pronounce all of the phonetics used in the VCCV CORE English Reclist. While this list was designed for a general American accent, it works for almost any English accent. Accent compatibility is still being tested. I’ve included CZ’s reference charts (Figures 2-6) in this section, as well as my own chart (Figure 7), but I highly recommend that you watch the video to hear the phonetics pronounced out loud and to practice writing with them.

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6


Figure 7

III. OREMO (Recording)

                This tutorial goes over how to install and use OREMO for the VCCV reclist. While it is not necessary to use the OREMO program, it is highly recommended by both CZ and myself. The first thing to do is to make sure your system locale is set to Japanese. Open the Control Panel, click on “Clock, Language, and Region,” then “Region and Language.” Under the Administrative tab, click on “Change System Locale” and select “Japanese (Japan)” from the drop-down menu. Click “OK” to save this. You will need to restart your computer for this to take effect. If you don’t set your system locale to Japanese OREMO may have issues installing and running. Download OREMO here, extract the folder to any location, and run the program.

                The first thing you should do now is load the reclist and recording folder. Each reclist comes with its own corresponding folder, and you want to save the recordings from each reclist into its corresponding folder. In OREMO, go to File Load Voice List, navigate to where you’ve saved the reclist and pick the first section. Then, go to File Set Recording Folder and select the corresponding folder. Folders will NOT be listed in the same order as the text files, so be careful not to pick the wrong one. Then go to Options Audio I/O Settings and make sure you are set to the default (Wave Mapper) on both Input and Output.

                Next, go to Options Recording Style Settings. Select “Automatic Recording 1” (the second option). This way, when you press R once, the selected background music will play and OREMO will start recording on its own until the music finishes. At the bottom of the window where it says “Guide Background Music,” click the blue folder and navigate to the background music folder in the VCCV reclist. Select the pitch that you want to record on and hit ok for both windows.

                After that, go to Options Advanced Settings. Check “Show Target Tone” in the bottom right and set the target tone to the pitch you selected earlier. Then hit “Go” underneath that so that the range of displayed pitches will zoom in on the pitch you’ve selected. Above that, check “Show Key Grids.” In the left column, you can change the color in which the spectrum is displayed; CZ and I both prefer color1. Then hit “Apply” and “Ok” at the bottom.

                Now go to “Show” and check “Show Spectrum,” “Show Power,” and “Show F0.” This will allow you to see the spectrum, volume and pitch of your recording. Then go to File Save Current Configurations to Initialization File and just click “Save.” That way, you won’t have to set all of this up every time you open OREMO.

                CZ demonstrates how to record each section’s recordings along with the background music provided. When you press “R,” you will hear the tone you are recording on, then a cymbal crash (which is where you should breathe in), and then four quarter notes at 120 BPM at the pitch you are recording on. You should record the syllables of the recording on beat with these quarter notes. Once you’ve recording, all you have to do is press the down arrow key to move on to the next sound. OREMO will save the recordings to the folder you’ve chosen automatically. You can also press space bar to play your recording again.

PLEASE NOTE: Some sections, like the CC- section, utilize solitary consonants as well as underscores and dashes. These are only for organizational purposes and do not affect the recording. DO NOT record solitary consonants.

                You may record with other methods, but keep in mind that if you don’t use this method your recordings will not line up with the base otos at all and you’ll only make it harder on yourself. This is the quickest and easiest method to record VCCV. Just make sure that when you switch to the lists of other sections you also switch folders or you’ll mess yourself up.

                Another thing to keep in mind is that in the CV_CVC section some consonants are aspirated and others are non-aspirated. For example, in the recording “-ba_bab_kab” the last b in “bab” is aspirated while the b in “kab” is not. There’s more info on this in CZ’s video as well. It is also important to note that in the V section, the first vowel should encompass two quarter notes, whereas the second sound should only encompass one. This is especially important in dipthongs like “A” and “O” – do not say the vowel twice, simply hold it for an extended amount of time. In the VV folder, you should pronounce both vowels together without a break in between, and each vowel should last a quarter note. Ignore dashes and underscores.

                If you aren’t sure what to record, check the pronunciation tutorial or download CZloid’s bank to look at the recording specifically. OREMO’s font can make capitol i’s and lower-case L’s look the same, but the pattern of consonants and vowels never changes within sections, so you can look at the recording’s surroundings for clues.



                This section covers how to oto VCCV English in a program called SETPARAM. While the program isn’t necessary, it is extremely useful for the type of oto that VCCV uses. VCCV banks are huge, and otos can be intimidating, but using SETPARAM will help you finish with minimal suffering.

                To download SETPARAM, click here. Extract the files to any location and open the program. Navigate to where you saved your recordings and select a section to work on. Open it and load the voice configurations that came with the reclist download. Just like you did with OREMO, go to “Show” and check “Show Spectrum,” “Show Power,” and “Show F0.” Then go to Options Mouse and Key Settings and check “Utterance Timing Adjustment Mode.” Then go back to Options Mouse and Key Settings When left blank is modified… and select “Other parameters are modified accordingly to keep relative positions.” You also want to select this opion for “When preutterance is modified….” Then go to Options Mouse and Key Settings Auto Focus Settings and make sure preutterance is selected.

                Next, go to advanced settings and set the spectrum color to color1. You can also have it show the target tone as described in the OREMO tutorial. After that, go to Show Time Zoom and select x10 and then save the current configurations to initialization file like you did with OREMO.

                The base oto will not be spot-on, but it will be very close as long as you followed the recording method described in the previous tutorial, and it will be very easy to fix the oto to suit your bank. This is because with the way we’ve configured SETPARAM, you can set the preutterance with a click and the rest of the configurations will move with it. Now all there is to do is set the preutterance where it goes, and the oto will be done before you know it! CZ outlines how to oto each individual type of otoing line, as well as what to look for in the spectrum. You will be looking at the spectrum for pretty much everything you need to know. Black areas are areas of silence, looping colored regions are vowels, and consonants look somewhat different from each other, but are always easy to distinguish from a vowel on the spectrogram. CZ also goes in-depth as to what all the lines and numbers mean and do if you’ve never otoed before, and she explains it better than I could. For now, I will simply list where the preutterance and other values should go on each type of otoing segment (I may produce a more detailed tutorial on otoing theory in the future). The overlap must ALWAYS be half of the preutterance value, otherwise the UST won’t work right. CZ includes a list of number sets you can copy and paste to make sure of this.

    ·         -CV and CV

    o   These types make up most of the bank.

    o   Preutterance should be placed after the consonant right before the start of the vowel.

    o   The left blank of a CV oto should be right after the previous vowel, so that none of the previous vowel is included. If the preutterance is too large for this, copy and paste a smaller number set into the chart.

    §  The left blank of a -CV oto doesn’t matter as long as it leaves ample space before the consonant.

    o   The consonant portion should include the entire consonant and part of the start of the vowel, until the vowel becomes consistent.

    o   The right blank should be placed at the end of the vowel, before the vowel becomes inconsistent.

    ·         V C (note the space)

    o   This section will look the most like a VCV oto.

    o   Preutterance should go at the end of the vowel (the very end, after it fades out). It should encompass the majority of the previous note. It should also go where the left blank was on the previous sample, so that the two crossfade smoothly.

    o   The left blank should be after the previous consonant, where the vowel starts to become consistent.

    o   The consonant portion should include the majority of the recording, stopping sometime after the consonant.

    o   The right blank literally does not matter. As long as you are USTing properly, you will never hear anything after the consonant portion. Leave the right blank alone.

    ·         VC

    o   These are NOT otoed the same way as VC- sections.

    o   These are supposed to be short, aspirated consonants that come before another consonant.

    o   The preutterance should go after the consonant, before the aspiration. If there is no aspiration, then just put the preutterance after the consonant. CZ goes over what this looks like more in-depth in the video.

    o   The left blank should cut off the previous consonant, just like with V C otos.

    o   The consonant portion should encompass all of the consonant, vowel, and aspiration.

    o   The right blank should come just a few milliseconds after the consonant, cutting off the following consonant and leaving a tiny bit of empty space.

    ·         VC-

    o   These are NOT otoed the same way as VC sections.

    o   These are used to end with a consonant and to attach consonant combinations to the end of samples.

    o   Preutterance goes at the very end of the vowel, like with a V C section.

    o   The left blank should exclude the previous consonant like in VC and V C.

    o   The consonant should include EVERYTHING, both the vowel and the consonant.

    o   The right blank doesn’t matter as long as there’s silence between it and the consonant.

    ·         -CC and beginning CC

    o   In these, we treat the second consonant as if it were the vowel.

    o   Preutterance should go right between the two consonants; right where the first one ends and the second one begins.

    o   The left blank should exclude the previous vowel.

    o   The consonant should cover all of the second consonant.

    o   Like in V C, the right blank does not matter.

    ·         _CV

    o   These are used to connect to -CC and beginning CCs.

    o   The preutterance should go where the vowel starts. It’ll probably be fairly small.

    o   The left blank should go at the end of the first consonant, right before the second one begins. Make sure the beginning consonant is cut off.

    o   The consonant should include the second consonant and part of the vowel.

    o   The right blank should be right before the vowel stops being consistant.

    ·         VCC

    o   The preutterance should go after the last consonant, like a VC oto.

    o   All other values are set the same way as VC

    ·         VCC-

    o   The preutterance should go after the first consonant, treating that consonant like the vowel in a VC- oto.

    o   All other values are set the same way as VC-

    ·         Ending CC

    o   These are otoed the exact same way as VCCs except the left blank should exclude the vowel.

    ·         CC-

    o   These are otoed the exact same way as VCC-s except the left blank should exclude the vowel.

    ·         -V

    o   Preutterance should go at the start of the vowel.

    o   The left blank should leave adequate space before the vowel.

    o   The consonant should go until the vowel becomes consistant.

    o   The right blank should cut off before the vowel becomes inconsistent (but if you recorded right you shouldn’t need to worry about it).

    ·         V

    o   The preutterance’s exact location doesn’t matter. All that matters here is that the left blank cuts off the very beginning of the vowel and that the right blank cuts off the end of the vowel before it becomes inconsistent. These rarely require adjustment.

    ·         V-

    o   This is otoed the same way as a VC-, with the preutterance at the end of the vowel.

    ·         _V

    o   These are for blending vowels with consonants that aren’t followed by vowels in the main reclist, such as ng.

    o   The preutterance should be at the start of the vowel. It’ll be really small.

    o   The left blank should be after the consonant at the beginning.

    o   The consonant and right blank are the same as a _CV.

    ·         VV

    o   The preutterance should go right between the two vowels.

    o   The left blank should cut off the beginning of the first vowel.

    o   The right blank should cut off the end of the second vowel.

    CZ also shows what every consonant looks like on the spectrum at the end of her video. I won’t for the sake of time, but you can see them on her video beginning at 17:19.

V. USTing

                This tutorial assumes that you’ve watched/read all of the previous tutorials and that you have a basic understanding of how to use utau. If you have never used utau, or don’t have a basic knowledge of the program, read this tutorial by Cdra before you continue. You will also need this flag editor plug in, because making a VCCV ust requires you to edit the Consonant Velocity of every note. I cannot emphasize this enough; consonant velocity is the most important part of making a VCCV ust.

                CZ describes in great detail how the oto works to make a ust. Since you recorded quarter notes at a BPM of 120 (if you recorded correctly), you can replicate recordings in utau by making quarter notes at 120 BPM. Quarter notes at 120 BPM do not need to have their consonant velocity edited, since consonant elocity is only used to compensate for making notes longer or shorter.

Text Box: Figure 9Text Box: Figure 8                To make a ust, start with the CV notes as seen in Figure 8. Then, grab the end of the first note and drag to create a rest that’s the same size as the envelope on the previous note, as seen in Figure 9. Double-click the rest note you created to change the lyric to the VC or V C lyric that you want to use.

                To actually use this in a song, import an midi or VSQ into UTAU and then change the notes to the CV portions of the lyrics for a small section. Then, highlight that section and open the flag editor plugin. For notes that are shorter than a quarter note at 120 BPM, increase the consonant velocity, and for notes that are longer, decrease the consonant velocity. Then go through the section adding the VC portions of the notes as seen in the previous paragraph. After you do that, highlight the entire section again and open the flag editor plugin. VC notes should have the same consonant velocities as their CV counterparts. After setting the consonant velocities for the VC notes, right-click on the selection and select Properties. The second box from the top is labeled Modulation, and it should be greyed out. Double-click inside that box and type “0.” If you don’t it will sound awful, just trust me. Then, while the section is still selected, press the buttons in the top right labeled “P2P3” and “ACPT” in that order. Repeat for the entire ust and tune as normal.

                CZ goes more in depth about the alias types and functions both in the previous tutorials and her video of this tutorial. All of her tutorials can be found in order in this playlist on YouTube. I highly recommend that you watch all of them all the way through before using or recording VCCV.

Please note these prices are for standard size reclists. I may ask for additional payment for extras depending on how much they add to the oto size.

CV Japanese: $1 per pitch
VCV Japanese: $5 for the first pitch and $2 for each additional pitch
CVVC Japanese: $4 for the first pitch and $2 for each additional pitch
VCCV English: $12 for the first pitch and $6 for each additional pitch

CV Sample VCV Sample CVVC Sample VCCV English Sample Demo Reel

Email me at to ask for a commission or note me here on DA.
I've submitted both Bunny and Gemini's official artwork to the print shop in hopes of generating a little money, and I'm willing to send some of my older works to the shop as well but only if someone actually wants them (in my opinion all my old art is ugly and worthless so XD)

I'm also thinking of opening commissions for artwork and UTAU otos, but I'm not sure what to do about pricing. I'm also not entirely sure how point commissions work, so they might be cash only.
Normally I don't post my covers here unless there's some art or MMD to go with them but the YouTube video is blocked in five countries and so I posted the cover on as well.


This guide only has the 100 most commonly used words in english (according to wikipedia) and only with CZ's phonetic system. I may add other systems like arpasing and X-SAMPA at some point in the future.

The list is limited, but I hope it's still helpful :)

PLEASE NOTE: In this list, a / indicates where you should put the preceding/following consonant to blend properly. If there is no preceding/following consonant, then use a - in place of the /.


