project – Hexar-Decimal

Programmatic Sentence Generation – Shakespeare!

Introduction

Many moons ago, when AOL Instant Messenger was all the rage, I really wanted to write some kind of chat bot, something like SmarterChild except, perhaps, more amusing and less useful. I do recall attempting to use some Perl-based chatbot APIs to connect one of my accounts to AIM, but had very little success. So I gave up on the AIM API, and focused more on the core functionality of what I wanted a chatbot to do: simply construct a sentence given a “seed” word as the first word in the sentence. Therefore, the “Shakespeare” project was born.

What’s in a Name?

Why is this application called Shakespeare? Well, actually, originally it wasn’t – it was called Jabberwocky. I needed a source text from which to construct sentences, and one of the free online books available on Project Gutenberg was Through the Looking Glass by Lewis Carroll, and Jabberwocky is one of my favorite poems from the book. Not to mention the fact that a programmtically-generated sentence would likely sound rather silly and hilarious. Or so I hoped. Anyway, the first version of my program used Jabberwocky as the source text, but later I decided that I wanted a much larger source text, such as one of the works of Shakespeare. I grabbed the text from The Merchant of Venice, cleaned it up, and started using it, and the sentences it generates now are just as amusing or more.

Program Source

This project is written in C++, and is available on GitHub for public consumption here: Shakespeare. Feel free to check it out. The project opens in Visual Studio 2008, but if you don’t have VS2008 installed then it should be pretty easy to edit and compile using g++ or Eclipse or something =)

Source Explanation

The program works via a simple two-step process:

The source text is parsed and string pairs are put into a multimap
The program reads lines from stdin until “quit” is received, and generates a “sentence” for each word entered.

Let’s examine these steps more closely:

Parsing the source text


    while (!file.eof())
    {
        std::string s;
        file >> s;
        s = strlower(s);

        // Look for a period
        if (s.length() > 1 && s[s.length() - 1] == '.')
        {
            s = s.substr(0, s.length()-1);
            file.putback('.');
        }
        // Look for an exclamation
        else if (s.length() > 1 && s[s.length() - 1] == '!')
        {
            s = s.substr(0, s.length()-1);
            file.putback('!');
        }
        // Look for a question-mark
        else if (s.length() > 1 && s[s.length() - 1] == '?')
        {
            s = s.substr(0, s.length()-1);
            file.putback('?');
        }
        // Look for a comma
        else if (s.length() > 1 && s[s.length() - 1] == ',')
        {
            s = s.substr(0, s.length()-1);
            file.putback(',');
        }

        // Add the data to the map
        if (last != "" && last != "." && last != "!" && last != "?")
        {
            wordmap.insert(make_pair(last, s));
        }
        
        last = s;
    }

The code here reads strings from the file, where each string is separated by whitespace. For the most part these strings represent words, but some of the words end with periods, commas, exclamation marks, or question marks. If one of these characters is encountered, we want to treat it like its own “word”. Next, for each word that we find, we insert it into a multimap where the previously encountered word is the key and the current word is the value. By the way, a multimap is just a C++ map that can have multiple non-unique key-value pairs.

Generating the sentence


        while (true)
        {
            // Check to see if the word is in the multimap
            if (wordmap.find(word) != wordmap.end())
            {
                // Get first mapped value
                multimap::iterator lb = wordmap.lower_bound(word);

                // Count how many there are
                long count = wordmap.count(word);

                // Select one of them
                long index = rand() % count;

                // Advance map iterator
                for (long i = 0; i < index; i++)
                {
                    lb++;
                }

                // Check for characters that would end a sentence
                if (lb->second == "." || lb->second == "!" || lb->second == "?")
                {
                    // If sentence is too short, select another word
                    if (wordcount  1)
                    {
                        // Update the word count in case the only endings to this
                        // word are ! or . or ?
                        wordcount++;
                        continue;
                    }

                    // Print word
                    cout <second << endl;
                    break;
                }
                else if (lb->second == ",")
                {
                    // Print commas
                    cout <second;
                }
                else
                {
                    // Print the word with a space
                    cout << " " <second;
                    wordcount++;
                }

                // Update the current word
                word = lb->second;
            }
            else
            {
                // Not found; end the sentence
                cout << ".\n";
                break;
            }
        }

The code here looks up the first mapped value for the current word in the multimap (if it exists). If it does, then we pick one of the values for this key at random. Note that multiple identical values can exist because of how the source text was parsed, so you have a higher chance of getting a more common “following” word than one that is not so common. Each following word is printed, unless the word is a “sentence-ending” word such as the period, exclamation mark, or question mark and the sentence is deemed to be too small. Not shown here is the initialization of the first word to be the word that the user typed into the application.

Some Example Outputs

I find many of these to be hilarious. The sentences have a tendency to run on and obviously don’t quite follow the rules of sentences. But because each source word precedes a destination word, the sentences generated often almost make sense:

Will be render’d, will you, monsieur le bon?

Why should i know your master will tarry?

Try confusions with his surety and you the occasion to his affections, dispatch all of rich value of fair hand i am in silver ewes, as he hath the fiend bids me this shadow ere thou to him.

Thou like a night troilus methinks it.

Wit in the duke only are not love me to speak for the truth is no so again his outward wall, for a man knows but who bids me the wind!

Pity on his wealth being seasoned with us in alabaster?

Jump with imagined speed, indeed, for this letter call me dog and that is sum that the knave and it to offend, but assumes some taste of blood to my clerk, for i do you a key, there.

Improvements?

Feel free to leave a comment if you have any ideas for ways to make the generated sentences either more realistic, or more amusing =)

RenameM4A

Introduction

For better or for worse, I use iTunes to manage the music on my iPhone. And since I have to have iTunes installed on my computer, I might as well also use it to rip my audio CDs into music files. Traditionally, the file format of choice was MP3, also known as MPEG-1 Audio Layer 3. However, a few years ago I heard about a new file format called AAC (Advanced Audio Coding) promising better sound quality than MP3s at similar bitrates, which iTunes also supports. So I went ahead with a project to re-rip all of my CDs from MP3 into AAC files, which by the way, have the file extension .m4a.

The problem I encountered was that I really disliked the file/folder structure that iTunes automatically uses when it rips your CDs. There seems to be no way to customize this file/folder structure in iTunes, so if you really care about what your files are named and in what directories they are placed, as I do, you’re stuck renaming all of your music files by hand, which is a major pain.

Therefore RenameM4A was born. This simple C# program allows you to select a set of M4A files and rename them according to a rename rule.

Rename Rules: Using RenameM4A

A rename rule is a simple way of enforcing a particular file/folder structure for any M4A file that is to be renamed. Because this is a very simple project, right now the rename rules support the following tags:

As you can see, the tags are surrounded by brackets which denote some kind of information contained within the file. All AAC files have metadata embedded in them which tells you information about the track and where it came from. When RenameM4A processes each file, it looks at the rename rule and replaces any tags encountered with information gathered from the file itself. This step takes place when you press the Preview button, which generates a list of files to be renamed, giving their original file locations and the new file location.

Then, you can hit either the Rename button or the Copy button to move or copy the file according to the destination paths given in the GridView. In fact, if you want to you can edit the GridView in order to fix any discrepancies that may have occurred when running the rule. For example, you might have a track from a particular band’s album where another artist is featured. Since you want the entire album contained within the same folder, you might edit the line for that track to remove the name of the other artist from the directory name.

If you take a look at the rename rule combo box, you can also see that there are a number of pre-loaded rename rules that may appeal to you.

How is AAC better than MP3?

In short, AAC files have superior sound quality and transparency than MP3s do, for the same bitrate. Put another way, you can encode your music at a lower bitrate than an MP3 for the same sound quality. Of course, there are a lot of other improvements, including:

More sample frequencies
Up to 48 channels of audio
Better handling of audio frequencies above 16 kHz
More flexibility in designing codecs

Advanced Audio Coding File Format

A SourceForge project called Atomic Parsley appears to have a lot of good information on the MPEG-4 file format, especially as it relates to M4A files. As it turns out, the AAC data is actually just a section within an MPEG-4 file, which can actually contain all kinds of nested data. Each nested section in an MPEG-4 file is called an atom (or sometimes a box); the first four bytes of each box defines its length in bytes, the next four is the name of the box, and the rest is either data or other nested boxes. In a way, that’s pretty similar to how XML works.

Since the only thing that RenameM4A really cares about is the album, artist, year, title, and track number, it simply scans the file for the five atoms it’s interested in and grabs the data. It’s also worth noting that text data appears to be stored in UTF-8 format, so you can’t simply assume that it will be straight-up ASCII.

Recent Improvements

This blog post actually took me a ridiculously long time to write. Mainly that’s because I was a bit embarrassed to release my source code; it was written several years ago and the way it was parsing M4A files was extremely naive. Recently, I rewrote most of the parsing code to make the following improvements:

More intelligent parsing; use atom information to jump to the sub-atoms of interest instead of scanning the entire file for “udta” and then each meta tag.
Reading only part of the file; since all metadata is under the “moov” atom, RenameM4A only needs to load the portion of the file under moov. The vast majority of a music file’s size is its music data, which is under the separate mdat atom, so we can avoid loading the entire file into memory when we don’t need to.

I did a simple benchmark test of the new and old parsing algorithms by trying to parse 14 half-hour audio files ten times each. The old parser takes about 36 seconds; the new one takes 0.14 seconds; an improvement of around 270 times faster. Not bad.

Download RenameM4A Source + Executable

[RenameM4A Source + Executable]
Note: Requires .NET Framework 2.0 or later.

References

MPEG-1 Audio Layer 3

Advanced Audio Coding

MPEG-4 Part 14

Atomic Parsley

Atoms, Boxes, Parents, Children & hex (oh my)

Subject: Dissection of m4a format – msg#00046

HexarViz: An XNA Music Visualizer

Media Player Visualization Data

Recently, at a local Microsoft GameDevz meeting that I attended, we had a short 45-minute contest to see which 2-3 person team could come up with the coolest visualization given a relatively simple XNA project that retrieves visualization data from the song currently playing in the Media Center. HexarViz is an evolution of the project that my partner Daniel Taylor and I came up with during this competition.

The way to retrieve music visualization data from XNA is very simple. Call the MediaPlayer.GetVisualizationData method in XNA 3.0 or later, and you will retrieve an object of type VisualizationData. The VisualizationData class provides only two properties, Samples and Frequencies. The former is an array of 256 float values from -1.0 to 1.0 indicating the raw amplitude of the currently playing sound sample. The latter is an array of 256 float values from 0 to 1.0 indicating the logarithmic scaled power value of each frequency band in the currently playing sound sample – essentially the result of a FFT on the sound data. The frequency distribution seemed like a more obvious choice to use in a visualizer, so that’s what we used.

HexarViz

The idea for this visualizer was inspired by the standard horizontal bar UI that is typically used to display the currently playing frequency bands. It looks like this:

And in fact, this is the first part we implemented, by using a simple 10×10 white square as a texture, and drawing one red rectangle per frequency band, from the left side of the screen to the right. But, let’s be honest, that’s boring, because it’s been done a million times.

We wanted something more interesting, so I thought maybe it would be cool to make the frequency band values rotate around a center point, kinda like a circle. Except since each rectangle drawn rotationally this way will have a different length depending on its power value – so the shape of this sound circle will change drastically over time. Here’s basically what that looks like:

This image depicts pretty closely what we ended up with for the competition, in about 45 minutes. We also had a few rudimentary features:

Previous Song / Next Song
Sound Circle Rotation

But like I said, HexarViz is an evolved version of the results of that competition, meaning that after the competition I added a lot of features, namely:

Pause
Volume Up / Volume Down
Displaying the currently playing playlist, album, artist, and song name
Playlist change support
Toggle individual Red/Green/Blue components of the target color
Color Cycling
Change the current texture

Color Cycling was probably the most difficult part to implement. In order to effect a smooth transition of target colors, I calculate the HSB (Hue/Saturation/Brightness) of the current target color, increment the hue, and then convert back to an RGB value. The color of each individual rectangle is also a linear interpolation between black and the target color, based on the frequency band’s percentage of maximum power.

Changing the current texture, from a 10×10 white square to something else, also has some very interesting effects on the way that the sound circle looks. Give that a try.

Source and Windows Installer

[HexarVizSource]

[HexarVizInstaller]

Controls

Change Playlist – P
Previous Song – Left Arrow
Next Song – Right Arrow
Pause – Spacebar
Increase Volume – Up Arrow
Decrease Volume – Down Arrow
Toggle Spin – S
Toggle Red – R
Toggle Green – G
Toggle Blue – B
Cycle Colors – C
Change Texture – T
Quit – Escape

HexShooter: An XNA 2D Tutorial Upgrade

XNA Game Studio 3.1

XNA Game Studio, if you haven’t heard of it, is an integrated development environment for the development of Windows and Xbox 360 games using either C# Express 2008 or Visual Studio 2008. It is essentially a Visual Studio Package which includes a set of XNA assemblies, as well as new project types to create Windows and Xbox 360 games. The latest version, 3.1, was released earlier this year and its newest features include video playback, an updated audio API, avatar support, and use of the XBox Live Party system.

In addition, budding game developers with a Premium Creators Club account can submit their game to the XNA Creators Club for peer review; if passed, these games can be sold on Xbox Live Indie Games for $1, $3, or $5. Developers receive 70% of all sales revenue for copies of their game that are purchased via Xbox Live. Not too bad a deal, though creating an independent game with mass market appeal and a decent purchase rate is no small task.

Creators Club Website

The epicenter of all things XNA is located at the XNA Creators Club Website. This is where you obtain a Premium account, submit your game for review, playtest others’ creations, and access a multitude of resources including forums and tutorials.

The 2D Tutorial

The Beginner’s 2D Tutorial uses video to demonstrate how to create a simple 2D shooter starting from a blank Windows Game project, given a couple of pre-created images. The videos walk you through creation of the game project, adding assets, drawing the background, creating cannons, firing cannonballs, adding enemies, and destroying enemies. It’s actually a very handy tutorial, and you end up with something like this (skip to 4:17)

HexShooter

As you can see in the video above, there are three enemy ships moving from right to left at all times, and you can shoot them to earn points. It’s pretty smooth, but there’s something important missing. Because there’s no inherent challenge, no sense of danger, and no real goal, this project is unfortunately little more than a toy, and the fun factor is completely missing. HexShooter is my attempt to take this project and change it in ways that will make it challenging and fun.

So here’s a list of some of the features I added:

Explosions and Smoke (from Joshua Foss’ excellent Adding 2D Particles and Explosions videos.
The ability to move the cannon from left to right with the left thumstick, aiming now with the right.
Enemies now spawn from either the left or right sides of the screen, and they bounce off the edges and come back with the same height and speed.
Player cannonballs are now very fast, but you can only have one on screen at once.
Enemies now fire green cannons back at you, and if they hit the center circle of your cannon, you lose.
For every ten enemies killed, a new enemy is added.

There are several reasons why I think this upgrade is a lot more fun to play than the original. First, there is the very real possibility of losing, and there is a challenge. Namely, kill as many aliens as you can while avoiding enemy fire for as long as possible. Because of the way that enemies are added, the game starts out easy but becomes increasingly challenging the more enemies you kill. Eventually you will find yourself spending most of your attention on dodging enemy fire, which reminds me of the fun and challenge of Galaga.

The choice to change the cannon to only fire one fast cannonball at once was also intentional. In the tutorial version, you can fire three cannonballs at once, but they are very slow. This makes aiming and firing tedious and somewhat pointless. With only one cannonball, you have to make your shots matter, so your aim counts.

For a while I considered creating a complete mesh-based collision detection system for the cannon, including not just the circle but the barrel too. But I realized that not only would that be a lot of extra work, but it would probably detract from the gameplay. Having to worry about your cannon aim in order to dodge enemy cannonballs is silly, and would make evasion nearly impossible in the late game.

Source and Executable

I am offering both the source code and executable for inspection and playtesting. If you want to compile and run the source you must have XNA Game Studio 3.1 installed.

[HexShooter Source]

If you just want to play the game, you need to have the Microsoft XNA Framework Redistributable 3.1 installed, and either:

Xbox 360 Controller for Windows ($29, USB).
OR
Xbox 360 Wireless Gaming Receiver ($40, USB) + Xbox 360 controller ($38, wireless).

Or, you could just use the keyboard.

[HexShooter Installer]

Using the keyboard isn’t too bad, but if you’d like to use an Xbox controller, I’d recommend the cheaper USB wired controller, unless you already have some Xbox 360 controllers lying around and you absolutely must have wireless controllers.

Controls

Move – Left thumbstick or A/D keys
Aim – Right thumbstick or Left/Right arrow keys
Shoot – Right trigger or Spacebar
New Game – Start or Enter