Programmatic Sentence Generation – Shakespeare!

Introduction

Many moons ago, when AOL Instant Messenger was all the rage, I really wanted to write some kind of chat bot, something like SmarterChild except, perhaps, more amusing and less useful. I do recall attempting to use some Perl-based chatbot APIs to connect one of my accounts to AIM, but had very little success. So I gave up on the AIM API, and focused more on the core functionality of what I wanted a chatbot to do: simply construct a sentence given a “seed” word as the first word in the sentence. Therefore, the “Shakespeare” project was born.

What’s in a Name?

Why is this application called Shakespeare? Well, actually, originally it wasn’t – it was called Jabberwocky. I needed a source text from which to construct sentences, and one of the free online books available on Project Gutenberg was Through the Looking Glass by Lewis Carroll, and Jabberwocky is one of my favorite poems from the book. Not to mention the fact that a programmtically-generated sentence would likely sound rather silly and hilarious. Or so I hoped. Anyway, the first version of my program used Jabberwocky as the source text, but later I decided that I wanted a much larger source text, such as one of the works of Shakespeare. I grabbed the text from The Merchant of Venice, cleaned it up, and started using it, and the sentences it generates now are just as amusing or more.

Program Source

This project is written in C++, and is available on GitHub for public consumption here: Shakespeare. Feel free to check it out. The project opens in Visual Studio 2008, but if you don’t have VS2008 installed then it should be pretty easy to edit and compile using g++ or Eclipse or something =)

Source Explanation

The program works via a simple two-step process:

  1. The source text is parsed and string pairs are put into a multimap
  2. The program reads lines from stdin until “quit” is received, and generates a “sentence” for each word entered.

Let’s examine these steps more closely:

Parsing the source text

    while (!file.eof())
{
std::string s;
file >> s;
s = strlower(s);

// Look for a period
if (s.length() > 1 && s[s.length() - 1] == '.')
{
s = s.substr(0, s.length()-1);
file.putback('.');
}
// Look for an exclamation
else if (s.length() > 1 && s[s.length() - 1] == '!')
{
s = s.substr(0, s.length()-1);
file.putback('!');
}
// Look for a question-mark
else if (s.length() > 1 && s[s.length() - 1] == '?')
{
s = s.substr(0, s.length()-1);
file.putback('?');
}
// Look for a comma
else if (s.length() > 1 && s[s.length() - 1] == ',')
{
s = s.substr(0, s.length()-1);
file.putback(',');
}

// Add the data to the map
if (last != "" && last != "." && last != "!" && last != "?")
{
wordmap.insert(make_pair(last, s));
}

last = s;
}

The code here reads strings from the file, where each string is separated by whitespace. For the most part these strings represent words, but some of the words end with periods, commas, exclamation marks, or question marks. If one of these characters is encountered, we want to treat it like its own “word”. Next, for each word that we find, we insert it into a multimap where the previously encountered word is the key and the current word is the value. By the way, a multimap is just a C++ map that can have multiple non-unique key-value pairs.

Generating the sentence

        while (true)
{
// Check to see if the word is in the multimap
if (wordmap.find(word) != wordmap.end())
{
// Get first mapped value
multimap::iterator lb = wordmap.lower_bound(word);

// Count how many there are
long count = wordmap.count(word);

// Select one of them
long index = rand() % count;

// Advance map iterator
for (long i = 0; i < index; i++)
{
lb++;
}

// Check for characters that would end a sentence
if (lb->second == "." || lb->second == "!" || lb->second == "?")
{
// If sentence is too short, select another word
if (wordcount 1)
{
// Update the word count in case the only endings to this
// word are ! or . or ?
wordcount++;
continue;
}

// Print word
cout <second << endl;
break;
}
else if (lb->second == ",")
{
// Print commas
cout <second;
}
else
{
// Print the word with a space
cout << " " <second;
wordcount++;
}

// Update the current word
word = lb->second;
}
else
{
// Not found; end the sentence
cout << ".\n";
break;
}
}

The code here looks up the first mapped value for the current word in the multimap (if it exists). If it does, then we pick one of the values for this key at random. Note that multiple identical values can exist because of how the source text was parsed, so you have a higher chance of getting a more common “following” word than one that is not so common. Each following word is printed, unless the word is a “sentence-ending” word such as the period, exclamation mark, or question mark and the sentence is deemed to be too small. Not shown here is the initialization of the first word to be the word that the user typed into the application.

Some Example Outputs

I find many of these to be hilarious. The sentences have a tendency to run on and obviously don’t quite follow the rules of sentences. But because each source word precedes a destination word, the sentences generated often almost make sense:

Will be render’d, will you, monsieur le bon?

Why should i know your master will tarry?

Try confusions with his surety and you the occasion to his affections, dispatch all of rich value of fair hand i am in silver ewes, as he hath the fiend bids me this shadow ere thou to him.

Thou like a night troilus methinks it.

Wit in the duke only are not love me to speak for the truth is no so again his outward wall, for a man knows but who bids me the wind!

Pity on his wealth being seasoned with us in alabaster?

Jump with imagined speed, indeed, for this letter call me dog and that is sum that the knave and it to offend, but assumes some taste of blood to my clerk, for i do you a key, there.

Improvements?

Feel free to leave a comment if you have any ideas for ways to make the generated sentences either more realistic, or more amusing =)

HexarViz: An XNA Music Visualizer

Media Player Visualization Data

Recently, at a local Microsoft GameDevz meeting that I attended, we had a short 45-minute contest to see which 2-3 person team could come up with the coolest visualization given a relatively simple XNA project that retrieves visualization data from the song currently playing in the Media Center. HexarViz is an evolution of the project that my partner Daniel Taylor and I came up with during this competition.

The way to retrieve music visualization data from XNA is very simple. Call the MediaPlayer.GetVisualizationData method in XNA 3.0 or later, and you will retrieve an object of type VisualizationData. The VisualizationData class provides only two properties, Samples and Frequencies. The former is an array of 256 float values from -1.0 to 1.0 indicating the raw amplitude of the currently playing sound sample. The latter is an array of 256 float values from 0 to 1.0 indicating the logarithmic scaled power value of each frequency band in the currently playing sound sample – essentially the result of a FFT on the sound data. The frequency distribution seemed like a more obvious choice to use in a visualizer, so that’s what we used.

HexarViz

The idea for this visualizer was inspired by the standard horizontal bar UI that is typically used to display the currently playing frequency bands. It looks like this:

And in fact, this is the first part we implemented, by using a simple 10×10 white square as a texture, and drawing one red rectangle per frequency band, from the left side of the screen to the right. But, let’s be honest, that’s boring, because it’s been done a million times.

We wanted something more interesting, so I thought maybe it would be cool to make the frequency band values rotate around a center point, kinda like a circle. Except since each rectangle drawn rotationally this way will have a different length depending on its power value – so the shape of this sound circle will change drastically over time. Here’s basically what that looks like:

This image depicts pretty closely what we ended up with for the competition, in about 45 minutes. We also had a few rudimentary features:

  • Previous Song / Next Song
  • Sound Circle Rotation

But like I said, HexarViz is an evolved version of the results of that competition, meaning that after the competition I added a lot of features, namely:

  • Pause
  • Volume Up / Volume Down
  • Displaying the currently playing playlist, album, artist, and song name
  • Playlist change support
  • Toggle individual Red/Green/Blue components of the target color
  • Color Cycling
  • Change the current texture

Color Cycling was probably the most difficult part to implement. In order to effect a smooth transition of target colors, I calculate the HSB (Hue/Saturation/Brightness) of the current target color, increment the hue, and then convert back to an RGB value. The color of each individual rectangle is also a linear interpolation between black and the target color, based on the frequency band’s percentage of maximum power.

Changing the current texture, from a 10×10 white square to something else, also has some very interesting effects on the way that the sound circle looks. Give that a try.

Source and Windows Installer

[HexarVizSource]

[HexarVizInstaller]

Controls

Change Playlist – P
Previous Song – Left Arrow
Next Song – Right Arrow
Pause – Spacebar
Increase Volume – Up Arrow
Decrease Volume – Down Arrow
Toggle Spin – S
Toggle Red – R
Toggle Green – G
Toggle Blue – B
Cycle Colors – C
Change Texture – T
Quit – Escape

HexShooter: An XNA 2D Tutorial Upgrade

XNA Game Studio 3.1

XNA Game Studio, if you haven’t heard of it, is an integrated development environment for the development of Windows and Xbox 360 games using either C# Express 2008 or Visual Studio 2008. It is essentially a Visual Studio Package which includes a set of XNA assemblies, as well as new project types to create Windows and Xbox 360 games. The latest version, 3.1, was released earlier this year and its newest features include video playback, an updated audio API, avatar support, and use of the XBox Live Party system.

In addition, budding game developers with a Premium Creators Club account can submit their game to the XNA Creators Club for peer review; if passed, these games can be sold on Xbox Live Indie Games for $1, $3, or $5. Developers receive 70% of all sales revenue for copies of their game that are purchased via Xbox Live. Not too bad a deal, though creating an independent game with mass market appeal and a decent purchase rate is no small task.

Creators Club Website

The epicenter of all things XNA is located at the XNA Creators Club Website. This is where you obtain a Premium account, submit your game for review, playtest others’ creations, and access a multitude of resources including forums and tutorials.

The 2D Tutorial

The Beginner’s 2D Tutorial uses video to demonstrate how to create a simple 2D shooter starting from a blank Windows Game project, given a couple of pre-created images. The videos walk you through creation of the game project, adding assets, drawing the background, creating cannons, firing cannonballs, adding enemies, and destroying enemies. It’s actually a very handy tutorial, and you end up with something like this (skip to 4:17)

HexShooter

As you can see in the video above, there are three enemy ships moving from right to left at all times, and you can shoot them to earn points. It’s pretty smooth, but there’s something important missing. Because there’s no inherent challenge, no sense of danger, and no real goal, this project is unfortunately little more than a toy, and the fun factor is completely missing. HexShooter is my attempt to take this project and change it in ways that will make it challenging and fun.

So here’s a list of some of the features I added:

  • Explosions and Smoke (from Joshua Foss’ excellent Adding 2D Particles and Explosions videos.
  • The ability to move the cannon from left to right with the left thumstick, aiming now with the right.
  • Enemies now spawn from either the left or right sides of the screen, and they bounce off the edges and come back with the same height and speed.
  • Player cannonballs are now very fast, but you can only have one on screen at once.
  • Enemies now fire green cannons back at you, and if they hit the center circle of your cannon, you lose.
  • For every ten enemies killed, a new enemy is added.

There are several reasons why I think this upgrade is a lot more fun to play than the original. First, there is the very real possibility of losing, and there is a challenge. Namely, kill as many aliens as you can while avoiding enemy fire for as long as possible. Because of the way that enemies are added, the game starts out easy but becomes increasingly challenging the more enemies you kill. Eventually you will find yourself spending most of your attention on dodging enemy fire, which reminds me of the fun and challenge of Galaga.

The choice to change the cannon to only fire one fast cannonball at once was also intentional. In the tutorial version, you can fire three cannonballs at once, but they are very slow. This makes aiming and firing tedious and somewhat pointless. With only one cannonball, you have to make your shots matter, so your aim counts.

For a while I considered creating a complete mesh-based collision detection system for the cannon, including not just the circle but the barrel too. But I realized that not only would that be a lot of extra work, but it would probably detract from the gameplay. Having to worry about your cannon aim in order to dodge enemy cannonballs is silly, and would make evasion nearly impossible in the late game.

Source and Executable

I am offering both the source code and executable for inspection and playtesting. If you want to compile and run the source you must have XNA Game Studio 3.1 installed.

[HexShooter Source]

If you just want to play the game, you need to have the Microsoft XNA Framework Redistributable 3.1 installed, and either:

Or, you could just use the keyboard.

[HexShooter Installer]

Using the keyboard isn’t too bad, but if you’d like to use an Xbox controller, I’d recommend the cheaper USB wired controller, unless you already have some Xbox 360 controllers lying around and you absolutely must have wireless controllers.

Controls

Move – Left thumbstick or A/D keys
Aim – Right thumbstick or Left/Right arrow keys
Shoot – Right trigger or Spacebar
New Game – Start or Enter

Converting Managed C++ to C++/CLI

What is Managed C++?

Managed C++, also known as Managed Extensions for C++, was a set of syntactic extensions and keywords released with Visual Studio .NET in 2002 which allows you to target the .NET Framework in C++. In other words, with the release of Managed C++, it became possible to write .NET programs in C++. This is particularly interesting because unlike pure managed languages like C# and Visual Basic .NET, Managed C++ is also directly compatible with native C++ code, without having to use PInvoke or COM. In fact you can write code that mixes both unmanaged and managed C++ in the same assembly, and even within the same function! This kind of “mixed mode” interop is what makes Managed C++ so interesting.

So how does it work? Basically in addition to the standard C++ built-in types, references, and pointers, you can now declare what is known as a garbage-collected pointer to point to a managed .NET object that has been allocated with a managed new operation. As long as you keep your regular unmanaged classes and pointers separate from your managed, garbage-collected ones, you’re good to go. Here’s an example:

__gc class G { public: int i; }
G __gc* pG = new G;
pG->i = 5;
int i = pG->i;

The __gc keyword is by far the most ubiquitous in Managed C++, but there are a number of other ones:

  • __abstract: for declaring a class as abstract
  • __box: for “boxing” value types
  • __delegate: for declaring a .NET delegate
  • __event: for declaring a .NET event
  • __identifier: for using a C++ keyword as an identifier
  • __interface: for declaring a .NET interface
  • __nogc: for explicitly declaring an unmanaged type
  • __pin: for declaring a “pinning pointer”
  • __property: for declaring a .NET property
  • __sealed: protects a class or method from derivation
  • __try_cast: for a dynamic cast that throws an exception on failure
  • __typeof: for determining the .NET type of an object
  • __value: for declaring a value class

Used correctly, these keywords are all you need to write code in Managed C++.

What is C++/CLI?

Sometime after introducing Managed C++ to the world via Visual Studio .NET, Microsoft started working on a newer revision of the language specification in an attempt to supersede Managed C++. Why did they decide to revise the language? For one thing, although Microsoft chose the common extension prefix “__” for its managed keywords, developers generally thought that the syntax was ugly. Really ugly. Additionally, the code was often difficult to read and understand in ambiguous situations. So for Visual Studio 2005, Microsoft created a new language specification called C++/CLI which completely revised the language both for the aesthetics of prettier code and the clarity of unambiguous syntax.

How did the syntax change?

Handles: Where before you had to refer to managed classes using gc pointers (__gc*), now when you use a managed type you use the syntax ClassName^, to clearly differentiate managed and unmanaged objects.

Contextal and Spaced Keywords: Instead of introducing a slew of new .NET keywords, and marking them as off limits for identifiers, Microsoft chose only three new keywords (gcnew, generic, and nullptr), and the rest are either spaced keywords (for each, enum class, interface class, ref class, value class) or contextual keywords which only appear where an identifier would never appear. This was a great decision because it means that the syntax is prettier but you break very few existing programs, unless they call their variables nullptr and gcnew. Contextual keywords include abstract, finally, in, override, sealed, and where.

Tracking Reference to No Object: The keyword nullptr was introduced to represent the managed equivalent of null. Because of this change, the integral value zero no can no longer be used as a tracking reference; nullptr must be used instead.

Garbage-Collected New: To differentiate the standard C++ new keyword from a managed allocation which will automatically be garbage collected, the keyword gcnew was introduced. Anytime you allocate a new .NET object you must use this gcnew keyword.

Generics: Just as C# 2.0 added support for generics, so too did C++/CLI add support for .NET generics in C++ using the generic keyword.

Dispose and Finalize: The changes to destructor semantics in C++/CLI are complicated; destructors now go to Dispose instead of Finalize, but you can still declare an explicit finalizer if necessary. See the MSDN translation guide for more details.

There are other changes that are too numerous to list here; consult my References for more information.

Quick and Dirty Translation Guide

Managed Class Declaration
=========================
Old: public __gc class MyClass
New: public ref class MyClass

Property Declaration
====================
Old: __property String *get_Name();
New: property String^ Name { String^ get(); }

Old: __property System::String __gc* get_Description();
Old: __property void set_Description(System::String __gc* value);
New: property String^ Description { String^ get(); void set(String^); }

Dispose
=======
Old: void Dispose();
New: ~MyClass();

Note: The Dispose/Finalize pattern has some tricky changes. See Translation Guide: Moving Your Programs from Managed Extensions for C++ to C++/CLI.

Handles
=======
Old: String* ToString();
New: String^ ToString();

Pin Ptr
=======
Old: std::list* __pin *pinner = &m_stringList;
New: pin_ptr<std::list > pinner = m_stringList;

Property Definition
===================
Old: String *MyClass::get_Name() { }
New: String^ MyClass::Name::get() { }

Old: System::Double MyClass::get_Values() __gc[] { }
New: Array^ MyClass::Values::get() { }

Old: double MyReader::EndRead(IAsyncResult* asyncResult) __gc [,] { }
New: Array MyReader::EndRead(IAsyncResult^ asyncResult) { }

Array Declaration
=================
Old: Object* args[] = { GetType()->Name, this->VirtualName };
New: array^ args = { GetType()->Name, this->VirtualName };

Managed Memory Allocation
=========================
Old: throw new ObjectDisposedException(S"MyClass");
New: throw gcnew ObjectDisposedException("MyClass");

Boxing
======
Old: Object* myObjArray __gc[] = { __box(1), __box(2), __box(3) };
New: array^ myObjArray = {1, 2, 3}; // Now implicit

Unboxing
========
Old: System::Object* o = __box(123);
Old: int a = *(dynamic_cast(o));
New: Object^ o = z; // implicit boxing
New: int y = *reinterpret_cast(o); // unboxing

Dereferencing a GC Pointer / Handle
===================================
Old: this->Name
New: this->Name // no change

Public Inheritance
==================
Old: public __gc class SubClass : public MyClass {
New: public ref class SubClass : MyClass { // Public inheritance by default

Implementing ICollection (New Syntax Only)
==========================================
ref class MyCollection : ICollection
{
public:
virtual IEnumerator^ GetEnumerator() = IEnumerable::GetEnumerator { return nullptr; }
virtual property int Count { int get() = ICollection::Count::get { return 0; } }
virtual property Object^ SyncRoot { Object^ get() = ICollection::SyncRoot::get { return nullptr; } }
virtual property bool IsSynchronized { bool get() = ICollection::IsSynchronized::get { return false; } }
virtual void CopyTo(Array^, int) = ICollection::CopyTo { return; }
};

For Each (New Syntax Only)
=========================
array^ classes = { gcnew MyClass() };
for each (MyClass^ c in classes)
{
MessageBox::Show(c->Name);
}

String Literals
===============
Old: S"Managed String"
New: "Managed String"

Try-Cast -> Safe-Cast
TypeOf -> TypeID
=====================
Old: m_bigFloat = *__try_cast(info->GetValue(S"_bigFloat",__typeof(f64)));
New: m_bigFloat = safe_cast(info->GetValue("_bigFloat", f64::typeid));

CLI Types
=========
Old: __int64
New: Int64

Delegates / Events
==================
Old: public __delegate void MyEventHandler(Object *sender, MyEventArgs *e);
Old: __gc class Task {
__event void add(MyEventHandler *eh);
__event void remove(MyEventHandler *eh);
}

New: public delegate void MyEventHandler(Object^ sender, MyEventArgs^ e);
New: ref class Task {
event MyEventHandler^ MyEvent { void add(MyEventHandler^ eh); void remove(MyEventHandler^ eh); }
}

Private Public
==============
Old: private public:
New: internal:

Operator Overloading
====================
Old: static bool op_Equality(DataFrequency d1, DataFrequency d2);
New: static bool operator ==(DataFrequency d1, DataFrequency d2);

Conversion Operators
====================
Old: static double op_Implicit(int i);
Old: static float op_Explicit(int i);
New: static implicit operator double(int i);
New: static explicit operator float(int i);

CLI Enums
=========
Old: __value enum Result { Pass, Fail };
Old: Result r = Pass;
New: enum class Result { Pass, Fail };
New: Result r = Result::Pass;

NullPtr
=======
Old: Object* obj = 0;
New: Object^ obj = nullptr;

Visual Studio Search & Replace Strings

Use these in order, using Visual Studio 2005 with Regular Expressions, per .h file. Voila.

String RegExp
=============
Find: String __gc\*
Replace: String^

Property RegExp
===============
Find: __property
Replace: property

Get Property RegExp
===================
Find: property {[^ ]*} get_{[^\(]*}\(\);
Replace: property \1 \2 { \1 get(); }

Set Property RegExp
===================
Find: property void set_{[^\(]*}\({[^ ]*} value\);
Replace: property \1 { void set(\2 value); }

Get Array
=========
Find: property {[^ ]*} get_{[^\(]*}\(\) __gc\[\];
Replace: property array^ \2 { array^ get(); }

Set Array
=========
Find: property void set_{[^\(]*}\({[^ ]*} value __gc\[\]\);
Replace: property \1 { void set(array^ value); }

Combine RegExp
==============
Find: property {[^ ]*} {[^ ]*} \{ [^ ]* get\(\); \}\n[^p]*property \2 \{ void set\(\1 value\); \}
Replace: property \1 \2 { \1 get(); void set(\1 value); }

Use the following in order per .cpp file:

Get
===
Find: get_{[^(]*}\(\)
Replace: \1::get()

Set
===
Find: set_{[^(]*}\({[^)]*}\)
Replace: \1::set(\2)

References

MSDN Translation Guide

Herb Sutter’s Blog

Wikipedia

Various Optimization Strategies

I’ve been a bit busy lately, and my blogging has suffered because of it. That’s really too bad, because I have a big list of topics / code that I’d like to post about, but most of them require at least a few hours of work to clean up and flesh out. It’s a bit overwhelming.

In the meantime, I wanted to do a quick post on a couple of interesting optimization strategies, especially ones I’ve come across here at Microsoft. These strategies are well-known but some are used quite often in Visual Studio to help speed up performance, so I thought I’d write about them, just for fun. Maybe you’ll learn something, or maybe you’ll already know all of these.

Copy-On-Write

Copy-on-write is an optimization strategy which delays the copying of two or more initially identical resources until a user of the resource attempts to modify it. When a resource user does attempt to modify the resource, a true private copy is performed behind the scenes and returned to the requestor. This is an optimization especially in cases where resources rarely need to be modified, but you cannot guarantee they won’t be at some point.

One example of this strategy in use is in virtual memory operating systems. Multiple copies of the same process generally have identical binary code in system memory, so the memory used by the first process is marked copy-on-write when additional instances of the process are loaded. If any of these processes do attempt to modify the memory, then the kernel detects the write and makes a transparent copy of the binary code so that the other processes are unaffected.

Copy-on-write can also be used in application code. One possible implementation could involve creating a class which wraps a property defined in a read-only data source. As long as callers ask to read the resource, you can simply return the value stored in the data source. But if a caller wants to modify the value, then the class makes a private local copy and sets a flag indicating that from then on it should return the local copy instead of the data source value.

Here’s some sample code (C++):

class StringWrapper
{
private:
char* s;
bool local;
public:
StringWrapper(char* source) : local(false) { s = source; }
char* Get() { return s; }
void Append(char* app)
{
if (!local)
{
char* temp = new char[strlen(s) + 1];
strcpy(temp, s);
s = temp;
local = true;
}

strcat(s, app);
}
};

Lazy Initialization

The idea behind lazy initialization is that you should try to limit the amount of pre-processing you have to do to get your program to run, until the last possible moment. So you initialize resources the first time they are requested, and no sooner.

For example, let’s say that you have a word processing program that supports spell-checking for hundreds of languages. Each separate language can be loaded from a dictionary file into a language-specific spell-checker object, from which you can call your spell-check method. You could just load every single dictionary file from disk when your word processor loads, but that would probably take a long time, and most of the languages are unlikely to be used in a single user session. In fact, the user might not even use the spell-checker at all. A better strategy would be to wait until the user requests a spell-check, determine whether or not the current language’s dictionary has already been loaded, and load it if it hasn’t yet. Then, you can proceed with your spell-check knowing that the user only incurred the cost of loading a single language at the time that they needed it.

Of course, you have to strike a balance here. If you lazily initialize everything in your program, then there’s a good chance that everything the user does will be a bit sluggish. Would the user be willing to wait a little bit longer for your program to load if it means that the most common operations are quicker? This seems especially relevant for operating systems. Slow boot times are annoying, but if you only reboot once a week, then a few extra seconds of pre-processing might be a good tradeoff for better OS performance.

Here’s some example code (C#):

public class MathConstants
{
private bool piInit = false;
private double pi;

public double PI
{
get
{
if (!piInit)
{
pi = CalcPi();
piInit = true;
}

return pi;
}
}
}

Double-Checked Locking

This pattern is often used in conjunction with lazy initialization, but in a multi-threaded environment. For example, in C# you might have a singleton class like so:

public class Singleton
{
private static Singleton instance;
private static Object lockObject = new Object();

public static Singleton Instance
{
lock (lockObject)
{
if (instance == null)
{
instance = new Singleton();
}
return instance;
}
}
}

This code will prevent multiple threads who are accessing Singleton.Instance at the same time from accidentally creating multiple copies of the singleton. But there is an overhead cost to acquiring the lock on every access, even though the singleton instance is only initialized once! So a better implementation might perform an “unsafe check” first, before acquiring the lock, like so:

public class Singleton
{
private static volatile Singleton instance;
private static Object lockObject = new Object();

public static Singleton Instance
{
if (instance == null)
{
lock (lockObject)
{
if (instance == null)
instance = new Singleton();
}
}

return instance;
}
}

Notice that the test if (instance == null) must be performed twice because two threads could still get past the first if at the same time (since there is no lock preventing them from doing so), and then sequentially create two different singletons, one after the other. Also, the volatile keyword must be used on the instance, indicating that the field might be modified in the program by a concurrently executing thread. This prevents a thread from “caching” the value of instance, and ensures that each thread always retrieves the current value of the object at the point it is requested.

Caching / Lookup Table

Let’s say you want to write a screensaver which will draw tons of circles of random sizes, colors, and positions all over the screen. And let’s say that your algorithm for drawing the arcs involves a loop that looks something like this:

void DrawCircle(int r, int x, int y, Color color)
{
for (double d = 0.0; d <= 2 * PI; d += 0.001)
{
int x1 = x + cos(d) * r;
int y1 = y + sin(d) * r;
SetPixel(x1, y1, color);
}
}

Okay, so for each circle, you need to call cos and sin about 6283 times. Let’s say you’re extremely interested in runtime performance, and want to make this function faster. One thing you can do is to pre-compute the sin/cos values at the beginning of your program and store them in memory, so that instead of calling a potentially expensive math function for each circle, you just access memory instead, which is always fast.

Depending on the operation, you might even try pre-computing these values before the program even starts up, writing the values to a file or putting them directly in source, depending on how many values you need.

void DrawCircle(int r, int x, int y, Color color)
{
int i = 0;
if (!initTrig)
{
for (double d = 0.0; d <= 2 * PI; d += 0.001)
{
costable[i] = cos(d);
sintable[i++] = sin(d);
}
initTrig = true;
}

for (i = 0; i <= 6283; i++)
{
int x1 = x + costable[i] * r;
int y1 = y + sintable[i] * r;
SetPixel(x1, y1, color);
}
}

References

Copy-on-Write

Lazy Initialization

Implementing Singleton in C#

volatile (C#)

Cryptogram Solver

Introduction

If you’re anything like me, everytime you see a cryptogram in the entertainment section of the newspaper, you stare at it for about 60 seconds and then you give up. I’ve always wanted to be able to solve cryptograms but I never really learned how. Furthermore, even after I read a few things about the strategies you can use to solve them, I was intrigued by (in my mind) a much more interesting problem – could you write a computer program to solve cryptograms for you?

First, some definition of the problem is in order. The object of the program is to change a ciphertext consisting of cipher words into a cleartext made up of words which exist in a dictionary file, by mapping each cipher letter into exactly one cleartext letter. There is a 1-to-1 mapping in that no two cipher letters share the same cleartext and vice versa. I will also allow for apostrophes as letters which always map to apostrophes, and ignore all other characters (periods, commas). Words are separated by any non-alphabetic or apostrophe letter.

Example

Gb py bgioy lkq ckd’y oqwwttc, yil, yil prpgd.

Random Assignment

My first attempt at solving this problem was very rudimentary. Basically, assign a random mapping of all cipher letters to cleartext letters and produce the result. Check to see that every word is in fact in the dictionary. This solution will inevitably work, but it will also take forever. How many mappings are there for 26 letters? Well, for “A” you have 26 choices, for “B” you have 25 choices left, etc. In other words there are 26! mappings to choose from and only one will give you the right answer. In fact this is worse than just looping through all possibilities because you will start duplicating random assignments that you’ve already tried. Don’t do this.

Distribution Ranking

This is a similar approach to what a human would do. Basically, we know that for most texts, e is the most common letter, followed by t, a, o, i, n, s, h, and r. Insert RSTLNE joke here. What we can try to do is to analyze our ciphertext and rank the letters by occurrence frequency and then assign letters based on our letter ranking list. As it turns out, this approach didn’t work very well. If your text has exactly the same letter distribution as the English language in general, then your first assignment should be the answer, and you’re done. This is not the case, and it’s even possible that the most common letter isn’t even E. And that’s a big problem, because when it comes to recursive algorithms, getting the first part wrong means you spend lots of time searching for solutions where none will be found. There may be a way to do this using some kind of breadth first search which will lead to a relatively quick solution, but I have to say I think this approach is misguided.

Pattern Search

The solution I eventually came up with has much less to do with the letters themselves and more to do with the uniqueness of each word. When you’re looking at a cipher word, you don’t know what the letters actually represent, but you do know one very important thing. You know that everytime the same letter is used in the cipher, it is the same letter in the real word too. In other words, the letter pattern of a word is a property shared by the cleartext of that word.

Here’s an example:

oqwwttc (could be rewritten as)
0122334 (numerically speaking)
abccdde (as a normalized cipher word)
succeed (an actual word with this pattern)

As it turns out, in my dictionary, there are only five seven-letter words with that letter pattern, and succeed is one of them. Fantastic! We now know that the solution, if it will be found, must assign letters according to one of those five words. This leads us to the following pseudocode:

  1. Load the dictionary into a hash table where the hash is the normalized letter pattern of each word, and the real words are added to the table according to their hash
  2. For each ciphertext word, normalize it and get a list of all English words that match this pattern
  3. Sort the list of ciphertext words in increasing count of English words
  4. Recursively assign a word to the current cipher. Double check that this does not violate any previously assigned letters for previous words; if so, back out. Pick the next word and recurse.

This algorithm is actually really fast. For the cipher I gave above, there are only 5 choices for the first* word and only 7 for the second*, and this results in an assignment of 8 letters immediately which have a high (1/35) chance of being correct. The search space increases to 49 choices for the next word, and more for each word following, but by the time you reach the words with a high number of pattern matches you’ve already decided most of your letter assignments already, so your choices are easy to prune.

Here’s the source: View Source

Summary

A true programmatic crypogram solver doesn’t necessarily need to use the same skills as a human would use to solve the same problem; we can use the computer’s advantage of being able to complete repetitive tasks by pre-processing every word in the dictionary based on its letter pattern, and draw on that information to choose whole word assignments instead of individual letters. Choosing from the smallest number of options severely reduces the recursive search space, and from there it’s just a matter of testing the remaining word patterns until we reach the solution(s).

If you see any glaring errors in my code, or possible optimizations, please feel free to add a comment saying how I could improve. Thanks!

Oh and by the way, if at first you don’t succeed, try, try again.

* Not actually the first word, but the word with the lowest matching pattern count in the dictionary.

Lambda Functions in C++ for RAII

I learned a new trick recently at Microsoft involving the use of C++ lambda functions for automatic resource cleanup, which I thought was pretty cool. It involves combining the RAII programming pattern with a new feature of C++ 0x, namely lambda functions.

What is RAII?

RAII stands for Resource Acquisition Is Initialization, a design pattern which is popular in languages such as C++, D, and Ada. The idea is that you want to acquire resources during the initialization of objects, i.e. as soon as possible, so that you cannot accidentally use an uninitialized object, and also that you want your object to automatically release the resource upon destruction. One of the main advantages of this pattern is that your resources will always be released, even if there are errors or exceptions between when your object is initialized and your object goes out of scope. How does one do this? Here’s a simple example:

#include 
using namespace std;
class FileCloser
{
public:
FileCloser(char* fname)
{
fp = fopen(fname, "r");
}
void ReadLine(char* line, int count)
{
fgets(line, count, fp);
}
~FileCloser()
{
fclose(fp);
}
private:
FileCloser() { }
FILE* fp;
};

int main(int argc, char* argv[])
{
FileCloser fc("test.txt");
char line[100];
fc.ReadLine(line, 100);
cout << line << endl;
return 0;
}

As you can see, we didn’t have to explicitly call fclose on the file pointer because our FileCloser object’s destructor did it for us as soon as the FileCloser object went out of scope in main. Thus, even if we change main to the following, we will still close fp after the exception is thrown:

int main(int argc, char* argv[])
{
FileCloser fc("test.txt");
char line[100];
fc.ReadLine(line, 100);
cout << line << endl;
throw;
return 0;
}

What are lambda functions?

Lambda functions are a new feature of the proposed new standard for the C++ programming language called C++0x, although it’s likely to be introduced sometime in 2009. They are a functional programming technique based on lambda calculus, which you may or may not remember from Theory of Computation. Or was it Programming Languages? Either way, I like to think of it as defining an anonymous inline method where you need it, instead of having to define your own function pointer or delegate method elsewhere. Here’s an example of a simple lambda function:

#include 
#include
#include
using namespace std;
int main(int argc, char* argv[])
{
vector<int> v = { 1, 2, 3, 4, 5 }; // Another new feature of C++ 0x
for_each(v.begin(), v.end(), [](int& x) { x = x * x; });
for_each(v.begin(), v.end(), [](int x) { cout << x << endl; });
return 0;
}

How to use lambda functions to perform RAII

Here’s where we combine these two techniques to perform RAII using lambda functions. In the first code snippet I showed you where we used RAII to automatically release the resource, we still had to define our own class, complete with a destructor, in order to get the kind of resource release we want in the face of errors and exceptions. And although you could design a generic class which could handle automatically deleting pointers which were allocated via new, you would have to create a separate class for each resource type if you have to do something a little bit more complicated to release a resource (for example, by calling a method). But thanks to lambda functions, this is no problem:

#include 
#include
using namespace std;
using namespace std::tr1;

int main(int argc, char* argv[])
{
FILE* fp = fopen("test.txt", "r");
shared_ptr fileReleaser(fp, [](FILE* fp) { fclose(fp); });
char line[100];
fgets(line, 100, fp);
cout << line << endl;
throw;
return 0;
}

The shared_ptr in this example is used to create a sharable pointer to the object we want to release. The arguments to its constructor are first the pointer, and then a pointer to a deleter method, in this case the lambda function. When the shared_ptr goes out of scope outside of main it will automatically call the lambda function to release the file.

Summary

In summary, RAII is a great defensive programming technique you can use to make sure your resources are released and your code doesn’t leak, and lambda functions are a great new way to make using RAII much less painful than it ever was before.

References

Wikipedia: RAII
HackCraft: RAII
Wikipedia: C++0x
Herb Sutter’s Blog
MSDN: shared_ptr

Solving Tetris via a Scoring Algorithm

When I was in college, I wrote a simple Windows Tetris game I called WinTetris, because the name wasn’t taken and I wasn’t feeling very creative that day. This project was at first just an exercise in Win32 programming and game logic, and the first version featured very ugly GDI graphics with DrawRect.

Over time, I polished it quite a bit, and added a lot of new features. I replaced the ugly graphics with DirectDraw, added support for mouse/keyboard controls, added sound and music, A/B game types, and LAN multiplayer. But quite possibly my favorite new feature was the “demo mode”, which starts when you launch the program.

Demo mode was a simple way for me to use the tetris engine I had already developed to do something cool; write an algorithm that could play tetris extremely well. The bonus was that I could show off both my AI programming and my game at the same time, thanks to my demo mode.

So how does it work? At the top-most level, it does the following:

  1. Considers every possible piece placement for the current piece, including both lateral movement and rotation.
  2. Assigns each piece placement a score, according to some kind of scoring mechanism.
  3. Chooses the placement with the highest score.
  4. Moves and rotates the piece into place.

This arrangement is pretty logical, but everything hinges on the scoring algorithm. How would you write such a function? Personally, I thought a bit about things that are good, and things that are bad when it comes to playing tetris, in general:

  • Completing a line is a good thing. Completing more than one line at once, all the way up to a tetris, is even better.
  • Reducing the maximum stack height is a good thing. Conversely, making the maximum stack height higher is bad.
  • Placing a piece lower in the playing field is generally better than placing it higher.
  • Creating empty holes beneath a piece is bad.

So now that you know what to look for, you write methods that will tell you the effects of your piece placement, assign each measure a weight, and total up the plusses and minuses. But what weight values to pick?

Personally, what I did was to pick weights that seemed reasonable, watch how the game played, and then made adjustments if I thought it was playing in a way that was either too reckless or too safe. And although I didn’t mention it before, my goal was an algorithm that would take a long time to lose, not one that loves to set up tetrises or rack up a high score.

In the end, I made the following weight table:

Hole Count: -500
Lines Completed: +500
Block Height: -150
Chasm Height: -400

If I was to tweak these weights again, I would want to take a more scientific approach. I could use even more AI programming to run thousands of tetris simulations, and see which weight values yielded the lowest average stack height, and use those values. Perhaps that could be a future project of mine.

The last trick I added was a modification of the process, not the scoring algorithm. In the process I gave above, the code only evaluates the current piece. But players get to see the next piece! So instead of choosing the piece placement with the highest score, I modified my algorithm to calculate the best combined score (A + B) for the current piece as well as the next piece, and use that information to place only the current piece. So the current piece is placed with the highest potential score. The weird thing about this is that the algorithm is not committed to that second placement. It can choose to do something completely different if the next piece changes the optimal two-piece score.

So now that you know how to write a Tetris AI, break out your favorite compiler and improve on my methods. I implemented WinTetris such that you can create your own AI DLLs if you want to play around with it. The installer and source code for WinTetris is available at http://www.hexar.net/wintetris.php. Enjoy!