<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Open Shakespeare &#187; adalovelace</title>
	<atom:link href="http://www.openshakespeare.org/author/adalovelace/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.openshakespeare.org</link>
	<description></description>
	<lastBuildDate>Fri, 27 Jan 2012 14:09:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>XML and the Natural Language Toolkit</title>
		<link>http://www.openshakespeare.org/2010/02/26/xml-and-the-natural-language-toolkit/</link>
		<comments>http://www.openshakespeare.org/2010/02/26/xml-and-the-natural-language-toolkit/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 11:54:06 +0000</pubDate>
		<dc:creator>adalovelace</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[Texts]]></category>

		<guid isPermaLink="false">http://blog.openshakespeare.org/?p=76</guid>
		<description><![CDATA[I&#8217;ve been playing with the nltk (natural language toolkit) and the really useful Jon Bosak xml annotated corpus these days,  and  this are some of the graphs I&#8217;ve been able to parse after analyzing the speech of the main characters &#8230; <a href="http://www.openshakespeare.org/2010/02/26/xml-and-the-natural-language-toolkit/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing with the nltk (natural language toolkit) and the really useful Jon Bosak xml annotated corpus these days,  and  this are some of the graphs I&#8217;ve been able to parse after analyzing the speech of the main characters of the play (characters that say more than 100 lines of code:</p>

<div id="attachment_77" class="wp-caption aligncenter" style="width: 610px"><img class="size-full wp-image-77" title="exclamations and interrogations" src="http://blog.openshakespeare.org/wp-content/uploads/2010/02/macbexagerat.png" alt="exclamations and interrogations" width="600" height="300" /><p class="wp-caption-text">exclamations and interrogations</p></div>

<p>Here we can see that Macduff is screaming a lot, and that when everybody talks is never to question, but to assert&#8230; Poor Macbeth and Lady Macduff question everything, while Lady Macbeth just as much as asserting.</p>

<p>Regarding amount of words in the play, by far Macbeth is the one that talks more:</p>

<div id="attachment_78" class="wp-caption aligncenter" style="width: 610px"><img class="size-full wp-image-78" title="Macbeth main characters / words spoken" src="http://blog.openshakespeare.org/wp-content/uploads/2010/02/macbjwordspoken.png" alt="amount of words spoken by main characters " width="600" height="300" /><p class="wp-caption-text">amount of words spoken by main characters </p></div>

<p>But what about lexical variety? In this next graph, we can see the variety of the words:</p>

<div id="attachment_79" class="wp-caption aligncenter" style="width: 610px"><img class="size-full wp-image-79" title="Macbeth - lexical variety" src="http://blog.openshakespeare.org/wp-content/uploads/2010/02/macb-lexvar.png" alt="Macbeth - lexical variety" width="600" height="400" /><p class="wp-caption-text">Macbeth - lexical variety</p></div>

<p>Here we can see the variety of characters speech.</p>

<p>The brown-ish words are said just once per character. The light greens are word that will repeat on their speech, and the dark greens are repetitions of the light green words. I still need to take more measures to see if this is actually the way everybody speaks: by repeating a lot of small words with just some new words once in a while. (There are more words that appear just once, than the words you will repeat through most of your speech! Think about it!)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.openshakespeare.org/2010/02/26/xml-and-the-natural-language-toolkit/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

