{"id":214,"date":"2012-05-10T16:47:20","date_gmt":"2012-05-10T16:47:20","guid":{"rendered":"http:\/\/www.silverclaw.net\/?p=214"},"modified":"2012-05-16T08:11:07","modified_gmt":"2012-05-16T08:11:07","slug":"markov-chains-in-asp-net","status":"publish","type":"post","link":"https:\/\/www.silverclaw.net\/?p=214","title":{"rendered":"Markov chains in ASP.NET"},"content":{"rendered":"<p><a title=\"Markov Chains Explained\" href=\"http:\/\/www.silverclaw.net\/?p=206\">My previous article<\/a> covers the concepts behind Markov chains. In this article, we&#8217;ll look at how Markov chains can be generated using C# code. You can\u00a0 download the complete source code of this example on the <a title=\"Downloads\" href=\"http:\/\/www.silverclaw.net\/?page_id=10\">Downloads<\/a> page.<\/p>\n<p>To start with, the text will need to be sanitized to remove any symbols and punctuation marks that might interfere with the analysis of the text. You can do this using a regular expression, as in the following code:<\/p>\n<pre>Input = Regex.Replace(Input, @\"[^A-Za-z \\.]\", \"\");<\/pre>\n<p>Regular expressions like this one are covered in Session 10 of my upcoming <a href=\"http:\/\/www.learnasp4.com\/free_tutorials\/free_tutorials_with_video.htm#expert\" target=\"_blank\">ASP.NET, C# and Visual Studio Expert Skills course<\/a>.<\/p>\n<p>You&#8217;ll notice that the full-stop characters are left in place. This is because they&#8217;ll be used to indicate the end of a chain. For this to work, however, they&#8217;ll need to be separated from the words in the text using the following code:<\/p>\n<pre>Input = Input.Replace(\".\", \" . \");<\/pre>\n<p>You could do more sanitization at this point to remove double-spaces or HTML tags, but this is enough for this example.<\/p>\n<p>Next the individual words must be extracted from the text and placed in an array. This can be done using the <em>Split<\/em> method:<\/p>\n<pre>string[] Words = Input.Split(' ');<\/pre>\n<p>Now the individual words have been placed in the <em>Words<\/em> array.<\/p>\n<p>Next you&#8217;ll need to analyze the text to get the probability data needed to create a Markov chain. To do this, you&#8217;ll need a class structure to store the probabilities:<\/p>\n<pre>private class MarkovWord\r\n{\r\n\u00a0\u00a0\u00a0 public MarkovWord(string NewWord, int NewInstances)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Word = NewWord;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Instances = NewInstances;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 FollowingWords = new List&lt;MarkovWord&gt;();\r\n\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0 public string Word;\r\n\u00a0\u00a0\u00a0 public int Instances;\r\n\u00a0\u00a0\u00a0 public double Probability;\r\n\u00a0\u00a0\u00a0 public List&lt;MarkovWord&gt; FollowingWords;\r\n}<\/pre>\n<p>Now you&#8217;ll need to use this class to extract the probabilities from the text. First you&#8217;ll need to extract a unique list of words to use as a &#8216;top level&#8217;:<\/p>\n<pre>string[] UniqueWords = Words.Select(UniqueWord =&gt; UniqueWord).Distinct().ToArray();<\/pre>\n<p>This code uses LINQ to extract a unique list of words from the <em>Words<\/em> collection. Using LINQ in this way is covered in Session 10 of my <a href=\"http:\/\/www.learnasp4.com\/free_tutorials\/free_tutorials_with_video.htm\" target=\"_blank\">ASP.NET, C# and Visual Studio Essential Skills course<\/a>.<\/p>\n<p>Now you&#8217;ll need some more complex code to extract the probability data from the text. You&#8217;ll need to create a new collection of <em>MarkovWord<\/em> objects and use a <em>foreach<\/em> loop to cycle through each of the unique words.<\/p>\n<pre>List&lt;MarkovWord&gt; MarkovData = new List&lt;MarkovWord&gt;();\r\nforeach (string Word in UniqueWords)\r\n{<\/pre>\n<p>Now you&#8217;ll need to create a new <em>MarkovWord<\/em> object for the current word and a collection of <em>MarkovWord<\/em> objects to store each of the words that follows the current word:<\/p>\n<pre>MarkovWord NewWord = new MarkovWord(Word, 1);\r\nList&lt;MarkovWord&gt; FollowingWords = new List&lt;MarkovWord&gt;();<\/pre>\n<p>Next, you need to loop through every word in the complete text and look for the current word. You can do this with the following <em>for<\/em> loop:<\/p>\n<pre>for (int Counter = 0; Counter &lt; Words.Length - 1; Counter++)\r\n{<\/pre>\n<p>When you find an instance of the current word, you need to get the word that follows it in the <em>Words<\/em> array and add it to the <em>FollowingWords<\/em> collection. If it already exists, you&#8217;ll increase the <em>Instances<\/em> property of the existing object instead of adding a new one:<\/p>\n<pre>if (Words[Counter] == Word)\r\n{\r\n\u00a0\u00a0\u00a0 string FollowingWord = Words[Counter + 1];\r\n\u00a0\u00a0\u00a0 MarkovWord ExistingWord = FollowingWords.SingleOrDefault(FWord =&gt; FWord.Word == FollowingWord);\r\n\u00a0\u00a0\u00a0 if (ExistingWord == null) FollowingWords.Add(new MarkovWord(FollowingWord, 1));\r\n\u00a0\u00a0\u00a0 else ExistingWord.Instances++;\r\n}<\/pre>\n<p>Now it&#8217;s time to calculate the probability of each following word. This is easy to do by simply adding up the total number of following words and dividing the number of instances of each following word:<\/p>\n<pre>double TotalFollowingWords = FollowingWords.Sum(FWord =&gt; FWord.Instances);\r\nforeach (MarkovWord FollowingWord in FollowingWords)\r\n{\r\n\u00a0\u00a0\u00a0 FollowingWord.Probability = FollowingWord.Instances \/ TotalFollowingWords;\r\n}<\/pre>\n<p>Now all that&#8217;s needed is to add the finished <em>MarkovWord<\/em> object to the <em>MarkovData<\/em> collection:<\/p>\n<pre>NewWord.FollowingWords = FollowingWords;\r\nMarkovData.Add(NewWord);<\/pre>\n<p>You have now created the complete data structure needed to create a Markov chain.<\/p>\n<p>To create the Markov chain, you&#8217;ll need a <em>Random<\/em> object:<\/p>\n<pre>Random Randomizer = new Random();<\/pre>\n<p>You&#8217;ll use this to randomly select words for the chain.<\/p>\n<p>Next, you need to select a starting word and create variables to store the random probability and output string:<\/p>\n<pre>MarkovWord CurrentWord = MarkovData[0];\r\ndouble RandomValue;\r\nstring OutputString = CurrentWord.Word;<\/pre>\n<p>Now you&#8217;ll create a <em>while<\/em> loop to cycle until a full stop is found:<\/p>\n<pre>while (CurrentWord.Word != \".\")\r\n{<\/pre>\n<p>Within the <em>while<\/em> loop, you will need to generate a random value. This will determine which word is selected from the <em>FollowingWords<\/em> collection.<\/p>\n<pre>bool WordPicked = false;\r\nRandomValue = Randomizer.NextDouble();\r\ndouble CurrentProbability = 0;<\/pre>\n<p>The <em>NextDouble<\/em> method will populate the <em>RandomValue<\/em> variable with a number from 0.0 to 1.0. The <em>WordPicked<\/em> and <em>CurrentProbability<\/em> variables will be used later.<\/p>\n<p>Next you&#8217;ll need to loop through all of the words in the <em>FollowingWords<\/em> collection and use the <em>RandomValue<\/em> to determine which one is selected:<\/p>\n<pre>foreach (MarkovWord FollowingWord in CurrentWord.FollowingWords)\r\n{\r\n\u00a0\u00a0\u00a0 if (!WordPicked)\r\n\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 CurrentProbability += FollowingWord.Probability;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if (CurrentProbability &gt;= RandomValue)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 WordPicked = true;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 OutputString += \" \" + FollowingWord.Word;\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 CurrentWord = MarkovData.Single(BaseWord =&gt; BaseWord.Word == FollowingWord.Word);\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\r\n\u00a0\u00a0\u00a0 }\r\n}<\/pre>\n<p>This works by adding the probability of each following word to <em>CurrentProbability<\/em> and selecting the word if <em>CurrentProbability<\/em> is greater than or equal to <em>RandomValue<\/em>.<\/p>\n<p>For example, using the following probabilities:<br \/>\n<em>and (0.33333)<\/em><br \/>\n<em>fell (0.66666)<\/em><\/p>\n<p>&#8230;the word <strong>and<\/strong> will only be selected if <em>RandomValue<\/em> is less than or equal to <em>0.33333<\/em>.<\/p>\n<p>Your code is now complete. A Markov chain is generated and placed in the <em>OutputString<\/em> variable. You can download a complete working version of this code on the <a title=\"Downloads\" href=\"http:\/\/www.silverclaw.net\/?page_id=10\">Downloads<\/a> page.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>My previous article covers the concepts behind Markov chains. In this article, we&#8217;ll look at how Markov chains can be generated using C# code. You can\u00a0 download the complete source code of this example on the Downloads page. To start &hellip; <a href=\"https:\/\/www.silverclaw.net\/?p=214\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"gallery","meta":[],"categories":[12],"tags":[],"_links":{"self":[{"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/posts\/214"}],"collection":[{"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=214"}],"version-history":[{"count":10,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/posts\/214\/revisions"}],"predecessor-version":[{"id":233,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=\/wp\/v2\/posts\/214\/revisions\/233"}],"wp:attachment":[{"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=214"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=214"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silverclaw.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=214"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}