<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-20452128</id><updated>2012-01-24T11:53:12.646-08:00</updated><category term='install'/><category term='Time complexity'/><category term='debug'/><category term='setup'/><category term='scheme'/><category term='hack'/><category term='market share'/><category term='upper bound'/><category term='client'/><category term='java'/><category term='Job Flow'/><category term='convert'/><category term='Amazon'/><category term='development'/><category term='graham and brent theorem'/><category term='example'/><category term='trace'/><category term='multicore'/><category term='nontail call'/><category term='lambda'/><category term='tail call'/><category term='Map Reduce'/><category term='API'/><category term='fibonacci'/><category term='Map'/><category term='firefox'/><category term='recursion tree'/><category term='Terminate'/><category term='minimize cost'/><category term='cost'/><category term='each line'/><category term='Hadoop'/><category term='web service'/><category term='Eclipse'/><category term='tail recursion'/><category term='elastic map reduce'/><category term='IE'/><category term='programmatically'/><category term='nontail recursion'/><category term='recursion'/><category term='scheduling'/><title type='text'>Chathura Herath 's Blog</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>25</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-20452128.post-4784122488423400462</id><published>2011-02-18T18:50:00.000-08:00</published><updated>2011-02-18T20:04:29.314-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='nontail call'/><category scheme='http://www.blogger.com/atom/ns#' term='example'/><category scheme='http://www.blogger.com/atom/ns#' term='nontail recursion'/><category scheme='http://www.blogger.com/atom/ns#' term='recursion tree'/><category scheme='http://www.blogger.com/atom/ns#' term='fibonacci'/><category scheme='http://www.blogger.com/atom/ns#' term='tail call'/><category scheme='http://www.blogger.com/atom/ns#' term='scheme'/><category scheme='http://www.blogger.com/atom/ns#' term='tail recursion'/><category scheme='http://www.blogger.com/atom/ns#' term='convert'/><title type='text'>Non-tail call to tail call coversion - Scheme Fibonacci example with recursion tree</title><content type='html'>After thought from a discussion i had with &lt;a href="http://terkhorn.com/"&gt;Felix Terkhorn&lt;/a&gt;, made write this blog to show the awesome display of tail-call optimization of scheme using scheme trace. I implemented this optimization when i did my scheme compiler for &lt;a href="https://www.cs.indiana.edu/~dyb/"&gt;Prof Dybvig&lt;/a&gt;'s class. &lt;br /&gt;&lt;br /&gt;Here on display is also the conversion of non-tail call to a tail call. Most common example for this is &lt;a href="http://en.wikipedia.org/wiki/Tail_call#Example_programs"&gt;Factorial &lt;/a&gt;which can be seen &lt;a href="http://"&gt;here&lt;/a&gt;, but i took Fibonacci as an example because it requires little more thinking. Take a look at the following Fibonacci implementation. We know if lambda is placed in tail context the lambda body will also be in tail context. Also if cond/if is placed in tail context the then and else branches will also be in tail context. But the recursion to fib is placed as args to +. Although + is in the tail context the args to + are not. This is displayed when we calculate (fib 13) and we can see the stack growing and shrinking in the recursion tree.&lt;br /&gt;&lt;br /&gt;Now take a look at the re-written tailfib function and its trace. you would notice the tail call optimization as the stack does not grow in the recursion tree. In other words its an iteration. Here the idea is to use an accumulator similar to that of the Factorial example, but in Fibonacci you need two such accumulators because it is a second order recurrence with reference to two previous states. Which updating the accumulators in the reverse order as if you started with the initial condition you would decrement the counter. This placement of accumulators allow us to position tailfib function in tail context of the cond special form, thus making it a tail recursion.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&gt;(define fib (lambda (n)&lt;br /&gt;&lt;span style="padding-left:10px"&gt;                 (cond&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;                  ( (= n 0) 0)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;                  ( (= n 1) 1)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;                  ( else (+ (fib (- n 1)) (fib (- n 2)))))))&lt;/span&gt;&lt;br /&gt;&gt;(fib 7)&lt;br /&gt;13&lt;br /&gt;&gt; (trace fib)&lt;br /&gt;(fib)&lt;br /&gt;&gt; (fib 7)&lt;br /&gt;|(fib 7)&lt;br /&gt;| (fib 5)&lt;br /&gt;| |(fib 3)&lt;br /&gt;| | (fib 1)&lt;br /&gt;| | 1&lt;br /&gt;| | (fib 2)&lt;br /&gt;| | |(fib 0)&lt;br /&gt;| | |0&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | 1&lt;br /&gt;| |2&lt;br /&gt;| |(fib 4)&lt;br /&gt;| | (fib 2)&lt;br /&gt;| | |(fib 0)&lt;br /&gt;| | |0&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | 1&lt;br /&gt;| | (fib 3)&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | |(fib 2)&lt;br /&gt;| | | (fib 0)&lt;br /&gt;| | | 0&lt;br /&gt;| | | (fib 1)&lt;br /&gt;| | | 1&lt;br /&gt;| | |1&lt;br /&gt;| | 2&lt;br /&gt;| |3&lt;br /&gt;| 5&lt;br /&gt;| (fib 6)&lt;br /&gt;| |(fib 4)&lt;br /&gt;| | (fib 2)&lt;br /&gt;| | |(fib 0)&lt;br /&gt;| | |0&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | 1&lt;br /&gt;| | (fib 3)&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | |(fib 2)&lt;br /&gt;| | | (fib 0)&lt;br /&gt;| | | 0&lt;br /&gt;| | | (fib 1)&lt;br /&gt;| | | 1&lt;br /&gt;| | |1&lt;br /&gt;| | 2&lt;br /&gt;| |3&lt;br /&gt;| |(fib 5)&lt;br /&gt;| | (fib 3)&lt;br /&gt;| | |(fib 1)&lt;br /&gt;| | |1&lt;br /&gt;| | |(fib 2)&lt;br /&gt;| | | (fib 0)&lt;br /&gt;| | | 0&lt;br /&gt;| | | (fib 1)&lt;br /&gt;| | | 1&lt;br /&gt;| | |1&lt;br /&gt;| | 2&lt;br /&gt;| | (fib 4)&lt;br /&gt;| | |(fib 2)&lt;br /&gt;| | | (fib 0)&lt;br /&gt;| | | 0&lt;br /&gt;| | | (fib 1)&lt;br /&gt;| | | 1&lt;br /&gt;| | |1&lt;br /&gt;| | |(fib 3)&lt;br /&gt;| | | (fib 1)&lt;br /&gt;| | | 1&lt;br /&gt;| | | (fib 2)&lt;br /&gt;| | | |(fib 0)&lt;br /&gt;| | | |0&lt;br /&gt;| | | |(fib 1)&lt;br /&gt;| | | |1&lt;br /&gt;| | | 1&lt;br /&gt;| | |2&lt;br /&gt;| | 3&lt;br /&gt;| |5&lt;br /&gt;| 8&lt;br /&gt;|13&lt;br /&gt;13&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&gt;(define tailfib (lambda (n F-1 F-2)&lt;br /&gt;&lt;span style="padding-left:10px"&gt;  (cond &lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;  ( (= n 0) 0)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;  ( (= n 1) F-1)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:20px"&gt;  ( else (tailfib (- n 1) (+ F-1 F-2) F-1)))))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&gt;(define fib(lambda (n)&lt;br /&gt;&lt;span style="padding-left:10px"&gt;                    (tailfib n 1 0)))&lt;/span&gt;&lt;br /&gt;&gt;(trace tailfib)&lt;br /&gt;&lt;br /&gt;&gt;(fib 7)&lt;br /&gt;|(tailfib 7 1 0)&lt;br /&gt;|(tailfib 6 1 1)&lt;br /&gt;|(tailfib 5 2 1)&lt;br /&gt;|(tailfib 4 3 2)&lt;br /&gt;|(tailfib 3 5 3)&lt;br /&gt;|(tailfib 2 8 5)&lt;br /&gt;|(tailfib 1 13 8)&lt;br /&gt;|13&lt;br /&gt;13&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-4784122488423400462?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/4784122488423400462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=4784122488423400462' title='100 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4784122488423400462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4784122488423400462'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2011/02/non-tail-call-to-tail-call-coversion.html' title='Non-tail call to tail call coversion - Scheme Fibonacci example with recursion tree'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>100</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-1868748821607444164</id><published>2011-01-29T10:10:00.000-08:00</published><updated>2011-01-29T10:37:57.105-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lambda'/><category scheme='http://www.blogger.com/atom/ns#' term='trace'/><category scheme='http://www.blogger.com/atom/ns#' term='recursion'/><category scheme='http://www.blogger.com/atom/ns#' term='debug'/><category scheme='http://www.blogger.com/atom/ns#' term='scheme'/><title type='text'>Trace lambda in Scheme to display recursion tree</title><content type='html'>When doing recursions in Scheme programming sometime its useful to see the growth of the recursion tree for debugging or even to get an intuition about approaches to improve the algorithm like dynamic programming. Following is a pure recursive insertion sort, obviously not the efficient implementation, yet it illustrate the use of &lt;span style="font-style:italic;"&gt;&lt;span style="font-weight:bold;"&gt;trace&lt;/span&gt;&lt;/span&gt; procedure.&lt;br /&gt;&lt;br /&gt;&lt;code&gt; (define insert (lambda (val sorted)&lt;br /&gt;&lt;span style="padding-left:20px"&gt;(if (null? sorted)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:40px"&gt;(list val)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:40px"&gt;(if (&lt; val (car sorted))&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:60px"&gt;(cons val sorted)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:60px"&gt;(cons (car sorted) (insert val (cdr sorted)))))))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;(define sort (lambda (vals)&lt;br /&gt;&lt;span style="padding-left:20px"&gt;(if (null? vals)&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:40px"&gt;vals&lt;/span&gt;&lt;br /&gt;&lt;span style="padding-left:40px"&gt;(insert (car vals) (sort (cdr vals))))))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;(trace insert)&lt;br /&gt;(trace sort)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;(sort '(7 6 5 9 3 4 2 8))&lt;/code&gt;&lt;br /&gt;|(sort (7 6 5 9 3 4 2 8))&lt;br /&gt;| (sort (6 5 9 3 4 2 8))&lt;br /&gt;| |(sort (5 9 3 4 2 8))&lt;br /&gt;| | (sort (9 3 4 2 8))&lt;br /&gt;| | |(sort (3 4 2 8))&lt;br /&gt;| | | (sort (4 2 8))&lt;br /&gt;| | | |(sort (2 8))&lt;br /&gt;| | | | (sort (8))&lt;br /&gt;| | | | |(sort ())&lt;br /&gt;| | | | |()&lt;br /&gt;| | | | (insert 8 ())&lt;br /&gt;| | | | (8)&lt;br /&gt;| | | |(insert 2 (8))&lt;br /&gt;| | | |(2 8)&lt;br /&gt;| | | (insert 4 (2 8))&lt;br /&gt;| | | |(insert 4 (8))&lt;br /&gt;| | | |(4 8)&lt;br /&gt;| | | (2 4 8)&lt;br /&gt;| | |(insert 3 (2 4 8))&lt;br /&gt;| | | (insert 3 (4 8))&lt;br /&gt;| | | (3 4 8)&lt;br /&gt;| | |(2 3 4 8)&lt;br /&gt;| | (insert 9 (2 3 4 8))&lt;br /&gt;| | |(insert 9 (3 4 8))&lt;br /&gt;| | | (insert 9 (4 8))&lt;br /&gt;| | | |(insert 9 (8))&lt;br /&gt;| | | | (insert 9 ())&lt;br /&gt;| | | | (9)&lt;br /&gt;| | | |(8 9)&lt;br /&gt;| | | (4 8 9)&lt;br /&gt;| | |(3 4 8 9)&lt;br /&gt;| | (2 3 4 8 9)&lt;br /&gt;| |(insert 5 (2 3 4 8 9))&lt;br /&gt;| | (insert 5 (3 4 8 9))&lt;br /&gt;| | |(insert 5 (4 8 9))&lt;br /&gt;| | | (insert 5 (8 9))&lt;br /&gt;| | | (5 8 9)&lt;br /&gt;| | |(4 5 8 9)&lt;br /&gt;| | (3 4 5 8 9)&lt;br /&gt;| |(2 3 4 5 8 9)&lt;br /&gt;| (insert 6 (2 3 4 5 8 9))&lt;br /&gt;| |(insert 6 (3 4 5 8 9))&lt;br /&gt;| | (insert 6 (4 5 8 9))&lt;br /&gt;| | |(insert 6 (5 8 9))&lt;br /&gt;| | | (insert 6 (8 9))&lt;br /&gt;| | | (6 8 9)&lt;br /&gt;| | |(5 6 8 9)&lt;br /&gt;| | (4 5 6 8 9)&lt;br /&gt;| |(3 4 5 6 8 9)&lt;br /&gt;| (2 3 4 5 6 8 9)&lt;br /&gt;|(insert 7 (2 3 4 5 6 8 9))&lt;br /&gt;| (insert 7 (3 4 5 6 8 9))&lt;br /&gt;| |(insert 7 (4 5 6 8 9))&lt;br /&gt;| | (insert 7 (5 6 8 9))&lt;br /&gt;| | |(insert 7 (6 8 9))&lt;br /&gt;| | | (insert 7 (8 9))&lt;br /&gt;| | | (7 8 9)&lt;br /&gt;| | |(6 7 8 9)&lt;br /&gt;| | (5 6 7 8 9)&lt;br /&gt;| |(4 5 6 7 8 9)&lt;br /&gt;| (3 4 5 6 7 8 9)&lt;br /&gt;|(2 3 4 5 6 7 8 9)&lt;br /&gt;(2 3 4 5 6 7 8 9)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-1868748821607444164?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/1868748821607444164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=1868748821607444164' title='105 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1868748821607444164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1868748821607444164'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2011/01/trace-lambda-in-scheme-to-display.html' title='Trace lambda in Scheme to display recursion tree'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>105</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-1995150885439409508</id><published>2010-12-02T20:54:00.000-08:00</published><updated>2010-12-02T20:56:36.722-08:00</updated><title type='text'>Latex Equation Editor</title><content type='html'>&lt;a href="http://www.numberempire.com/texequationeditor/equationeditor.php"&gt;Here &lt;/a&gt;is a great site that allows you to edit and render mathematical equation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-1995150885439409508?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/1995150885439409508/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=1995150885439409508' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1995150885439409508'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1995150885439409508'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/12/latex-equation-editor.html' title='Latex Equation Editor'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-2438444748593043203</id><published>2010-09-06T21:17:00.000-07:00</published><updated>2010-09-06T21:48:48.710-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='install'/><category scheme='http://www.blogger.com/atom/ns#' term='development'/><category scheme='http://www.blogger.com/atom/ns#' term='Eclipse'/><category scheme='http://www.blogger.com/atom/ns#' term='setup'/><category scheme='http://www.blogger.com/atom/ns#' term='hack'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop'/><title type='text'>How to setup Eclipse project and development envionment for changing Hadoop framework</title><content type='html'>Here are the three easy steps that i followed when i want to hack the hadoop code to for my research and i have seen this question asked and answered few times and i personally think the approach i took is worth blogging.&lt;br /&gt;&lt;br /&gt;First of all you need to decide on the version that you are selecting to hack on. I usually use the .20.2 tag and i download and install the binary version of that and follow the &lt;a href="http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29"&gt;instructions &lt;/a&gt;posted  that. Now the idea is you would check out the same tag from svn and and setup an eclipse project  and you would build that project and replace the jar in binary installation with the newly build jar.&lt;br /&gt;&lt;br /&gt;Step 1: Install: Download and install a stable version of &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;that you are chosen to develop on.&lt;br /&gt;&lt;br /&gt;Step2 :  Setup &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;Eclipse Project - To setup the &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;eclipse project you would have to have &lt;a href="http://www.eclipse.org/"&gt;Eclipse &lt;/a&gt;with the &lt;a href="http://subclipse.tigris.org/update_1.6.x/"&gt;svn plug-in&lt;/a&gt; and add a svn repository using  &lt;a href="https://svn.apache.org/repos/asf/hadoop/common/"&gt;https://svn.apache.org/repos/asf/hadoop/common/&lt;/a&gt;  URL. Once the repository is added go into the tags in the repository tree and find the tag version you installed in step 1 and check out that using the eclipse. Now you will have a eclipse project with lot of errors. First step in getting the project to compile is to run the ant jar target in the build script. This will find all the dependency jars and they will be stored in the lib and other folders. You can try to add the jars one by one but i would rather suggest using a explorer like Konqueror and searching for all the jar files in the project root folder and copying all of them into a folder named libn. You might have to manually download the ant.jar and add that to the libn as well. Then you can add all these jar files to the eclipse classpath. Once this is done you will be almost all set except for selecting the src folders that you want to be compiled. Most of the exciting stuff in hadoop happen in src/mapreduce, src/core, src/hdfs source trees. So for a starter i would only add these as the source directories. By now the Eclipse project should compile successfully without errors.&lt;br /&gt;&lt;br /&gt;Step 3: Deployment - Once the eclipse project is setup and you done with your hacking and changed say  the &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;core, you could  run the jar target of the build.xml and build the project. This would generate a jar file inside the build directory named hadoop-0.xx.x-dev-core.jar. Now you should go to the installation directory of &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;in step1 and delete the core jar there and replace it with the new jar you just build. Once you have done that you can restart &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;and you will have your own version of &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;installation.&lt;br /&gt;&lt;br /&gt;Happy Hadoop hacking !!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-2438444748593043203?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/2438444748593043203/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=2438444748593043203' title='112 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/2438444748593043203'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/2438444748593043203'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/09/how-to-setup-eclipse-project-and.html' title='How to setup Eclipse project and development envionment for changing Hadoop framework'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>112</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-6073605006130014279</id><published>2010-06-22T21:05:00.000-07:00</published><updated>2010-06-22T21:13:37.267-07:00</updated><title type='text'>Nice overview of Hadoop Pig</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://hadoop.apache.org/pig/images/pig-logo.gif"&gt;&lt;img style="float: left; margin: 0pt 10px 10px 0pt; cursor: pointer; width: 75px; height: 106px;" src="http://hadoop.apache.org/pig/images/pig-logo.gif" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Came across this nice presentation of &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt; &lt;a href="http://hadoop.apache.org/pig/"&gt;Pig&lt;/a&gt;!. It can be found &lt;a href="http://www.cloudera.com/videos/introduction_to_pig"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-6073605006130014279?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/6073605006130014279/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=6073605006130014279' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6073605006130014279'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6073605006130014279'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/06/nice-overview-of-hadoop-pig.html' title='Nice overview of Hadoop Pig'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-917821006511734684</id><published>2010-06-15T11:39:00.000-07:00</published><updated>2010-06-15T11:40:58.046-07:00</updated><title type='text'>Great Hadoop installation guide</title><content type='html'>Following is the link to the  most straight forward installation guide i ve seen so far&lt;br /&gt;&lt;a href="http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29"&gt;&lt;br /&gt;Link!!&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-917821006511734684?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/917821006511734684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=917821006511734684' title='66 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/917821006511734684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/917821006511734684'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/06/great-hadoop-installation-guide.html' title='Great Hadoop installation guide'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>66</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-1602418662076191763</id><published>2010-05-04T17:30:00.000-07:00</published><updated>2010-05-04T18:18:58.517-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programmatically'/><category scheme='http://www.blogger.com/atom/ns#' term='API'/><category scheme='http://www.blogger.com/atom/ns#' term='Terminate'/><category scheme='http://www.blogger.com/atom/ns#' term='Map Reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='Job Flow'/><category scheme='http://www.blogger.com/atom/ns#' term='elastic map reduce'/><title type='text'>Amazon Elastic Map Reduce: API call to find and terminate job flows without Amazon Console</title><content type='html'>Recently one of the interns in the lab asked me how can he kill the running job flows in &lt;a href="http://aws.amazon.com/elasticmapreduce/"&gt;Amazon Elastic Map Reduce&lt;/a&gt; so he will not run up a big bill. This is straight forward to do in &lt;a href="http://aws.amazon.com/console/"&gt;Amazon Management Console&lt;/a&gt;, but in this case the student didn't have the username/password to the amazon account because he was using a community account (although he had access to the accesskeys). Another application would be to write a cron job that checks the left over Job Flows and their EC2 instances every night so being able to do this programmatically helps.&lt;br /&gt;&lt;br /&gt;Terminating a Job flow is straight forward using the &lt;a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2305"&gt;client API&lt;/a&gt; that Amazon has provided and a sample can be found &lt;a href="http://www.cs.indiana.edu/%7Echerath/java/TerminateJobFlowsSample.java"&gt;here&lt;/a&gt;. Now the issue is how to find the running Job Flows using API calls. The DescribeJobFlow API may give the impression that one need to give the Job Flow id to begin with to query the Job Flow Status, but my testing shows that if we don't specify any Job Flow ids in the request as shown in the sample &lt;a href="http://www.cs.indiana.edu/%7Echerath/java/DescribeJobFlowsSample.java"&gt;here&lt;/a&gt;, it would return the Job Flows with the most recent activities. If there are Job Flows that are running those would have most recent activities. So once you have these two its easy to write a piece of code that would glue these two together to kill all the existing Job Flows.&lt;br /&gt;&lt;br /&gt;Thats it!!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-1602418662076191763?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/1602418662076191763/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=1602418662076191763' title='29 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1602418662076191763'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1602418662076191763'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/05/amazon-elastic-map-reduce-api-call-to.html' title='Amazon Elastic Map Reduce: API call to find and terminate job flows without Amazon Console'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>29</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-7314218854846661559</id><published>2010-04-19T11:25:00.000-07:00</published><updated>2010-04-19T12:04:45.552-07:00</updated><title type='text'>Google Scholar, I want BibTex all the time !!!</title><content type='html'>Don't get me wrong, i am extremely grateful for Google scholar especially their import to BibTex feature and yes, i do want to stand on the shoulders of giants.&lt;br /&gt;&lt;br /&gt;The BibTex feature is very useful when including related work for a paper but its not there by default and i say they should put it there by default. You do have the option though, to go to preferences and switch it on, but apparently its not associated with my Gmail account and if i login from a different machine i have to switch it on again.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-7314218854846661559?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/7314218854846661559/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=7314218854846661559' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7314218854846661559'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7314218854846661559'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/04/google-scholar-i-want-bibtex-all-time.html' title='Google Scholar, I want BibTex all the time !!!'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-7677113352344544227</id><published>2010-04-09T17:49:00.000-07:00</published><updated>2010-04-11T19:52:45.064-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multicore'/><category scheme='http://www.blogger.com/atom/ns#' term='upper bound'/><category scheme='http://www.blogger.com/atom/ns#' term='Time complexity'/><category scheme='http://www.blogger.com/atom/ns#' term='graham and brent theorem'/><category scheme='http://www.blogger.com/atom/ns#' term='scheduling'/><title type='text'>Graham and Brent theorem for speedup when using multiprocessors or multicores</title><content type='html'>Since we are big into multicores these days and it would be useful to predict or make a conservative estimate the behavior of an application speedup as we throw more and more processors at it. Let us consider the two obvious extreme cases, one favorable the other unfavorable. The are embarrassingly parallel applications and strictly sequential applications respectively and we will have a look at an example in the reverse order.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Strictly Sequential.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Lets consider calculating the famous nth Fibonacci number and lets assume we do NOT know how to solve a second order recurrence using the clever mathematical way so we will just use the formula as it is. So starting from F0 and F1 we can calculate F2 and so fourth upto the nth number. So if we write a program to calculate Fi we can observe that it has a strict data dependency on Fi-1 and Fi-2. So in such data dependent cases there is no room for making things parallel. Such application obviously won't speed up as you throw more processors at the problem.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Fn+2 = Fn+1 + Fn&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Embarrassingly parallel&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Consider an application that requires adding two large matrices. Addition of two elements from the matrices are totally independent of any other element. So there is no data dependency in such a case and the matrices can be easily split into sub-problems and given to the number of processors that are available and can be made to run in parallel easily. In such applications programs could achieve linear speed up and in some cases with some clever caching techniques even super linear speedups can be achieved.&lt;br /&gt;&lt;br /&gt;In the two cases discussed above we can calculate how an application may speedup in the boundary cases, but most applications that we encounter are in between strictly sequential and embarrassingly parallel. Most applications have certain degree of parallelism and some amount of data dependency which forces sequential behavior and analyzing speed up of such application is much harder. In this post we have a look at  a clever theorem invented by Graham[1] and Brent[2] independently and one of the MIT talks i listened to consolidated their ideas to a very insightful theorem.&lt;br /&gt;&lt;br /&gt;First Some naming conventions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;T_n  - Time taken for the application to run when n processors are at its disposal. T_1 represent the sequential case and T_infinity represent that case where application may have as many processors as needed &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;based on above conventions we could define:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Parallelism = T_1/T_infinity&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Speedup for p processors  = T_1/T_p&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;An observation assuming no cache tricks to achieve super-linear speedup:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;T_p &gt; T_1/p&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This simply mean that by having two processors you cannot cut the time exactly by half, it will always be a little more due to possible data dependencies and of course the extra overhead associated with making the application parallel.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Model&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This theorem models a given application using a graph where nodes would represent computational units and edges would represent the data dependencies similar to the workflow idea. Graham and Brent argue that if you have infinite processors available for the application, the time will depend upon the critical path of the graph or longest path of the graph from start to finish. So T_infinity may also looked upon as the critical path time. They argue that the speedup that can be achieved depend on this critical path as well as the number of processors that is available to the scheduler at the execution of the application. Based on these two parameters they propose a upper bound to the time taken by an application when its presented with p processors. Upper bound of time is useful because it may allow you to provision your resources for worst case scenario.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Theorem:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;br /&gt;T_p &lt;=  T_1/p   + T_infinity&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Theorem means an application with p processors at most would take time that is addition of sequential time / p and critical path time. Many techniques have since developed to reduce that upper bound but in my view this is the most insightful theorem of its nature and its simplicity makes it quite appealing when making an conservative estimate or during formal analysis.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[1]R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied&lt;br /&gt;Mathematics, 17(2):416{429, March 1969&lt;br /&gt;[2]Richard P. Brent. The parallel evaluation of general arithmetic expressions. Journal of&lt;br /&gt;the ACM, 21(2):201{206, April 1974&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-7677113352344544227?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/7677113352344544227/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=7677113352344544227' title='61 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7677113352344544227'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7677113352344544227'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/04/graham-and-brent-theorem-for-speedup.html' title='Graham and Brent theorem for speedup when using multiprocessors or multicores'/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>61</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-8888648333369595849</id><published>2010-04-06T19:09:00.000-07:00</published><updated>2010-04-06T19:41:14.882-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='minimize cost'/><category scheme='http://www.blogger.com/atom/ns#' term='Map Reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='cost'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop'/><category scheme='http://www.blogger.com/atom/ns#' term='elastic map reduce'/><title type='text'></title><content type='html'>&lt;span style="font-size:180%;"&gt;Reducing debug cycle during Amazon elastic map reduce development&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The cost model that Amazon has published for &lt;a href="http://aws.amazon.com/elasticmapreduce/"&gt;Amazon Elastic Map Reduce&lt;/a&gt; is totally unfair during the development process. The minimum billing unit is an hour and these hours add up quickly to run up your  bill if you are not careful enough. If you are doing anything serious using &lt;a href="http://aws.amazon.com/elasticmapreduce/"&gt;Amazon Elastic Map Reduce&lt;/a&gt;, that is to say you are running something other than the word count example and you choose not to install &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;yourself but rather to develop off the &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;in Amazon Elastic Map Reduce, you will end up making lot of debug runs to get the configurations right. In each of these runs if the Hadoop gets launched  even for a minute it will charge you for an entire hour times the number of instances you launched. Especially if you are using the Amazon Management Console you will end up having to start a new Job Flow every time you change your application and want to test it. These costs quickly add up if you are not careful, or rather careless.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Tips to reduce the costs&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;Avoid using the extra large machines &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;During development avoid using extra large instances because the cost of these are much much higher and because they have 8 cores you will be billed 8 normalized CPU hours when the instance gets launched.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Programatically Launch Job Flow with keep alive&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I wrote a &lt;a href="http://chathurah.blogspot.com/2010/03/programmatically-launch-elastic-map.html"&gt;blog &lt;/a&gt;earlier showing how to launch a Job Flow programatically and in that i showed how to keep the Job Flow and the instances alive after your map reduce application finish. Then you can simply add a job flow step to the already running application. This will not only reduce the debug cycle because the instance boot up time is no longer relevant to the subsequent Job Flow Steps and you can launch multiple map reduce runs as Job Flow Step with in an hour and yet it will cost you only one hour of CPU because you are not shutting down the instances after one run.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Not develop on Amazon Elastic Map Reduce&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;One option is to install &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;locally and test it there before coming the Amazon so you will not end up paying an hours price for every few minutes of debug run you did.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;Amazon should provide development instances billed per minute.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Best scenario is amazon either provide cheaper instances for development or bill per minute during development.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-8888648333369595849?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/8888648333369595849/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=8888648333369595849' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/8888648333369595849'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/8888648333369595849'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/04/reducing-debug-cycle-during-amazon.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-6249022507951700964</id><published>2010-03-31T07:14:00.000-07:00</published><updated>2010-03-31T09:55:32.819-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Map'/><category scheme='http://www.blogger.com/atom/ns#' term='Map Reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Time complexity'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop'/><category scheme='http://www.blogger.com/atom/ns#' term='each line'/><title type='text'></title><content type='html'>&lt;span style="font-size:180%;"&gt;Optimal Map tasks for Map Reduce Applications based on Time Complexity ???&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Some Analysis&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;One of the strong points in &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;is its ability to efficiently partition the input files using API hooks in the HDFS and calling the Map tasks with each of these partitions. Its useful to understand how to find these task distribution to get the best possible performance achievable. There are few factors to consider when trying to pick the right number of map tasks but the obvious would be to strike a balance between the speedup from the distribution and the overhead of distribution.&lt;br /&gt;&lt;br /&gt;Consider the case where time Complexity of the application is O(n) meaning the running time depend entirely on the size of the input, like in the case of WordCount. We can argue that the efficient sizes of the partition that the system could handle would be the deciding factor because the running time is only growing linearly with the input and you have to read the input anyway. So in such cases letting the Hadoop decide on the partition sizes will be the right approach because it will base the partition size based on the optimal block sizes. Why? Following may convince you why.&lt;br /&gt;&lt;br /&gt;Lets assume Hadoop distribution overhead is linear with the Map tasks, then Total time would look like the following equation. This may be oversimplified equation because it leaves out the reduction complexity and without any argument assumed the overhead of Hadoop is linear with Map Tasks. But this simplicity would let you have more insight.&lt;br /&gt;&lt;br /&gt;Total time = O (Input Size/Map Tasks) + Overhead * Map Tasks&lt;br /&gt;&lt;br /&gt;So if the application has linear time complexity O(n) then ;&lt;br /&gt;&lt;br /&gt;Total time = (Input Size/Map Tasks)+ Overhead * Map Tasks&lt;br /&gt;&lt;br /&gt;Taking the derivative would tell you that Total time is minimum when&lt;br /&gt;&lt;br /&gt;Map Tasks = sqrt(Input Size/Overhead)&lt;br /&gt;&lt;br /&gt;Since this number is very dependent on the Overhead it make sense to let Hadoop do the partitioning and distribution.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Structured input for Map Reduce&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Consider the case where the time complexity of the application is not linear but higher, then the dominant term in the Total time is O (Input Size/Map Tasks). In case the time complexity is O(n^2)  =&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Total time = (Input Size/Map Tasks) ^ 2 + Overhead * Map Tasks&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here the first term grows much faster.  So minimizing that by increasing the Map tasks is essential.  So in such cases you cannot let the Hadoop come up with the number of Map tasks based on the optimal block sizes that it can handle but rather you need to force the number of map tasks upon it.&lt;br /&gt;&lt;br /&gt;For example a &lt;a href="http://j3d.sourceforge.net/"&gt;java ray tracing&lt;/a&gt; application which has O(n^2)  complexity according to &lt;a href="http://www.odesk.com/users/Software-Architect-USA_d43caee4d56de953"&gt;Michael&lt;/a&gt;. In this you try to render a scene using ray tracing and this can be easily parallelized because each pixel calculation is independent of the other. When you parallelize it, you will have different segments you want to compute in the input file and you need to force the Map Reduce to calculate those segments in different map tasks.  So you need much structured input than that in the word count example because your number of map tasks and your running time depend on it. Letting the partitioning be done to optimize the file sizes in such a compute intensive application would be unwise and in many cases counterproductive. Although it should be noted Hadoop was designed for data intensive parallelism rather than compute intensive parallelism .&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Forcing Hadoop to run a map task for each line in input file&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Hadoop provides a input formater called &lt;a href="http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html"&gt;NLineInputFormat &lt;/a&gt;where each line in the input file would get mapped to a separate map task which allows you to have control over the number of map tasks that you want to force upon the Hadoop framework. Following is the incomplete code to setup the JobConfiguration and the Bold line shows what you need to do to set the NLineInputFormat. So if you know how to optimally map tasks you need to run you can allow those inputs to lines in the input file so Hadoop will create a new map task for each input.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;JobConf conf = new JobConf(HadoopRayTracer.class);&lt;br /&gt;        conf.setJobName("raytrace");&lt;br /&gt;        conf.setOutputKeyClass(Text.class);&lt;br /&gt;        conf.setOutputValueClass(Text.class);&lt;br /&gt;        conf.setMapperClass(Map.class);&lt;br /&gt;        conf.setReducerClass(Reduce.class);&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;        conf.setInputFormat(NLineInputFormat.class);&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;        conf.setOutputFormat(TextOutputFormat.class)&lt;br /&gt;.......&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-6249022507951700964?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/6249022507951700964/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=6249022507951700964' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6249022507951700964'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6249022507951700964'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/03/optimal-map-tasks-for-map-reduce.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-2141175495257937072</id><published>2010-03-28T23:19:00.000-07:00</published><updated>2010-03-31T01:19:18.374-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='java'/><category scheme='http://www.blogger.com/atom/ns#' term='Map Reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop'/><category scheme='http://www.blogger.com/atom/ns#' term='elastic map reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='client'/><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;font-size:180%;" &gt;Programmatically Launch Elastic Map Reduce applications&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I am writing this mainly to help the student that I am teaching this semester in &lt;a href="https://www.cs.indiana.edu/%7Eplale/"&gt;Prof Beth Plale&lt;/a&gt;’s &lt;a href="http://www.cs.indiana.edu/classes/b534/"&gt;B534 &lt;/a&gt;class.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Short Background&lt;/span&gt;&lt;br /&gt;&lt;a href="http://aws.amazon.com/elasticmapreduce/"&gt;&lt;br /&gt;Amazon Elastic MapReduce&lt;/a&gt; has provides an API to &lt;a href="http://hadoop.apache.org/"&gt;Hadoop &lt;/a&gt;MapReduce jobs using Amazon EC2. If you have a Hadoop application you can use the &lt;a href="http://aws.amazon.com/console/"&gt;Amazon Console&lt;/a&gt; to deploy and run it in the EC2. A great great tutorial on how to use the Amazon Console to run your Hadoop application can be found &lt;a href="http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/index.html?CreatingaJobJAR.html"&gt;here&lt;/a&gt;.&lt;br /&gt;If you do not have a Hadoop application you need to have a look at the famous word count example which can be found &lt;a href="http://wiki.apache.org/hadoop/WordCount"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Amazon abstraction of Map reduce.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Amazon has introduce two abstraction for its Elastic MapReduce framework and they are&lt;br /&gt;&lt;br /&gt;1) Job Flow&lt;br /&gt;2) Job Flow Step&lt;br /&gt;&lt;br /&gt;It is important to understand these abstractions before using the API to launch MapReduce application. Following are the definitions provided by Amazon for Job Flow and Job Flow Step, but it would be much more intuitive to understand them in the context of an application.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://aws.amazon.com/elasticmapreduce/faqs/#gen-6"&gt;Q: What is an Amazon Elastic MapReduce Job Flow?&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A Job Flow is a collection of processing steps that Amazon Elastic MapReduce runs on a specified dataset using a set of Amazon EC2 instances. A Job Flow consists of one or more steps, each of which must complete in sequence successfully, for the Job Flow to finish.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://aws.amazon.com/elasticmapreduce/faqs/#gen-6"&gt;Q: What is a Job Flow Step?&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A Job Flow Step is a user-defined unit of processing, mapping roughly to one algorithm that manipulates the data. A step is a Hadoop MapReduce application implemented as a Java jar or a streaming program written in Java, Ruby, Perl, Python, PHP, R, or C++. For example, to count the frequency with which words appear in a document, and output them sorted by the count, the first step would be a MapReduce application which counts the occurrences of each word, and the second step would be a MapReduce application which sorts the output from the first step based on the counts.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Terminology: &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;When you run a MapReduce application, data files will be read by the Hadoop framework and it will be divided in to say N segments and your map function will be called N times in parallel and at the completion of the map tasks your reduce function will be called. Once the reduce function is finished executing your outputs will be written to the output folder. We shall call this a &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; for the purpose of identification. So if you want to word count a particular log file and lets assume you ran it within a &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; and at the end of it or during the run you received another log message that you want to word count, so you may want to launch another &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;For the purpose of the class, a rendering of a given scene using the modified &lt;a href="http://j3d.sourceforge.net/"&gt;ray tracing library&lt;/a&gt; will be a &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;. In this &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; you will split the scene into sections and ray trace these sections/subviews in different Map tasks and you will combine the sections/subviews in the Reduce task and write it to the output. So if you have a second scene to render, it will be another&lt;span style="font-style: italic;"&gt; Single MapReduce Run&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;&lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; –Job Flow Step&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The Amazon Job Flow corresponds to a running application in Amazon Elastic MapReduce and could contain one or more &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;s. So the &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; described above correspond to a Job Flow Step in Amazon Elastic Map Reduce. Job Flow and Job Flow Step has somewhat like parent child relationship and its one-to-many.&lt;br /&gt;&lt;br /&gt;The Job Flow corresponds to the setup of infrastructure, EC2 reservation and I am guessing for accounting. Within that Job Flow one may run Hadoop MapReduce applications. Each &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt; will be a Job Flow Step inside that Job Flow.&lt;br /&gt;&lt;br /&gt;Ec2 machine reservations are done at the Job Flow level and once reserved it is fixed for all the Job Flow Steps in that particular Job Flow. This would become much clearer when you get to the API calls.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Launching Elastic MapReduce jobs with programmatically using Java&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Amazon provides an API client that enable Amazon Elastic MapReduce users to launch Hadoop MapReduce jobs programmatically. Again I assume by this time you have already launched and tested your application manually using Amazon MapReduce Console. You can download the client from &lt;a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2305"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Once you download it set it us with your IDE and find the com.amazonaws.elasticmapreduce.samples package. Inside this package you will find the few java files which will allow you to create Job Flows, add Job Flow Steps, Terminate Job Flows and Query Job Flow Status. In this discussion we will focus on first two (RunJobFlowSample.java and AddJobFlowStepsSample.java).&lt;br /&gt;&lt;br /&gt;If you open RunJobFlowSample class from the Amazon you may notice its main method is pretty empty but the API that they provide is very easy and you would have to fill in your configurations to the bean objects they have provided and set it to the request. Following is the code to setup Job Flow with a single Job Flow Step. In other words it will setup the Hadoop framework using 11 machines and will launch &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;. You can download the implementation class &lt;a href="http://www.cs.indiana.edu/%7Echerath/java/RunJobFlowSample.java"&gt;here&lt;/a&gt;. You will obviously have to change the Amazon credential to your own.&lt;br /&gt;&lt;br /&gt;Once you launch the Job Flow, you may go to the Amazon console and monitor the status of your Job Flow.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;public static void main(String[] args) {&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;String accessKeyId = "FHJKDFIHKJDF";&lt;br /&gt;String secretAccessKey = "DFJLDFODF/AND/NO/THIS/IS/NOT/MY/ACCESS/KEY";&lt;br /&gt;&lt;br /&gt;AmazonElasticMapReduceConfig config = new AmazonElasticMapReduceConfig();&lt;br /&gt;config.setSignatureVersion("0");&lt;br /&gt;// config.set&lt;br /&gt;AmazonElasticMapReduce service = new AmazonElasticMapReduceClient(&lt;br /&gt;accessKeyId, secretAccessKey, config);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;RunJobFlowRequest request = new RunJobFlowRequest();&lt;br /&gt;JobFlowInstancesConfig conf = new JobFlowInstancesConfig();&lt;br /&gt;conf.setEc2KeyName("class");&lt;br /&gt;conf.setInstanceCount(11);&lt;br /&gt;conf.setKeepJobFlowAliveWhenNoSteps(true);&lt;br /&gt;conf.setMasterInstanceType("m1.small");&lt;br /&gt;conf.setPlacement(new PlacementType("us-east-1a"));&lt;br /&gt;conf.setSlaveInstanceType("m1.small");&lt;br /&gt;&lt;br /&gt;request.setInstances(conf);&lt;br /&gt;request.setLogUri("s3n://b534/logs");&lt;br /&gt;&lt;br /&gt;String jobFlowName = "Class-job-flow" + new Date().toString();&lt;br /&gt;jobFlowName = Utils.formatString(jobFlowName);&lt;br /&gt;System.err.println(jobFlowName);&lt;br /&gt;&lt;br /&gt;request.setName(jobFlowName);&lt;br /&gt;String stepname = "Step" + System.currentTimeMillis();&lt;br /&gt;List steps = new LinkedList();&lt;br /&gt;StepConfig stepConfig = new StepConfig();&lt;br /&gt;stepConfig.setActionOnFailure("CANCEL_AND_WAIT");&lt;br /&gt;HadoopJarStepConfig jarsetup = new HadoopJarStepConfig();&lt;br /&gt;List arguments = new LinkedList();&lt;br /&gt;arguments.add("s3n://b534/inputs/");&lt;br /&gt;arguments.add("s3n://b534/outputs/"+jobFlowName+"/"+stepname+"/");&lt;br /&gt;jarsetup.setArgs(arguments);&lt;br /&gt;jarsetup.setJar("s3n://b534/Hadoopv400.jar");&lt;br /&gt;jarsetup.setMainClass("edu.indiana.extreme.HadoopRayTracer");&lt;br /&gt;stepConfig.setHadoopJarStep(jarsetup);&lt;br /&gt;&lt;br /&gt;stepConfig.setName(stepname);&lt;br /&gt;steps.add(stepConfig);&lt;br /&gt;&lt;br /&gt;request.setSteps(steps);&lt;br /&gt;&lt;br /&gt;invokeRunJobFlow(service, request);&lt;br /&gt;&lt;br /&gt;}&lt;span style="font-size:85%;"&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-size:130%;" &gt;Adding a Job Flow Step to existing Job Flow&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Above Job Flow will not shutdown once it finished the execution of its Job Flow Step, you may find the following line which is responsible for that.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;&lt;span style="font-family:courier new;"&gt;conf.setKeepJobFlowAliveWhenNoSteps(true);&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Now we will attempt to add another Job Flow Step to the Job Flow we started earlier. This will be another &lt;span style="font-style: italic;"&gt;Single MapReduce Run &lt;/span&gt;because Job Flow Steps correspond to &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;s. Have a look at the AddJobFlowStepsSample.java class and this class will be used to add a job step to an already running Job Flow. Following is the implemented main method in that class that could be used to add a Job Flow Step to a Job Flow that is already running. In this you will have to set the jobflowID and the jobFlowName apart from the credentials and they would identify the already running Job Flow. Once you run it you can again go to Amazon management Console and monitor the progress. The implemented class could be found &lt;a href="http://www.cs.indiana.edu/%7Echerath/java/AddJobFlowStepsSample.java"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;This would be useful to reuse already allocated and booted up EC3 resources and launch multiple &lt;span style="font-style: italic;"&gt;Single MapReduce Run&lt;/span&gt;s one after the another without having to incur setup cost each time. So if you want to word count a second data file or if you are a student in my class if you want to render a second scene, you can use this client to add a new Job Flow Step without having to setup all the machines again and incur the setup cost.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:85%;"&gt;public static void main(String... args) {&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;//Set these values&lt;br /&gt;String accessKeyId = "FHJKDFIHKJDF";&lt;br /&gt;String secretAccessKey = "DFJLDFODF/YES!/YOU/GUESSED/IT/NO/THIS/IS/NOT/MY/ACCESS/KEY/NEITHER";&lt;br /&gt;String jobflowID = "j-6XL4RL7E5A2";&lt;br /&gt;String jobFlowName = "Class_job_flowSat_Mar_27_23_12_16_EDT_2010";&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;AmazonElasticMapReduce service = new AmazonElasticMapReduceClient(&lt;br /&gt;accessKeyId, secretAccessKey);&lt;br /&gt;AddJobFlowStepsRequest request = new AddJobFlowStepsRequest();&lt;br /&gt;String stepName = "Step" + System.currentTimeMillis();&lt;br /&gt;System.err.println(stepName);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;request.setJobFlowId(jobflowID);&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;List steps = new LinkedList();&lt;br /&gt;StepConfig stepConfig = new StepConfig();&lt;br /&gt;stepConfig.setActionOnFailure("CANCEL_AND_WAIT");&lt;br /&gt;HadoopJarStepConfig jarsetup = new HadoopJarStepConfig();&lt;br /&gt;List arguments = new LinkedList();&lt;br /&gt;arguments.add("s3n://b534/inputs/");&lt;br /&gt;arguments.add("s3n://b534/outputs/"+jobFlowName +"/"+stepName+"/");&lt;br /&gt;jarsetup.setArgs(arguments);&lt;br /&gt;jarsetup.setJar("s3n://b534/Hadoopv400.jar");&lt;br /&gt;jarsetup.setMainClass("edu.indiana.extreme.HadoopRayTracer");&lt;br /&gt;stepConfig.setHadoopJarStep(jarsetup);&lt;br /&gt;&lt;br /&gt;stepConfig.setName(stepName);&lt;br /&gt;steps.add(stepConfig);&lt;br /&gt;&lt;br /&gt;request.setSteps(steps);&lt;br /&gt;&lt;br /&gt;invokeAddJobFlowSteps(service, request);&lt;br /&gt;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-2141175495257937072?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/2141175495257937072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=2141175495257937072' title='24 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/2141175495257937072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/2141175495257937072'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2010/03/programmatically-launch-elastic-map.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>24</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-7213936499427240352</id><published>2009-02-28T16:41:00.000-08:00</published><updated>2009-02-28T17:20:58.839-08:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Java method level generics - Getting First Element from an Iterator.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;There are ample examples where you are given an iterator but you are only interested in the first element of the collection. Following through some code i had to read, which was doing WSDL parsing, I ran into some code which most probably annoyed the programmer because of the repeating code of the same form that had to be written for different Iterator types, to loop through the iterator and extract the first element. One solution, besides writing helper for each case is to not do typing at all and do non generic helper method.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;public static Object getFirst(Iterable itr){&lt;br /&gt;    for (Object object : itr) {&lt;br /&gt;         return object;&lt;br /&gt;    }&lt;br /&gt;    throw new RuntimeException("Iterator empty");&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Obviously this is not why i wrote this blog. I wrote down the following code which is type safe and gives exactly what i want. This makes use of java method level templates. This is type safe because when some one calls this method the templates will be instantiated and the return values are obviously typed based on the instantiation happened at the caller and no type casting necessary.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;public static &amp;lt;T extends Object&amp;gt; T getfirst(Iterable&amp;lt;T&amp;gt; vals) {&lt;t&gt;&lt;t&gt;&lt;br /&gt;    for (T val: vals) {&lt;br /&gt;         return val;&lt;br /&gt;    }&lt;br /&gt;    throw new RuntimeException("Iterator empty");&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;&lt;/t&gt;&lt;/t&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-7213936499427240352?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/7213936499427240352/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=7213936499427240352' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7213936499427240352'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/7213936499427240352'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2009/02/java-method-level-generics-getting.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-4129233559646469951</id><published>2009-02-03T14:54:00.001-08:00</published><updated>2009-02-03T15:18:56.523-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='web service'/><category scheme='http://www.blogger.com/atom/ns#' term='client'/><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Generic Online web service client&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;As i ve mentioned many times before Eclipse WTP is cool for many reasons, one of them being the ability to generate a web service client to test a service in an instant.&lt;br /&gt;&lt;br /&gt;Lately i came across this online tool to make web service call, and its pretty useful for testing purposes and does save you time. It would get the WSDL (as a http url or otherwise) and would generate a web client which would have the text boxes for the different simple types that would build the complex type.&lt;br /&gt;&lt;br /&gt;One client can be found in&lt;a href="http://www.service-repository.com"&gt; service-repository&lt;/a&gt;  and another at&lt;a href="http://www.soapclient.com/soaptest.html"&gt; soap-client&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-4129233559646469951?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/4129233559646469951/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=4129233559646469951' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4129233559646469951'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4129233559646469951'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2009/02/generic-online-web-service-client-as-i.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-564621927958929935</id><published>2008-06-17T20:50:00.000-07:00</published><updated>2008-06-17T20:57:13.569-07:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;Whats Up Vegas!!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Last week I attended the &lt;a href="http://www.tacc.utexas.edu/tg08/"&gt;TeraGrid conference&lt;/a&gt; in Las Vegas, I was presenting a &lt;a href="http://www.tacc.utexas.edu/tg08/index.php?abstractID=36&amp;amp;m_b_c=paperSchedule"&gt;paper &lt;/a&gt;there.&lt;br /&gt;Besides the conference proceedings the gambling and the other Vegas activities were pretty interesting. Best attraction for me was the Bellagio Fountains a video of that could be found &lt;a href="http://youtube.com/watch?v=cP0K6H2QK7A"&gt;here&lt;/a&gt;, it was amazing.&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-564621927958929935?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/564621927958929935/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=564621927958929935' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/564621927958929935'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/564621927958929935'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2008/06/whats-up-vegas-last-week-i-attended.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-1539794292680924332</id><published>2008-06-05T08:51:00.000-07:00</published><updated>2008-06-05T08:56:46.309-07:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight:bold;"&gt;Deepal's Axis2 Book&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://blogs.deepal.org/"&gt;Deepal Jayasinghe&lt;/a&gt; has published a &lt;a href="http://www.packtpub.com/creating-web-services-with-apache-axis-2/book"&gt;book on Axis2&lt;/a&gt; and i believe its the first book published by one of my peers from &lt;a href="http://www.cse.mrt.ac.lk/"&gt;University of Moratuwa&lt;/a&gt;. Job well done!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-1539794292680924332?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/1539794292680924332/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=1539794292680924332' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1539794292680924332'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/1539794292680924332'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2008/06/deepals-axis2-book-deepal-jayasinghe.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-4111352277763553257</id><published>2008-06-03T22:25:00.000-07:00</published><updated>2008-06-03T22:37:32.010-07:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight:bold;"&gt;Proud of my school&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.cse.mrt.ac.lk/"&gt;Department of computer science&lt;/a&gt; &lt;a href="http://www.mrt.ac.lk/"&gt;University of Moratuwa&lt;/a&gt;, the school i did my undergrads and a place i revere, recently became the highest ranking school in &lt;a href="http://google-opensource.blogspot.com/2008/05/this-weeks-top-10s-universities-for.html"&gt;Google Summer of Code acceptance&lt;/a&gt;. Well done.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-4111352277763553257?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/4111352277763553257/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=4111352277763553257' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4111352277763553257'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/4111352277763553257'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2008/06/proud-of-my-school-department-of.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-6776287348145569049</id><published>2008-05-06T16:31:00.000-07:00</published><updated>2008-05-06T17:25:29.684-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firefox'/><category scheme='http://www.blogger.com/atom/ns#' term='IE'/><category scheme='http://www.blogger.com/atom/ns#' term='market share'/><title type='text'></title><content type='html'>&lt;span style="font-size:130%;"&gt;Browser market share.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I was interested about knowing how the &lt;a href="http://www.mozilla.com/firefox/"&gt;Firefox &lt;/a&gt;was doing in their quest "take back the web" and i came across this interesting article. It seems Firefox is eating away &lt;a href="http://www.microsoft.com/windows/products/winfamily/ie/default.mspx"&gt;IE&lt;/a&gt; and numbers seem to add up. I think it would continue to grow, from my personnel experience, i  have introduces Firefox to many, none seem to have gone back to IE. The original work can be found &lt;a href="http://marketshare.hitslink.com/report.aspx?qprid=0"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;table&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;Browser&lt;/td&gt;&lt;td&gt;2007&lt;/td&gt;&lt;td&gt;2008&lt;/td&gt;&lt;td&gt;diff&lt;/td&gt; &lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;IE  &lt;/td&gt;&lt;td&gt;        74.8%   &lt;/td&gt;&lt;td&gt;   79.1%     &lt;/td&gt;&lt;td&gt;     -4%&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;Firefox    &lt;/td&gt;&lt;td&gt; 14.6%  &lt;/td&gt;&lt;td&gt;  17.8%  &lt;/td&gt;&lt;td&gt;  +3%&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;td&gt;Safari    &lt;/td&gt;&lt;td&gt;  4.5%   &lt;/td&gt;&lt;td&gt;  5.8%   &lt;/td&gt;&lt;td&gt;  +1%&lt;/td&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-6776287348145569049?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/6776287348145569049/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=6776287348145569049' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6776287348145569049'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/6776287348145569049'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2008/05/browser-market-share.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-116904118209331346</id><published>2007-01-17T04:56:00.000-08:00</published><updated>2007-01-17T11:37:16.106-08:00</updated><title type='text'></title><content type='html'>&lt;span style="font-size:130%;"&gt;How to accommodate an Out-Only MEP service in Axis2&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;I have been queried quite a few times by Axis2 users about writing Out-Only MEPed services with Axis2 wsdl2java compiler, so thought of writing this down for future use.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;First of all Axis2 does not have first class support for Out-Only MEP, as at now. Main reason being that this MEP is not widely u&lt;/span&gt;&lt;span style="font-size:100%;"&gt;sed. Further there is a technique that is stated below that could change Out-Only M&lt;/span&gt;&lt;span style="font-size:100%;"&gt;EP paradigm in to some kind of isomorphism to IN-Only MEP by switching client and the server.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;In technical point of view there are other concerns such as, if a Out-Only is sitting inside Tomcat what would trigger &lt;/span&gt;&lt;span style="font-size:100%;"&gt;it to send o&lt;/span&gt;&lt;span style="font-size:100%;"&gt;ut its message? This trigger cannot be an incoming SOAP&lt;/span&gt;&lt;span style="font-size:100%;"&gt; message to that service, because if it is then the MEP of the service would c&lt;/span&gt;&lt;span style="font-size:100%;"&gt;hange to IN-OUT. So we need some trigger interface to the service implementation to tell it 'ok service now its time to send out your message'. Say we do&lt;/span&gt;&lt;span style="font-size:100%;"&gt; all that what would be the gain, I would say nothing much. Even though the service reside inside Tomcat it will not gain much from Tomcat scalability because when it is sending out its message it will use commons http to initiate its communication, i.e. it would not use Tomcat's(actually jetty's) features. So might as well keep it outside Tomcat and following is the way to do it.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-style: italic;font-size:130%;" &gt;Method&lt;/span&gt;&lt;br /&gt;Say you have the out-only operation named op1( which does not have a input message reference)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;WSDL operation would look something like the following:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/x/blogger/2511/2049/1600/224116/fig1.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/x/blogger/2511/2049/320/40157/fig1.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;We  always refer to the MEP with respec&lt;/span&gt;&lt;span style="font-size:100%;"&gt;t to the server's message exchange.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;Consider a In-Only MEP, its opera&lt;/span&gt;&lt;span style="font-size:100%;"&gt;tion would look like the following.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/x/blogger/2511/2049/1600/618710/fig2.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/x/blogger/2511/2049/320/643652/fig2.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;wsdl:operation name="op2"&gt;Now compare the service of Out-Only MEP against the client of the In-Only MEP. They are semantically the same, both&lt;/wsdl:operation&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;wsdl:operation name="op2"&gt; send out only one message.&lt;br /&gt;&lt;/wsdl:operation&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;wsdl:operation name="op2"&gt;So if you have a WSDL with Out-Only MEP like the following.&lt;br /&gt;&lt;br /&gt;&lt;/wsdl:operation&gt;&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/x/blogger/2511/2049/1600/552797/fig3.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/x/blogger/2511/2049/320/250136/fig3.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;wsdl:operation name="operation"&gt;&lt;br /&gt;&lt;br /&gt;Change it to (i.e. change the output message reference to an input message reference)&lt;br /&gt;&lt;br /&gt;&lt;/wsdl:operation&gt;&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/x/blogger/2511/2049/1600/579792/fig4.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/x/blogger/2511/2049/320/899846/fig4.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;wsdl:operation name="operation"&gt;&lt;wsdl:inputmessage="tns:outgoingmessage"&gt;&lt;br /&gt;&lt;/wsdl:inputmessage="tns:outgoingmessage"&gt;&lt;br /&gt;&lt;br /&gt;Now above is a very familiar MEP for Axis2, In-Only MEP. Now using this changed wsdl, instead of code generating for server we code generate for client using wsdl2java compiler.&lt;br /&gt;&lt;br /&gt;Now you will get a client that would send out messages when you call/trigger it to the endpoint you define. Functionality sound familiar?? Its same as what Out-Only service would do.&lt;br /&gt;&lt;br /&gt;The only flaw that I can think of is that there will be no endpoint where you could run a ?wsdl query and get the WSDL. That would something that you would achieve had this service been deployed inside Tomcat. Its a compromise.&lt;br /&gt;&lt;/wsdl:operation&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-116904118209331346?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/116904118209331346/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=116904118209331346' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/116904118209331346'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/116904118209331346'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2007/01/how-to-accommodate-out-only-mep.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-115508862305323962</id><published>2006-08-08T18:24:00.000-07:00</published><updated>2006-08-08T18:57:03.066-07:00</updated><title type='text'></title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/2511/2049/1600/classDiagram_doplerSource.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/blogger/2511/2049/320/classDiagram_doplerSource.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Eclipse again- UML!!&lt;/span&gt;&lt;br /&gt;I was asked of my opinion on a modeling tool by one of the research groups before they jump and purchase rational rose.  I already knew the Eclipse SOA guys have done significant work in this direction. &lt;a href="http://www.eclipse.org/emf/"&gt;EMF&lt;/a&gt; and &lt;a href="http://www.eclipse.org/uml2/"&gt;eclipse UML2&lt;/a&gt; had quite a bit of development in this direction but the I found this product which had everything nicely packaged and of course its free. Its a product called &lt;a href="http://www.omondo.com/"&gt;Omondo&lt;/a&gt; and worth downloading. The class diagram posted here is done using the tool and it impressed me. It threw exceptions once in a while, oh well I should fix at least one bug before start complaining.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-115508862305323962?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/115508862305323962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=115508862305323962' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115508862305323962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115508862305323962'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/08/eclipse-again-uml-i-was-asked-of-my.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-115457554247568376</id><published>2006-08-02T19:59:00.000-07:00</published><updated>2006-08-02T20:25:42.493-07:00</updated><title type='text'></title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/2511/2049/1600/niagara.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://photos1.blogger.com/blogger/2511/2049/320/niagara.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style="font-size:130%;"&gt;Niagara!!!&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;Few days ago I visited my Woden and WSDL Working group collegues at IBM Toronto and Arthur Ryman was so nice to arrage a visit to Niagara on the second day. It was once in a lifetime experiance to withness such a massive mass of water (over half a million gallons per second) falling down over 150 feet. Next time i am gonna go i am sure to be ready to get wet. You don't need to go in the boat ride to get wet, I got wet badly at the top of the fall from the rain like dripping from the moisture generated by the fall. It is an experiance of a lifetime.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-115457554247568376?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/115457554247568376/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=115457554247568376' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115457554247568376'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115457554247568376'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/08/niagara-few-days-ago-i-visited-my.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-115221492887534134</id><published>2006-07-06T12:31:00.000-07:00</published><updated>2006-07-06T12:42:08.886-07:00</updated><title type='text'></title><content type='html'>&lt;span style="font-size:130%;"&gt;Eclipse Web Tool Platform&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;I met some of the IBM  guys from Toronto and they wanted me to look at this "other" Eclipse thisng that they are developping. Of course! its opensourse, otherwise no point talking about it. I downloaded and set it up and some of the guys there showed me some of the useful functionalities, and i was quite convinced. It has the some cool xml tools that makes the life of xml speakers like me, bit easier than before.&lt;br /&gt;Here is the location to &lt;a href="http://www.eclipse.org/webtools/"&gt;download &lt;/a&gt;it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-115221492887534134?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/115221492887534134/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=115221492887534134' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115221492887534134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/115221492887534134'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/07/eclipse-web-tool-platform-i-met-some.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-114223026438913844</id><published>2006-03-12T21:50:00.000-08:00</published><updated>2007-01-17T11:38:26.143-08:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;&lt;span style="font-size:130%;"&gt;Numbers we use !!!!!!!!&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:130%;"&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;After a mobile communications talk at IU which claimed that Google would index numbers, I wanted to check it myself. Google indeed does index numbers and it lead me to the observation that as the numbers become larger the usage decreases super linearly. Do a google search for some random number like 43538. For this you will get around 200,000 hits. If you do a search for a random 10 digit number, most probably you will not get any hits. The interesting part is for any random number that is smaller than 10 digits you will probably get hits. As the number becomes smaller (in terms of digits)  there will be multiple hits for a given random number and the hit rate will increase super linearly as the number becomes smaller.&lt;br /&gt;Bottom line the internet may represent a sample representation of the number usage of people and most of the time the numbers we use are within the first 10 billion range as I observed. Further the we use big number less often as they get bigger even within the above 10 billion range.&lt;br /&gt;Interesting study to be done in a web mining course.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-114223026438913844?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/114223026438913844/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=114223026438913844' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/114223026438913844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/114223026438913844'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/03/numbers-we-use-after-mobile.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-114202478517832948</id><published>2006-03-10T12:58:00.000-08:00</published><updated>2006-03-10T13:06:25.196-08:00</updated><title type='text'></title><content type='html'>&lt;span style="font-weight: bold;"&gt;WS-Messenger paper &lt;/span&gt;&lt;br /&gt;I was working with my collegue Yi Haung to improve his WS-Notification and WS-Eventing brocker, WS-Messenger, for the past few weeks. This is a publish subscribe framework developped at extreme lab @ IU. A research paperabout this work was published in the ccgrid conference. Yi did most of the work, so great job Yi. Paper can be looked up &lt;a href="http://www.extreme.indiana.edu/xgws/messenger/doc/HuangY-WSMessenger.pdf"&gt;here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-114202478517832948?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/114202478517832948/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=114202478517832948' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/114202478517832948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/114202478517832948'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/03/ws-messenger-paper-i-was-working-with.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20452128.post-113625923399166122</id><published>2006-01-02T18:41:00.000-08:00</published><updated>2006-01-02T19:33:54.000-08:00</updated><title type='text'></title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://photos1.blogger.com/blogger/2511/2049/1600/gt4-sub-time.0.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://photos1.blogger.com/blogger/2511/2049/400/gt4-sub-time.jpg" alt="" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt; &lt;p class="MsoNormal"&gt;GT4 - WS - Notification behavior under multiple subscriptions.&lt;br /&gt;&lt;br /&gt;I tried out the Globus Toolkit lates version (4.0, aka gt4) recently and my interest was with its WSRF and WS-Notification capability. With respect to WS-Notification GT4 container was quite capable of getting a heavy pounding with multiple clients and could in fact deliver notifications without loosing messages up to 90 subscriptions to a particular topic. Such a load was possible when the no two publishings were interleaved or in other word the container was given enough time to deliver to all the subscribers before publishing the next message.&lt;br /&gt;Above graph shows the notification delivery time as the number of subscribers to a given topic increases and the delivery time is linearly proportional to the number of subscribers.&lt;br /&gt;Bottom line from these results is, GT4 Notification can be used for applications that would use less than 100 subscribers for a given topic. This is not a bad outcome, after all many practical applications would fall below this limit.&lt;br /&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;The SOAP message in concern is a very light weight soap message which has only a simple xsd:string as a input parameter.&lt;/li&gt;&lt;li&gt;SOAP style used is Document and the Use is Literal.&lt;/li&gt;&lt;li&gt;Ran the test on a linux box and the web service calls did not involve network overhead. (ran in localhost)&lt;/li&gt;&lt;li&gt;Notification was done in the form of a value changed event of a WS-Resource.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20452128-113625923399166122?l=chathurah.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://chathurah.blogspot.com/feeds/113625923399166122/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=20452128&amp;postID=113625923399166122' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/113625923399166122'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20452128/posts/default/113625923399166122'/><link rel='alternate' type='text/html' href='http://chathurah.blogspot.com/2006/01/gt4-ws-notification-behavior-under.html' title=''/><author><name>Chathura Herath</name><uri>http://www.blogger.com/profile/17002308263448601628</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='30' src='http://people.apache.org/~chathura/images/chathura.jpg'/></author><thr:total>1</thr:total></entry></feed>
