Is AndersH really losing any sleep?#
"From where I sit, Ruby has the language thought leadership position and is the competitor I hope AndersH is losing the most sleep over nowadays"

Don Box - Gosling on Ruby

This month's DDJ issue shares the similar thought in its article "Ruby On Rails" subtitled  "It makes development fast, agile manageable"

I'm not sure if I'm eligible to do a comparative analysis b/w C# 3.0 and Ruby but with LINQ, DLINQ and XLINQ in prospect, I'd certainly admire the advent provided in this version of C#. Being able to do this is awesome and it's just scratching the surface.

public void Linq1() {
    int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

    var lowNums =
        from n in numbers
        where n < 5
        select n;

    Console.WriteLine("Numbers < 5:");
    foreach (var x in lowNums) {
        Console.WriteLine(x);
    }
}

Result
Numbers < 5:
4
1
3
2
0


Microsoft Visual Studio Code Name “Orcas” - LINQ CTP (May 2006)
The LINQ Project
LINQ on Channel9
C# 3.0 Language Specification

5/29/2006 2:31:46 AM (Pacific Standard Time, UTC-08:00) #    Comments [1]  |  Trackback

 

Text To Phone Web Service#
This Method will call any phone number in the US/Canada and read the TextToSay to that phone number using the voice of Diane (voiceid: 0). PhoneNumberToDial must be filled in (They can be in any format as long as there is 10 digits).

Just called my cell and heard the bot saying the message along with IP. Pretty cool eh?

By Cdyne

5/27/2006 11:59:34 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Clever Quotes Revisited#

"I want a baby. I want a baby which can pass the Turing Test so they believe it's real!"

-Jeff Bergman contemplating to get a baby shower for himself


5/27/2006 9:38:44 PM (Pacific Standard Time, UTC-08:00) #    Comments [1]  |  Trackback

 

Call for Papers - AI and Data Mining Conferences#
Some interesting conferences are going on, following are the CFP's.

IEEE INTELLIGENT SYSTEMS Special Issue on Argumentation Technology
Submission deadline: 2 Mar. 2007


A PDF version of this the call is available here:

The theory of argumentation is a rich, interdisciplinary area of research lying across philosophy, communication studies, linguistics, and psychology. Over the last few years, formal models of argumentation have been gaining increasing importance in artificial intelligence, and have found a wide range of applications ranging from specifying semantics for logic programs, to natural language text generation, to supporting legal reasoning, to facilitating multiagent dialogue and negotiation.

Topics including, but not limited to,

  • Applications of argumentation in complex-systems engineering and design
  • Argument-based decision support systems
  • Argument extraction from real data
  • Argumentation in computer-supported collaborative work
  • Argumentation in natural language processing
  • E-democracy and e-government applications
  • Educational applications of argumentation
  • Intelligent tools for argument construction and analysis
  • Legal applications
  • Medical applications
  • Practical argument-based multiagent systems
  • Semantic Web applications

 Important Dates

Submissions due for review: 2 Mar. 2007
Notification of acceptance: 27 Apr. 2007
Final version submitted: 22 June 2007
Issue publication: Sept./Oct. 2007



PADM'06 - IEEE International Workshop on Privacy Aspects of Data Mining
December 18, 2006

Privacy is essential for the provision of electronic and knowledge-based services in modern e-business, e-commerce, e-government, and e-health environments. Nowadays, service providers can easily track an individual’s actions, behaviors, and habits.  Given large data collections of person-specific information, providers can data mine to learn patterns, models, and trends that can be used to provide personalized services…. (read more)

Important dates
===============
o July 30, 2006: Submission deadline
o September 8, 2006: Author notification
o September 29, 2006: Submission of Camera-ready papers
o December 18, 2006: Workshop

Topics
================
The workshop will seek submissions that cover aspects of privacy protection solutions and threats as they pertain to various data mining endeavors. The following comprises a sample, but not complete, listing of topics:

  • Biomedical and healthcare data mining research privacy
  • Cryptographic tools for privacy preserving data mining
  • Inference and disclosure control for data mining
  • Learning algorithms for randomized/perturbed data
  • Legal and regulatory frameworks for data mining and privacy
  • Privacy and anonymity in e-commerce and user profiling
  • Privacy aspects of business processes and enterprise management
  • Privacy aspects of geographic, spatial, and temporal data
  • Privacy aspects of ubiquitous computing systems
  • Privacy enhancement technologies in web environments
  • Privacy policy infrastructure, enforcement, and analysis
  • Privacy preserving link and social network analysis
  • Privacy preserving applications for homeland security
  • Privacy preserving data integration
  • Privacy protection in fraud and identify theft prevention
  • Privacy threats due to data mining
  • Query systems and access control
  • Trust management for data mining

Workshop On Uncertainty And Fuzziness In Case-Based Reasoning
Taking Place On Sep 5, 2006, As Part Of ECCBR 2006 - 8TH EUROPEAN
CONFERENCE ON CASE-BASED REASONING, Turkey

The paper submission deadline of our workshop has been extended to
JUNE 11. For detailed information, please visit the workshop website at http://wwwiti.cs.uni-magdeburg.de/iti_dke/events/WS_UFCBR.html



5/27/2006 9:35:35 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Antlr C# code generation using Visual Studio.NET#
When started using Antlr this semester, I had to go through a learning curve to use C# as Antlr's generated language.

Following is a step by step guide I came up with for anyone interested in using Antlr with C#/Visual Studio.NET.

Happy Antlr'ing.


Antlr_C_Sharp_code_generation_using_Visual_Studio.NET.pdf (102.82 KB)
5/25/2006 7:09:02 AM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Clever Quotes#
It's like an outlook with an MBA.
-From Linkedin website about LinkedIn Toolbar

So, LinkedIn is like myspace for Dilbert crowd, huh?
-Jeremy Myers

Yes, the Ping method in this API will be a two way Ping; Ping Request and then Ping Commit.
-David Gullett

I hit over 500m with my driver today, threw ball on the other side of the fence. Oh wait, ball might have actually come to our parking lot, did you see it?
-Jeff Bergman bragging his golf skills

Well, if Adnan is not going, I'll be the only one who doesn't know how to play golf!

-Joon-Shim Chua

There is one issues [sic] remaining.
-Antony Chhan

Poison cupcakes – mmmm :)
-Joanne Siegel

OMG  surprise its jelly filled!
-David Phan

Thank you for the cupcakes!  The one I had was very delicious (but don't tell my doctor.)
-Sharen Goodman


5/24/2006 10:22:31 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Things I like About You - IE 7.0 beta#

Beside the slick look and tabbed browsing, two of the best features I like about IE 7 is its RSS feeds handling and website thumbnails.

The clean and well organized RSS aggregator is much better from usability prospect than anything I've tried before. Import your OPML's and try it out for yourself.

IE7-RSS-Reader.jpg

The website thumbnail view tells you all about what websites are open, a live version of print preview.

IE7-Thumbnails.jpg

and yes, its ironic to have www.ie7.com pointing to FF and Clipboard exploit (Retrieve Clipboard Text To A Web Page With Javascript) still there.

References


5/21/2006 5:02:01 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Upgrade from D800 to D610#

Thanks to my director, Andrea Kim, I've been able to upgrade from my old rickety Dell Latitude D800 to flashy mobile Latitude D610. With a 2Ghz/1.32 Bus, 2G Ram/80G HDD, it's a decent developer machine. The only down side was abandoning the wide screen so I made up by putting a 17 inch LCD next to it. With Intel Centrino processor, an approx 4 hour battery life was a big plus. One of the possible choices was D810 but its much more heavier and thicker in size than even D800, and hence disregarded.

So far, I'm loving it!

PCMAG Review Dell Latitude D610
Dell Latitude D800 review by PC Magazine
Dell Latitude D810 review by PC Magazine

Dell-D800-to-D610.jpg


5/21/2006 4:22:37 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Da Vinci Code Tour#

I did get a chance to read The Da Vinci Code, in early 2004, long before the hype built up. Since I've also read Dan Brown's lesser known books Angels & Demons, Digital Fortress and Deception Point and knowing his writing style in prospect, I don't think movie did much of justice. It might have had some interesting artifact displays and flashbacks but yet the movie failed to deliver what made Da Vinci Code a definite page turner. I still hold that Angels and Demons is a much more interesting and well woven than Da Vinci Code but again, whatever invokes the most controversy gets to be the best seller.

During my European backpacking tour in 2003 (France, Germany, Netherlands, Belgium), I took some pictures around the
Louvre Museum which is shown and much discussed in the book and the movie. Especially the symmetrical pyramids. Not knowing one day they will be part of a best selling book and movie.





 


Now also, compare the cover of ASP.NET Developer's Cookbook with the picture below. Do you notice some similarity?
ASP.NET Developers Cookbook -0672325241.jpg




5/21/2006 1:28:19 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Grid Computing using Microsoft.NET Framework#
Grid Computing using Microsoft

Forbes Magazine in its January 2006 issue ran its cover story titled “A Super Computer for your living room”. This discusses the IBM cell processor powered by eight co-processors with their own memory being used in multimedia.  This helps providing realistic imaging by realtime rendering as it states

“Cell crunches through millions of lines of topographical and photographic data per second to paint topographically accurate, photo-quality pictures at a movie quality 30 frames per second. On a similar program a pentum take two minutes to sketch a single frame.” –Forbes, Jan 2006.

Beside the chip’s power whenever it comes to large scale parallel processing, Linux is usually the first name which comes to mind. Even though Microsoft is trying to catch up, it’s seems already a bit late in the OS/hardware game. However, while assisting a friend’s Masters thesis for “Grid computing with XML web services”, I came across several decent Microsoft application frameworks and services augmenting the grid computing. Grid computing is a form of distributed computing infrastructure “that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure.” (wikipedia). Microsoft’s early initiatives on Grid are explained in this Jim Gray’s talk and the future eScience workshops have augmented the blitz. This list of services and frameworks comprises of TerraService.net which is a poster child for both eGovernment and for .NET, SkyServer.sdss.org or SkyQuery.net, windows clustered version, Web services resource framework (WSRF.NET Webcast: Grid Computing Using .NET), and last but not least, Alchemi.

Alchemi is defined as “an open source software framework that allows you to painlessly aggregate the computing power of networked machines into a virtual supercomputer (computational grid) and to develop applications to run on the grid.”

Distributed Fractal Generator is an interesting example and with a little setup, programming for a grid can be as intuitive and simple as following sample taken from Alchemi’s user guide.

  class MultiplierApplication
    {
        static GApplication ga;
        [STAThread]

        static void Main(string[] args)
        {
            Console.WriteLine("[enter] to start grid application ...");
            Console.ReadLine();
            // create grid application
            ga = new GApplication("localhost", 9099);
            // add GridThread module (this executable) as a dependency
            ga.Manifest.Add(new ModuleDependency(typeof(MultiplierThread).Module));
            // create and add 10 threads to the application
            for (int i=0; i<10; i++)
            {
                // create thread
                MultiplierThread thread = new MultiplierThread(i, i+1);
                // set the thread finish callback method
                thread.FinishCallback = new GThreadFinish(ThreadFinished);
                // add thread to application
                ga.Threads.Add(thread);
            }
            // set the application finish callback method
            ga.FinishCallback = new GApplicationFinish(ApplicationFinished);
            // start application
            ga.Start();
            Console.ReadLine();
      }

References & Further Reading


5/14/2006 6:23:38 AM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Google and CAPTCHA Pages#
With Firefox and Google personalized homepage as defaults, this morning I encountered a CAPTCHA challenge response upon opening the web browser. A page with CAPTCHA stated.

"A computer virus or spyware application is sending us automated requests, and it appears that your computer or network has been infected.”

It was so not Google-like that I had to double check the URL to make sure it wasn’t a phishing attempt, it sure wasn’t.

It seemed like an attempt for either

  • DDoS attack Prevention scheme for the loaded and process/memory intensive personalized pages.
  • Large number of requests (same outgoing IP for large number of users) from same originating IP automatically getting blacklisted.

Providing the Google’s intelligent Bayesian filtering and intrusion detection algorithms in place, a CAPTCHA attempt looked a bit out of place to me, or probably it was just me.

References

CAPTHA is the squiggly characters / security code one encounters during signups and other online activities to keep bots away. It an acronym for "completely automated public Turing test to tell computers and humans apart") and is a type of challenge-response test used in computing to determine whether or not the user is human.

Spy Blog: Stupid Google virus/spyware captcha page
15 Seconds : Fighting Spambots with .NET and AI
Adnan Masood


5/10/2006 5:47:05 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Firefox for Visual Studio.NET Debugging#

Thanks to Peter's Gekko’ tip on Using Firefox (and IE) to debug your apps in Visual Studio. It worked like a charm on first try however, after installing IE 7.0 beta, VS.NET started acting up. The debug process kept detaching from browser and upon resetting the debug project options to IE, it got fixed.

Not really sure what could be causing it but a heads up for FF zealots.


5/6/2006 9:10:03 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Data Mining vs. Text Mining for Business Applications#

I think I’m hung up on semantics again. This era of connected systems we live in, businesses highly rely on knowledge management to ‘know thy visitor’. All the consumer specific data they can get their hands on and all the possible customer trends which could be derived from this information are deemed as an asset or probably a given at times. The knowledge management process comprises of several activities including but not limited to summarization, filtering, visualization, searching, categorization, mining & extraction and clustering. In this arena of digging up clues, text mining is the creation, discovery, derivation or deduction of new and previously unknown patterns from text documents.

Like it was said in the Hearst paper, “Another way to view text data mining is as a process of exploratory data analysis that leads to heretofore unknown information, or to answers for questions for which the answer is not currently known.” (Hearst, 1999), data and text mining are crucial to business process orchestration today. This process is seldom defined as unsupervised learning, lexical analysis, information extraction, live classification, self annotation, hierarchical text classification semantic web etc. From executive point of view, it’s mere CRM. However, the core ideology gets mixed up in the plethora of the buzz words. What is the difference between text and data mining and where the line needs to be drawn? Wikipedia defines Text Mining as

“Text mining is a young interdisciplinary field which draws on information retrieval, data mining, machine learning, statistics and computational linguistics”.

 The semantic difference is when data and text mining are used interchangeably which is a fallacy. With applications like Riya and Multimedia content filtering going mainstream (read TiVo), trend analysis is not text bound anymore. Re-routing your help desk ticket to the right correspondent using Bayesian inference is one thing but if you are matching up your interactive voice response (IVR) logs with customer’s demographics, banner clicks and web hits to evaluate the business requirements, it’s beyond mere text mining. “The difference between text mining and information retrieval is analogous to the difference between data mining and database management” (Thuraisingham, 1999) makes the point. Also, the idea of intelligent text mining vs. standard text mining augments the theory of mere statistical clustering vs. application of heuristics or specialized learning algorithms on text streams.

There are several conferences coming up dedicated to business applications of text mining for instance the 2nd Annual Text Analytics Summit, 22-23 June 2006, in Boston, Massachusetts, which has several interesting tracks. Some of them which seem particularly interesting to me includes

·        Understand, predict and act by Olivier Jouve, VP Text mining, SPSS

·        Enabling your enterprise with Text Analytics – A financial perspective by John Anthony – Director, P&C Innovation Lab The Hartford Financial Services Group, Inc

·        How HP perform needs-based customer segmentation using text mining by Randy Collica – Sr. Business Analyst Hewlett Packard

·        Methodology for Defining Text Enabled Business Intelligence Applications by Jay Henderson - Director of Product Marketing, ClearForest

·        High performance text analysis architectures & applications, Ramana Rao – CTO, Inxight

·        Visualising textual data by Bill Inmon – CEO Inmon Data Systems

Categorization, structuring and the cleanup of text is discussed in both (Hearst, 1999) and (Jan H. Kroeze et al, 2003) in much detail and there are counter opinions to it as well “It is a fallacy that text data are unstructured. (Nasukawa et al, 2001) and hence this discussion will go further in both camps but IMHO, Nasukawa derives his point from Google Page Rank.

Further Reading

·        Differentiating data- and text-mining terminology Jan H. Kroeze, Machdel C. Matthee, Theo J. D. Bothma
Proceedings of the 2003 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology SAICSIT '03

·        Untangling Text Data Mining – (Hearst, 99) ACM

·        Data Mining 2005 Sixth International Conference on Data Mining, Text Mining and their Business Applications

·        Business Intelligence Text Mining

·        UT ML Group: Text Data Mining

·        Experimental study of discovering essential information from customer inquiry – ACM

·        Mining concept associations for knowledge discovery in large textual databases – ACM

·        Generating association graphs of non-cooccurring text objects using transitive methods – ACM

·        Unsupervised Learning of Soft Patterns for Generating Definitions from Online News -Cui, H., Kan, M-Y. and Chua, T-S

·        Information Retrieval and Text Mining: A domain independent environment for creating information extraction modules – ACM

·        Text mining as integration of several related research areas: report on KDD's workshop on text mining 2000

·        Evaluating the novelty of text-mined rules using lexical knowledge

·        Artificial intelligence #2: Topic-based clustering of news articles


5/4/2006 5:54:20 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

DARPA's Urban Challenge 2007#
After having a A huge leap forward to robotics R&D (my 10/05 post Autonomous land vehicles - drivers not required), DARPA has again raised the bar. Now entering the realm of “The real urban driving challenge” and the deadline is 3rd Nov 2007. Beside boundary conditions, how the machines would do? How well the artificial neural network training will hold up against a true urban environment, we will find out. Can’t wait to see an autonomous vehicle running a light (or not). Kudos to the scientists and developers working on it.

This reminds me, one of the objections on Alan Turing’s 1950’s artificial intelligence paper was that a computer will NEVER be able to “Drive in the center of Cairo”; that day is not too far!

Further Readings

 


5/4/2006 7:22:23 AM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

HCI / Usability Resources Linkopoloza#
Courtesy: All the Nova Class fellows!


5/1/2006 8:01:52 PM (Pacific Standard Time, UTC-08:00) #    Comments [0]  |  Trackback

 

Book Review: Don't Make Me Think by Steve Krug#

Last term during graduate course of HCI (Human computer interaction), my professor Dr. Maxine Cohen, a SUNY  Ph.D. in Systems Science, created a WebCT forum thread called “HCI Resources”. Students were to post entries ranging from web based HCI resources to newspaper articles and books which they’ve found useful. Krug’s “Don’t make me think” subtitled as “A common sense approach to web usability” has made quite a mark among the listings by multiple mentions and was highlighted in several other assignment posts. This is when I started reading it and despite the common belief that technical books are selective reads, “Don’t make me think” is an addictive page turner in its right.

DontMakeMeThink.jpg

Krug’s book is generally based on KIS (Keep it simple) principle and is an easy read. Like the topic, book is aesthetically well designed and organized into distinct chapters addressing different topics of web usability and HCI in general.  Chapter titles are not your-usual-everyday-headlines but rather Daily Onion style ones depicting the theme discussed in the chapter. Krug’s laws of usability may seem like common sense to most of us but you’d be amazed to see how many websites around us don’t follow these simple guidelines to enhance the user experience.

This two hundred page book is divided into eleven chapters and definitely deserves to be called “an illustrated guide on making your web presence meaningful!” Steve Krug has worked hard in providing us concrete details and no-fluff advice on all things web usability. With gentle wit and humor, he emphasizes on web designing for scanning instead of reading, presenting simple mindless choices to user, providing meaningful and short text and realization of business requirements for frequent changes. Along with pertaining illustrations, author has provided the reader in