Top Data Mining Challenges / Research Problem Areas#
KDD 2006 panel report did a panel report on grand challenges in the field of data mining. The report was published in the ACM SIGKDD Explorations Newsletter and can be accessed from here. The report identifies following areas as prominent ones for active research and development. Makes a good repository for those looking for an idea to expand upon for their dissertation :)
 
  • Will you cheat for me please, my dear computer: Text-mining and understanding system that can use the web to pass standard tests, e.g. SAT in World Literature-based discovery of drug X side effects History.
  • Nip in the bud: Fraud detection based on company financial statements. (Can we find another Enron before it collapses?)
  • Autonomous Tagging: Automatic tagging and classification of 1 billion digital photos on the web.
  • Social Networking 2.0: Mining user behaviors in interactions with multimedia data and use the knowledge extracted in this process to anticipate future behaviors or to diagnose medical or psychological conditions of the users. This generally falls under the area of Crossing the semantic gap between multi-media data and semantics
  • Where do I belong?: Link mining Challenge (extracting graphs describing entities and relationships from unstructured data)
  • Lots of Traffic!: Estimating large dataset predictive model - from 833 traffic sensors in the Chicago metropolitan region and the goal is identifying anomalous traffic patterns.
  • Gold in the Text: Entity extraction and autonomous text analysis from large scale unstructured text repository.
  • And of course the genetics side, mining the proteome (Large-scale databases analysis from sequencing projects, micro array studies, gene-function studies, protein-protein interactions, comparative genomics, structural biology, and open source journal articles)

Also, the other areas of research interest mentioned in the data mining literature are  

  • Parallelization of data mining algorithms.
  • Designing and developing scalable algorithms to operate on massive data sets.
  • distributed data mining; multiple topologies (local data, distributed app and so on …)
  • Standardizing the languages, underlying protocols, and application level integration for data mining and predictive modeling.
  • Systems to promote preserving privacy and security in the data mining.
  • Visualization of large datasets; mapping their corresponding associations, hierarchies and underlying patterns.
References

What Are The Grand Challenges for Data Mining? KDD-2006 Panel Report





9/15/2007 3:59:48 PM (Pacific Standard Time, UTC-08:00) #    Comments [1]  |  Trackback
Tracked by:
"generische viarga cialis" (generische viarga cialis) [Trackback]

 

All content © 2008, Adnan Masood
About the Author
On this page
Calendar
<December 2008>
SunMonTueWedThuFriSat
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910
Archives
Sitemap
Blogroll OPML
microsoft
Blogroll
Disclaimer

Powered by: newtelligence dasBlog 1.8.5223.2

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Send mail to the author(s) E-mail

Theme design by Jelle Druyts