CFP of Special issue on Smart Data: Where the Big Data Meets the Semantics, Computational Intelligence and Neuroscience, Impact Factor 0.596 (Deadline 11 March 2016)


Welcome to Smart Data Lab (SDL)!

Big data technology is designed to handle the challenges of the three Vs of big data including volume (massive amount of data), velocity (speed of data in and out), and variety (range of data types and sources). Big data is often captured without a specific purpose leading to most of it being task-irrelevant data. The most important feature of data is neither volume nor the other Vs but value. If big data is the technological foundation for data driven business decision-making, Smart Data is an organized way in which different data sources are semantically brought together, correlated, and analyzed, etc., to be accurate, actionable, and agile to feed smarter decision-making. Smart Data supports harnessing and overcoming the three V-challenges by utilizing Semantics and Big Data so that all values can be taken from the data. Dealing with volume, Semantics technology supports converting massive amounts of data into abstraction, meaning, and insight useful for human decision-making. Neural network algorithms are able to learn from whole data instead of samples of the data. Neural network algorithms, in particular, can take advantage of massively parallel (brain-like) computations, which use very simple processors that other machine learning technologies cannot use. For variety, the well-defined form of ontology, natural language processing, facilitates integration. For dealing with velocity, the ontology evolution techniques support dynamically, flexibly, and adaptively creating models of new objects, concepts, and relationships and using them to better understand new cues in the data that capture rapidly evolving events and situations. In addition, the Semantics and Neuroscience are applied for intelligent analytics to find insight that is actionable. Smart Data bridges a gap by facilitating information extraction and insight discovery. Smart Data can certainly help to make smarter decisions.

In our lab we usually concentrate on the following subjects:

  • Big Data
  • Semantics
  • Smart Data
  • Intelligent e-Commerce Systems
  • Business Intelligence
  • Smart Search Engines
  • Social Network Analysis

Read more ...

Meet our team

  • All
  • Master Student
  • Ph.D Student
  • Alumni (Master)
  • Alumni (Ph.D)

What We Do?

Diminer Data Intelligence Platform for All

Diminer platform uncover critical insights from enterprise and machine data. It quickly harness the BigData and result the decisions. It provides easily readable data visualizations , analytical charts and push messages . This platform helps business owners to take right , timely , predictive decision which saves time , money and lives. 

WISDOM Collaborative e-Learning

This project proposes an intelligent e-learning system (IES) personalizing recommendations of scenario for learners chosen from a dynamically scenario repository. Different learners are probably recommended suitably different learning scenario for the same subject/program based on the learners’ preferences, learning styles, personal features, interests and knowledge state. The main purpose of this system is to teach effectively by providing an optimal learning path in each step of the educational process.  

Collaborative Vietnamese WordNet Building

Ontology-based Vietnamese WordNet (OVW) has an extremely important role for most of areas relating to Vietnamese language processing. In this project, we supplement some structural changes to enrich the structure of Ontology-based WordNet and use it to develop the OVW. A consensus-based collaboration method with reliability measurement is proposed for collaborative OVW building.

ESCATER Smart Interactive Search

We aim to look the “apple fruit” by putting the word "apple" in famous search engines including Google, Yahoo! and Bing. We could not find the page, which mentions "apple fruit" easily; most of the top results are about the "Apple Inc.”, an American multinational corporation.


  • ISI Articles

    1. Trong Hai Duong, Geun Sik Jo, Jung J.J., Nguyen N.T. Complexity Analysis of Ontology Integration Methodologies: A Comparative Study, Journal of Universal Computer Science 15(4), pp.877-897 2009.

    2. Trong Hai Duong, Nguyen N.T., Jo G.S. A Hybrid Method for Integrating Multiple Ontologies, Cybernetics and Systems 40(2), pp.123-145. 2009.

    3. Trong Hai Duong, Nguyen N.T., Jo G.S. Constructing and Mining: A Semantic-Based Academic Social Network, Journal of Intelligent & Fuzzy Systems 21(3), pp.197-207. 2010.

    4. Trong Hai Duong, Jo G.S. Collaborative Ontology Building by Reaching Consensus among Participants, Information-An International Interdisciplinary Journal, pp.1557-1569. 2010.

    5. Mohd Zulkefli N. A., Trong Hai Duong, Inay Ha, and Geun Sik Jo Consensus Based Information Search by Analyzing Keywords in Blogosphere, Information-An International Interdisciplinary Journal, pp.1625-1637. 2010.

    More ...

  • International Conferences

    1. Trong Hai Duong, Nguyen N.T., Jo G.S. A Method for Integration across Text Corpus and WordNet-based Ontologies IEEE/ACM/WI/IAT 2008 Workshops Proceedings, pp. 1-4. IEEE Computer Society 2008

    2. Trong Hai Duong, Nguyen N.T., Jo G.S. A Method for Integration of WordNet-based Ontologies Using Distance Measures Proceedings of KES 2008. Lecture Notes in Artificial Intelligence pp. 5177, 210-219. 2008

    3. Trong Hai Duong, Ngoc Thanh Nguyen, Jason J. Jung, Geun-Sik Jo CMOI: A Collaboratve Method for Multiple Ontology Integration International Workshop on Soft Computing for Knowledge Technology (SCKT 2008) , IEEE Computer Society 2008

    4. Trong Hai Duong, Mohammed Nazim Uddin, Geun Sik Jo Collaborative Web for Personal Ontology Generation and Visualization for a Social Network KSE 2009,IEEE Computer Society , pp. 237-242. 2009

    5. Trong Hai Duong, Mohammed Nazim Uddin, Li Delong, Geun-Sik Jo A Collaborative Ontology-Based User Profiles System ICCCI 2009, Lecture Notes in Computer Science, pp.540-552. 2009.

    More ...

  • Thesises

  • Patents

  • Chapter Books

    1. Trong Hai Duong, Nguyen Ngoc Thanh, Geun Sik Jo. Effective Backbone Techniques for Ontology Integration Management. Springer-Verlag, 197-227 Nguyen N.T., Szczerbicki E. (Eds): Intelligent Systems for Knowledge 2010

    2. Trong Hai Duong, Ngoc Thanh Nguyen, Kozierkiewicz-Hetmanska, Geun-Sik Jo Fuzzy Ontology Integration Using Consensus to Solve Conflicts on Concept Level New challenges for Intelligent Information, Springer-Verlag Berlin Heidelberg, SCI 351, pp. 33-42. 2011

    3. Trong Hai Duong, Sy Dung Nguyen, Minh Hien Hoang An Adaptive Neuro-Fuzzy Inference System for Seasonal Forecasting of Tropical Cyclones Making Landfall along the Vietnam Coast Advanced Computational Methods for Knowledge Engineering Studies in Computational Intelligence Volume 479, 2013, pp 225-236 2013

    4. Trong Hai Duong, Tran Hoang Chau Dao, Jason J. Jung, Ngoc Thanh Nguyen Solving Conflicts in Video Semantic Annotation Using Consensus-Based Social Networking in a Smart TV Environment Advanced Computational Methods for Knowledge Engineering Advances in Intelligent Systems and Computing Volume 282, 2014, pp 201-216 2014


  • CS246 (Stanford University)

    Mining Massive Data Sets




    ·         Course information handout

    ·         Hadoop tutorial will help you set up Hadoop and get you started. Due on 01/13 at 5:00 pm.

    ·         Homework 1: Out on 1/8. Due on 1/22 at 5:00 PM (max 1 late period allowed). (Solutions) (Code)

    ·         Homework 2: Out on 1/22; Due on 2/5 at 5:00 PM (max 1 late period allowed). (Solutions) (Code)

    ·         Homework 3: Out on 2/5; Due on 2/19 at 5:00 PM (max 1 late period allowed). (Solutions) (Code)

    ·         Homework 4: Out on 2/19; Due on 3/5 at 5:00 PM (max 1 late period allowed). (Solutions) (Code)

    More ...

  • CS224W:

    Social and Information Network Analysis


    World Wide Web, blogging platforms, instant messaging and Facebook can be characterized by the interplay between rich information content, the millions of individuals and organizations who create and use it, and the technology that supports it. 
    The course will cover recent research on the structure and analysis of such large social and information networks and on models and algorithms that abstract their basic properties. Class will explore how to practically analyze large scale network data and how to reason about it through models for network structure and evolution. 
    Topics include methods for link analysis and network community detection, diffusion and information propagation on the web, virus outbreak detection in networks, and connections with work in the social sciences and economics.


    More ...

Latest Seminars

Keep on track of latest news and updates

Apache Mahout

What is Mahout?

Apache Mahout is an open source project that is primarily used for creating scalable machine learning algorithms. It implements popular machine learning techniques such as: 

  • Recommendation
  • Classification
  • Clustering

Apache Spark

Apache Spark

1.  what is spark?

Apache Spark:

·       is a framework for performing general data analytics on distributed computing cluster like Hadoop.

Hadoop Introduction & Development Guideline

Introduction to Apache Hadoop, an open source software framework for storage and large scale processing of data-sets on clusters and guideline how to develop your first Hadoop project!



Contact Us

We are looking forward to hear from you