Mohammad Al Hasan, Ph.D.Associate Professor, Department of Computer & Information Science
Ph.D., Rensselaer Polytechnic Institute, 2009
Courses Taught / Teaching
CSCI 57300 Data Mining (Fall 2017)
CSCI 59000 Topics in Computer Science: Algorithms for Bioinformatics (Fall 2017)
CSCI 48100 Data Mining (Spring 2018)
Research Interests: Data Mining, Graph Mining, Network Sampling, Bio-Informatics, Biomedical-Informatics, Machine Learning and Social Network Analysis
Mining frequent patterns from a hidden dataset is an important task with various real-life applications. In this research, we propose a solution to this problem that is based on Markov Chain Monte Carlo (MCMC) sampling of frequent patterns.
In this work, we propose an interactive pattern discovery framework named PRIIME which identifies a set of interesting patterns for a specific user without requiring any prior input on the interestingness measure of patterns from the user. The proposed framework is generic to support discovery of the interesting set, sequence and graph type patterns.
In this paper, we introduce a new home discovery tool called RAVEN. It uses interactive feedback over a collection of home feature-sets to learn a buyer's interestingness profile. Then it recommends a small list of homes that match with the buyer's interest.
In this work, we propose a frequent subgraph mining algorithm called FSM-H which uses an iterative MapReduce-based framework. FSM-H is complete as it returns all the frequent subgraphs for a given user-defined support, and it is efficient as it applies all the optimizations that the latest FSM algorithms adopt.
In this work, we propose a novel approach for solving graph classification using two alternative graph representations, which are the bag of vertices and the bag of partitions. For the first representation, we use deep learning based node features and for the second, we use traditional metric based features.
In this work, we propose a supervised regression (Cox regression) model inspired by survival analysis to predict the sale probability of a house given historical home sale information within an observation time window.
A novel method for metric embedding of node-pair instances for a dynamic network. DyLink2Vec models the metric embedding task as an optimal coding problem where the objective is to minimize the reconstruction error, and it solves this optimization task using a gradient descent method.
A novel method for graphlet transitions based feature representation of the node-pair instances. GraTFELuses unsupervised feature learning methodologies on graphlet transition based features to give a low-dimensional feature representation of the node-pair instances.
A simple, yet powerful algorithm that obtains the approximate graphlet frequency for all graphlets that have upto 5 vertices.
A Uniform Sampler for Constructing Frequency Histogram of Graphlets. GUISE uses Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network.
An approximate triangle counting algorithm, that runs on multi-core computers through a multi-threaded implementation.
12. Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams
A Bayesian non-exhaustive classification framework for solving online name disambiguation task in digital library domain.
Two Indirect triple sampling methods based on Markov Chain Monte Carlo (MCMC) sampling strategy. Triple-MCMC samples triple by performing MCMC walk on an imaginary triple sample space. Vertex-MCMC samples triple by performing MCMC walk on the original network to sample a node and then samples a triple centered by the selected node.
Publications & Professional Activities
- Discovery of Functional Motifs from the Interface Region of Oligomeric Proteins using Frequent Subgraph Mining (Tanay Kumar Saha, Ataur Katebi, Wajdi Dhifli, Mohammad Al Hasan), In IEEE/ACM Transactions on Computational Biology and Bioinformatics, IEEE, 2017. [bibtex]
- Feature Selection for Classification under Anonymity Constraint ( ), In Transactions on Data Privacy, volume 10, 2017.
- Name Disambiguation in Anonymized Graphs using Network Embedding ( ), In Proceedings of the 26th ACM International on Conference on Information and Knowledge Management (CIKM), Singapore, Research Track Full Paper, 2017.
- Learning Sentence Representation with Context ( ), In Proceedings of the 26th ACM International on Conference on Information and Knowledge Management (CIKM), Singapore, Research Track Full Paper, 2017.
- CON-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec ( ), In ECML-PKDD, 2017.
- How Fast Will You Get a Response? Predicting Interval Time for Reciprocal Link Creation ( ), In The 11th International AAAI Conference on Web and Social Media (ICWSM), 2017.