Seminars & Colloquia
Abstract: The field of health informatics is challenged by enormous amount of data. Data mining and Big Data analytics are helping to realize the goals of diagnosing, treating, helping, and healing all patients in need of healthcare. A common problem in medical research is to classify subjects into disease categories. Machine learning tools such as Neural Network (ANN), Support Vector Machine (SVM), Linear Regression (LR) and Fisher’s Linear Discriminant Analysis (LDA) are widely used in the areas of prediction and classification. The main goal of these classification strategies is to predict a dichotomous outcome (e.g. disease/healthy) based on several features. Currently predictive modeling faces two important problems; feature selection and class imbalance. Feature selection has become the focus of much research in the areas of gene expression array analysis and fMRI studies, for which datasets with tens or hundreds of thousands of variables are available. On the other hand, class imbalance is also a “realworld” problem in medicine and biology, whose data are characterized by severe imbalance. In this talk, I will present a classification framework for diagnosing autism and major depressive disorder patients. The focus will be given mainly on constructing and selecting subsets of features that are useful to build a good predictor using elastic net, a regularization and variable selection method. Further, I will introduce a sampling based method SMOTE (Synthetic Minority Oversampling Technique) to overcome the class imbalance issues
Fadi Mohsen Department of Computer Science University of North Carolina Monday, March 28, 2016 11:00am SC348 Title:Different Approaches for Countering the Risks of Third-party Mobile Applications
Abstract: Third-party applications have become an essential component of modern ecosystems such as Android and Facebook. These applications
can be: web-based, desktop-based, or mobile-device-based applications. The number of third-party applications has grown tremendously in recent
years, for instance, the average number of applications installed on Facebook each day is 20 million, and the number of Android applications in the
market is 1.8 million. Although these applications provide users with better and more customized experiences, they could pose security and privacy
risks for mobile phone users. In this talk I will introduce our research on identifying a number of vulnerabilities in mobile operating systems such as
Android, describe our proposed solutions, and discuss the user studies that we had conducted to investigate the usersâ€™ awareness and
comprehension to the vulnerabilities and the countermeasures.
Song Min Kim Department of Computer Science University of Minnesota Wednesday, March 23, 2016 11:00am SC340
Title:IoT Networking: From Coexistence to Collaboration
Abstract: IoT (Internet of Things) is anticipated to significantly improve the qualities of our daily lives in the areas of home automation, healthcare, and transportation, among many others. This is achieved by massive and diverse wireless devices covering every corner of our living space. This talk introduces intelligent IoT networking. I will first discuss a novel design for harmonious coexistence of heterogeneous systems, which addresses the key issue of interference caused by incompatibility between various wireless technologies. The second part of the talk takes a step further in the aim to exploit the opportunity behind the heterogeneous networks consisting IoT. A new technique that establishes direct communication between incompatible devices is presented, enabling interoperation and collaboration between heterogeneities to provide services beyond the capability of each entity. The talk will conclude with a brief discussion on enhancing IoT infrastructure and its support for big data applications.
Andrey Kasklev Department of Computer Science Wayne State University Friday, February 19,2016 11:00 SC348 Title: Big Data Management Using Scientific Workflows
Abstract: Big data has the potential to revolutionize many areas of human activity, including scientific research, education, healthcare, energy,environmental science, and transportation, just to name a few. Examples of possible big data innovations may range from safer driving using connectedvehicles to energy-efficient homes with learning thermostats, from discovering new particles using Large Hadron Collider (LHC), to saving lives with remotely-monitored pacemakers.However, making these breathtaking innovations a reality requires managing terabytes and even petabytes ofdata, generated by billions of devices, products, and events, often in real time, in different protocols, formats and data types. The volume, velocity, andvariety of big data, known as the “3 Vs”, present formidable challenges, unmet by the traditional data management approaches.In the first part of this talk, I will discuss how I address the volume challenge with my big data modeling approach for Apache Cassandra, a populardistributed database, used in many cutting-edge big data applications, including Twitter’s trend mining and eBay’s product recommendation. I will present my query-driven schema design approach that enables efficient storage and querying of terabyte- and petabyte-scale data using Cassandra and will demonstrate my online tool that streamlines and automates the entire data modeling process. I will then focus on the variety challenge of big data in the context of scientific workflow composition. The variety of big data leads to heterogeneity of software components, such as Web services, that are connected into workflows to analyze data. Interfaces of such components are often incompatible with each other and therefore require mediation. I will present my typetheoretic approach that allows to automatically insert intermediate workflow components, called shims, that perform such mediation. Finally, I will discuss the volume challenge of big data in the context of scientific workflows, and identify challenges of executing large-scale scientific workflows in the cloud. I will then present a system architecture for running scientific workflows that addresses many of these challenges.
Wei Wang Department of Computer Science University of Virginia Monday, February 22, 2016 11:00 SC340
Title: System Software Design for Computation-at-scale
Abstract: In recent years, the computer industry is undergoing profound changes. New hardware, such as many-core and exascale computing platforms are emerging, which emphasizes massive parallelism. While traditional applications are being parallelized, new applications aimed at big data sets and massive user requests are also increasing. The primary goal of computer system design is no longer purely performance, but is a balance of a range of requirements, including energy efficiency, reliability and cost. However, the solutions to exploit new hardware and applications to meet these goals rests with the system software. Realizing both the challenges and opportunities presented by the changing computer landscape, my research focuses on developing system software to meet the needs of the new hardware, new applications, and new goals.
In this talk, I will present two research projects involving system software. The first project looks at the scalability of multi-threaded, general-purpose and high-performance computing applications on many-core platforms. This project aims at developing system software to improve application scalability. More specifically, this project answers two basic questions affecting scalability: how many cores should be allocated to an application, and how many threads should be created, to achieve the best performance? I will present a novel run-time system, which combines theoretical and practical approaches to automatically execute multi-threaded applications with their optimal core allocations. The second project focuses on data center applications and warehouse-scale computers. This project aims at improving system software to consolidate data center workloads to increase server utilization, reduce cost and power consumption, while ensuring good quality of service. In the end of the talk, I will discuss future research directions for large-scale parallel systems.
Jeremy Logan Department of Computer Science
University of Maine Wednesday, February 17, 2016 11:00 SC340 Title:Towards a Generative Software Ecosystem for Scientific Computing
Abstract: Scientific simulation applications are often complex and costly to learn, maintain, optimize and extend. In addition, users must navigate a large ecosystem of analysis, visualization, and data management tools. Additional costs are incurred with each new generation of hardware, as existing codes must be ported and optimized to take advantage of new processor architectures and accelerators. The use of domain specific language and generative programming promises to reduce these costs by automating aspects of code creation and maintenance, but currently comes with its own set of overheads. This talk will explore current and future research into generative techniques for creating and organizing software that decrease the overhead involved in generative programming and promise to significantly reduce the overall cost of scientific computing.
James Lynch Department of Computer Science Thursday, December 3, 2015 14:30 SC356
Title: Random Graphs and Systems Biology
Abstract:Graphs arise in many areas of science and engineering. Often, they are the result of random processes. This talk will describe various kinds of random graphs, focusing on some that are now being used in models of biochemical reaction systems. The talk will be mostly expository and will not assume any knowledge of random graphs or systems biology. Familiarity with introductory graph theory will be helpful, although a brief review of the subject will be given. Theorems about random graphs in systems biology will be presented. Detailed proofs will not be given. Instead, the methods used in the proofs for constructing and analyzing random graphs will be described. The talk will conclude with some open problems and topics for future research.
Natasha Banerjee Department of Computer Science Thursday, Novembre 5,2015 14:30 SC 356
Title: How to Prepare a Good Talk
Abstract:We are all faced with having to deliver presentations, for classes, conferences, thesis defenses, and job interviews. In this talk, I will discuss some tips on how you can create and deliver a good presentation. In particular, I will discuss the idea that delivering a good talk is about understanding the psychology of your audience, and about creating and delivering content that allows the audience to come to your side. Among other things, we will look at providing minimalism in slide design, the value ofgures in your slides, and methods to present equations and algorithms in a presentation.
Gregory Dudek School of Computer Science McGill University Thursday, October 29,2015 14:30 SC 356
Title:Robotic System Design for Automated Marine Data Analysis
Abstract:This talk will address the deployment of robotic systems for data collection. This includes task spec cation, gait learning and data analysis. As a concrete example I will discuss the automated analysis of video data, and spec cally video data collected underwater with an amphibious vehicle (the Aqua 2 hexapod). Automated systems can collect data at prodigious rates and the timely analysis of this data is a growing challenge, especially when there are bandwidth constraints between the data source and the people who must examine the data. We are spec cally interested in the real-time summarization and detection of the most interesting events in a video sequence, for use by humans who will analyze the data either in real time, or oine. To do this, we are developing methods that adapt to video data streams in real time to collect salient events and using them in the context of a group of vehicles that fly, swim and float.