Home > Notice

Frontier Academic Leture Announcement

Author:Administrator Source:website Time:2018-07-09 08:53:29


Time :
2:30pm   Wednesday, July 11, 2018
Place : 303 Conference Room in Electronic and Information School

 

Lecture I
Topic : Computing Problems from the Humanities
Speaker : Associate Prof. Steve Cassidy,
Macquarie University, Sydney, Australia

Abstract : 
      Collaborating with researchers in the Digital Humanities is the source of a number of interesting problems in Computer Science. This talk looks at some examples from my past work and some emerging problems from more recent work.
      Linguists and other language researchers make use of large collections of language data in the form of text, audio and video recordings. They like to annotate this data to add information to the base signal. This can be textual annotation such as grammatical structure or the location of named entities or structure in spoken language such as speaker turns or the location of words and phonetic segments. My work with this kind of data has looked at how best to represent these annotations and how to work with very large collections of annotations. Interesting problems include the development of a graph-based annotation model, the use of query languages to find interesting annotations and the problem of version control for annotation data stored as a graph.
      More recently we have begun working with Oral History researchers who record stories from the community as the source material for their research. Our goal is to make use of speech and language technology to help the researchers make the most of the data that they collect.   One of the interesting problems here is about segmenting a long interview recording into useful chunks and assigning those to different speakers. This is known as speaker diarization and is a well understood task but there are still some outstanding problems that happen to turn up when looking at these interviews. Once we have a transcript we can also begin to look at how to apply some Natural Language Processing techniques to this data to make it more useful. I’ll present some of the research directions we are looking at as we explore this area.


About the speaker : 
     
Steve Cassidy is a computer scientist and has working various areas relating to speech and language technology over the last 30 years after completing a PhD in Cognitive Science.
      With Jonathan Harrington, he developed the Emu Speech Database System to support corpus-based research in speech and acoustic phonetics. Emu supports a flexible hierarchical annotation system and provides a query language and analysis environment based on the R Statistical environment. Emu is widely used to support research on small and large-scale speech corpora and includes tools to support every stage of the corpus collection and analysis lifecycle. Emu is now maintained by a team of developers in Munich.
      He was recently involved in the development and collection of an audio-visual corpus of Australian English from around 1000 speakers around Australia. He built the software for data capture and a server-based system for data upload and publishing.
His most recent work has been on the Alveo Virtual Laboratory which is both a repository for language resources and a platform to support tools for exploration and analysis of language data. Alveo currently holds around 30 collections including audio, video and text resources and is working on new acquisitions of data and tools.



Lecture II
Topic : Exploring Features for Complicated Objects: Cross-View Feature Selection for Multi-Instance Learning
Speaker : Dr. WU, Jia,
Macquarie University, Sydney, Australia


Abstract : 
     In traditional multi-instance learning (MIL), instances are typically represented by using a single feature view. As MIL becoming popular in domain specific learning tasks, aggregating multiple feature views to represent multi-instance bags has recently shown promising results, mainly because multiple views provide extra information for MIL tasks. Nevertheless, multiple views also increase the risk of involving redundant views and irrelevant features for learning. To this end, we formulate a new cross-view feature selection problem that aims to identify the most representative features across all feature views for MIL. To achieve the goal, we design a new optimization problem by integrating both Multiview representation and multi-instance bag constraints. The solution to the objective function will ensure that the identified top-m features are the most informative ones across all feature views. Experiments on two real-world applications demonstrate the performance of the cross-view feature selection for content-based image retrieval and social media content recommendation.

About the speaker :
      Dr. Jia Wu is an Assistant Professor/Lecturer of Data Science in the Department of Computing at Macquarie University. He received his PhD degree in Computer Science from University of Technology Sydney under the supervision of Prof. Chengqi Zhang and A/Prof. Xingquan Zhu.
Dr. Wu is the recipient of the Best Student Paper Award (IJCNN 2017) and Best Paper Candidate Award (ICDM 2014). He is a member of the IEEE.