This month we’ll focus on the work of the Digital Institute at Newcastle University. There will be 3 talks:
1. The TMS Platform for Analysing Streaming and Historic Twitter Data
Rebecca Simmonds, Newcastle University
Analysis of social media data has the potential to provide useful insights in a wide range of domains including social science, advertising and policing. Performing social media analysis in real-time on streaming data can give insights into events as they occur. Similarly, low latency querying of historic data is also valuable. However, the rate at which new data is generated makes it a real challenge to design a system that can achieve these two goals. This talk will describe and demonstrate such a system. It is cloud-based, and exploits both continuous query, and NoSQL database technology (Cassandra). Evaluation results are presented which show that the system can scale to process queries on data arriving at the rate of the full Twitter firehose.
2. The Urban Observatory
Phil James, Newcastle University
Understanding how the city works now is the first part of the challenge if we are to make it work better in the future. Cities are complex, with interactions happening across many scales and sectors. The Urban Observatory is collecting and managing data from across Newcastle, integrating sensor platforms with citizen based data collection, geospatial and satellite data. It includes a new framework for integrating data into workflows, models and other applications. The Urban Observatory provides a baseline for gaining a richer understanding of how the city operates, and in the process repurpose and reuse the data across many different disciplines and applications. It is a key part of the £50m investment by Newcastle University in Science Central.
3. Big Data Analytics in the Cloud with e-Science Central
Hugo Hiden and Mark Turner, Newcastle University
e-Science Central is an open-source, cloud platform for Data Analytics. It supports the storage, sharing and analysis of bid data. Analysis is through scalable workflows that can combine services written in a variety of languages including Java and R. e-Science Central is portable across internal clusters of servers, as well as a variety of Public Clouds, including Azure, Amazon and OpenShift. Developed over the past 6 years, it now supports £M projects in academia and industry. This Talk will describe its design, give examples of its use, and demo its capabilities.