Data & Analytics Articles

An Efficient Incremental Indexing Mechanism for Extracting Top-k Representative Queries Over Continuous Data-streams

Data & Analytics 21 January 2016

The annual ACM/IFIP/USENIX Middleware conference is a major forum for the discussion of innovations and recent advances in the design, construction and use of middleware systems.  The scope of the conference is the design, implementation, deployment, and evaluation of distributed system platforms and architectures for computing, storage, and communication environments. Highlights of the conference includes

An Efficient Incremental Indexing Mechanism for Extracting Top-k Representative Queries Over Continuous Data-streams

CAKE LABS 21 January 2016

The annual ACM/IFIP/USENIX Middleware conference is a major forum for the discussion of innovations and recent advances in the design, construction and use of middleware systems.  The scope of the conference is the design, implementation, deployment, and evaluation of distributed system platforms and architectures for computing, storage, and communication environments. Highlights of the conference includes

ElasticSearch

ElasticSearch

Sysco LABS 7 September 2015

ElasticSearch is an open-source search and analytics engine for both structured and unstructured data. Desiged for the cloud and big data, it provides near real time analytics capabilities. In this Innovation Session, software engineers Yasanka Horawalavithana and Malinga Perera of Leapset show an introduction to ElasticSearch Architecture which drives Elasticsearch and best ways to use

Resource Aware Job Schedulers

Harnessing Resource Aware Job Schedulers in Storm

Sysco LABS 11 July 2015

At the Data & Analytics Team, we are doing regular investigations to improve efficiency of in-house built RETL(Real Time Extract Transform Load). Any improvements may have significant impact on the direction of time, space and cost complexity. This article provides an overview of the investigation we have done to improve the performance of job schedulers

Apache ZooKeeper

Apache ZooKeeper

Sysco LABS 5 June 2015

For this week’s innovation session, Azeem Mumtaz and Lohitha Chiranjeewa from our Platform team, spoke about Apache ZooKeeper. They started their presentation by introducing us to fallacies (mistakes of reasoning, as opposed to making mistakes that are of a factual nature) which was followed by an introduction to Apache ZooKeeper. They mentioned that Apache ZooKeeper

Splunk

Splunk

Sysco LABS 5 February 2015

This week the topic of the session was Splunk, a log management tool which gathers machine data across an organization, identifies data patterns, provides metrics, diagnoses problems and provides intelligence for business operations. Two of our senior software engineers, Gihan Jayakody and Eshan Sudharaka enlightened us with the technicalities of the system as well as

Data Mining and Analytics with Spark

Data Mining and Analytics with Spark

Sysco LABS 22 September 2014

Software Engineer (Data Science), Pankajan Chathirasegaran of the Data & Analytics Team at Leapset, took us through this friday’s Innovation Session with a discussion on Data Mining and Analytics with Spark. He gave an introduction about Apache Sparks and brought in a lot of insights that were quite useful for the engineering team here at