Close
Type at least 1 character to search
Back to top

TELSTRA

CASE - AUTOMATION OF DATA

Data Factory/

We're proud
of our Work!

CUSTOMER
THE TECHNICAL FRAMEWORK

Hadoop, Hive, Teradata, Spark streaming, Kafka, NiFi, Python, Qlikview

SKILLS AND ROLES

Technical Data Analyst
Project Manager

BENEFITS FOR THE CUSTOMER

╋ Migration of a big data lake to a big data cloud solution: Statistical analysis of relevant sources and datasets used - Working with data engineers to prioritise the necessary components (NiFi, Kafka broker and Spark Streaming) for real time streaming.
╋ Web scraping Python POC doing location-based specific event extraction aimed at anticipating network congestion.

OUR CHALLENGE

Automation of the data pipeline: extraction from Internet sources, retention of raw files, ingestion into HDFS. Identification of useful data sets in DataLake to establish network speed and reliability metrics.

 

THE CONTEXT

We were working in Telstra's operations and security department with the aim of improving the network infrastructure (mobile and fixed). Telstra collects hundreds of TeraBytes of data per day on its infrastructure (from Telstra's own infrastructure and from some external data sources). The objectives were to enrich DataLake to support data analysis, monitor the network customer experience and proactively anticipate major outages that may occur on the infrastructure through pattern detection.

 

THE PROJECT

╋ Project team of 20: Data engineers, data scientists and business stakeholders.
╋ Identify valuable and critical data sets that can be cross-referenced to improve analytical models on Telstra network performance, and create value for Telstra infrastructure analysis.
╋ External data: identify and ingest relevant new data sources that could enrich existing data models (e.g. Machine Learning solution for fault prediction).
╋ Data dictionary and data modelling and business technical analysis of the main existing data sources in the Big Data Lake.

Our experts

Alexandre Anne

Julien Labouze