Close
Type at least 1 character to search
Back to top

TELSTRA

CASE - AUTOMATION OF DATA

Data Factory/

We're proud
of our Work!

CLIENT
TECHNICAL STACK

Hadoop, Hive, Teradata, Spark streaming, Kafka, NiFi, Python, Qlikview

SKILLS & ROLES

Technical Data Analyst
Project Manager

CUSTOMER BENEFITS

╋ Big Data Lake migration onto a Big Data Cloud solution: Statistics analysis of the relevant sources and datasets used – Work with Data Engineers to prioritize the needed components (NiFi, Kafka broker and Spark Streaming) for real-time streaming flows.
╋ Web scraping Python POC doing specific location-based events extraction aiming to anticipate the congestion on the network.

WHAT IS THE CHALLENGE?

Automation of data pipeline: extraction from internet sources, curation of raw files, ingestion in HDFS. Identification of valuable datasets in the DataLake in order to build network speed and reliability metrics.

 

WHAT ARE WE TALKING ABOUT?

We were working in the Operations & Security Department of Telstra with the goal of improving the network infrastructure (mobile and fixed). Telstra is collecting hundreds of TeraBytes of data per day on its infrastructure (coming from the Telstra infrastructure and some external data sources). The objectives were to enrich the DataLake to support the data insights analysis, monitor the network customer experience and proactively anticipate the main outages which can occur on the infrastructure through patterns detection.

 

WHAT ABOUT DELIVERY?

╋ 20 people project team: Data engineers, data scientists and business stakeholders
╋ Identify the valuable and critical datasets that can be crossed to improve the analysis models about the performance of the Telstra network, and create value for Telstra infrastructure analysis
╋ External data: Identify and ingest new relevant data sources which could enrich the existing data models (for example: Machine Learning solution for outages prediction)
╋ Data dictionary and data modeling and technical business analysis of the main existing data sources existing in the Big Data Lake.

Our experts

Alexandre Anne

Julien Labouze