The lost art of data engineering 2: The origin of a data Engineer
- Ara Islam

- Sep 25
- 3 min read

To understand what a data engineer does, it is important to know that a data engineer is fundamentally a specialist software engineer. They are primarily responsible for the data systems, but they also venture out into other non-data, software engineering principles. For example, data engineers follow similar continuous integration and continuous development (CI/CD) processes when creating data pipelines as software engineers do when creating API endpoints.
Data systems come in two flavours: Online Analytical Processing (OLAP) and Online Transaction Processing (OLTP).
OLTP systems manage high volumes of real time data. Think of a website like IKEA. Customers place an order, payments are taken and items are fulfilled. All of this happens quickly and smoothly for hundreds of thousands of customers.
Each stage of this process creates or moves data: an order has an entry in an orders table, with details like value and items. The payments system validates the customer has the funds and then puts the required amount on hold, before batching and settling thousands of customer orders at the end of the day. Finally, an entry in the fulfilment table is entered to assign the order for delivery, whilst the inventory table has quantity reduced by the order quantity.
OLTP needs single digit micro second process time. All of these processes usually need to be complete by the time the customer has finished blinking.
OLAP systems on the other hand handle large volumes of data. All the data created from the OLTP systems need to be analysed. The data needs to be prepared for reporting in a Business Intelligence (BI) dashboard, or to be used in a Machine Learning (ML) Model.
Back to our IKEA example, executives need to see what the sales trend is after they introduce a new marketing strategy. All the information they need to answer this question with quantitative explanation should be found in their BI reports. The executives may then want to implement a ML recommender model. This model needs to be trained on customers' existing purchasing habits, then be fed their behaviour in real time. The model will recommend items to customers, which should increase sales.
The OLAP side of data engineering is where this series will spend most of its attention. With the boom of the AI revolution since Nov 2023, data engineers are becoming the backbone for organisations that want to now power their company on AI technologies. Whether you want to build a RAG (random augmented generation) model or use an MCP (Model Context Protocol) framework, your data needs to be prepped and served. That is what data engineers do.

Data engineers will primarily focus on extracting data from the source system, transforming it and then loading it to the target system. This process is known as ETL.
The counterpart to ETL is ELT, in which case you load it into your target system and then do your transformation and cleaning. No matter the use case, data will need to be prepared before use. Think of it like cooking, with data being the ingredients. The meal you are trying to cook will determine how you prepare your ingredients.
Data engineers will often be the invisible person in the middle, enabling the features for customers in OLTP or supporting executives with their decision making.
Contact information
If you have any questions about our Data Engineering services, or you want to find out more about other services we provide at Solirius Reply, please get in touch (opens in a new tab).




Comments